digital harmony

naïve backpropagation, laziness and chord roots

The rolling-down-a-hill of backpropagation with gradient descent has proven enormously powerful. The process of identifying chords in a ternary harmony system is curiously similar, though the loss function is far more naïve and transparent. Instead of using derivatives, the algorithm simply derives the smallest earth-moving distance, at bit level. It’s rather prosaic.

But what this musical/computational process lacks in sophistication it makes up for in mechanical efficiency, and its musical yield is more than trivial. In the end, this system for categorization of chords will look like the network below, with input left and output right:

To reach this model, though, we will have to think a bit about the function of notes and chords. It is a musical topic which is easy to feel, but hard to describe. So we will advance step by step. The network/node model will prove useful in future posts. It does have a kind of neural-net look to it, and that may or may not be relevant.

notes alone

To begin with: a note alone can mean just about anything. A note tells that there has been an event whose resulting sound frequency is consistent and thus identifiable.

If we consider a single note in the twelve-note world of ternary harmony, it may seem to have any one of 49 possible functions. Below is an map of these possible harmonic roles (C is yellow: each horizontal strip represents a potential key and function) along with an audio rendering of all of those keys. There are more than might be usefully listed or meaningfully considered by ear:

This is not a practical list, at least not from a human standpoint. It is worth pointing out, though, that since only a single logic operation (&) is required to parse these possibilities, it’s pretty efficient even in the worst case.

note groups (chords)

Groups of notes tell us much more than single notes, and not only because they come in quantity. They radically limit the possibilities for definition of context (much as a few data points can indicate a whole demographic and set of behaviors). Chords, when they consist of notes separated enough not to directly interfere with each other, allow their components to remain aurally distinct. Listeners can understand them as a set of distinct identities working together. Chords are singular and plural.

Below you can see how the potential harmonic function of a chord is far more narrowly defined than that of a single note (shown in the graph above). It still gives a fair number of possibilities to consider, but it clearly provides a more focused set of possibilities. The C major triad below has 18 possible functions. (Some of which are duplicates in related keys, e.g. C Major is a tonic key, but can also serve as a temporary dominant key related to F Major. But this matters only in long-term memory, which we will come to later).

A ‘seven’ chord, which uses four notes, limits the potential keys still further, to 8 (three of which are duplicates, as above). This gives a quite stable sense of tonality:

Chords alone can imply many functions, consequences, and tensions of the notes they contain, with surprising specificity.

But although chords do supply critical context for notes, they also can imply a deeper context. The context which chords require comes from remembering the past and predicting the future.

This brings us to the idea of key.

key as a hidden state

A key provides orientation for memory and prediction, but it is not directly heard at any given moment. It is only real insofar as it allows prediction and exists in memory. You might say that the idea of a key constitutes a ‘hidden state‘, whose contents and meaning show up in practise and over time. Musically, this makes sense: we require not only sounds, but a field for anticipation, prediction, tension and release. Single tones are literally monotonous, and clusters of tones are figuratively monotonous. But a sense of direction and possibility is a real part of musical reality – we find these qualities between the idea of a note and the idea of a key.

Keys do not make sense when you hear their contents all at once. Here, for example, is C Major as a cluster. It’s not outright cacophonous, but does tend toward being a sound-cloud:

The pentatonic scale (aka 6/9 chord) seems to live right at the limit of what we can discern in simultaneous sounds:

What is crucial to know is that these keys – prime-numbered in their parts and distinct from one another – exist in memory and prediction. We experience them over time. They are not physical sounds.

naïve backpropagation: finding the root and degree of a chord

Given a group of notes, though, how do we find which note is the most important? It turns out that if we assume a seven-note or five note scale (which of course includes smaller scales), we can find the root of a chord in a few steps, but reaching back into the hidden states.

For each of the possible keys in which a chord might live (shown in the long lists above), we can test how efficiently the group of notes can exist within it.

Let us take the example of a C major triad (as usual). Given three notes, one of them must be the ‘root’. So let us try building in thirds from G, using the key of C major:

You can see that it takes six steps to find C again. Not very good. Next, try building in thirds from E:

Just as bad: six moves away. And starting from C?:

Three steps. Lovely. The most efficient algorithm yields the perceived root – and this discovery can be cleanly implemented using binary logic operations.

Additionally, this backprop-lite technique is quite flexible, allowing for the understanding partial voicings. Here, for example, is the process of finding the root of a dominant 7 chord without a 5, on A-flat:

Add a flat 9 and it still gets found:

This turns out to be a process which operates with considerable precision even at the speed of audio analysis.

key as a prediction area

Once you know all the functions that a chord might serve, and what note serves as it root, you can investigate which function it does serve.

Unfortunately, there is no way of really knowing this (even within the assumption that musical tensions exist). And even inside this ternary system, there are several kinds of possible surprise, the clearest of which is that the chord serves as a hinge, fulfilling one purpose toward the past, and another the future. A chord can clearly exist in two keys at once — one for the past, and one for the future.

But assuming a key allows us to hear and predict does have some reality to it. Keys allow us to encounter many combinations of notes without making any significant alteration to underlying expectations. For example: if we hear a C Major chord, and we hear a G Major chord, we figure the two of them can come together to define a single undifferentiated bandwidth called the Key of C Major:

A ii-V-I progression is even more complete, with no note of C Major left unheard:

Last, you can begin to see (and systematically measure) not only how keys fit within a tonality, but how a prediction space can bend without breaking, through the addition strategic single notes:

So what we hear at any moment fits into some larger pattern over time. The question becomes not only ‘how far is one chord from another?’, but ‘how far is one prediction space from another?’.

And in this way, we may be able to measure some parameters of tension, release, memory, and prediction in harmony.

digital harmony

scale- and chord-building

One advantage of this ternary system is that a lot of the crazy folds and skips of scales and arpeggios – even in quite advanced tonalities – turn out to be pretty simply algorithmic.

Here, for example, is what you get when you take every second bit in the F-to-B group of bits, starting on C. You will here the whole scale as a series, then each individually, as pictured:

As you can see, and hear, every half-step in the scale comes from the space folding back on itself.

Similarly, here is what you get when you skip to every fourth bit. It will also sound familiar:

Just as in the scale, every minor interval in the arpeggio is generated by the space folding back on itself.

This seems like a lot of information for just a few bits, but it works because the information is in the system of the bits themselves.

building scales: a closer look

Each regular stride across the space will bring a very familiar musical structure. These combinations of individual notes begin to indicate modal areas. Chords and scales serve as intermediaries between the singleness of notes and the multiplicity of keys.

Here, for example, is a scale forming in major (the reverse of what is pictured above):

And here is a scale forming, similarly, in minor. Remember that the Eb is effectively taking the place of E, so the same ‘staircase’ pattern is in effect:

building chords: stable simultaneity

Thirds are far enough from one another in frequency not seem like conflicting versions of the same note. So simultaneity of two voices (polyphony) becomes unmistakeable with piles of thirds.

Furthermore, this system of building chords works very nicely in minor. Here, for example, is V in A Harmonic minor, which nicely takes into account the flat 2:

modes as patterns within the patterns:

Of course, there’s a great deal going within each key. Modes, which start on different notes of a single scale pattern, are not differentiable from each other when taken as a sets (as you can see below). There is, however, an audible importance to the beginning of a pattern. These, too, are algorithmically definable.

Although it may not be directly evident from a first hearing, all of the scales in the audio above have exactly the same notes (adjusted for register). They do have different functions and balances, though, dependent solely upon which note they start with. These are the modes, and they are real enough to hear, even when they may in some ways look indistinguishable from one another.

modes in minor

Naturally, the same applies for modes of minor scales. Don’t let the fancy names at the left in the bottom throw you: it’s exactly the same pattern as in major. The Eb at the far right just folds into the former place of E, and the same algorithm applies: if you play the note in every second column, you will hear a scale. Different starting points yield different modes.

arpeggios: non-simultaneous chords

The impression of a key is just as strong with 3-note arpeggios as it is with entire scales. Arpeggios bring tremendous flexibility. They are sets of notes which imply larger sets of notes, and allow our ears to predict some sort of completeness beyond them. Ornament becomes possible; small surprises can appear in the gaps; a great deal ends up being ‘heard’ without ever being audible.

Here are some arpeggiated examples in both harmonic and melodic minor:

on the algorithmic generation of these models

Every one of the sequences above (and infinite others) can be generated using the absolute minimum of particular and wave representation. The sounds are sine waves based on frequencies; the images are twelve bits per row.

These harmonic sequences show a surprising amount of intrinsic information, given their small resources. The tensions inside these chords are non-trivial generative sources for musicians and listeners.

I won’t vouch for the musical value of these systems in and of themselves, but they do reflect patterns that appear commonly in both notated and improvised music – and not only in so-called Western music.

Lastly, it is worth mentioning the total information necessary to generate the information in each of the graph/sound pairs above, is something like this:


This makes a total of 32 bytes for some fairly complex information. (This sentence is 32 bytes long.) Your average jazz standard would have an information germ of about 80 bytes, give or take, in this format.

Moreover, these numbers not only identify bit combinations in a taxonomic or indexed way, like barcodes; they define a (probable) function and direction in musical time and space. They open up possibilities for memory and for extrapolation. This kind of encoding is, to put it briefly, open for interpretation, and prepared to take off in many directions.

digital harmony

ternary harmonic system in color

Up to now, I’ve been showing these bit-groups in black and white, like a piano. But doesn’t do justice to the differences within the 12-bit bandwidth. An octave makes a sort of spectrum. To begin, here is the circle fifths, visible and audible.

This serves as a kind of identity matrix, serving specifically to mark the presence or absence of powers of three.

The Circle of Fifths as Groups of Seven Notes

Each row in the graph below contains the notes of a key, represented by seven bits. Modulation, in this system, is possible with a circular bit-shift. However, keys are large categories: when you hear the scales next to each other, they are a bit hard to distinguish, though I suppose it’s clear there’s a kind of downward spiral.

But what’s more important is what each key is capable of containing, and how it defines not only what notes to play, but what notes one might predict even without having heard them.

Sets of Related Keys

Each major key, in this system and in conventional harmony, has a set of related keys. Here they are defined moving one bit seven places to the left (flat-side: subdominant, parallel minor) or right (sharp-side: dominant, melodic minor, relative harmonic minor).

Interestingly, moving one bit seven places the same amount of ‘work’ as moving seven bits one place each, as in the circle of fifths, above.

Perhaps still more interestingly, this group of scales allows all twelve notes to be used within a single key area, as often happens in musical reality.

Dominant Sequence around the Circle of Fifths

Here is a sequence of dominant 7th chords. Each gives the full outline of a key, but is entropic enough to be heard either as a series of individual notes or as a sequence of chords

Dominant Sequence around the Related Keys

Here’s a similar sequence to the previous one, but with a shortcut. It finds its way home in 7 steps rather than twelve, without ever losing tonal orientation. (The evenly-spaced diminished chord in the fourth row allows the slip. ‘Harmonic Major’ is a bit tricky to hear, but it plays a role here, as it does in Cole Porter’s Night and Day.)

Sequence of chords within one key

And here is what happens when you stay in one key: the bandwidth narrows. The prediction space finds solid footing, and orientation is clear.

Sequence of chords within a minor key

A minor key, if it exists on its own, is a major key with a bit-wrinkle in it. Here, for example, the essential bandwidth of C Major remains (the range from F to B), but E becomes Eb, creating an audible change of balance.

Harmonic minor is similar. it replaces G with G#. But a curious thing happens: the G# re-orients the C Major bandwidth around A. It is almost as though one hears a new 7-tone bandwidth, running from D to G#:

So that’s a quick tour. Patterns emerge. These patterns don’t make music in themselves, but they do make some the ingredients out of which music can be made.


  1. Each key is identifiable as a distinct 7-bit bandwidth in a 12-bit space.
  2. Related keys are definable using only simple logic operations. Minor keys are defined by the increased entropy in their organization.
  3. Because it is a bit-combination, each scale is a maximally efficient logical filter, measuring inclusion of a frequency in a key area. Categorization of musical sounds within harmony can thus occur with almost no computational overhead.
  4. Categorization of keys brings a distinct, intrinsic musical identity aligned with many existing traditions of notation – not to mention a bar-code identity.
  5. A prime numbered set of pitches allows for unique algorithmic generation of patterns-within-patterns. This will prove useful for building chords and categorizing chords, as we will see in the next post.
  6. A small backpropagation (with earth-moving distance serving as its naive loss function) can calculate the probable context of anay given group of identified notes in either an audio or notation context. That will be for… two posts later.

digital harmony

shepard tones

Just as an experiment, here are a few chord progressions given as shepard tones. I was curious to see/hear how they would affect the perception of voice leading. (The intonation is just: powers of 3, sequential.)

C Major: ii-V-I

ii-V-I with Shepard tones
ii-ViI with pure sine waves

C minor ii-V-i:

ii-V-i with Shepard tones
ii-V-I with pure sine waves

A harmonic minor ii-V-i:

ii-V-i (harmonic minor) with Shepard tones
ii-V-i (harmonic minor) with pure sine waves

It begins to introduce questions of voice leading and of register… and orchestration. That will all come later.

In any case, perhaps it is a useful spreading of the powers of three back into the powers of two, for the ears.

digital harmony

chord progressions

A few simple patterns within the ternary harmony system can yield a surprising wealth of familiar diatonic chord progressions with close to zero computational effort.

To keep things simple, I’m going to do just one sequence, reachable with only a bit shift operation. All the major/minor folds of thirds fall naturally into the right place.

And it sounds familiar (in pure ratios of 3:2 relative to C Major):

diatonic cycle in C Major. (just intonation)

Just for the sake of argument, here’s the same thing in equal temperament. It’s largely the same, perhaps a bit duller:

diatonic cycle in C Major (equal temperament)

What stands out is that the chord progressions taken together form the entire scale.

The memory of all seven becomes a prediction space with a prime number of options. (This is further reinforced by the fact that many musical instruments give a strong overtone at the fifth.)

It turns out that a regular ‘stride’ over the prediction space forms important musical categories — key, scale, arpeggio — in an absolutely algorithmic way. The efficiency of this description in machine language will allow us eventually to seamlessly join the world of music notation to the world of audio.

digital harmony

musical harmony as ternary computing

This is a theory, but it works in practise, and yields surprisingly specific and rich musical results.

We know that binary numbers represent the presence or absence of powers of two:

1910 = 100112 
  = [[(1)×24]+[(0)×23]+[(0)×22]+[(1)× 21]+[(1)×20]
  = 16 + 2 + 1

We might also use binary numbers to represent the presence or absence of powers of three:

8510 = 100113 
  = [[(1)×34]+[(0)×33]+[(0)×32]+[(1)×31]+[(1)×30]
  = 81 + 3 + 1

Using threes in these positions is a step toward ternary computing, using digits 0, 1, and 2.

It also turns out to give a very fruitful representation of musical scales. Twos, in that scheme, turn out to be redundant — or probabalistic — when all is reduced to a single octave. So the system is surprisingly complete. It can be used to analyze audio and music notation, or to create music dynamically.

Circle of Fifths: 3n/2x

Instead of thinking of these powers of three as a sum (or dot product), as in a number system, we could think of them as populating a collection:

10011 = [(1)×34],[(1)×31],[(1)×30]

For musical purposes, it is useful to think of this as a collection of multiples of a base freqency:

10011 = [(freq)×34],[(freq)×31],[(freq)×30]

The circle of fifths is just such a sequence of twelve consecutive powers of three, from 30 to 311. Thus, if the above set of numbers were applied to a base frequency of middle C (as ‘freq’, above), the ones would correspond to C, G, E — a major triad. The sum of the waves at these frequencies sounds like this:

A major triad as powers of three, octave reduced.

The ratios of the complete circle of fifths pattern (starting at F) look like this in a single octave:

F:  3^0/2^0,
C:  3^1/2^1,
G:  3^2/2^3,
D:  3^3/2^4,
A:  3^4/2^6,
E:  3^5/2^7,
B:  3^6/2^9,
F#: 3^7/2^11,
C#: 3^8/2^12,
G#: 3^9/2^14,
Eb: 3^10/2^15,
Bb: 3^11/2^17

Our conventional musical notes correspond to powers of three, divided by whatever power of two is necessary to bring the frequency down into the the right octave. Twos define register: if their exponent changes, their octave changes.

In practical reality, this register change (which doesn’t change the name-identity of a note) may be important. It seems a sort of trivial reduction, but in the end the matter of register may help us identify the separateness of tones within a chord. The overtones of a pitch played on a real instrument multiply quickly beyond the first octave, but if there is a big spike within an octave of a perceived tone, it’s likely that it is the result of a secondary source.

Back to binary

In logical terms, then, we can derive the following: a tonal ‘note’ in the system is defined as the presence or absence of a power of three (with presence/absence marked by 0/1). A power of 2 changes the octave, but not the identity, of the note.

We can thus represent a group of notes as a single binary number, showing the presence or absence of the power of three relative to some core frequency.

Using this system, then, we can represent a C Major scale like this:

The above pattern serves as a kind of identity matrix, through which all other combinations of notes can be filtered, re-organized, and re-conceived.

Two things to notice: 1) this is a rearrangement of the white and black keys on the piano and 2) the order is reversed here to match the piano, ascending left to right. (C Major would be 000001111111 if the numbers went up from left to to right.) I’ve put the highest-exponent-value note at the right; numbers would increase to the left…

Maybe that was a mistake from musical habit.

For finding mathematical patterns (and there are many to find), it might be best to consider it the other way around…

The Audio End

We only come across the chromatic scale insofar as we consider powers of three to be musically meaningful. And it is fair to say that it is meaningful: the ratio of 3:2 lives powerfully in real-world pitches. Here it is on a violin. The lower band is C and neighboring pitches. The upper band is G, the frequency at a ratio of 3:2:

A violin playing a middle C

On Intonation and Tuning

There’s no point insisting on the finer points of intonation at this moment. In fact, I am confident in saying, as a highly competent professional violinist, that intonation is a moving target, with ever-shifting possibilities for ‘rightness’.

It is possible, for example, to tune dominant seventh chords to the overtone series, with a low major third, and a very low seventh. It sounds good – it’s not ‘out of tune’ – but it’s quite static, and not terribly useful for polyphony. The harmonic series functions most powerfully in the area of timbre, giving focus and color rather than dynamism and note-predictions. Chords can function (and can be suggestively tuned) far more flexibly in this binary system, where consonance is defined not only by the harmonic series, but also, simultaneously, by the entropy and patterning of its powers of three.


I write and pursue this musical system for two reasons.

  1. The presence of a large amount of conventional harmony at the root of the computer suggests a durable representation in the digital age. What can show up in binary can continue to exist digitally, because it is maximally efficient in the medium/mechanism.
  2. This system suggests a kind of predictive computing, as I will suggest in upcoming posts. Perhaps it can serve as a simple, tangible example of the basic processes of networks, memory, and even backpropagation. Evidence of its usefulness can be seen in the physical disturbance that every harmony change can cause a player of an instrument.

That is to say, it seems to be real enough.

digital harmony

sine wave blues

I would post this as ‘blues’ only because I know it’s not blues and never should be categorised as such.

It’s just some sort of tightened symbolic rendering, some mathemagical/numerological shadow.

But here in our digital igloos, before our computers, respecting our ‘social distance’, we play with dots and waves, alone. Perhaps, at least, there is some glue in this, for us all to use later, when there are instruments and gatherings.

In any case: for beginning to play along with harmony, there is hardly a better starting point than a simple blues progression. Here’s a little low resolution, lockdown blues, with a digital absence of style as backup.

I’ve tried to keep it in a kind of digital lo-res, trying to be aesthetically clear that we’re working at the bottom of the computer, in binary, and making decisions by hand, in some kind of reality.

The whole of it derives from three numbers: 0, 1, and -1, applied to a travelling harmony state.

Below is a graph of the waveform (barely recognisable as a waveform, in the green-blue strip) of the sinewave-piled blues progression, repeated ten times (GGGG-CCGG-DCGG). Below that a harmonic analysis of the waveform, based on a Fourier analysis. All is based in C Major (the red at the bottom), centered around G dominant (orange).

When I add a bass line and analyse the more complex waveform, the analysis turns to the flat side, centering itself, rather properly, in a B-flattish area. It’s an alternate analysis, equally valid:

If I set one of the hyperparameters higher, to filter out less of the signal, the basic structure returns, even if the extension analysis (the line at the top) disappears:

So to a large extent, the basic structure remains, even in a more complex system. Even my own live blues show through, somewhat:

Thankfully, it’s not so absolutely clear.

Harmony can indicate direction and tension in a near-total absence of timbre and dynamics. Harmony is something that folds us forward.

What we add on our own, in timbre and dynamics, is infinitely complicated, barely succumbing to any analysis, but still showing some structure. At least we know we’re on the flat side, and finding pillars of G major7. In the end, the variation within the analysis is akin to the many directions — the many potential continuities — we hear when we listen closely. It’s just at the edge of categorisation, where it ought to be.

digital harmony

playing along

These days, we musicians are left with a question: what is left?

practise muted violin on minor vi-ii-V-I

What is left when everything is audio? What can performance mean in this infinite, icy context? What can be the message from our fingers, our own digits, our own breath, when touch seems at best superfluous and at worst outright dangerous?

Even as the idea of performance changes radically – and I can see neither limit nor brake in the current trend away from former conventions – live performance will remain a crucial point of contact not only amongst people, but also between people and the tactile, living world.

musical questions and answers

The question even of a digitized message is not only how does it make you feel, but how does it ask you to respond? This second question makes us build music. Music is a thing which asks for a response in kind.

Here is an answer:

Autumn Leaves for violin and sine waves

To the following. question from the computer:

[‘0x6c0’, ‘0x6c0’,  ‘0x590’,  ‘0x590’,  ‘0x330’,  ‘0x330’,  ‘0x660’,  ‘0x660’,  ‘0x4d0’,  ‘0x4d0’,  ‘0xb2’,  ‘0xb2’,  ‘0x262’,  ‘0x262’,  ‘0x262’,  ‘0x262’,  ‘0x6c0’,  ‘0x6c0’,  ‘0x590’,  ‘0x590’,  ‘0x330’,  ‘0x330’,  ‘0x660’,  ‘0x660’,  ‘0x4d0’,  ‘0x4d0’,  ‘0xb2’,  ‘0xb2’,  ‘0x262’,  ‘0x262’,  ‘0x262’,  ‘0x262’,  ‘0x4d0’,  ‘0x4d0’,  ‘0xb2’,  ‘0xb2’,  ‘0x262’,  ‘0x262’,  ‘0x262’,  ‘0x262’,  ‘0x6c0’,  ‘0x6c0’,  ‘0x590’,  ‘0x590’,  ‘0x330’,  ‘0x330’,  ‘0x660’,  ‘0x660’,  ‘0x4d0’,  ‘0x4d0’,  ‘0xb2’,  ‘0xb2’,  ‘0x262’,  ‘0x20b’,  ‘0xd80’,  ‘0x82c’,  ‘0x660’,  ‘0x660’,  ‘0xb2’,  ‘0xb2’,  ‘0x262’,  ‘0x262’,  ‘0x262’,  ‘0x262’,  ‘0x6c0’,  ‘0x6c0’,  ‘0x590’,  ‘0x590’,  ‘0x330’,  ‘0x330’,  ‘0x660’,  ‘0x660’,  ‘0x4d0’,  ‘0x4d0’,  ‘0xb2’,  ‘0xb2’,  ‘0x262’,  ‘0x262’,  ‘0x262’,  ‘0x262’,  ‘0x6c0’,  ‘0x6c0’,  ‘0x590’,  ‘0x590’,  ‘0x330’,  ‘0x330’,  ‘0x660’,  ‘0x660’,  ‘0x4d0’,  ‘0x4d0’,  ‘0xb2’,  ‘0xb2’,  ‘0x262’,  ‘0x262’,  ‘0x262’,  ‘0x262’,  ‘0x4d0’,  ‘0x4d0’,  ‘0xb2’,  ‘0xb2’,  ‘0x262’,  ‘0x262’,  ‘0x262’,  ‘0x262’,  ‘0x6c0’,  ‘0x6c0’,  ‘0x590’,  ‘0x590’,  ‘0x330’,  ‘0x330’,  ‘0x660’,  ‘0x660’,  ‘0x4d0’,  ‘0x4d0’,  ‘0xb2’,  ‘0xb2’,  ‘0x262’,  ‘0x20b’,  ‘0xd80’,  ‘0x82c’,  ‘0x660’,  ‘0x660’,  ‘0xb2’,  ‘0xb2’,  ‘0x262’,  ‘0x262’,  ‘0x262’,  ‘0x262’] 

What I hope to be able to give with harmonypartition is not a technique for automation, but rather a language for prediction which allows multiple, flexible and complementary solutions. If we learn to play over chords, we will learn how to accommodate multiple voices, and we can play with each other, with infinite variety, in timbre, timing, register…

The computer can help with this, perhaps.

what is (or was) a concert?

Consider a concert, in the abstract. They were always call-and-response at their core, regardless of genre (wikipedia definition of call-and-response notwithstanding).

One group of people, identifiable as a ‘group’ by costume or location, would make sounds together in an identifiable way, while another group of people listens and watches. As the concert proceeded, the audience also makes sounds, usually of approval or encouragement. Sometimes this came as ‘applause’: handmade pink noise.

The computer sends echoes, but that is not sufficient: we need responses, not echoes. A response is not a right or wrong answer: a response is something which answers affirmatively that someone is indeed out there.

Simulation of a response is perhaps the only wrong answer to a musical question. Simulation… or absence, which are weirdly similar. to one another.

When we play with each other, live, we constantly respond, with infinite complexity. We applaud someone else’s music when we respond to it; we give it a history, a trajectory, a social meaning. We miss a part of it, we take a part of it; we transform it, we preserve it.

It is a miracle that it can be remembered at all, let alone taken further. It was all a memory, even when it happened.

digital harmony

tonality => memory

The question that keeps appearing before me is about what changing keys does to your memory. I’d like to build a framework to seek a more particular answer.

long-term and medium-term tonal memory: ii vs. V/V

To begin down this path, I’m going to start with the vi-ii-V-I sequence I’ve been going on about in the previous posts.

I’d like to contrast vi-ii-V-I with a vi-V/V-V-I sequence, which is only one note away, but makes a real difference in perceived brightness and in orientation.

Here they are to compare as audio files. First, a clean vi-ii-V-I:

Then, something similar, but with V/V instead of ii (the second chord is brighter):

Notice the change to orange in the medium-term memory, below the yellow

The analysis here shows the V/V changing something in the medium-term memory below the yellow box (look closely for the orange stripe below the yellow). It also does not disrupt the longest-term memory (consistently red, C Major, at the very bottom).

memory of major in relative minor

The relative minor is also one parameter away, and we can see that as relative minor, it also contains a memory of C Major (red) in the long-term memory, though the medium-term memory settles in around D minor:

This is a preliminary change in tonality: something that leaves a near-audible trace in the medium term memory. The change could be as short as a V/V, or could settle into a related key.

Here it settles into A minor, which is where the major sequence starts — put them together and you have something like ‘Autumn Leaves’, plus a bit of a new image of how Autumn Leaves descends in harmony as well as melody.

song structure: Autumn Leaves

So here are the first seven chords of Autumn Leaves analysed as audio:

And here as analysed from within the system’s clean taxonomy.

There are some minor differences in the long-term memory of these two versions, which is fantastic. The audio analysis shows that the music must find its way from a ‘real’, unprepared entry point; the prescribed, systematic version, knows exactly that it is in C Major. Our ears are somewhere in between, or around, these two possibilities.

This flexibility, which allows the same music to be analysed in multiple ways simultaneously (by different ears or different lines of thought), is what allows for flexibility in other musical parameters: dynamics, ornaments, tension.

Thus, two chords or sequences can sound exactly the same, but longer-term memory can always indicate other possibilities for memory and prediction.

It might be called ‘room for interpretation’ – though interpretation a wider field. But above all it indicates an irreducible volatility, full of potential, at the bottom-most level of computational analysis.

Somewhere, far above this analysis, lurks Cannonball Adderley. And the guys playing Autumn Leaves on street corners, giving ears and lives a bit of flow as they go.

digital harmony

midi => harmony

harmonypartition works simply and cleanly with MIDI files.

Using the music21 package, harmonypartition is able to analyze a MIDI file in exactly the same way it analyzes an audio file.

Results from MIDI are cleaner than those from audio – though not necessarily better, since resonance and continuity play such a key role in analysis, performance, and understanding. But MIDI is an important and stable reference in music notation, and provides invaluable information on many levels.

analysing a MIDI file

Below is an analysis of the same progression as on the previous pages: vi-ii-V-I in C Major, from a MIDI file:

bin_a, kpdve_a = pt_analyzeMIDI.analyze_notation_file("vi_ii_V_I.mid", beats_per_slice=2)

This analysis (like all analyses in harmonypartition yields two crucial pieces of information. The first is a binary ‘punch card’ of pitches. The second is an analysis of probable function. These can be seen below. First, in binary (with analysis given below):

Second, coded by proximity and most probable (or, more to the point, laziest possible) harmonic function:

harmonypartition analysis, with chords (large blocks) in C Major (bottom stripe)

This analysis is very close to what derives from the Fourier analysis of a live performance of the same music, here played on a lightly out-of-tune piano:

vi-ii-V-I live on funky piano

further examples: MAESTRO dataset

Then, to take a leap away from the vi-ii-V-I, here is a somewhat more sophisticated examples: Prelude and Fugue in A Minor, WTC II, BWV 889 (Johann Sebastian Bach), from the MAESTRO dataset. First an analysis of the audio recording:

Audio/Harmony analysis: Prelude and Fugue in A Minor, WTC II, BWV 889 (Johann Sebastian Bach) — using librosa stft

And here an analysis of the MIDI version of the same performance:

MIDI/Harmony analysis: Prelude and Fugue in A Minor, WTC II, BWV 889 (Johann Sebastian Bach) — using music21 for chord extraction.

These analyses could use some smoothing and statistical comparison, but all in all, it’s not bad for pure binary logic – which seems like a good, rooted place to have a meeting point.