BTW, I suspect you'll need the prediction assisted SP for this to work well, so Felix' comportex would be worth starting with..
On Sun, Jun 14, 2015 at 2:16 PM, Fergal Byrne <[email protected]> wrote: > Great talk, Charlie, thanks for coming and sharing your passion for this > stuff. I'm about as unmusical as any human, which is still pretty musical, > so it's great to learn from someone who's been thinking about this so much. > > > Here's a kind of outline for a campaign to do the MIDI analysis. Let me > know if this makes sense. I have some ideas about the audio to MIDI, but > they need this working first. > > From both a theory and a practice point of view, we're concerned with what > work needs to be done subcortically (the encoder) and what can be learned > cortically (in the HTM). Nature will hopefully have made similar decisions, > based on a similar tradeoff. Encoders have to be evolved (or programmed in > our case), which is very expensive compared to learning in cortex (or just > shovelling data into HTM). But you can't just shove raw data into HTM, > because it needs some semantic encoding to extract structure from. You also > aren't going to persuade Nature to evolve encoders for music in > anticipation of it, so there's a strict upper bound on the hand-engineering > of encoders. > > The analogy with vision is important. It suggests we need to have a > hierarchy (including the encoders) which at the bottom is topologically > mapped to the midi channels (which are a proxy for cochlear frequency > bands). To first order, let's ignore velocity values and just pretend it's > a binary stream. To simplify further (I believe your preprocessor does > this), we should limit temporal resolution to fixed bins and thus eliminate > very fine differences in onset and offsets across channels from the > performance, which will more closely resemble the composition. > > OK, now the musical information is contained in the position and timing of > onsets and (less importantly and usefully) offsets. Assuming we're > streaming, we won't know when the offsets are until they happen. So a > simple encoder would have a bit per channel per not-off state - onset, on, > offset. We usually give these each a small width so they're potentially > seen by lots of columns, let's say 8 replicated bits per channel per state. > We can reduce this to 4 or 2 at the extremes if they're seldom used in real > MIDI files. So the encoding would be say 64 * 3 * 8 + 32 * 3 * 4 + 32 * 3 * > 2 = 528 * 4 = 2112 bits. > > You'd then put a topological HTM layer (L4) on top of this, with each > column seeing inputs from an octave up and down, say. The SDRs this would > produce after learning would include localised representations of > individual chords in each stage of their existence, so the columns for the > middle C-major chord onset will appear when that chord is first played, > followed by columns for middle-C major "on", then middle-C major "off" and > then none of the above. A temporal pooling layer (L2/3) above this will > show "middle C-major" during the on-time of the chord, and can learn that > melody using sequence memory. > > To track multiple voices, you may be able to use this one region. The > sequences of chords in L2/3 might be predictable enough to hold together in > parallel, but you'll still get a union of their SDR's. Or you might need > another region on top to separate the voices. > > You can do key identification simply by counting up the L2/3 chords (per > voice) used in recent bars of the music. The chords which occur over a > short period give information about the probable key. Some of the L2/3 > columns will automatically indicate this because they happen to be > temporally pooling over exactly the set of chords for one key. A classifier > or higher region trained to predict just the key (over many melodies, each > in the same key) will give you the key. > > To identify a melody, you'd need to add another region above this analysis > level, which would take chord SDRs from lower region L2/3 into L4 and > temporally pool over them to produce melody SDRs. > > To learn composer, style or mood, you'd provide the "melody identifier" > region with encodings of the composer, style or mood as L1 input along with > the melody data from this level. You'd feed the output of the higher region > in as L1 predictive input to the L2/3 layer and have all regions learn > together. This would potentially give you a generative model which might > reproduce Bach or Beethoven style music. > > > Regards, > > Fergal Byrne > > > On Sat, Jun 13, 2015 at 8:19 PM, Matthew Lohbihler < > [email protected]> wrote: > >> Perhaps it's the opposite, and explains tinitis. Being someone who has >> this, i can confirm that the pitch is precisely constant. >> >> >> On 6/13/2015 2:34 PM, Tim Boudreau wrote: >> >> Interesting idea, but for it to be correct, shouldn't we have observed >> cases of highly pitch-specific deafness - i.e. you lose 435-445Hz, but 446 >> is fine? >> >> -Tim >> >> On Sat, Jun 13, 2015 at 2:30 AM, Matthew Taylor <[email protected]> wrote: >> >>> Interesting reading: >>> http://hyperphysics.phy-astr.gsu.edu/hbase/sound/place.html >>> >>> Sent from my MegaPhone >>> >> >> >> >> -- >> http://timboudreau.com >> >> >> > > > -- > > Fergal Byrne, Brenter IT @fergbyrne > > http://inbits.com - Better Living through Thoughtful Technology > http://ie.linkedin.com/in/fergbyrne/ - https://github.com/fergalbyrne > > Founder of Clortex: HTM in Clojure - > https://github.com/nupic-community/clortex > Co-creator @OccupyStartups Time-Bombed Open License > http://occupystartups.me > > Author, Real Machine Intelligence with Clortex and NuPIC > Read for free or buy the book at https://leanpub.com/realsmartmachines > > e:[email protected] t:+353 83 4214179 > Join the quest for Machine Intelligence at http://numenta.org > Formerly of Adnet [email protected] http://www.adnet.ie > -- Fergal Byrne, Brenter IT @fergbyrne http://inbits.com - Better Living through Thoughtful Technology http://ie.linkedin.com/in/fergbyrne/ - https://github.com/fergalbyrne Founder of Clortex: HTM in Clojure - https://github.com/nupic-community/clortex Co-creator @OccupyStartups Time-Bombed Open License http://occupystartups.me Author, Real Machine Intelligence with Clortex and NuPIC Read for free or buy the book at https://leanpub.com/realsmartmachines e:[email protected] t:+353 83 4214179 Join the quest for Machine Intelligence at http://numenta.org Formerly of Adnet [email protected] http://www.adnet.ie
