Re: MiniKanren + Neural Nets

Apocalypse Mystic Sat, 02 Sep 2017 14:57:06 -0700

Hi,

Thanks for that. I'm actually in the middle of trying to revise one of the
core assumptions I was making in the approach I described here initially.


What I was doing, which is what a lot of the neural and other learning
systems were/are doing (and what that one looks like it may be doing on
first glance, but frr), was learning general priors on which program
operations tend to go well together (if + null?, map + lambda, etc etc).
What bothered me, although I couldn't figure out how to use it, was that we
have the live environment to work with at every step of synthesis, rather
than just the general language grammar & syntax.

This comes from the fact that the relational interpreter is still an
interpreter, and not just a program synthesizer that generates programs to
be run later. You could think of it in terms of not needing to know that
you were nested inside an (if _ because you know that the return value has
to be true or false, which summarizes the context needed to figure out how
to turn your input into a true or a false. I'm still in the
not-sure-my-mental-model-is-accurate phase of testing this, but I have a
preliminary implementation of something that will hopefully pull all
available information about inputs and outputs as far down the search tree
as possible so that a learning task can be formulated on the basis of that
data, but nothing conclusive yet.

Anyway, that might be something to think about when trying to come up with
something to do with learning in this space. We may actually have more
pieces on the table than a lot of other synthesis projects, so we may need
to do things they don't to make use of them.

Cheers,
Evan

On Sat, Sep 2, 2017 at 3:49 PM, Timothy Washington <[email protected]>
wrote:

> Hey Evan,
>
> Wrt algorithms for searching the symbol space, I've been investigating
> RNNs, I've come across the notion of a Neural Programmer.
>
>    - https://distill.pub/2016/augmented-rnns/#neural-programmer
>
>
> Not sure if this helps your approach. But thought I'd throw it out there
> anyways. I'm going to try to dive into itRNN in the coming weeks. And here
> are some of the references in the above link, for you guys to take a look
> at.
>
>    - https://arxiv.org/pdf/1511.04834.pdf
>    - https://openreview.net/pdf?id=ry2YOrcge
>    - https://github.com/tensorflow/models/tree/master/neural_programmer
>    - https://github.com/mokemokechicken/keras_npi
>
>
> Hth
> Tim
>
> On Wednesday, 14 June 2017 15:19:38 UTC-4, Evan Donahue wrote:
>>
>> Sure, so here is the trace for the program that generates a list of two
>> quoted values:
>>
>> '(eval 3 list proper 2 nonempty 1 eval ((1 quote) (2 proper 2 nonempty
>> ((1 eval 1 quote) (2 proper 1)))))
>>
>> I am using human readable tags for the functions and numbers for the
>> conde branches, I think in the same way you are. You will notice the tree
>> has 3 leaves, one for each eval'd element of the list and one for the final
>> nil. Because I'm feeding it to a simple rnn right now, I flatten this to a
>> simple sequence:
>>
>> "((eval) (eval 3) (list) (proper) (proper 2) (nonempty) (nonempty 1)
>> (eval) (eval 1) (quote) (nonempty 1) (eval) (eval 1) (quote) (nonempty 1)
>> (eval) (eval 1) (quote))"
>>
>> Where each inner list should be taken together as an opaque token. I'm
>> also working in part from that grammar autoencoder paper, although I am
>> currently building these traces on the fly and without a full reference
>> grammar, which makes following the precise method kind of tricky. I may
>> bite the bullet and extract the grammar as a pre-step before starting the
>> process, but we'll see.
>>
>> I'll be interested to hear if you think up good ways to encode the data,
>> since I'd like to be able to feed that information to the rnn and use it to
>> further bias the search. I haven't thought too much about it yet, but one
>> thing that has occurred to me is that maybe it would be possible to
>> repurpose the traces generated by the interpreter to encode the
>> input/output data. Eg, given input '(1 2) and output '(2) generate the
>> canonical program that produces this data, eg (cons (list (quote 1) (quote
>> 2)) (list (quote 2))) and run the interpreter on it to generate a trace
>> that can be encoded by whatever you are using to encode normal program
>> traces. This would still leave the actual atomic elements (numbers,
>> symbols...) unspecified, but maybe you could work them in with a vector
>> representation that reserved say 3 bits for a 1 hot encoding of the type
>> (number, symbol, boolean) and the rest for a type specific encoding (binary
>> encoding for numbers, one hot encodings for distinct symbols within a given
>> execution of the program, etc). Depending on what one was doing, I'm also
>> interested in some of the stuff that has gone out to the mailing list about
>> vectorized 'unification' so if symbols are meaningful knowledge within a
>> domain rather than just opaque gensyms for specifying a scheme function,
>> maybe actually learning embeddings for the symbols could be part of the
>> execution. Lots of partial thoughts on this, but nothing too concrete yet.
>>
>> I'm intrigued by where you're going with lookup, because that is also
>> something I'd love to have a good answer for. Ideally it would be nice to
>> be able to have a large library of functions in the environment without
>> totally sacrificing our ability to use them. I've been wondering if some of
>> the finite domain constraint stuff could be repurposed to do a more direct
>> lookup computation, possibly along the lines you are suggesting, but I
>> haven't made too much progress in fleshing that out so far.
>>
>> Cheers,
>> Evan
>>
>> On Wed, Jun 14, 2017 at 9:54 AM, Smock Jonathan <[email protected]>
>> wrote:
>>
>>> Do you have an example trace that you can paste? That's really
>>> interesting. My trace/path is just a sequence, not a tree (shown at the
>>> very bottom).
>>>
>>> Btw, for those interested, I've been trying to figure out how to take
>>> trees of data and turn them into a feature vector/embedding. Does anyone
>>> have good resources for this? The best I can find so far is this, which
>>> uses both a context-free grammar and an autoencoder to generate molecules:
>>> https://www.youtube.com/watch?v=ar4Fm1V65Fw .  My goal was to take data
>>> structures and turn them into feature vectors and then use those to make
>>> feature vectors for functions. Hopefully having feature vectors for
>>> functions gets us out of the scalability problem for `lookup` - i.e.
>>> instead of trying all functions, we have an intuition about what kind of
>>> function we need/want and then do a lookup in all existing functions by
>>> feature vector for nearest neighbors (or use feature vector to create a
>>> heuristic through the interpreter).  I think my first step is to try and
>>> replicate what he's doing in that video though, which will take me some
>>> time to figure out.
>>>
>>> Anyway, here's my substitution store (Clojure):
>>> {:st/path []}
>>>
>>> I only add to the path for disjunctions (btw, I realized yesterday that
>>> Greg said he only reorders conjunctions in Barliman, which I have to think
>>> about more).
>>>
>>> My implementation very much follows the microKanren approach (b/c that's
>>> what I understand ...). Here's the simplified version of my disjunction:
>>>
>>> (defn path-conj [st direction]
>>>   (update st :st/path conj direction))
>>>
>>> (defn disjunction [& goals]
>>>   (fn [st]
>>>     (->> goals
>>>          (map-indexed (fn [idx g]
>>>                         (g (path-conj st idx))))
>>>          vec
>>>          (apply interleave-all))))
>>>
>>> `map-indexed` gives each goal in the disjunction a number/index, and we
>>> record that in the path as we descend into that branch. So, when we're
>>> done, our path might look like [1 1 0], meaning we took the 1th branch,
>>> then the 1th branch of the next disjunction, and finally the 0th branch of
>>> the last disjunction. I considered labeling conde's or branches, though, so
>>> it's neat that you're doing something like that.
>>>
>>> My initial goal in adding the path was to be able to stop the
>>> computation at some arbitrary depth, serialize the substitution store,
>>> deserialize it somewhere else, and then resume where it left off.  I
>>> realized after I did that, that I could simply serialize the path alone, as
>>> long as I'm willing to do the work of rebuilding the substitution store as
>>> I navigate though to that path to resume.
>>>
>>>
>>> On Tue, Jun 13, 2017 at 6:37 PM, William Byrd <[email protected]> wrote:
>>>
>>>> Thanks Evan!
>>>>
>>>> Yes, Sunday's Advanced Hangout would be a great time to talk about
>>>> this implementation, and for anyone interested in learning for
>>>> miniKanren to share notes.
>>>>
>>>> --Will
>>>>
>>>>
>>>> On Tue, Jun 13, 2017 at 4:02 PM, Apocalypse Mystic
>>>> <[email protected]> wrote:
>>>> > Hello,
>>>> >
>>>> > Since it seems that there is growing activity around applying machine
>>>> > learning to minikanren at various levels, Will asked if I would share
>>>> the
>>>> > approach I have been taking. To the extent that there is broader
>>>> interest in
>>>> > mk machine learning, it probably makes sense for interested parties
>>>> to be in
>>>> > touch.
>>>> >
>>>> > Background:
>>>> >
>>>> > Currently, I have what seems like a complete implementation for at
>>>> least one
>>>> > approach to integrating neural nets with mk search. However, there
>>>> remain
>>>> > some open questions, such as what to train it on and how to evaluate
>>>> it once
>>>> > trained. Moreover, it seems likely that there are entirely other
>>>> approaches
>>>> > that may or may not be related to this one. In any case, this approach
>>>> > should be fairly easy to implement with minor changes to any standard
>>>> > minikanren implementation, so if anyone is interested in using any
>>>> part of
>>>> > it to try to integrate it with an existing implementation of a given
>>>> mk
>>>> > program, such as the reference implementation of Barliman, I can
>>>> describe it
>>>> > in more detail. This email will just be a basic overview.
>>>> >
>>>> > The goal of this approach is to derive a general way to turn a
>>>> minikanren
>>>> > program into a machine learning problem, so that any application
>>>> developed
>>>> > in minikanren can also implicitly be trained. Although I had
>>>> > Barliman/program synthesis in mind as a prime example, I did not use
>>>> any
>>>> > Barliman-specific knowledge, so that the system would be applicable to
>>>> > Barliman as well as to other programs. For the sake of explanation,
>>>> however,
>>>> > I will describe the particular set up in terms of how it works in
>>>> Barliman.
>>>> >
>>>> > Overview:
>>>> >
>>>> > As a simplification, I decided to set up the problem in three stages:
>>>> > 1) Barliman would be run like normal, and each returned package would
>>>> > contain, in addition to the substitution and constraint store, an
>>>> "execution
>>>> > trace" representing the choices made in constructing this particular
>>>> > package. Similar programs should have similar execution traces, which
>>>> forms
>>>> > the basis of learning. Given a corpus of programs, this would take
>>>> the form
>>>> > of running Barliman once on each fully ground program, so that we can
>>>> > recover the trace representing that program. Given no corpus, we can
>>>> use
>>>> > Barliman to just generate a bunch of programs and use those for
>>>> training,
>>>> > although in this case we need to make sure we have a way to
>>>> distinguish good
>>>> > and bad programs so that we can set the rewards appropriately.
>>>> > 2) Outside of minikanren, a machine learning model (in my case, a
>>>> simple
>>>> > RNN) is trained on the program traces, which are tree structures,
>>>> although I
>>>> > flatten them to sequences for training.
>>>> > 3) The model is fed back into subsequent runs of the minikanren
>>>> program, and
>>>> > its predictions are used to guide the search.
>>>> >
>>>> > The overall effect is that the model represents a bias over the search
>>>> > space, but does not update live during the search. There are a number
>>>> of
>>>> > reasons for this, all of which have to do with my inability to arrive
>>>> at a
>>>> > satisfactory interpretation of what success and failure mean in the
>>>> middle
>>>> > of a given search, but I suspect that, depending on the problem, a
>>>> model
>>>> > that updates live during the search would probably be a valid
>>>> approach and
>>>> > possibly easily reachable from this implementation, although for the
>>>> moment
>>>> > I am not pursuing it. I will note however that this "bias" can be a
>>>> fairly
>>>> > flexible concept per-search. Eg, one could imagine a bias over all
>>>> scheme
>>>> > programs in Barliman that down-weighted lambdas full of unused
>>>> arguments and
>>>> > other frequently vestigial components of synthesized programs, but
>>>> one could
>>>> > also imagine a bias tuned to the specific input/output constraints,
>>>> so that
>>>> > it weighted cons more highly if the output was a list vs a number,
>>>> etc.
>>>> >
>>>> > Implementation:
>>>> >
>>>> > Since I implemented this in Smalltalk, there are probably a few
>>>> > implementation details that will differ in the lisp/scheme variants.
>>>> I will
>>>> > try to give a general sketch of what I think such an implementation
>>>> would
>>>> > look like. Roughly, it will require two main changes to the basic mk
>>>> > implementation:
>>>> >
>>>> > 1) Forms that delay the search, like fresh and conde, should have
>>>> unique
>>>> > global id's, and add them to the trace of the packages that they
>>>> process.
>>>> >
>>>> > This may be easier in the scheme/lisp implementations, where you have
>>>> a
>>>> > clear macro expansion time and a module boundary that can contain a
>>>> global
>>>> > counter, although maybe not. The goal is that each package, as it
>>>> passes
>>>> > through the goals, will accumulate a history of all the goals it has
>>>> seen.
>>>> > Eg, an answer that represents a variable lookup on the third variable
>>>> in the
>>>> > environment might look something like a less human-readable  version
>>>> of:
>>>> > '(eval-conde-2 lookup-fresh lookup-fail-conde lookup-fail-conde
>>>> > lookup-success-conde). It is important that the same symbol be given
>>>> by the
>>>> > "same" goal within a recursive step, so taking the second cond in the
>>>> main
>>>> > eval conde should always add eval-conde-2 to the trace.
>>>> >
>>>> > One extra potential complication here is the handling of conjunctions
>>>> of
>>>> > more than one conde. This may not be necessary for a basic
>>>> implementation.
>>>> > If we think of the interpreter as defining an implicit grammar over
>>>> the
>>>> > execution traces of the program, eg:
>>>> >
>>>> > Eval => Quote
>>>> > Quote => quoted-value
>>>> > Eval => Lookup
>>>> > Lookup => lookup-failure Lookup
>>>> > Lookup => lookup-success
>>>> >
>>>> > Then we can recognize that the trace may not, in general, be a list.
>>>> For an
>>>> > interpreter with only lookup and quote, it will be, because these
>>>> rules are
>>>> > both regular, and so a sequence of trace id's uniquely defines the
>>>> "parse"
>>>> > of grammar rules used by the interpreter to arrive at its final
>>>> package.
>>>> >
>>>> > Once we add in rules like list:
>>>> >
>>>> > ProperList => empty-list
>>>> > ProperList => destructure-list Eval ProperList
>>>> >
>>>> > We can see that the "grammar" may potentially be context free. Of
>>>> course, in
>>>> > reality, even calling it context free is a modeling approximation. In
>>>> some
>>>> > minikanren programs, this will never come up (either because all the
>>>> "rules"
>>>> > are regular, or the "cfg" is unambiguous given a sequential trace
>>>> anyway),
>>>> > and in some machine learning models, this may not really matter, if
>>>> the
>>>> > model is only using information from the sequence and not from the
>>>> full
>>>> > hierarchy. Still, All that is needed to capture the extra information
>>>> is
>>>> > that fresh/cond forms that contain multiple condes should create bind
>>>> forms
>>>> > that essentially take the sequential traces from each conde and merge
>>>> them
>>>> > together up to the end of their common prefix to create a tree
>>>> structured
>>>> > trace. Eg: '(eval-conde-3 proper-list-fresh proper-list-nonempty-conde
>>>> > ((eval-conde2 ...) (proper-list-fresh ...)))
>>>> >
>>>> > 2) The interleaving streams of closures should be reified as structs
>>>> so that
>>>> > the order of the search can be controlled by the trained model.
>>>> >
>>>> > This is a straightforward translation of the stream logic into a
>>>> reified
>>>> > form of the same stream logic. Every time the package is extended
>>>> with a new
>>>> > trace id, the model is run on the current trace and the package is
>>>> tagged
>>>> > with a score. When packages are mplused together, mplus checks which
>>>> branch
>>>> > has the max score, and interleaves that branch first, caching the
>>>> score in
>>>> > the new mplus struct for further checking higher up the tree. If the
>>>> scores
>>>> > are the same, it interleaves like normal, so the initial run should
>>>> use a
>>>> > dummy model that just scores everything with a 0. It is also possible
>>>> to
>>>> > choose probabilistically, etc instead of using the max.
>>>> >
>>>> > I've definitely left out some implementation details that would be
>>>> useful
>>>> > for actually making this work, but I think I've said enough already,
>>>> so I
>>>> > can go into more detail if anyone wants to hear it. Just let me know.
>>>> Also,
>>>> > I think there is a plan to talk about this more in the advanced
>>>> hangout on
>>>> > sunday? Will correct me if I'm wrong. So that would be a good time to
>>>> talk
>>>> > about it if this is interesting to anyone.
>>>> >
>>>> > Currently, I'm refactoring the above code a bit, and then I'm going
>>>> to work
>>>> > on figuring out what kind of training makes sense (corpus-based,
>>>> > self-training via unbound synthesis, etc), and how to evaluate a
>>>> trained
>>>> > search.
>>>> >
>>>> > Cheers,
>>>> > Evan
>>>> >
>>>> > --
>>>> > You received this message because you are subscribed to the Google
>>>> Groups
>>>> > "Barliman Editor" group.
>>>> > To unsubscribe from this group and stop receiving emails from it,
>>>> send an
>>>> > email to [email protected].
>>>> > To post to this group, send email to [email protected].
>>>> > To view this discussion on the web visit
>>>> > https://groups.google.com/d/msgid/barliman-editor/CAEE2%2B5A
>>>> YrW-m2%2BvO6QZ6Uu2EJkyPOtePEXto9g2NSaxL-8g1Zg%40mail.gmail.com.
>>>> > For more options, visit https://groups.google.com/d/optout.
>>>>
>>>> --
>>>> You received this message because you are subscribed to the Google
>>>> Groups "Barliman Editor" group.
>>>> To unsubscribe from this group and stop receiving emails from it, send
>>>> an email to [email protected].
>>>> To post to this group, send email to [email protected].
>>>> To view this discussion on the web visit https://groups.google.com/d/ms
>>>> gid/barliman-editor/CACJoNKF%2B8EOVAXVduDcutzCkGqO_HBAgfMdOx
>>>> CdWX1%2BnBD65hg%40mail.gmail.com.
>>>> For more options, visit https://groups.google.com/d/optout.
>>>>
>>>
>>>
>> --
> You received this message because you are subscribed to the Google Groups
> "Barliman Editor" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> To post to this group, send email to [email protected].
> To view this discussion on the web visit https://groups.google.com/d/
> msgid/barliman-editor/1678e047-3b4e-4ab6-84b5-a24df47892ea%40googlegroups.
> com
> <https://groups.google.com/d/msgid/barliman-editor/1678e047-3b4e-4ab6-84b5-a24df47892ea%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
>
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"minikanren" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/minikanren.
For more options, visit https://groups.google.com/d/optout.

Re: MiniKanren + Neural Nets

Reply via email to