Hello,

Since it seems that there is growing activity around applying machine
learning to minikanren at various levels, Will asked if I would share the
approach I have been taking. To the extent that there is broader interest
in mk machine learning, it probably makes sense for interested parties to
be in touch.

Background:

Currently, I have what seems like a complete implementation for at least
one approach to integrating neural nets with mk search. However, there
remain some open questions, such as what to train it on and how to evaluate
it once trained. Moreover, it seems likely that there are entirely other
approaches that may or may not be related to this one. In any case, this
approach should be fairly easy to implement with minor changes to any
standard minikanren implementation, so if anyone is interested in using any
part of it to try to integrate it with an existing implementation of a
given mk program, such as the reference implementation of Barliman, I can
describe it in more detail. This email will just be a basic overview.

The goal of this approach is to derive a general way to turn a minikanren
program into a machine learning problem, so that any application developed
in minikanren can also implicitly be trained. Although I had
Barliman/program synthesis in mind as a prime example, I did not use any
Barliman-specific knowledge, so that the system would be applicable to
Barliman as well as to other programs. For the sake of explanation,
however, I will describe the particular set up in terms of how it works in
Barliman.

Overview:

As a simplification, I decided to set up the problem in three stages:
1) Barliman would be run like normal, and each returned package would
contain, in addition to the substitution and constraint store, an
"execution trace" representing the choices made in constructing this
particular package. Similar programs should have similar execution traces,
which forms the basis of learning. Given a corpus of programs, this would
take the form of running Barliman once on each fully ground program, so
that we can recover the trace representing that program. Given no corpus,
we can use Barliman to just generate a bunch of programs and use those for
training, although in this case we need to make sure we have a way to
distinguish good and bad programs so that we can set the rewards
appropriately.
2) Outside of minikanren, a machine learning model (in my case, a simple
RNN) is trained on the program traces, which are tree structures, although
I flatten them to sequences for training.
3) The model is fed back into subsequent runs of the minikanren program,
and its predictions are used to guide the search.

The overall effect is that the model represents a bias over the search
space, but does not update live during the search. There are a number of
reasons for this, all of which have to do with my inability to arrive at a
satisfactory interpretation of what success and failure mean in the middle
of a given search, but I suspect that, depending on the problem, a model
that updates live during the search would probably be a valid approach and
possibly easily reachable from this implementation, although for the moment
I am not pursuing it. I will note however that this "bias" can be a fairly
flexible concept per-search. Eg, one could imagine a bias over all scheme
programs in Barliman that down-weighted lambdas full of unused arguments
and other frequently vestigial components of synthesized programs, but one
could also imagine a bias tuned to the specific input/output constraints,
so that it weighted cons more highly if the output was a list vs a number,
etc.

Implementation:

Since I implemented this in Smalltalk, there are probably a few
implementation details that will differ in the lisp/scheme variants. I will
try to give a general sketch of what I think such an implementation would
look like. Roughly, it will require two main changes to the basic mk
implementation:

1) Forms that delay the search, like fresh and conde, should have unique
global id's, and add them to the trace of the packages that they process.

This may be easier in the scheme/lisp implementations, where you have a
clear macro expansion time and a module boundary that can contain a global
counter, although maybe not. The goal is that each package, as it passes
through the goals, will accumulate a history of all the goals it has seen.
Eg, an answer that represents a variable lookup on the third variable in
the environment might look something like a less human-readable  version
of: '(eval-conde-2 lookup-fresh lookup-fail-conde lookup-fail-conde
lookup-success-conde). It is important that the same symbol be given by the
"same" goal within a recursive step, so taking the second cond in the main
eval conde should always add eval-conde-2 to the trace.

One extra potential complication here is the handling of conjunctions of
more than one conde. This may not be necessary for a basic implementation.
If we think of the interpreter as defining an implicit grammar over the
execution traces of the program, eg:

Eval => Quote
Quote => quoted-value
Eval => Lookup
Lookup => lookup-failure Lookup
Lookup => lookup-success

Then we can recognize that the trace may not, in general, be a list. For an
interpreter with only lookup and quote, it will be, because these rules are
both regular, and so a sequence of trace id's uniquely defines the "parse"
of grammar rules used by the interpreter to arrive at its final package.

Once we add in rules like list:

ProperList => empty-list
ProperList => destructure-list Eval ProperList

We can see that the "grammar" may potentially be context free. Of course,
in reality, even calling it context free is a modeling approximation. In
some minikanren programs, this will never come up (either because all the
"rules" are regular, or the "cfg" is unambiguous given a sequential trace
anyway), and in some machine learning models, this may not really matter,
if the model is only using information from the sequence and not from the
full hierarchy. Still, All that is needed to capture the extra information
is that fresh/cond forms that contain multiple condes should create bind
forms that essentially take the sequential traces from each conde and merge
them together up to the end of their common prefix to create a tree
structured trace. Eg: '(eval-conde-3 proper-list-fresh
proper-list-nonempty-conde ((eval-conde2 ...) (proper-list-fresh ...)))

2) The interleaving streams of closures should be reified as structs so
that the order of the search can be controlled by the trained model.

This is a straightforward translation of the stream logic into a reified
form of the same stream logic. Every time the package is extended with a
new trace id, the model is run on the current trace and the package is
tagged with a score. When packages are mplused together, mplus checks which
branch has the max score, and interleaves that branch first, caching the
score in the new mplus struct for further checking higher up the tree. If
the scores are the same, it interleaves like normal, so the initial run
should use a dummy model that just scores everything with a 0. It is also
possible to choose probabilistically, etc instead of using the max.

I've definitely left out some implementation details that would be useful
for actually making this work, but I think I've said enough already, so I
can go into more detail if anyone wants to hear it. Just let me know. Also,
I think there is a plan to talk about this more in the advanced hangout on
sunday? Will correct me if I'm wrong. So that would be a good time to talk
about it if this is interesting to anyone.

Currently, I'm refactoring the above code a bit, and then I'm going to work
on figuring out what kind of training makes sense (corpus-based,
self-training via unbound synthesis, etc), and how to evaluate a trained
search.

Cheers,
Evan

-- 
You received this message because you are subscribed to the Google Groups 
"minikanren" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/minikanren.
For more options, visit https://groups.google.com/d/optout.

Reply via email to