Hello, Since it seems that there is growing activity around applying machine learning to minikanren at various levels, Will asked if I would share the approach I have been taking. To the extent that there is broader interest in mk machine learning, it probably makes sense for interested parties to be in touch.
Background: Currently, I have what seems like a complete implementation for at least one approach to integrating neural nets with mk search. However, there remain some open questions, such as what to train it on and how to evaluate it once trained. Moreover, it seems likely that there are entirely other approaches that may or may not be related to this one. In any case, this approach should be fairly easy to implement with minor changes to any standard minikanren implementation, so if anyone is interested in using any part of it to try to integrate it with an existing implementation of a given mk program, such as the reference implementation of Barliman, I can describe it in more detail. This email will just be a basic overview. The goal of this approach is to derive a general way to turn a minikanren program into a machine learning problem, so that any application developed in minikanren can also implicitly be trained. Although I had Barliman/program synthesis in mind as a prime example, I did not use any Barliman-specific knowledge, so that the system would be applicable to Barliman as well as to other programs. For the sake of explanation, however, I will describe the particular set up in terms of how it works in Barliman. Overview: As a simplification, I decided to set up the problem in three stages: 1) Barliman would be run like normal, and each returned package would contain, in addition to the substitution and constraint store, an "execution trace" representing the choices made in constructing this particular package. Similar programs should have similar execution traces, which forms the basis of learning. Given a corpus of programs, this would take the form of running Barliman once on each fully ground program, so that we can recover the trace representing that program. Given no corpus, we can use Barliman to just generate a bunch of programs and use those for training, although in this case we need to make sure we have a way to distinguish good and bad programs so that we can set the rewards appropriately. 2) Outside of minikanren, a machine learning model (in my case, a simple RNN) is trained on the program traces, which are tree structures, although I flatten them to sequences for training. 3) The model is fed back into subsequent runs of the minikanren program, and its predictions are used to guide the search. The overall effect is that the model represents a bias over the search space, but does not update live during the search. There are a number of reasons for this, all of which have to do with my inability to arrive at a satisfactory interpretation of what success and failure mean in the middle of a given search, but I suspect that, depending on the problem, a model that updates live during the search would probably be a valid approach and possibly easily reachable from this implementation, although for the moment I am not pursuing it. I will note however that this "bias" can be a fairly flexible concept per-search. Eg, one could imagine a bias over all scheme programs in Barliman that down-weighted lambdas full of unused arguments and other frequently vestigial components of synthesized programs, but one could also imagine a bias tuned to the specific input/output constraints, so that it weighted cons more highly if the output was a list vs a number, etc. Implementation: Since I implemented this in Smalltalk, there are probably a few implementation details that will differ in the lisp/scheme variants. I will try to give a general sketch of what I think such an implementation would look like. Roughly, it will require two main changes to the basic mk implementation: 1) Forms that delay the search, like fresh and conde, should have unique global id's, and add them to the trace of the packages that they process. This may be easier in the scheme/lisp implementations, where you have a clear macro expansion time and a module boundary that can contain a global counter, although maybe not. The goal is that each package, as it passes through the goals, will accumulate a history of all the goals it has seen. Eg, an answer that represents a variable lookup on the third variable in the environment might look something like a less human-readable version of: '(eval-conde-2 lookup-fresh lookup-fail-conde lookup-fail-conde lookup-success-conde). It is important that the same symbol be given by the "same" goal within a recursive step, so taking the second cond in the main eval conde should always add eval-conde-2 to the trace. One extra potential complication here is the handling of conjunctions of more than one conde. This may not be necessary for a basic implementation. If we think of the interpreter as defining an implicit grammar over the execution traces of the program, eg: Eval => Quote Quote => quoted-value Eval => Lookup Lookup => lookup-failure Lookup Lookup => lookup-success Then we can recognize that the trace may not, in general, be a list. For an interpreter with only lookup and quote, it will be, because these rules are both regular, and so a sequence of trace id's uniquely defines the "parse" of grammar rules used by the interpreter to arrive at its final package. Once we add in rules like list: ProperList => empty-list ProperList => destructure-list Eval ProperList We can see that the "grammar" may potentially be context free. Of course, in reality, even calling it context free is a modeling approximation. In some minikanren programs, this will never come up (either because all the "rules" are regular, or the "cfg" is unambiguous given a sequential trace anyway), and in some machine learning models, this may not really matter, if the model is only using information from the sequence and not from the full hierarchy. Still, All that is needed to capture the extra information is that fresh/cond forms that contain multiple condes should create bind forms that essentially take the sequential traces from each conde and merge them together up to the end of their common prefix to create a tree structured trace. Eg: '(eval-conde-3 proper-list-fresh proper-list-nonempty-conde ((eval-conde2 ...) (proper-list-fresh ...))) 2) The interleaving streams of closures should be reified as structs so that the order of the search can be controlled by the trained model. This is a straightforward translation of the stream logic into a reified form of the same stream logic. Every time the package is extended with a new trace id, the model is run on the current trace and the package is tagged with a score. When packages are mplused together, mplus checks which branch has the max score, and interleaves that branch first, caching the score in the new mplus struct for further checking higher up the tree. If the scores are the same, it interleaves like normal, so the initial run should use a dummy model that just scores everything with a 0. It is also possible to choose probabilistically, etc instead of using the max. I've definitely left out some implementation details that would be useful for actually making this work, but I think I've said enough already, so I can go into more detail if anyone wants to hear it. Just let me know. Also, I think there is a plan to talk about this more in the advanced hangout on sunday? Will correct me if I'm wrong. So that would be a good time to talk about it if this is interesting to anyone. Currently, I'm refactoring the above code a bit, and then I'm going to work on figuring out what kind of training makes sense (corpus-based, self-training via unbound synthesis, etc), and how to evaluate a trained search. Cheers, Evan -- You received this message because you are subscribed to the Google Groups "minikanren" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at https://groups.google.com/group/minikanren. For more options, visit https://groups.google.com/d/optout.
