Re: [Caml-list] Optimizing symbolic processing code

Hugo Ferreira Sat, 17 Jan 2009 01:28:58 -0800

Andrej,

First and foremost thanks for taking the time to answer.


Andrej Bauer wrote:

After being so bad spirited in my last message, I decided to make it
up by doing something positive. I have added to the PL Zoo a mini

prolog interpreter, see http://andrej.com/plzoo/ .


Interesting. I had visited this page for a quick look and never
realized (or don't remember) seeing any Prolog interpreter.

It is very slow and


Maybe be slow, but it is very clear and concise.

I am sure a decent implementation would speed it up by an order of
magnitude (at least a 100 fold).


Indeed this may be possible.

I wonder how your implementation
compares to mine.


I will go through your code in the order it is available:

1) First thing I noticed is that you have very few type of terms.
   In my case I add parsing and unification of integers, strings,
   lists, negation (not), etc. I also added some extras because
   I use the same parser to manipulate first order logic
   sentences used  by a AI planner.

2) Your parsing doesn't construct a symbol table. So this means
   that all comparisons are string based. In my case all
   comparisons are integer based i.e: everything from a predicate
   symbol to a constant is an integer.

3) Your variable bank is based on a list. This means that any
   look-up requires linear time. I use arrays for this. This
   has effects on unification.

4) Related to the above I use a Union-Find implementation
   (See http://www.lri.fr/~filliatr/ for the implementation
    I use) to bind variables. I have also experimented with
   another data structure for this, but this implementation
   is simple and fast although mutable.

5) Your unification algorithm looks like a standard (at-worst)
   quadric order unifier, which is not too bad. However you
   use your linear list of substitutions (3+4). What is more
   your occurs check is done on every variable - term binding.
   I on the other hand, use a near-linear algorithm using fast
   union-find and do a occurs check only at the end for only
   those terms bound to variable and again using the U-F data
   structure.

6) Ok, this one is the one that seems to be killing my application.
   You use a very simple database, its basically a list of
   assertions. Any look-up is linear in respect to the number of
   assertions, which means that resolution of a goal is exponential
   in respect to the number of assertions and number of goals. I
   use a discriminant trie whose search is linear in respect to
   the number of goal (as opposed to the assertions).

7) Before unification you take care of performing variable
   renaming. This has to be repeated every time you use a
   predicate. I use clauses represented in a canonical form.
   Renaming for me is simply a matter of bumping a counter of
   variables in the variable bank and attaching this offset to
   the terms in question. I also "reuse" variables because of
   the way the variable bank is implemented.

8) Your algorithm is a very clean implementation of SLDNF.
   You keep a stack of sorts and allow one to continue search
   for the next goal. One of my implementations did this however
   keeping track of the position in a trie resulted in complicated
   code. I now use simple folds over the data structure. A
   function is invoked whenever a solution is found. If only one
   solution is required then a quick exit is performed via an
   exception. In your case your implementation is simpler because
   it uses only lists.

Ok, in respect to your first response:

> Judging from what your responses, the most probable explanation for
> inefficiency is that you implemented your prolog interpreter badly.

Possibly yes, but from what I have explained above you can see I have
taken pains to have a decent intepreter.

> It would help a lot if you just showed us your code.

True. But the problem is I did not know what was killing performance,
hence the request for more information on how to diagnose the problem.
Only then would I analyse further and ask for help with more details.

I have compiled and executed the code with profiling. "grpof" shows
that about 8-10 % of the time is spent on folding over the trie (I
use map folds and finds, why are map folds taking so long?). In other
words it is not an issue on unification or the resolution function but
the search in the trie. I also find calls to caml currying functions.
This seems to point me to [1] for solutions.

I am going to make some additional experiments in order to diagnose
the problem further.

One simple question: is Ocaml matching fast enough that I need
not worry with:

a) the number of variants in a type
b) comparisons of the sort:

and bind_all_var_to_any f h ps t delay acc =
  (* k a b -> b *)
  let scan k e acc =
    match e with
    | Leaf _ -> acc
    | Node(Pred.Rel _,ps',vs',jps',jvs') ->
            (* jump over ptedicate *)
            fold_all f jps' jvs' t delay acc

    | Node(_,ps',vs',jps',jvs') ->
            fold_all f ps' vs' t delay acc
   in
   Node_map.fold (scan) ps acc


Once again, appreciate any comments on the above.

Regards,
Hugo F.

[1] http://ocaml.janestreet.com/?q=node/30

Best regards,

Andrej

_______________________________________________
Caml-list mailing list. Subscription management:
http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
Archives: http://caml.inria.fr
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs


_______________________________________________
Caml-list mailing list. Subscription management:
http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
Archives: http://caml.inria.fr
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs

Re: [Caml-list] Optimizing symbolic processing code

Reply via email to