Andrej,
First and foremost thanks for taking the time to answer.
Andrej Bauer wrote:
After being so bad spirited in my last message, I decided to make it
up by doing something positive. I have added to the PL Zoo a mini
prolog interpreter, see http://andrej.com/plzoo/ .
Interesting. I had visited this page for a quick look and never
realized (or don't remember) seeing any Prolog interpreter.
It is very slow and
Maybe be slow, but it is very clear and concise.
I am sure a decent implementation would speed it up by an order of
magnitude (at least a 100 fold).
Indeed this may be possible.
I wonder how your implementation
compares to mine.
I will go through your code in the order it is available:
1) First thing I noticed is that you have very few type of terms.
In my case I add parsing and unification of integers, strings,
lists, negation (not), etc. I also added some extras because
I use the same parser to manipulate first order logic
sentences used by a AI planner.
2) Your parsing doesn't construct a symbol table. So this means
that all comparisons are string based. In my case all
comparisons are integer based i.e: everything from a predicate
symbol to a constant is an integer.
3) Your variable bank is based on a list. This means that any
look-up requires linear time. I use arrays for this. This
has effects on unification.
4) Related to the above I use a Union-Find implementation
(See http://www.lri.fr/~filliatr/ for the implementation
I use) to bind variables. I have also experimented with
another data structure for this, but this implementation
is simple and fast although mutable.
5) Your unification algorithm looks like a standard (at-worst)
quadric order unifier, which is not too bad. However you
use your linear list of substitutions (3+4). What is more
your occurs check is done on every variable - term binding.
I on the other hand, use a near-linear algorithm using fast
union-find and do a occurs check only at the end for only
those terms bound to variable and again using the U-F data
structure.
6) Ok, this one is the one that seems to be killing my application.
You use a very simple database, its basically a list of
assertions. Any look-up is linear in respect to the number of
assertions, which means that resolution of a goal is exponential
in respect to the number of assertions and number of goals. I
use a discriminant trie whose search is linear in respect to
the number of goal (as opposed to the assertions).
7) Before unification you take care of performing variable
renaming. This has to be repeated every time you use a
predicate. I use clauses represented in a canonical form.
Renaming for me is simply a matter of bumping a counter of
variables in the variable bank and attaching this offset to
the terms in question. I also "reuse" variables because of
the way the variable bank is implemented.
8) Your algorithm is a very clean implementation of SLDNF.
You keep a stack of sorts and allow one to continue search
for the next goal. One of my implementations did this however
keeping track of the position in a trie resulted in complicated
code. I now use simple folds over the data structure. A
function is invoked whenever a solution is found. If only one
solution is required then a quick exit is performed via an
exception. In your case your implementation is simpler because
it uses only lists.
Ok, in respect to your first response:
> Judging from what your responses, the most probable explanation for
> inefficiency is that you implemented your prolog interpreter badly.
Possibly yes, but from what I have explained above you can see I have
taken pains to have a decent intepreter.
> It would help a lot if you just showed us your code.
True. But the problem is I did not know what was killing performance,
hence the request for more information on how to diagnose the problem.
Only then would I analyse further and ask for help with more details.
I have compiled and executed the code with profiling. "grpof" shows
that about 8-10 % of the time is spent on folding over the trie (I
use map folds and finds, why are map folds taking so long?). In other
words it is not an issue on unification or the resolution function but
the search in the trie. I also find calls to caml currying functions.
This seems to point me to [1] for solutions.
I am going to make some additional experiments in order to diagnose
the problem further.
One simple question: is Ocaml matching fast enough that I need
not worry with:
a) the number of variants in a type
b) comparisons of the sort:
and bind_all_var_to_any f h ps t delay acc =
(* k a b -> b *)
let scan k e acc =
match e with
| Leaf _ -> acc
| Node(Pred.Rel _,ps',vs',jps',jvs') ->
(* jump over ptedicate *)
fold_all f jps' jvs' t delay acc
| Node(_,ps',vs',jps',jvs') ->
fold_all f ps' vs' t delay acc
in
Node_map.fold (scan) ps acc
Once again, appreciate any comments on the above.
Regards,
Hugo F.
[1] http://ocaml.janestreet.com/?q=node/30
Best regards,
Andrej
_______________________________________________
Caml-list mailing list. Subscription management:
http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
Archives: http://caml.inria.fr
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs
_______________________________________________
Caml-list mailing list. Subscription management:
http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
Archives: http://caml.inria.fr
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs