On 19/11/2010, at 10:10 AM, Erick Tryzelaar wrote:

> On Thu, Nov 18, 2010 at 2:06 PM, john skaller
> <skal...@users.sourceforge.net> wrote:
>> 
>> I've tried hash-consing: slows the program down 2x.
> 
> I've heard that can happen. 

> Which mentions getting good results with memoization. I was hoping
> we'd have similar opportunities.

We do, and they're used: the ticache caches types of functions,
which is the primary type needed to compute the return types of
other functions and do overload resolution. We also cache the
environments.

If we wanted to do more caching, the proper way to do it is
to put the bound expressions, types and statements IN
the unbound symbol table.

Originally I did this, but I removed it for two reasons: first,
it prevents separating bound and unbound types because
of Ocaml's lack of type recursion. The two types have to
be declared along with ALL the others in a single file.

But this cannot be done because of another serious fault
in Ocaml: functors require a closed type as an argument.
Unfortunately our type terms include functorised types
inside them.. which is impossible in Ocaml, at least without
some heavy duty tricks.

The problem is you have to instantiate functor manually,
unlike C++ templates which you just "use" and the compiler
collects the instantiations (same in Felix). And these instantiations
cannot occur inside a type .. and .. and .. recursions which means
you cannot use a functorised t inside any of those types if t is
one of the types you're defining.

All in all, Ocaml has a weak type system compared to Felix.
Pretty weird for an academic made programming language
specifically designed to be strongly typed, and where the main
research focus is on the type system.

The problem is that researchers use weak theoretical results
and try to improve them, whereas I just wrote the code to do
what had to be done, if I couldn't make it fly I threw that concept out.

Originally Felix had functors, which are trivial to implement:
they're just bodies inside a scope with some parameters,
and instantiation is done just by filling in the parameters 
and binding the result. Trivial, except by reusing the body
of the functor with different arguments the type changed
which prevented caching the bound types inside the body
in a closed form.

The workaround would be to copy the body every time but I got
lazy, copying symbol definitions is difficult. So I just gave up
caching in the unbound table.

But then I gave up functors too. So the caching could go back
where it belongs if we could work around the deficiencies
in Ocaml's type system.

Only .. now we have type classes and I don't know what impact
that would have.

Also, as it is we could probably add functors back in .. :)

Not sure. Much of the cost in Felix is that EVERY expression
is associated with its type. In most compilers, you don't cache
the types with every expression. You lookup the primitive
leaves and recalculate the types every time you need them.

Frankly, this would probably be better, because when you're doing
things like reductions folding etc, you have to do it for every type
associated with every sub-expression.

Also the way Felix represents type recursion isn't conventional:
instead of

        rec t . 1 + int * t  // aka fix t . ... aka (1 + int * t) as t

for a list, Felix acually uses

        1 + int * fix -2

where the -2 means "put the fixpoint binder up two levels".
This can be hard to work with, you can't just pattern match a type,
because the subcomponents become inconsistent. You have to
unfold the type once before matching to prevent this.

With a normal binder, you know when you need to unfold because
you hit the binder during matching.

The disadvantage of the binder is that it introduces a new named
type variable (which is not your usual type variable).
This make it hard to say, compare types because you have to alpha
convert and keep track of the fixpoint variable names so that you unify
two distinct names if they're both from matching fixpoint binders:
with the Felix encoding there is no need for this: fix -2 matches fix -2.

Anyhow, I have "watched" the lookup routine to see if I can spot any
places where there is repetition which would benefit from caching.

There are certainly such places. For example we could cache the 
bind-expression. The problem is that the key here is an expression
and comparing them for equality for hash-consing is more expensive
than just doing the binding, considering that most searches of the
cache will fail. Also you have to store the environment along with
the expression term.


The objective now is to forget about it: by binding the whole library
and storing the bound symbol tables on disk, the cost of binding
the library is effectively zero.

The big cost is in the optimisation, because that processes
"the whole program" and the user cant reduce the cost.



--
john skaller
skal...@users.sourceforge.net





------------------------------------------------------------------------------
Beautiful is writing same markup. Internet Explorer 9 supports
standards for HTML5, CSS3, SVG 1.1,  ECMAScript5, and DOM L2 & L3.
Spend less time writing and  rewriting code and more time creating great
experiences on the web. Be a part of the beta today
http://p.sf.net/sfu/msIE9-sfdev2dev
_______________________________________________
Felix-language mailing list
Felix-language@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/felix-language

Reply via email to