On Tue, 2007-10-02 at 11:33 +0200, Tom wrote:
> I am writing this mail to you only. 

replying also to Felix list since the information may be useful
there (but not to Ocaml list!)


> But, now I read that you have abandoned heap and use the stack
> instead? How so? Or am I misinterpreting your message?

The abstract model uses the heap for a spaghetti stack.

The implementation does too, it is not abandoned.

Functions always did use the machine stack 
for return addresses for C compatibility, however the rest
of the data (local variables etc) goes on the heap by default.

Procedures use the heap for the return address (it is part 
of the frame object).

However the *optimiser* tries to calculate when it can
go with no stack at all -- by inlining a call, for example --
or when it is safe to use the machine stack .. as well
as other optimisations.

The physical execution model includes

        * heap call and heap apply
        * stack call and stack appply
        * C function call and apply
        * goto


For example in Ackermann's function, the function is reduced
from heap object to stackable, and then down to a plain old
C function, with the tail-recursive calls done using a goto.

So this Felix:

fun ack(x:int,y:int):int =>
  if x == 0 then y + 1
  elif y == 0 then ack(x-1, 1)
  else ack(x-1, ack(x, y-1))
  endif
;

Turns into this C++ code:

//C FUNC <4781>: ack
int FLX_REGPARM ack( int x, int y){
    start_5719:;
      ifnot(x == 0 ==1) goto _5714;
      return y + 1 ;
    _5714:;
      ifnot(y == 0 ==1) goto _5715;
      y = 1;
      x = x - 1 ;
      goto start_5719;
    _5715:;
      y = ack(x, y - 1 );
      x = x - 1 ;
      goto start_5719;
}

And this is almost 2x faster than the C code:

int Ack(int M, int N) {
  if (M==0) return N +1;
  else if(N==0) return Ack(M-1,1);
  else return Ack(M-1, Ack(M,N-1));
}

with gcc 4.1.x on AMD64.  The latest SVN Felix also provides
gcc branch prediction hints automatically for loops.

you will note 2 of the 3 calls in Ackermann are tail calls,
and Felix reuses the frame by implementing the call with
assignments and a goto. The recursive call actually calls
the function, but it is a C function so it is fast.

Without tail optimisation, this function is not computable
beyond about y=8, ditto if the heap is used: the overhead
is just too high. The performance is linear in the
number of machine words stored per call, with different
overheaps for heap, C++ class on stack, and C function calling
protocols.

It seems by luck, the Felix C code minimises the number of
words on the stack: the minimum for a dumb compiler is 3 words:
x,y and the machine return address. On AMD64 the assembler is
hard to read because gcc unrolls the recursion 3 times,
for both the Felix and C versions -- however it does a better
job with the Felix generated code.

The reason the machine word count is critical here is that
the amount of cache used is NWORDS * recursion depth.

So .. the design is to use a simple heap based spaghetti stack,
but to perform as many optimisations as possible to reduce
the need to actually use it.

Another example:

        x = 1,2,3

in theory, the rhs is a tuple object which is assigned to x.
But the optimiser notes this assignment, which semantically
assigns the components in parallel, can safely be done
sequentially, and implements this instead:

        x[0]=1; x[1]=2; x[2]=3;

This works even with

        x,y,z = y,x,z;

Felix calculates how many temporaries are needed and a good 
order of assignment, so the result is semantically equivalent
to a parallel assignment, but faster.

Parallel assignment optimisation is also used in tail recursive
call optimisation, including in the Ackermann function example.

Another example: Felix currently has an aggressive "de-optimisation"
that converts functions to procedures. It does this in preparation
for allowing service calls which are needed for channel IO.
Functions can't do channel I/O at the moment, *because* they use
the machine stack for the return address -- and service calls
are implemented by yielding control to the driver with a 
C++ 'return this' statement.


-- 
John Skaller <skaller at users dot sf dot net>
Felix, successor to C++: http://felix.sf.net

-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2005.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
_______________________________________________
Felix-language mailing list
Felix-language@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/felix-language

Reply via email to