My feeling is that the current implementations of stacks are not
adequate:
- the control stacks store too many registers at once;
- the generic stack is typed, so it is slow;
- none of these stacks provide any support for register
  spilling/reload: there is no opcode to get or set the
  n-th element of a stack.

So, I propose to replace them by a single stack.  Here is a design
proposal.  What do you think about it?

-- Jerome

Description
===========

A stack frame has the following structure:

    ---+---+-------------+--------------+---+---+---
       |(1)|      (2)    |     (3)      |(4)|(5)|
    ---+---+-------------+--------------+---+---+---
            ^
            |
            stack pointer

  (1) frame description (1 stack entry)
  (2) function arguments
  (3) local data
  (4) return address (1 entry)
  (5) saved stack pointer (1 entry)
      (point to the beginning of (2))

The stack pointer points to the beginning of the function arguments of
the topmost frame.

The frame description is a pointer to a structure containing:
- the number of arguments of each type,
- the type of the local data
- the size of the frame
- the amount of stack needed by the function (including the space
  needed for them arguments of the functions that may be called from
  this function)
- possibly some other information, if needed

We have the following operations (I use some pseudo C code to explain
them):
- Function call: call
  P0 contains the function closure, and I0 a pointer to a description
  of the arguments.
      size = ((frame_desc *) st[-1])->size;
                                 /* get the current frame size */
      st[size - 3] = pc + 2;     /* save the return address */
      st[size - 2] = st;         /* save the stack pointer */
      st += size;                /* move to the next stack frame */
      pc = ((closure *)P0)->pc;  /* jump to the function address */
- Function tail call: tailcall
  P0 contains the function closure, and I0 a pointer to a description
  of the arguments.
      pc = ((closure *)P0)->pc;  /* jump to the function address */
- Function initialization: frame <desc>
  <desc> points to a frame description.
      /* Check the arguments number and types (compare the
         information in I0 and <desc>) */
      /* Check whether there is enough room on the stack for the local
         data and the arguments of the functions this function may
         call; if not, then allocate a new stack chunk */
      /* Fill the frame with GC-safe values */
      st[-1] = <desc>            /* store the frame description */
- Function return : ret
      pc = st[-3];               /* jump to the return address */
      st = st[-2];               /* restore the stack pointer */
- Stack read access : load <reg>, <pos>
      <reg> = st[<pos>]
- Stack write access : store <pos>, <reg>
      st[<pos>] = <reg>

GC
==
The GC can walk the stack using the saved stack pointer:
   st = st[-2];  /* Move to the previous stack frame */
It can use the frame description (st[-1]) to know which stack entry
need to be followed.

When performing a function call, we may need to put some arguments on
the stack, beyond the end of the frame.  Also, for tail calls, we are
going to overwrite some function arguments and maybe some local data.
We should be careful that the GC is not triggered at this point.
For instance, we can first write the function arguments to the local
data, and only afterwards move them to their final location.

A better design would be to have a table which indicates for each
return address which entries of the corresponding frame need to be
followed by the GC.  Then, we don't need to store frame descriptions
on the stack, nor to initially fill the frame with GC-safe values.

Debugging
=========
We could have a debugging version of the opcodes, that make some
sanity checks:
  - we don't write out of the frame;
  - we don't write a value of some type and then reload it with
    another type;
  ...
The opcode version could be selected during bytecode fixup.

Floats
======
On a lot of architectures, they are twice as large as integers and
pointers.  So, we have two options:
- make the stack entries large enough for floats (this wastes memory)
- use to stack entries to store floats (we then need to be careful
  with memory alignement)

Reply via email to