I may be approaching semi-radical territory here with this idea.

I've read all the FAQs and reasons on why we choose a register
architecture versus a stack architecture. However, I recently thought of a
combination idea which, (although it was probably discovered in the
70s sometime,) I think provides the best of both worlds, and would like
to propose it before I shove it off to the dust-bin.

Problems with register architecture:

with caller-save, we're saving 0.5KB (4types*32registers*4bytes) of data
per function call, which might add up with deeply-nested functions.

If we want more than 32 elements, we need to start doing stack-pushing to
get around limitations. One thing that I wanted to do in a regex
implementation was use the full set of registers to store off certain
points in the compiled regex. With 32 registers, longer regexes will
require stack pushing at a certain point, and that will make a certain
transition of the regex slower than the other portions of the regex.


Problems with stack architecture:

time spent pushing/popping ops (also happens with parrot registers if we
use >32 elements)

stack grows at runtime


Now, my proposal is simply that instead of hardcoding to 32 elements, we
allow the function to determine the number of elements in each register
type. This gives us:

each function uses a minimal of space, since it only uses as many
registers as it requires

leaf accessor functions have 0 or 1 registers for the most part, so we
don't need to allocate a whole new set of registers for them.
Caller-save is just pushing the current register frame onto the stack,
and allocating a properly-sized register frame for the current leaf
function, which is very small. Should be efficient.

the register stack only gets used for functions. For the most part,
functions allocate all the space they know they'll need, and that's it.
(Of course, I suppose there are probably legit reasons for functions to
use the register stack frames, however).

We can use more than 32 elements. This means a 120-node regex can allocate
120 int registers for it's operation and execution. (Not saying that it's
1:1, more like 1:1 for non-greedy regex ops only, but still....)

No need to worry about register liveness/allocation to make things work.
We now only need to do it if we want to make things run faster by using
*less* than one register per C/perl stack variable.


It's probably a bit late in the parrot development cycle to be propsing an
idea like this, but I suppose since this idea couldn't have been
implemented until we had functions which could store register
requirement
information, this might actually be a good time to suggest it. :)


Thoughts or comments? Thanks,
Mike Lambert

Reply via email to