At 11:17 AM 10/29/2001 -0500, Gregor N. Purdy wrote:
> > > > *) The first five registers (I0-I4, S0-S4, P0-P4, N0-N4) are scratch
> > > and do
> > > > not have to be preserved by the callee
> > >
> > >Still thinking about this... We are reducing the overall number of reg
> > >copies going on by adding these special cases. I just wish we had an
> > >approach that was both uniform (simple, no special cases) and fast too.
> >
> > You, and me, and about a zillion other people. Generally speaking the
> > choices are fast, uniform, and scalable. Choose two.
>
>Hmmmm. I tried reading section 29 (Subroutine Linkage) of the MMIXware
>book (pages 32-34) for inspiration, but I didn't see how anything there
>could help us. MMIX has 256 logical general-purpose 64-bit registers.
>That's a handy reg size since a reasonable float can sit in there as
>well as an unreasonable int. The local-marginal-global register
>distinction used by MMIX is interesting, but I think it might lose its
>appeal with 4 distinct typed register files.
>
>Knuth does make the statement:
>
>     These conventions for parameter passing are admittedly a bit
>     confusing in the general case, and I suppose people who use them
>     extensively might sometime find themselves talking about "the
>     infamous MMIX register shuffle." However, there is good use for
>     subroutines that convert a sequence of register contents like
>     (x, a, b, c) into (f, a, b, c) where f is a function of a, b,
>     and c but not x. Moreover PUSHGO and POP can be implemented with
>     great efficiency, and subroutine linkage tends to be a significant
>     bottleneck when other conventions are used.
>
>Its that last sentence that got my attention... But, I still don't
>know if we could make use of any of those ideas. I can imagine
>having separate L and G for each register file, and otherwise
>following the same procedure, but I suspect we'd be unhappy with
>the MMIX conventions without having a larger number of registers.

I'll have to snag that manual next time I'm around a good bookstore. I've 
not read it as of yet, and Knuth generally has good things to say.

A split between local, marginal, and global registers would be an 
interesting thing to do, and I can see it making the code more elegant. I 
worry about it making things more complex, though, especially with us 
already having multiple register types. (We'd double or triple the number 
of register types essentially, and to some extent blow cache even more than 
we do now. Might be a win in other ways, though. I'll have to ponder a bit)

>BTW, how did you choose 32 for the number of regs?

Picked it out of the air. :)

Seriously, I wanted a power-of-two number, I wanted the resulting size of a 
register file to be equal to or smaller than your average page size (512 
bytes for most folks IIRC) and I wanted to be able to encode the register 
number and type in a single byte if it turned out that the overhead of 
decoding was smaller than the speed hit we took from the extra bus 
bandwidth wasting a full 32 bit word for each parameter.

So, the two-bit type limits us to 64 registers max, and that seemed a bit 
too big in the general case. 16 was too few by a bit (most of my compiler 
books say that's not quite enough for most code, and you'll end up with 
overflow to the stack to handle temps), so that left 32. Still a bit big in 
some cases, especially considering we have four full sets of registers, but 
we'll see how that goes.

>Yeah. I'm trying very hard not to put anything really sophisticated
>into jakoc (at least not yet). Right now I can still tweak things
>reasonably well. If I add much more complexity, I'm going to have
>to actually write a real compiler, and if I write a real compiler I
>probably won't be able to resist the temptation to turn Jako into the
>language I *really* wish I had, and that would be a bigger project.

And this would be a bad thing because? (Well, besides the demands on what 
little free time you might have now, but that's not our problem... :)

> > > > *) The callee is responsible for making sure the stack is cleaned off.
> > >
> > >So, in the case of zero args, do we still push a zero on the stack to
> > >make a proper frame? I think yes...
> >
> > If the function is listed as taking a variable number of args, yes.
> > Functions marked as taking no args at all don't get anything put on the 
> stack.
>
>I'm thinking yes because of stack unwinding. Don't we need to have
>parity between return addresses on their stack and frames of args
>on their stack?

Sort of. The only place we really need to have it is for the exception 
handling, which needs to quickly unwind the register stacks, but I'm 
thinking we'll push the addresses of the current register files when we 
push an exception handler, and restore them (along with the stack) when we 
catch an exception.

>Oh wait. We're popping (restoring) those off the stack
>on subroutine entry, so in general the arg stack should be empty
>most of the time, right?

I don't know that it'll be empty all the time, as it's a general-purpose 
save stack in addition to being an arg stack. There may be some overflow 
data on it, and some languages may choose to use it for most of their work 
because they find it fits their models better.

>Adding to that the fact that most of the
>time our args and results will be passed in regs, and I guess I
>can see that we won't need it. Except for the variable arg and variable
>result case, which is what Jako does today.

Keen. Parrot-level vararg stuff will probably be reasonably uncommon, but 
we'll see how it goes.

                                        Dan

--------------------------------------"it's like this"-------------------
Dan Sugalski                          even samurai
[EMAIL PROTECTED]                         have teddy bears and even
                                      teddy bears get drunk

Reply via email to