Uri Guttman <[EMAIL PROTECTED]> writes:
>  NI> i.e. 
>  NI>      R4 = frame[N]
>  NI> is same cost as
>  NI>      R4 = per_thread[N]
>  NI> and about the same as
>  NI>      extern REGISTER GlobalRegs4     
>  NI>      R4 = GlobalRegs4;
>
>well, if there is no multithreading then you don't need the per_thread
>lookup. 

Well:
 (a) I thought the plan was to design threads in from the begining this time.
 (b) I maintain that cost is about the same as global variables anyway.

The case for (b) is as follows:
on RISC hardware

    R4 = SomeGlobal;

becomes two instructions:

    loadhigh SomeGlobal.high,rp 
    ld rp(SomeGlobal.low),R4

The C compiler will try and factor out the loadhigh instruction, leaving
you with an indexed load. In most cases 

    ld rp(RegBase.low+4),R4

is just a valid and takes same number of cycles, and there is normally
a form like

    ld rp(rn),R4

Which allows "index" by variable amount.


On CISC machines, then either there is an invisible RISC (e.g. Pentium)
which behaves as above or you get something akin to PDP-11 where indirection
reads a literal address via the "program counter".

    move [pc+n],r4

In such cases 

    move [regbase+n],r4 

is going to be just as fast - the issue is the need for a (real machine)
register to hold 'regbase'.

>and the window base is not accounted for. you would need 2
>indirections, the first to get the window base and the second to get the
>register in that window. 

No - you keep the window base "handy" and don't keep re-fetching it,
same way you keep "program counter" and "stack pointer" "handy".

Getting      
   window[N] 
is same cost as 
   next = *PC++; 

My point is that to avoid keeping too-many things "handy" window base
and stack pointer should be the same (real machine) register.

>i am just saying register windows don't seem to
>be any win for us and cost an extra indirection for each data access. my
>view is let the compiler keep track of the register usage and just do
>individual push/pops as needed when registers run out.

That makes sense if (and only if) virtual machine registers are real 
machine registers. If virtual machine registers are in memory then 
accessing them "on the stack" is just as efficient (perhaps more so)
than at some other "special" location. And it avoids need for 
memory-to-memory moves to push/pop them when we do "spill".
     
-- 
Nick Ing-Simmons
who is looking for a new job see http://www.ni-s.u-net.com/

Reply via email to