On Wed, Apr 25, 2012 at 11:00 AM, Gabriel Michael Black <
[email protected]> wrote:

> Quoting Nilay Vaish <[email protected]>:
>
>  On Wed, 25 Apr 2012, Ali Saidi wrote:
>>
>>
>>>
>>>  On April 25, 2012, 7:37 a.m., Steve Reinhardt wrote:
>>>>
>>>>> I believe I understand why this is necessary,
>>>>>
>>>>
>>>> Steve Reinhardt wrote:
>>>>   Oops, clicked "publish" instead of "save" by accident...
>>>>
>>>>   My concern is that we are adding an array to every instruction, and a
>>>> level of indirection to every register access, when the vast majority of
>>>> the time this is an identity mapping.  And "vast majority" will be 100% for
>>>> many ISAs.  I don't know how significant the performance impact will be,
>>>> but I'm confident that it will not be in the right direction.
>>>>
>>>>   I don't have a specific proposal to avoid this, but I don't want to
>>>> see this code go in lightly without consideration of the performance 
>>>> impact.
>>>>
>>>
>>> I second this concern.
>>>
>>>
>>> - Ali
>>>
>>
>>
>> Two things that I want to say in response --
>>
>> a. With respect to memory usage, it would go down if we do what Gabe
>> suggested in his review i.e. introduce members for each of the registers
>> instead of using an array. This would also remove the indirection. I'll get
>> performance numbers from some of the regression tests to carry out a
>> performance comparison.
>>
>> b. The aim for these set of patches is remove the RAW dependence on
>> condition code register in x86. Current approach is to figure out at run
>> time whether or not the CC register needs to be read. We require this
>> dynamic map to correctly read the registers. In ARM ISA description, some
>> thing similar has been done. At runtime, if a certain condition is true,
>> the CC register is reda, otherwise register 0 is read. IIRC, no instruction
>> can write to reg 0 in a RISC isa. Hence, there is no RAW dependence. Can we
>> do this in x86 as well? May be a register that is only visible to the
>> microcode.
>>
>
I agree, at a high level, that keeping the existing _srcRegIdx[] array and
just dealing with the fact that it may have gaps in it seems like an
alternative that's at least worth exploring.

We could also have a special register index like -1 that just means invalid
so that we don't rely on piggybacking on some potentially ISA-specific
characteristics to make it work.


>  The special handling of the zero register is actually somewhat annoying.
> The CPU has to check for it and treat it specially, and, last I recall, the
> way it holds its zero value is to be rewritten with zero over and over and
> over to squash any errant updates. I would like to get rid of this
> mechanism if possible, so my gut reaction is not to build more on top of
> it. That said, the zero register behavior isn't going away so it has to
> keep working somehow anyway. That may be a viable way to avoid adding
> storage to the StaticInst classes.


With Alpha, the decoder generates a no-op for any instruction whose
destination is the zero register, regardless of the opcode.  In theory,
this should make rewriting the zero value every cycle unnecessary.  I don't
know if some cases were missed, or if it's still needed for other ISAs, but
that's the direction I'd want to go to get rid of that aspect.  We could
always just replace the overwrite with an assertion and see how far we get.

On the input side, it would seem that instructions that read the zero
register could be decoded into StaticInsts that act as if it were an
immediate zero operand.

So I'd think it's at least theoretically possible to make the current
zero-register implementation go away.

Also, PowerPC has funky semantics in which the zero register actually holds
a value, and whether you get that value or 0 depends on the opcode.  I'm
not sure how that's implemented in gem5, but we can't assume that the
zero-register behavior is universal.

Steve
_______________________________________________
gem5-dev mailing list
[email protected]
http://m5sim.org/mailman/listinfo/gem5-dev

Reply via email to