Steve Reinhardt wrote:
Thanks for the email... can't say I really follow all the nuances
after a quick read, but I'm glad you're thinking about it.  Just a few
comments off the top of my head:

The common indexing scheme across all register types is something we
inherited from SimpleScalar.  It's not ideal for actually indexing
into register files, but the main benefit is that it gives a single
flat namespace for tracking register dependencies, which simplifies
things.  (At least is simplified the old FullCPU model; I assume it
helps in O3 as well.)

What do you mean by common indexing scheme? What I'm talking about doesn't allow for index aliasing or anything like that so I think I'm missing something. If you're talking about DepTags, I'd actually be happy to see those go. They've caused a number of very subtle and VERY annoying bugs to find and fix. I'm thinking it would be more like a two dimensional register space (file, index) rather than a two dimensional space projected into one which can lead to accidental overlaps. I can see how that might be more complicated though.

While I think doing heterogeneous ISAs could be useful, I don't want
to lose performance for it, esp. in the common homogeneous case.  My
thought was that we'd always keep the ISA as a static parameter, but
then compile and link in multiple instantiations of the CPU model for
each ISA we cared about.
If it's a template parameter I think that does basically what you're saying.
I also don't think the ISA itself should have any state... CPUs or
other components can have ISA-specific state, and the ISA can
certainly have a lot of constants associated with it, but there
shouldn't be any dynamic state associated solely with the ISA.  Thus I
don't see where making it an object has any advantages over the status
quo.

There are two benefits I see. First off, you can keep track of things statically and update them only as necessary for things like register flattening and predecoding in x86 which I mentioned below. The state as in registers isn't in the ISA, but this residual configuration information is, stuff like the old register window code I had using bits of the index to look into a register paging like structure. Second, it makes it easy for the different pieces of an instance of an ISA to communicate. Right now, they need to use accessors in the thread context and casting to get what they need. That's why I'd have had to add an accessor for the Interrupts object if I wanted the miscregfile to be able to communicate with it.

Steve


On Sat, May 24, 2008 at 12:01 AM, Gabe Black <[EMAIL PROTECTED]> wrote:
For now, I'm going to make the miscregfile have the event and cause an
interrupt when the timer goes off. This really sounds crappy to me, but I'd
have to add new functions to get at the interrupt object like there are for
the TLB. That wouldn't be hard, but I wanted to point out I'd be adding a
new function all over the place in the CPU before I went and did it in case
that doesn't sound like a great idea. I personally think there are too many
functions in too many places in the CPU which is probably unavoidable, but
I'd prefer not to add another one. Also, while I was deciding what to do
about the local APIC, I tried to boil down what exactly an ISA is in M5,
what it should and shouldn't be doing, and if that suggested anything we
should be doing differently. That's basically what follows, so if you don't
care go ahead and stop reading.


Basically, the ISA is a set of policies and a collection of state which
directs them. The policies are layered on top of and direct the mechanisms
of the CPU. This is different from how I think the ISAs have been built up
to this point which was as a collection of objects which implemented the
ISAs policies and which were plugged into the CPU. Essentially, the
difference is that there would be a base TLB class which would contain the
behavior of all TLBs, they translate addresses, they have entries, the evict
things, blah blah, and the ISA would describe the policies that happened
under and what state, the entries and control registers, that guided it. The
TLB would call into the ISA to make decisions rather than the ISA defining a
TLB which would automatically make the "right" decisions.

Another important distinction here is that the state which controls the
policies of the ISA, most of the MiscRegs, are part of the ISA, but the
other registers are not. The policy directing how those registers are used,
namely how they're indexed, is, but the actual registers themselves are not.
Also, the way they're currently indexed, as integer/floating point/misc,
should really be setup and managed by the ISA and not the CPU. The CPU would
probably want to know what type a register is so it can have it's own
policies layered on top, like for instance separate pools of resources for
ints and floats, but it shouldn't force the ISA to glob those together. Even
that might be too much, because at least for X86 there are registers which
can be both floating point and integer depending on the instruction that
uses them. So really, the cpu should just expect some number of local
storage banks. In x86, those could correspond to the general (ha) purpose
registers, x87/mmx registers, xmm registers, and control register backing
storage. The control register backing storage could be broken down farther
into the CR control registers, debug registers, performance counter
registers, the LAPIC control registers, the MSRs, etc, etc. Like now, the
CPU would be responsible for providing the ISA the illusion that that was
how things worked, but the o3, for example, could keep track of the actual
storage however it liked.

Then, the ISA defined control register file(s) really are just register
file(s) which have their accesses intercepted and acted on in some way. They
could conceptually live in the TLB, Interrupt object, whatever, with the CPU
actually keeping track of the "state", aka what a read would return. The
objects the ISA controls would be able to keep their own local "cached"
versions of the control state in whatever way was convenient, like for
instance as a lookup table for register windows. The objects would need to
be able to "refresh" and reread the underlying control state to update their
caching data structures in cases where the underlying storage was updated
directly, like what "NoEffect" does right now. This takes care of some odd
circumstances where you can have trouble bootstrapping the control state in,
for instance, SE mode. You have to make sure everything actually uses the
control state you're righting, but it has to do that when not all of the
state has actually been set. Really, all that needs to be "refreshed" is the
control state which is part of the ISA, not of the objects in the CPU. In
that way, you don't have to refresh the TLBs, blah blah. You only have to
make the ISA configure itself based on the (to the CPU) generic array of
values stored in some of the register files.

There also would be objects with state and policy which are dynamically
generated like faults and instructions. The faults are policy with maybe a
little bubble of controlling state in them, and the instructions are policy
(how to execute) strapped onto generic objects.

So now, all the policies and state particular to the ISA have been localized
into one place, an abstract concept called the "ISA". That might as well be
an object since it has methods and fields just like an object would. Now
that the ISA is an object, which might as well be a SimObject, it can be
plugged into a CPU semi-arbitrarily to make that CPU do x86 or Alpha or
whatever, and you get heterogeneous CPUs. In reality a lot of other stuff
would need some tweaking, but I don't think that much.

At least one downside of looking at this stuff this way is performance.
Because the boundary between ISA and not ISA would be crossed a lot more
frequently, and if the ISA isn't compiled in which would imply virtual
function calls, there would be a lot of overhead from that. There would be a
performance boost, though, because some parts of the ISAs which are
inefficient, like the predecoder in x86 constantly figuring out what the
operand/address/stack sizes and mode are and SPARC deciding how to flatten
register indexes, would be much more efficient. Also, the implicit downside
is that that would uproot a LOT of code and take a long time to get back to
where things are now.

Gabe

_______________________________________________
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev

_______________________________________________
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev

Reply via email to