For now, I'm going to make the miscregfile have the event and cause an
interrupt when the timer goes off. This really sounds crappy to me, but
I'd have to add new functions to get at the interrupt object like there
are for the TLB. That wouldn't be hard, but I wanted to point out I'd be
adding a new function all over the place in the CPU before I went and
did it in case that doesn't sound like a great idea. I personally think
there are too many functions in too many places in the CPU which is
probably unavoidable, but I'd prefer not to add another one. Also, while
I was deciding what to do about the local APIC, I tried to boil down
what exactly an ISA is in M5, what it should and shouldn't be doing, and
if that suggested anything we should be doing differently. That's
basically what follows, so if you don't care go ahead and stop reading.
Basically, the ISA is a set of policies and a collection of state which
directs them. The policies are layered on top of and direct the
mechanisms of the CPU. This is different from how I think the ISAs have
been built up to this point which was as a collection of objects which
implemented the ISAs policies and which were plugged into the CPU.
Essentially, the difference is that there would be a base TLB class
which would contain the behavior of all TLBs, they translate addresses,
they have entries, the evict things, blah blah, and the ISA would
describe the policies that happened under and what state, the entries
and control registers, that guided it. The TLB would call into the ISA
to make decisions rather than the ISA defining a TLB which would
automatically make the "right" decisions.
Another important distinction here is that the state which controls the
policies of the ISA, most of the MiscRegs, are part of the ISA, but the
other registers are not. The policy directing how those registers are
used, namely how they're indexed, is, but the actual registers
themselves are not. Also, the way they're currently indexed, as
integer/floating point/misc, should really be setup and managed by the
ISA and not the CPU. The CPU would probably want to know what type a
register is so it can have it's own policies layered on top, like for
instance separate pools of resources for ints and floats, but it
shouldn't force the ISA to glob those together. Even that might be too
much, because at least for X86 there are registers which can be both
floating point and integer depending on the instruction that uses them.
So really, the cpu should just expect some number of local storage
banks. In x86, those could correspond to the general (ha) purpose
registers, x87/mmx registers, xmm registers, and control register
backing storage. The control register backing storage could be broken
down farther into the CR control registers, debug registers, performance
counter registers, the LAPIC control registers, the MSRs, etc, etc. Like
now, the CPU would be responsible for providing the ISA the illusion
that that was how things worked, but the o3, for example, could keep
track of the actual storage however it liked.
Then, the ISA defined control register file(s) really are just register
file(s) which have their accesses intercepted and acted on in some way.
They could conceptually live in the TLB, Interrupt object, whatever,
with the CPU actually keeping track of the "state", aka what a read
would return. The objects the ISA controls would be able to keep their
own local "cached" versions of the control state in whatever way was
convenient, like for instance as a lookup table for register windows.
The objects would need to be able to "refresh" and reread the underlying
control state to update their caching data structures in cases where the
underlying storage was updated directly, like what "NoEffect" does right
now. This takes care of some odd circumstances where you can have
trouble bootstrapping the control state in, for instance, SE mode. You
have to make sure everything actually uses the control state you're
righting, but it has to do that when not all of the state has actually
been set. Really, all that needs to be "refreshed" is the control state
which is part of the ISA, not of the objects in the CPU. In that way,
you don't have to refresh the TLBs, blah blah. You only have to make the
ISA configure itself based on the (to the CPU) generic array of values
stored in some of the register files.
There also would be objects with state and policy which are dynamically
generated like faults and instructions. The faults are policy with maybe
a little bubble of controlling state in them, and the instructions are
policy (how to execute) strapped onto generic objects.
So now, all the policies and state particular to the ISA have been
localized into one place, an abstract concept called the "ISA". That
might as well be an object since it has methods and fields just like an
object would. Now that the ISA is an object, which might as well be a
SimObject, it can be plugged into a CPU semi-arbitrarily to make that
CPU do x86 or Alpha or whatever, and you get heterogeneous CPUs. In
reality a lot of other stuff would need some tweaking, but I don't think
that much.
At least one downside of looking at this stuff this way is performance.
Because the boundary between ISA and not ISA would be crossed a lot more
frequently, and if the ISA isn't compiled in which would imply virtual
function calls, there would be a lot of overhead from that. There would
be a performance boost, though, because some parts of the ISAs which are
inefficient, like the predecoder in x86 constantly figuring out what the
operand/address/stack sizes and mode are and SPARC deciding how to
flatten register indexes, would be much more efficient. Also, the
implicit downside is that that would uproot a LOT of code and take a
long time to get back to where things are now.
Gabe
Gabe Black wrote:
I was thinking about this more, and, bus frequency aside, it seems
like the LAPIC is basically exactly what that interrupts object I made
in the CPU is for, namely managing interrupts and feeding them one at
a time on request and according to the ISA's policy to the CPU. At the
moment that's just a C++ class for the purpose of abstraction if I
remember right, but it might be good to promote it to a full
simobject. That limits people's ability to put it in weird places, but
they'd probably want to do that with the IO APIC (the one the devices
actually talk to) anyway. This also touches on something I've been
thinking about, namely that control registers might be more closely
associated with particular structures like the TLB or the interrupts
object or the register flattening than with the CPU itself or
controlling instructions. I was thinking it might be a good idea to
set up some sort of call back interface on the MiscRegFile so those
"devices" can keep track of those values more with an interrupt model
than a polling model which would probably perform much better. The
actual storage could be moved over to them or not, that's more of an
implementation detail, but either way it would immediately take care
of the performance problems with SPARCs register window handling stuff.
Gabe
Steve Reinhardt wrote:
On Thu, May 22, 2008 at 11:45 AM, Gabe Black <[EMAIL PROTECTED]
<mailto:[EMAIL PROTECTED]>> wrote:
One problem is that this isn't a parent/child relationship. For
instance, there could be four CPUs and four local APICs all on the
same bus, and they need to know which one goes with which CPU. In
that case it would be arbitrary. [...] Also, all the local APICs
appear at the same address but are only visible to their CPU. I
was going to try to deal with that by sticking them into their own
address space multiplexed on some address bits, but I don't like
the scalability of that.
Yea, that's the kind of thing I meant by "fiddling with address
mappings". I wouldn't disagree that it's a bit hackish, but it seems
plenty scalable to me. If you do that it does solve the "all the
APICs on one local bus" problem too.
Another idea is a bus local to the CPU which has the CPU and APIC
on it. That would have to connect to the external interconnect
somehow. That could also be a home for the page table walker and
tlb eventually if those go "off chip" and into the memory system.
The tricky part here is doing that without adding latency to non-APIC
memory accesses (and appropriately forwarding flow-control state
through from the cache, etc.). It's probably doable but it would be
nicer to have a more general solution that didn't require this (IMO).
If the interconnect isn't a bus, then you can't just stick the
APICs anywhere since they're supposed to be tightly integrated on
die. You'd need to make sure they ended up very close to their
CPUs, and having that over a network hop would be substantially
unrealistic.
Don't forget that part of the point of m5 is to be able to model
things that are "substantially unrealistic", like putting a PCI NIC
next to the L2 cache :-). I think a solution that lets you put the
APIC wherever you want is a good idea, even if most of those places
are obviously stupid.
I don't think it matters much whether the CPU can check the
interconnect frequency, although it probably can through some
mechanism somewhere like CPUID or the stuff the processor sticks
into PCI config space according to the AMD kernel/BIOS developers
guide. The timer is being setup and used for something, and I
would imagine whatever is using it might break if the frequency is
way off. Then again maybe not. I'd rather make it do what it's
supposed to do than to try to figure out and fix the problems it
may cause later.
I think the key is separating the stuff that needs to be a certain
way to get things to work from the stuff that is just done a certain
way just because that's easy/traditional/whatever. So clearly the
APIC timer needs to run at a frequency within some range, such that
the CPU can figure out what that frequency is, but I'm guessing that
as long as that frequency is reasonable it doesn't matter exactly
what it's relationship is to the interconnect. So just having a
parameter that lets you set the frequency sounds fine to me; if in
the longer term we find that there really is a need to have it tied
to the interconnect frequency we can always set that up in the python.
Steve
------------------------------------------------------------------------
_______________________________________________
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev
_______________________________________________
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev
_______________________________________________
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev