Re: [m5-dev] local APIC timer and bus frequency

Gabe Black Fri, 23 May 2008 23:52:03 -0700

For now, I'm going to make the miscregfile have the event and cause aninterrupt when the timer goes off. This really sounds crappy to me, butI'd have to add new functions to get at the interrupt object like thereare for the TLB. That wouldn't be hard, but I wanted to point out I'd beadding a new function all over the place in the CPU before I went anddid it in case that doesn't sound like a great idea. I personally thinkthere are too many functions in too many places in the CPU which isprobably unavoidable, but I'd prefer not to add another one. Also, whileI was deciding what to do about the local APIC, I tried to boil downwhat exactly an ISA is in M5, what it should and shouldn't be doing, andif that suggested anything we should be doing differently. That'sbasically what follows, so if you don't care go ahead and stop reading.

Basically, the ISA is a set of policies and a collection of state whichdirects them. The policies are layered on top of and direct themechanisms of the CPU. This is different from how I think the ISAs havebeen built up to this point which was as a collection of objects whichimplemented the ISAs policies and which were plugged into the CPU.Essentially, the difference is that there would be a base TLB classwhich would contain the behavior of all TLBs, they translate addresses,they have entries, the evict things, blah blah, and the ISA woulddescribe the policies that happened under and what state, the entriesand control registers, that guided it. The TLB would call into the ISAto make decisions rather than the ISA defining a TLB which wouldautomatically make the "right" decisions.

Another important distinction here is that the state which controls thepolicies of the ISA, most of the MiscRegs, are part of the ISA, but theother registers are not. The policy directing how those registers areused, namely how they're indexed, is, but the actual registersthemselves are not. Also, the way they're currently indexed, asinteger/floating point/misc, should really be setup and managed by theISA and not the CPU. The CPU would probably want to know what type aregister is so it can have it's own policies layered on top, like forinstance separate pools of resources for ints and floats, but itshouldn't force the ISA to glob those together. Even that might be toomuch, because at least for X86 there are registers which can be bothfloating point and integer depending on the instruction that uses them.So really, the cpu should just expect some number of local storagebanks. In x86, those could correspond to the general (ha) purposeregisters, x87/mmx registers, xmm registers, and control registerbacking storage. The control register backing storage could be brokendown farther into the CR control registers, debug registers, performancecounter registers, the LAPIC control registers, the MSRs, etc, etc. Likenow, the CPU would be responsible for providing the ISA the illusionthat that was how things worked, but the o3, for example, could keeptrack of the actual storage however it liked.

Then, the ISA defined control register file(s) really are just registerfile(s) which have their accesses intercepted and acted on in some way.They could conceptually live in the TLB, Interrupt object, whatever,with the CPU actually keeping track of the "state", aka what a readwould return. The objects the ISA controls would be able to keep theirown local "cached" versions of the control state in whatever way wasconvenient, like for instance as a lookup table for register windows.The objects would need to be able to "refresh" and reread the underlyingcontrol state to update their caching data structures in cases where theunderlying storage was updated directly, like what "NoEffect" does rightnow. This takes care of some odd circumstances where you can havetrouble bootstrapping the control state in, for instance, SE mode. Youhave to make sure everything actually uses the control state you'rerighting, but it has to do that when not all of the state has actuallybeen set. Really, all that needs to be "refreshed" is the control statewhich is part of the ISA, not of the objects in the CPU. In that way,you don't have to refresh the TLBs, blah blah. You only have to make theISA configure itself based on the (to the CPU) generic array of valuesstored in some of the register files.

There also would be objects with state and policy which are dynamicallygenerated like faults and instructions. The faults are policy with maybea little bubble of controlling state in them, and the instructions arepolicy (how to execute) strapped onto generic objects.

So now, all the policies and state particular to the ISA have beenlocalized into one place, an abstract concept called the "ISA". Thatmight as well be an object since it has methods and fields just like anobject would. Now that the ISA is an object, which might as well be aSimObject, it can be plugged into a CPU semi-arbitrarily to make thatCPU do x86 or Alpha or whatever, and you get heterogeneous CPUs. Inreality a lot of other stuff would need some tweaking, but I don't thinkthat much.

At least one downside of looking at this stuff this way is performance.Because the boundary between ISA and not ISA would be crossed a lot morefrequently, and if the ISA isn't compiled in which would imply virtualfunction calls, there would be a lot of overhead from that. There wouldbe a performance boost, though, because some parts of the ISAs which areinefficient, like the predecoder in x86 constantly figuring out what theoperand/address/stack sizes and mode are and SPARC deciding how toflatten register indexes, would be much more efficient. Also, theimplicit downside is that that would uproot a LOT of code and take along time to get back to where things are now.


Gabe

Gabe Black wrote:

I was thinking about this more, and, bus frequency aside, it seemslike the LAPIC is basically exactly what that interrupts object I madein the CPU is for, namely managing interrupts and feeding them one ata time on request and according to the ISA's policy to the CPU. At themoment that's just a C++ class for the purpose of abstraction if Iremember right, but it might be good to promote it to a fullsimobject. That limits people's ability to put it in weird places, butthey'd probably want to do that with the IO APIC (the one the devicesactually talk to) anyway. This also touches on something I've beenthinking about, namely that control registers might be more closelyassociated with particular structures like the TLB or the interruptsobject or the register flattening than with the CPU itself orcontrolling instructions. I was thinking it might be a good idea toset up some sort of call back interface on the MiscRegFile so those"devices" can keep track of those values more with an interrupt modelthan a polling model which would probably perform much better. Theactual storage could be moved over to them or not, that's more of animplementation detail, but either way it would immediately take careof the performance problems with SPARCs register window handling stuff.
Gabe

Steve Reinhardt wrote:
On Thu, May 22, 2008 at 11:45 AM, Gabe Black <[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>> wrote:
    One problem is that this isn't a parent/child relationship. For
    instance, there could be four CPUs and four local APICs all on the
    same bus, and they need to know which one goes with which CPU. In
    that case it would be arbitrary. [...] Also, all the local APICs
    appear at the same address but are only visible to their CPU. I
    was going to try to deal with that by sticking them into their own
    address space multiplexed on some address bits, but I don't like
    the scalability of that.
Yea, that's the kind of thing I meant by "fiddling with addressmappings". I wouldn't disagree that it's a bit hackish, but it seemsplenty scalable to me. If you do that it does solve the "all theAPICs on one local bus" problem too.
    Another idea is a bus local to the CPU which has the CPU and APIC
    on it. That would have to connect to the external interconnect
    somehow. That could also be a home for the page table walker and
    tlb eventually if those go "off chip" and into the memory system.
The tricky part here is doing that without adding latency to non-APICmemory accesses (and appropriately forwarding flow-control statethrough from the cache, etc.). It's probably doable but it would benicer to have a more general solution that didn't require this (IMO).
    If the interconnect isn't a bus, then you can't just stick the
    APICs anywhere since they're supposed to be tightly integrated on
    die. You'd need to make sure they ended up very close to their
    CPUs, and having that over a network hop would be substantially
    unrealistic.
Don't forget that part of the point of m5 is to be able to modelthings that are "substantially unrealistic", like putting a PCI NICnext to the L2 cache :-). I think a solution that lets you put theAPIC wherever you want is a good idea, even if most of those placesare obviously stupid.
    I don't think it matters much whether the CPU can check the
    interconnect frequency, although it probably can through some
    mechanism somewhere like CPUID or the stuff the processor sticks
    into PCI config space according to the AMD kernel/BIOS developers
    guide. The timer is being setup and used for something, and I
    would imagine whatever is using it might break if the frequency is
    way off. Then again maybe not. I'd rather make it do what it's
    supposed to do than to try to figure out and fix the problems it
    may cause later.
I think the key is separating the stuff that needs to be a certainway to get things to work from the stuff that is just done a certainway just because that's easy/traditional/whatever. So clearly theAPIC timer needs to run at a frequency within some range, such thatthe CPU can figure out what that frequency is, but I'm guessing thatas long as that frequency is reasonable it doesn't matter exactlywhat it's relationship is to the interconnect. So just having aparameter that lets you set the frequency sounds fine to me; if inthe longer term we find that there really is a need to have it tiedto the interconnect frequency we can always set that up in the python.
Steve

------------------------------------------------------------------------

_______________________________________________
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev
_______________________________________________
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev


_______________________________________________
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev

Re: [m5-dev] local APIC timer and bus frequency

Reply via email to