Re: [m5-dev] local APIC timer and bus frequency

Gabe Black Sat, 24 May 2008 00:05:53 -0700

Oh, and one thing I forgot, registers can be like faults where they'relittle islands of the ISA. They know how to translate indexes, and theycould use bitunions, which if it works (I don't remember if it does)could be inherited from (or inherit, with some modifications) to be ableto pull out the bitfields easily while having methods and whatnot. Ikeep thinking about how to get the parser to figure out inputs andoutputs better, but really I'm thinking that might be better doneexplicitly and then propagated using the formats. If the parser goesfrom treating the ISA description as input and instead has the samefunctionality but becomes a python library to go into x86.py (forinstance) which I'd like to do someday as well, it could be based onpython classes and inheritance. I think generally if you have a class ofinstructions like IntOps, you know there will be Ra and Rb as inputs andRc as outputs all the time, and if not you did something wrong. Anyway,that's an entirely different discussion.


Gabe


Gabe Black wrote:

For now, I'm going to make the miscregfile have the event and cause aninterrupt when the timer goes off. This really sounds crappy to me,but I'd have to add new functions to get at the interrupt object likethere are for the TLB. That wouldn't be hard, but I wanted to pointout I'd be adding a new function all over the place in the CPU beforeI went and did it in case that doesn't sound like a great idea. Ipersonally think there are too many functions in too many places inthe CPU which is probably unavoidable, but I'd prefer not to addanother one. Also, while I was deciding what to do about the localAPIC, I tried to boil down what exactly an ISA is in M5, what itshould and shouldn't be doing, and if that suggested anything weshould be doing differently. That's basically what follows, so if youdon't care go ahead and stop reading.
Basically, the ISA is a set of policies and a collection of statewhich directs them. The policies are layered on top of and direct themechanisms of the CPU. This is different from how I think the ISAshave been built up to this point which was as a collection of objectswhich implemented the ISAs policies and which were plugged into theCPU. Essentially, the difference is that there would be a base TLBclass which would contain the behavior of all TLBs, they translateaddresses, they have entries, the evict things, blah blah, and the ISAwould describe the policies that happened under and what state, theentries and control registers, that guided it. The TLB would call intothe ISA to make decisions rather than the ISA defining a TLB whichwould automatically make the "right" decisions.
Another important distinction here is that the state which controlsthe policies of the ISA, most of the MiscRegs, are part of the ISA,but the other registers are not. The policy directing how thoseregisters are used, namely how they're indexed, is, but the actualregisters themselves are not. Also, the way they're currently indexed,as integer/floating point/misc, should really be setup and managed bythe ISA and not the CPU. The CPU would probably want to know what typea register is so it can have it's own policies layered on top, likefor instance separate pools of resources for ints and floats, but itshouldn't force the ISA to glob those together. Even that might be toomuch, because at least for X86 there are registers which can be bothfloating point and integer depending on the instruction that usesthem. So really, the cpu should just expect some number of localstorage banks. In x86, those could correspond to the general (ha)purpose registers, x87/mmx registers, xmm registers, and controlregister backing storage. The control register backing storage couldbe broken down farther into the CR control registers, debug registers,performance counter registers, the LAPIC control registers, the MSRs,etc, etc. Like now, the CPU would be responsible for providing the ISAthe illusion that that was how things worked, but the o3, for example,could keep track of the actual storage however it liked.
Then, the ISA defined control register file(s) really are justregister file(s) which have their accesses intercepted and acted on insome way. They could conceptually live in the TLB, Interrupt object,whatever, with the CPU actually keeping track of the "state", aka whata read would return. The objects the ISA controls would be able tokeep their own local "cached" versions of the control state inwhatever way was convenient, like for instance as a lookup table forregister windows. The objects would need to be able to "refresh" andreread the underlying control state to update their caching datastructures in cases where the underlying storage was updated directly,like what "NoEffect" does right now. This takes care of some oddcircumstances where you can have trouble bootstrapping the controlstate in, for instance, SE mode. You have to make sure everythingactually uses the control state you're righting, but it has to do thatwhen not all of the state has actually been set. Really, all thatneeds to be "refreshed" is the control state which is part of the ISA,not of the objects in the CPU. In that way, you don't have to refreshthe TLBs, blah blah. You only have to make the ISA configure itselfbased on the (to the CPU) generic array of values stored in some ofthe register files.
There also would be objects with state and policy which aredynamically generated like faults and instructions. The faults arepolicy with maybe a little bubble of controlling state in them, andthe instructions are policy (how to execute) strapped onto genericobjects.
So now, all the policies and state particular to the ISA have beenlocalized into one place, an abstract concept called the "ISA". Thatmight as well be an object since it has methods and fields just likean object would. Now that the ISA is an object, which might as well bea SimObject, it can be plugged into a CPU semi-arbitrarily to makethat CPU do x86 or Alpha or whatever, and you get heterogeneous CPUs.In reality a lot of other stuff would need some tweaking, but I don'tthink that much.
At least one downside of looking at this stuff this way isperformance. Because the boundary between ISA and not ISA would becrossed a lot more frequently, and if the ISA isn't compiled in whichwould imply virtual function calls, there would be a lot of overheadfrom that. There would be a performance boost, though, because someparts of the ISAs which are inefficient, like the predecoder in x86constantly figuring out what the operand/address/stack sizes and modeare and SPARC deciding how to flatten register indexes, would be muchmore efficient. Also, the implicit downside is that that would uproota LOT of code and take a long time to get back to where things are now.
Gabe

Gabe Black wrote:
I was thinking about this more, and, bus frequency aside, it seemslike the LAPIC is basically exactly what that interrupts object Imade in the CPU is for, namely managing interrupts and feeding themone at a time on request and according to the ISA's policy to theCPU. At the moment that's just a C++ class for the purpose ofabstraction if I remember right, but it might be good to promote itto a full simobject. That limits people's ability to put it in weirdplaces, but they'd probably want to do that with the IO APIC (the onethe devices actually talk to) anyway. This also touches on somethingI've been thinking about, namely that control registers might be moreclosely associated with particular structures like the TLB or theinterrupts object or the register flattening than with the CPU itselfor controlling instructions. I was thinking it might be a good ideato set up some sort of call back interface on the MiscRegFile sothose "devices" can keep track of those values more with an interruptmodel than a polling model which would probably perform much better.The actual storage could be moved over to them or not, that's more ofan implementation detail, but either way it would immediately takecare of the performance problems with SPARCs register window handlingstuff.
Gabe

Steve Reinhardt wrote:
On Thu, May 22, 2008 at 11:45 AM, Gabe Black <[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>> wrote:
    One problem is that this isn't a parent/child relationship. For
    instance, there could be four CPUs and four local APICs all on the
    same bus, and they need to know which one goes with which CPU. In
    that case it would be arbitrary. [...] Also, all the local APICs
    appear at the same address but are only visible to their CPU. I
    was going to try to deal with that by sticking them into their own
    address space multiplexed on some address bits, but I don't like
    the scalability of that.
Yea, that's the kind of thing I meant by "fiddling with addressmappings". I wouldn't disagree that it's a bit hackish, but itseems plenty scalable to me. If you do that it does solve the "allthe APICs on one local bus" problem too.
    Another idea is a bus local to the CPU which has the CPU and APIC
    on it. That would have to connect to the external interconnect
    somehow. That could also be a home for the page table walker and
    tlb eventually if those go "off chip" and into the memory system.
The tricky part here is doing that without adding latency tonon-APIC memory accesses (and appropriately forwarding flow-controlstate through from the cache, etc.). It's probably doable but itwould be nicer to have a more general solution that didn't requirethis (IMO).
    If the interconnect isn't a bus, then you can't just stick the
    APICs anywhere since they're supposed to be tightly integrated on
    die. You'd need to make sure they ended up very close to their
    CPUs, and having that over a network hop would be substantially
    unrealistic.
Don't forget that part of the point of m5 is to be able to modelthings that are "substantially unrealistic", like putting a PCI NICnext to the L2 cache :-). I think a solution that lets you put theAPIC wherever you want is a good idea, even if most of those placesare obviously stupid.
    I don't think it matters much whether the CPU can check the
    interconnect frequency, although it probably can through some
    mechanism somewhere like CPUID or the stuff the processor sticks
    into PCI config space according to the AMD kernel/BIOS developers
    guide. The timer is being setup and used for something, and I
    would imagine whatever is using it might break if the frequency is
    way off. Then again maybe not. I'd rather make it do what it's
    supposed to do than to try to figure out and fix the problems it
    may cause later.
I think the key is separating the stuff that needs to be a certainway to get things to work from the stuff that is just done a certainway just because that's easy/traditional/whatever. So clearly theAPIC timer needs to run at a frequency within some range, such thatthe CPU can figure out what that frequency is, but I'm guessing thatas long as that frequency is reasonable it doesn't matter exactlywhat it's relationship is to the interconnect. So just having aparameter that lets you set the frequency sounds fine to me; if inthe longer term we find that there really is a need to have it tiedto the interconnect frequency we can always set that up in the python.
Steve
------------------------------------------------------------------------
_______________________________________________
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev
_______________________________________________
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev
_______________________________________________
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev


_______________________________________________
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev

Re: [m5-dev] local APIC timer and bus frequency

Reply via email to