To make it easier to have a discussion about this proposed API, I copied it into a document over here:
https://docs.google.com/document/d/1GBpHjWtjH_AowGqb8pAk9dhWtMdukMALoXmlzwpPgOM/edit?usp=sharing Please take a look when you have a chance. Gabe On Mon, Mar 18, 2019 at 6:51 PM Gabe Black <gabebl...@google.com> wrote: > Hello again folks. I'm tempted to write up a doc which describes my > thoughts on this, but for the sake of expediency I'm just going to send a > "quick" email for now. (edit: This got pretty long. Maybe I should have put > it in a doc. Too late now, maybe later?) > > TLM DMI socket API: > > When a TLM initiator sends a transaction which eventually gets serviced by > a target, there is an attribute in the generic payload which says whether > or not the value accessed *could* have been accessed by DMI if that had > been requested. It seems to be that at least for some uses of this > mechanism, this is a gating mechanism. If the other guy doesn't say yes to > this, then the initiator won't bother to ask with heavier weight mechanisms. > > Then there's a get_direct_mem_ptr method which also accepts a transaction, > and effectively returns whether there's a DMI-able region which corresponds > to that transaction and a descriptor of that region. > > Finally, there's an invalidate_direct_mem_ptr method which basically > broadcasts backwards to let anybody know who may have a live DMI descriptor > that some region is going away, and they should throw away their > descriptor. Unlike the get_direct_mem_ptr which travels like an access with > one destination, this propogates back to potentially multiple initiators > who could have retrieved a DMI descriptor at some point. > > > Proposed gem5 mechanism: > > There would be a new pair of protocol functions added to the master/slave > memory port pairs, sendAtomicBackdoor/recvAtomicBackdoor. Only the atomic > mechanism is being extended to support backdoor accesses, at least for > right now, because you'd likely use the atomic mode and DMI/backdoors under > the same circumstances, ie when you want to go fast and can sacrafice > accuracy. The sendAtomicBackdoor function would just call the > recvAtomicBackdoor function on its peer. The recvAtomicBackdoor function > would be virtual, and its default implementaiton would be to call vanilla > recvAtomic. > > The same as regular sendAtomic/recvAtomic, except that it also takes a > reference to a pointer to a memory backdoor descriptor. I use the term > backdoor instead of re-using DMI to avoid them being seen as equivalent but > confusingly different, and because the term backdoor is already being used > in gem5. The exact name is up for debate. If the target which services the > atomic requiest could support backdoor accesses, or a link in the chain > wouldn't stop working if it was circumvented (like a cache), then that > pointer is set to point to a backdoor memory descriptor. If not, then it's > left pointing at nullptr. > > The backdoor descriptor itself has a few basic properties (still to be > nailed down) which essentially correspond to the DMI ones, ie start and end > address, pointer to the data, and access priveleges. It also has a list of > callbacks for everyone that has a copy of the descriptor returned one way > or the other. These are registered by the caller when they successfully > receive the descriptor, a step they can skip if they're, for instance, just > checking if there is one but not actually storing it. > > Optionally, there could also be a getMemoryBackdoors function which would > accept a start and end address (or AddrRange) and collect a vector of > backdoor descriptors for the whole system. This could be used when setting > up, for instance, KVM, so that you don't have to manually try to figure out > where the memory is to add into the KVM descriptor. This is largely > separate from/an extension of the other mechanism and can be added later. > > If a backdoor needs to be invalidated, the owner of the backdoor just > needs to go through and call all the callbacks, and when it's done it can > throw away the backdoor. > > > Translation between the two: > > TLM -> gem5 > > When a transaction comes in from TLM, the gem5 side of the bridge would > call sendAtomicBackdoor. That will trickle through to somebody on the gem5 > side which will reply, and optionally pass back a pointer to the backdoor > descriptor. The assumption is that if something is backdoor-able, then it > will have already set up that descriptor and will just hand out pointers to > it as necessary. If the backdoor pointer is set, then the dmi_allowed > attribute will be set. Otherwise the bridge acts as before. > > When get_direct_mem_ptr is called, the bridge will also call > sendAtomicBackdoor, but in that case it will set the NO_ACCESS flag in the > request it creates (or similar) to indicate that the packet shouldn't do > anything when it gets where it's going. When the result comes back, the > backdoor descriptor is used to set properties in the DMI object which is > returned to the caller, or false is returned if no backdoor is found. The > bridge will keep the backdoor around and will install a callback in it > which will prompt it to call invalidate_direct_mem_ptr if the backdoor is > invalidated. > > gem5 -> TLM > > When a gem5 request comes in that is on recvAtomicBackdoor, the bridge > will potentially make two calls. First it will use the normal transmission > mechanism to perform the access. If the target indicates DMI is possible, > then gem5 will use get_direct_mem_ptr to get a DMI data blob which it will > use to construct a backdoor descriptor which it will store and set the > backdoor pointer to. Either way, recvAtomic will then return. > > Alternatively, if there's already a backdoor which covers that access for > some reason, the bridge could just take advantage of it to do the access > without forwarding anything on. > > When a call to invalidate_direct_mem_ptr comes in, the bridge will look > through the backdoors it's accumulated. For each one that overlaps at all, > it will be invalidated using the procedure described above, calling the > callbacks and then deleting the backdoor. > > > Benefits: > > I think there are a couple performance benefits to doing things this way. > First, at least on the gem5 side, there aren't a lot of temporary objects > being set up and initialized and/or copied around. The initiator of an > access doesn't have to check if a DMI would have been possible if it cares. > If a memory system participant doesn't care or know about DMI or the new > mechanism, everything automatically and transparently falls back to the > non-backdoor-aware mechanisms. > > > Downsides: > > Not automatically transported by interconnects like normal bridges, buses, > so some porting work is necessary for it to be useful. Expansion of the > memory port API. Backdoor callbacks can't gracefully disappear before the > backdoor they track goes away, although they could have a flag set which > makes them do nothing and go away harmlessly. > > Gabe > > On Thu, Mar 14, 2019 at 4:18 PM Gabe Black <gabebl...@google.com> wrote: > >> I think gem5 would benefit from it for the same reason SystemC >> simulations do, namely speeding up simulations when doing fast forwarding >> (perhaps with a binary translating CPU... hypothetically...). It would also >> be very nice to enable software development for not-yet-existing hardware >> that has gem5 models available. Then gem5 users could do both software >> development and performance evaluation in parallel and only have to build >> models once. This is very nice when large bodies of software need to be >> written to support a bit of hardware, for instance if they have large >> complex drivers, need application level support, etc. It would also be >> great not to have to wait for hours for android to boot to get to the >> interesting part of a simulation or when debugging at a guest software >> level. >> >> Gabe >> >> On Thu, Mar 14, 2019 at 6:41 AM Dr. Matthias Jung <jun...@eit.uni-kl.de> >> wrote: >> >>> Hi Gabe, >>> >>> one of the main reasons for DMI is to speedup simulations, similar to >>> the temporal decoupling in TLM LT or debug transport in order to make the >>> boot-loading. AFAIK DMI is mainly used in virtual platforms that target >>> software development and not hardware architecture design space >>> explorations, because you skip interconnects, caches etc. In commercial >>> tools you can just switch on or switch off DMI and therefore you can have a >>> nice trade-off between speed and accuracy by using the same models. >>> >>> Since gem5 is mainly there for computer-system architecture research, >>> I’m not sure if the DMI feature is really required. From a TLM2 >>> perspective, even if a TLM target model included in gem5 offers DMI, the >>> gem5 core model (initiator) does not have to use it, right? Do you have any >>> concrete use case where you could exploit DMI? >>> >>> For KVM: maybe somebody with KVM experience should comment that. >>> >>> Best regards, >>> Matthias >>> >>> > Am 28.02.2019 um 06:13 schrieb Gabe Black <gabebl...@google.com>: >>> > >>> > Hi folks. TLM is a communication protocol/mechanism built on top of >>> > systemc. It supports a mechanism called DMI which stands for direct >>> memory >>> > interface. The idea is that an entity sending a request into the >>> system can >>> > ask if the target can give it a pointer it can use to directly access >>> that >>> > memory in the future. The target, if it supports that sort of thing, >>> > returns a descriptor which describes a region of memory that can be >>> > accessed in that way. If that needs to be invalidated in the future, >>> then >>> > there's another mechanism the target can use to communicate back to the >>> > sender telling it to throw away that descriptor. >>> > >>> > The way this mechanism is implemented in TLM is a bit less than ideal >>> since >>> > every request has a field that says whether the requester wants to know >>> > about DMI, and so the target has to perform an extra check on all the >>> > requests in case someone is asking when that's useful to communicate >>> only a >>> > very small fraction of the time, perhaps only once during an entire >>> > simulation. >>> > >>> > Aside from that though, this mechanism has some nice properties. >>> First, it >>> > avoids having to globally identify what a memory is or where it is for >>> a >>> > particular simulation. A memory is just a thing on the other end of a >>> > request that may let you get at it directly if you ask nicely. Also, if >>> > there's something in the way that would get messed up if you skipped >>> over >>> > it, say a cache, it can block those requests from getting through to >>> > targets. This could be useful for KVM for instance, when it's >>> collecting >>> > regions to act as RAM for the virtual machine. >>> > >>> > I haven't fully figured out what a good way to avoid the >>> check-every-time >>> > problem of the systemc mechanism, and ideally whatever I/we come up >>> with >>> > will be compatible enough to be bridged effectively, but I'm thinking >>> some >>> > sort of explicit additional call like getAddrRanges which would >>> propogate >>> > through the hierarchy at specific points, either to a specific address >>> or >>> > as a broadcast. >>> > >>> > I know some folks have looked at gem5's memory system protocol and >>> > systemc's TLM before, for instance either to try making gem5 use TLM >>> > natively, or for the systemc TLM bridges. What do you think about >>> adding >>> > this sort of mechainsm to gem5? Are there any pitfalls to avoid, known >>> > issues to figure out, suggested avenues to explore, etc? Please let me >>> > know. This is likely something I'm going to want to pursue in the next >>> few >>> > weeks. >>> > >>> > Gabe >>> > _______________________________________________ >>> > gem5-dev mailing list >>> > gem5-dev@gem5.org >>> > http://m5sim.org/mailman/listinfo/gem5-dev >>> >>> _______________________________________________ >>> gem5-dev mailing list >>> gem5-dev@gem5.org >>> http://m5sim.org/mailman/listinfo/gem5-dev >> >> _______________________________________________ gem5-dev mailing list gem5-dev@gem5.org http://m5sim.org/mailman/listinfo/gem5-dev