> On Nov. 6, 2015, 7:43 a.m., Nilay Vaish wrote: > > src/cpu/trace/trace_cpu.cc, lines 1047-1048 > > <http://reviews.gem5.org/r/3029/diff/2/?file=51455#file51455line1047> > > > > I don't think this is correct. > > Stephan Diestelhorst wrote: > Radhika, maybe add a comment that nextExecute() will update currElement?
delta >= 0 && currElement.tick < last_tick is possible. I think the code should be: assert(currElement.tick >= last_tick); delta = currElement.tick - last_tick; - Nilay ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: http://reviews.gem5.org/r/3029/#review7519 ----------------------------------------------------------- On Nov. 5, 2015, 9:08 p.m., Curtis Dunham wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > http://reviews.gem5.org/r/3029/ > ----------------------------------------------------------- > > (Updated Nov. 5, 2015, 9:08 p.m.) > > > Review request for Default. > > > Repository: gem5 > > > Description > ------- > > This patch defines a TraceCPU that replays trace generated using the elastic > trace probe attached to the O3 CPU model. The elastic trace is an execution > trace with data dependencies and ordering dependencies annoted to it. It also > replays fixed timestamp instruction fetch trace that is also generated by the > elastic trace probe. > > The TraceCPU inherits from BaseCPU as a result of which some methods need > to be defined. It has two port subclasses inherited from MasterPort for > instruction and data ports. It issues the memory requests deducing the > timing from the trace and without performing real execution of micro-ops. > As soon as the last dependency for an instruction is complete, > its computational delay, also provided in the input trace is added. The > dependency-free nodes are maintained in a list, called 'ReadyList', > ordered by ready time. Instructions which depend on load stall until the > responses for read requests are received thus achieving elastic replay. If > the dependency is not found when adding a new node, it is assumed complete. > Thus, if this node is found to be completely dependency-free its issue time is > calculated and it is added to the ready list immediately. This is encapsulated > in the subclass ElasticDataGen. > > If ready nodes are issued in an unconstrained way there can be more nodes > outstanding which results in divergence in timing compared to the O3CPU. > Therefore, the Trace CPU also models hardware resources. A sub-class to model > hardware resources is added which contains the maximum sizes of load buffer, > store buffer and ROB. If resources are not available, the node is not issued. > The 'depFreeQueue' structure holds nodes that are pending issue. > > Modeling the ROB size in the Trace CPU as a resource limitation is arguably > the > most important parameter of all resources. The ROB occupancy is estimated > using > the newly added field 'robNum'. We need to use ROB number as sequence number > is > at times much higher due to squashing and trace replay is focused on correct > path modeling. > > A map called 'inFlightNodes' is added to track nodes that are not only in > the readyList but also load nodes that are executed (and thus removed from > readyList) but are not complete. ReadyList handles what and when to execute > next node while the inFlightNodes is used for resource modelling. The oldest > ROB number is updated when any node occupies the ROB or when an entry in the > ROB is released. The ROB occupancy is equal to the difference in the ROB > number > of the newly dependency-free node and the oldest ROB number in flight. > > If no node dependends on a non load/store node then there is no reason to > track > it in the dependency graph. We filter out such nodes but count them and add a > weight field to the subsequent node that we do include in the trace. The > weight > field is used to model ROB occupancy during replay. > > The depFreeQueue is chosen to be FIFO so that child nodes which are in > program order get pushed into it in that order and thus issued in the in > program order, like in the O3CPU. This is also why the dependents is made a > sequential container, std::set to std::vector. We only check head of the > depFreeQueue as nodes are issued in order and blocking on head models that > better than looping the entire queue. An alternative choice would be to > inspect > top N pending nodes where N is the issue-width. This is left for future as the > timing correlation looks good as it is. > > At the start of an execution event, first we attempt to issue such pending > nodes by checking if appropriate resources have become available. If yes, we > compute the execute tick with respect to the time then. Then we proceed to > complete nodes from the readyList. > > When a read response is received, sometimes a dependency on it that was > supposed to be released when it was issued is still not released. This occurs > because the dependent gets added to the graph after the read was sent. So the > check is made less strict and the dependency is marked complete on read > response instead of insisting that it should have been removed on read sent. > > There is a check for requests spanning two cache lines as this condition > triggers an assert fail in the L1 cache. If it does then truncate the size > to access only until the end of that line and ignore the remainder. > Strictly-ordered requests are skipped and the dependencies on such requests > are handled by simply marking them complete immediately. > > The simulated seconds can be calculated as the difference between the > final_tick stat and the tickOffset stat. A CountedExitEvent that contains > a static int belonging to the Trace CPU class as a down counter is used to > implement multi Trace CPU simulation exit. > > > Diffs > ----- > > src/cpu/trace/TraceCPU.py PRE-CREATION > src/cpu/trace/trace_cpu.hh PRE-CREATION > src/cpu/trace/trace_cpu.cc PRE-CREATION > src/cpu/trace/SConscript PRE-CREATION > > Diff: http://reviews.gem5.org/r/3029/diff/ > > > Testing > ------- > > > Thanks, > > Curtis Dunham > > _______________________________________________ gem5-dev mailing list gem5-dev@gem5.org http://m5sim.org/mailman/listinfo/gem5-dev