Hello Avadh, I'm working on the hybrid main memory project with Ishwar, Mutien, and others at Maryland and I'm in charge of the caching algorithm used between DRAM and non-volatile RAM.
A problem I've been thinking about lately is the need to warm up the cache when switching to marss's simulation mode (which we only do for critical sections of our benchmarks to avoid extremely long simulation times). The problem is that this isn't like a traditional cache warm-up period for simulation, in that it could take an extremely long time to fill up the DRAM cache to a reasonable level. My concern is that events such as OS traps could cause DRAM cache misses that wouldn't occur if simulation was ran from the beginning, negatively affecting our simulation's accuracy. So, basically, I would like to be able to get my cache state (as well as the state of the NVRAM controller) into the exact same state it would be in if the cache algorithm was being used from boot up. So I need to get a trace of all memory operations that Linux generates during boot, as well as anything else that may happen in our benchmarks up to the starting checkpoint. I see two possible solutions to this problem: 1) Actually run the full simulation from the beginning of boot. I would prefer not to do this because of the amount of time it would take to boot Linux. We do have some high performance machines that we could do this with, but I'm afraid it may take a very long time to complete (and if something goes wrong, then I have to run it again). We would also have to run this separately for each benchmark since we need the accesses up to the critical section checkpoint. 2) My preferred method is to modify marss to write every single guest physical address access to a trace file while in native mode (not sure if this is easy, since it might involve messing with QEMU's JIT, or just running QEMU in interpretation mode only), which I can then run through the trace-based mode of my cache controller to get it up to the correct state and then save the state to a file for later use. Another idea is to run in PTLSim mode, but disable the low level pipeline simulation stuff to reduce the amount of time necessary to get to the checkpoint. The details of how this works are irrelevant to me, as long as I can get the memory traces for deriving the memory system state quickly. My question is: has anybody attempted to do something like solution 2 before? If not, do you have any suggestions for how to go about implementing such a feature? Or do you have any other ideas on how I can get the traces I need? Thanks, Jim Stevens _______________________________________________ http://www.marss86.org Marss86-Devel mailing list [email protected] https://www.cs.binghamton.edu/mailman/listinfo/marss86-devel
