Hello Avadh,

I'm working on the hybrid main memory project with Ishwar, Mutien, and
others at Maryland and I'm in charge of the caching algorithm used between
DRAM and non-volatile RAM.

A problem I've been thinking about lately is the need to warm up the cache
when switching to marss's simulation mode (which we only do for critical
sections of our benchmarks to avoid extremely long simulation times). The
problem is that this isn't like a traditional cache warm-up period for
simulation, in that it could take an extremely long time to fill up the
DRAM cache to a reasonable level. My concern is that events such as OS
traps could cause DRAM cache misses that wouldn't occur if simulation was
ran from the beginning, negatively affecting our simulation's accuracy.

So, basically, I would like to be able to get my cache state (as well as
the state of the NVRAM controller) into the exact same state it would be
in if the cache algorithm was being used from boot up. So I need to get a
trace of all memory operations that Linux generates during boot, as well
as anything else that may happen in our benchmarks up to the starting
checkpoint.

I see two possible solutions to this problem:

1) Actually run the full simulation from the beginning of boot. I would
prefer not to do this because of the amount of time it would take to boot
Linux. We do have some high performance machines that we could do this
with, but I'm afraid it may take a very long time to complete (and if
something goes wrong, then I have to run it again). We would also have to
run this separately for each benchmark since we need the accesses up to
the critical section checkpoint.

2) My preferred method is to modify marss to write every single guest
physical address access to a trace file while in native mode (not sure if
this is easy, since it might involve messing with QEMU's JIT, or just
running QEMU in interpretation mode only), which I can then run through
the trace-based mode of my cache controller to get it up to the correct
state and then save the state to a file for later use. Another idea is to
run in PTLSim mode, but disable the low level pipeline simulation stuff to
reduce the amount of time necessary to get to the checkpoint. The details
of how this works are irrelevant to me, as long as I can get the memory
traces for deriving the memory system state quickly.

My question is: has anybody attempted to do something like solution 2
before? If not, do you have any suggestions for how to go about
implementing such a feature? Or do you have any other ideas on how I can
get the traces I need?

Thanks,

Jim Stevens


_______________________________________________
http://www.marss86.org
Marss86-Devel mailing list
[email protected]
https://www.cs.binghamton.edu/mailman/listinfo/marss86-devel

Reply via email to