Re: [gem5-dev] Review Request 2908: ruby: Fix checkpointing and restore

Timothy Jones Wed, 08 Jul 2015 13:35:02 -0700


> On July 3, 2015, 4:24 p.m., Jason Power wrote:
> > Hi Tim,
> > 
> > Sorry to come back to this patch, but I just applied it and tried to test 
> > it and ran into a problem. When restoring the original event queue in line 
> > 187 of System.cc, I get an error that the event is already on the event 
> > queue. Below is how I ran into the problem:
> > 
> > scons build/X86_MOESI_hammer/gem5.opt -j5 --default=X86 
> > PROTOCOL=MOESI_hammer
> > build/X86_MOESI_hammer/gem5.opt configs/example/fs.py --ruby 
> > --cpu-type=detailed -m 4118117000 --checkpoint-at-end
> > 
> > Am I missing some other patch that is also needed in conjuction with this 
> > one?
> 
> Timothy Jones wrote:
>     Hm, this sounds quite like one of the issues I was having and I thought I 
> had fixed.  Could you tell me which disk image and kernel you were using and 
> I'll see what's going on?
> 
> Jason Power wrote:
>     I'd be surprised if my kernel and disk were the difference here, since it 
> is a problem in checkpointing. I chose a totally random tick to checkpoint at 
> (about 30 seconds into simulation). I think the problem is checkpointing 
> after using --cpu-type=detailed. Running the same test with --cpu-type=timing 
> seems to work.
> 
> Timothy Jones wrote:
>     I agree - I didn't mean to imply that it's your kernel or disk, just that 
> I wanted to get the setup as similar as possible.  For me, you see, this is 
> all fine.  I tried with several different random tick values for 
> checkpointing too.  I used your command lines above and here is my restore 
> commmand line:
>     
>     build/X86_MOESI_hammer/gem5.opt configs/example/fs.py --ruby 
> --cpu-type=detailed -m 10000000000 --restore-with detailed -r 0


Jason and I worked on this offline.  I managed to reproduce his error and have 
posted a fix in review request #2951


- Timothy


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
http://reviews.gem5.org/r/2908/#review6698
-----------------------------------------------------------


On July 1, 2015, 7:45 p.m., Timothy Jones wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> http://reviews.gem5.org/r/2908/
> -----------------------------------------------------------
> 
> (Updated July 1, 2015, 7:45 p.m.)
> 
> 
> Review request for Default and Ruby Reviewers.
> 
> 
> Repository: gem5
> 
> 
> Description
> -------
> 
> ruby: Fix checkpointing and restore
> 
> There are 2 problems with the existing checkpoint and restore code in ruby.
> The first is that when the event queue is altered by ruby during 
> serialization,
> some events that are currently scheduled cannot be found (e.g. the event to
> stop simulation that always lives on the queue), causing a panic.  The second
> is that ruby is sometimes serialized after the memory system, meaning that the
> dirty data in its cache is flushed back to memory too late and so isn't
> included in the checkpoint.
> 
> These are fixed by implementing memory writeback in ruby, using the same
> technique of hijacking the event queue, but first descheduling all events that
> are currently on it.  They are saved, along with their scheduled time, so that
> the event queue can be faithfully reconstructed after writeback has finished.
> Writeback is still implemented using flushing, so the cache recorder object,
> that is created to generate the trace and manage flushing, is kept around and
> used during serialization to write the trace to disk.
> 
> 
> Diffs
> -----
> 
>   src/mem/ruby/system/CacheRecorder.cc e4f63f1d502d 
>   src/mem/ruby/system/System.hh e4f63f1d502d 
>   src/mem/ruby/system/System.cc e4f63f1d502d 
>   src/sim/eventq.hh e4f63f1d502d 
> 
> Diff: http://reviews.gem5.org/r/2908/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Timothy Jones
> 
>

_______________________________________________
gem5-dev mailing list
[email protected]
http://m5sim.org/mailman/listinfo/gem5-dev

Re: [gem5-dev] Review Request 2908: ruby: Fix checkpointing and restore

Reply via email to