> On July 3, 2015, 4:24 p.m., Jason Power wrote: > > Hi Tim, > > > > Sorry to come back to this patch, but I just applied it and tried to test > > it and ran into a problem. When restoring the original event queue in line > > 187 of System.cc, I get an error that the event is already on the event > > queue. Below is how I ran into the problem: > > > > scons build/X86_MOESI_hammer/gem5.opt -j5 --default=X86 > > PROTOCOL=MOESI_hammer > > build/X86_MOESI_hammer/gem5.opt configs/example/fs.py --ruby > > --cpu-type=detailed -m 4118117000 --checkpoint-at-end > > > > Am I missing some other patch that is also needed in conjuction with this > > one? > > Timothy Jones wrote: > Hm, this sounds quite like one of the issues I was having and I thought I > had fixed. Could you tell me which disk image and kernel you were using and > I'll see what's going on? > > Jason Power wrote: > I'd be surprised if my kernel and disk were the difference here, since it > is a problem in checkpointing. I chose a totally random tick to checkpoint at > (about 30 seconds into simulation). I think the problem is checkpointing > after using --cpu-type=detailed. Running the same test with --cpu-type=timing > seems to work. > > Timothy Jones wrote: > I agree - I didn't mean to imply that it's your kernel or disk, just that > I wanted to get the setup as similar as possible. For me, you see, this is > all fine. I tried with several different random tick values for > checkpointing too. I used your command lines above and here is my restore > commmand line: > > build/X86_MOESI_hammer/gem5.opt configs/example/fs.py --ruby > --cpu-type=detailed -m 10000000000 --restore-with detailed -r 0
Jason and I worked on this offline. I managed to reproduce his error and have posted a fix in review request #2951 - Timothy ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: http://reviews.gem5.org/r/2908/#review6698 ----------------------------------------------------------- On July 1, 2015, 7:45 p.m., Timothy Jones wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > http://reviews.gem5.org/r/2908/ > ----------------------------------------------------------- > > (Updated July 1, 2015, 7:45 p.m.) > > > Review request for Default and Ruby Reviewers. > > > Repository: gem5 > > > Description > ------- > > ruby: Fix checkpointing and restore > > There are 2 problems with the existing checkpoint and restore code in ruby. > The first is that when the event queue is altered by ruby during > serialization, > some events that are currently scheduled cannot be found (e.g. the event to > stop simulation that always lives on the queue), causing a panic. The second > is that ruby is sometimes serialized after the memory system, meaning that the > dirty data in its cache is flushed back to memory too late and so isn't > included in the checkpoint. > > These are fixed by implementing memory writeback in ruby, using the same > technique of hijacking the event queue, but first descheduling all events that > are currently on it. They are saved, along with their scheduled time, so that > the event queue can be faithfully reconstructed after writeback has finished. > Writeback is still implemented using flushing, so the cache recorder object, > that is created to generate the trace and manage flushing, is kept around and > used during serialization to write the trace to disk. > > > Diffs > ----- > > src/mem/ruby/system/CacheRecorder.cc e4f63f1d502d > src/mem/ruby/system/System.hh e4f63f1d502d > src/mem/ruby/system/System.cc e4f63f1d502d > src/sim/eventq.hh e4f63f1d502d > > Diff: http://reviews.gem5.org/r/2908/diff/ > > > Testing > ------- > > > Thanks, > > Timothy Jones > > _______________________________________________ gem5-dev mailing list [email protected] http://m5sim.org/mailman/listinfo/gem5-dev
