Hi again,

Further to this message, I've used hg bisect to find the revision that breaks checkpointing with ruby. It's revision 10524 that Nilay committed in November that's the first bad changeset. It fails with the panic() on the missing event that I wrote about previously.

I've scanned through the diff and can't immediately see any reason why this would break serialisation, although it does remove some of the code to serialise ruby state.

Could anyone (Nilay?) give me a hint as to why this might break checkpointing with ruby?

I've compiled with the MOESI_hammer protocol for x86, then run with this command line:

./build/X86/gem5.opt --remote-gdb-port=0 -d <outdir> configs/example/fs.py -n 1 --kernel <my-kernel> --script configs/boot/hack_back_ckpt.rcS --max-checkpoints 1 --checkpoint-dir <cptdir> --disk-image <my-disk-image> --cpu-type timing --restore-with timing --ruby

Any help would be appreciated. I don't know ruby at all, so trying to work out what's going on is slow....

Cheers
Tim

On 11/06/2015 20:48, Timothy M Jones wrote:
Hello,

Could someone tell me why we need to take the head event off the event
queue in RubySystem::serialize() in src/mem/ruby/system/System.cc?

Event* eventq_head = eventq->replaceHead(NULL);

The problem I'm getting is that when simulate() is called a few lines
later, it tries to reschedule the simulate_limit_event, but that causes
a panic because it's no longer on the event queue.  This is happening
when trying to take a checkpoint with ruby.  I can't work out from the
comments why the head event needs to be taken off in the first place.

This is basically the reason behind the problems in this thread:

https://www.mail-archive.com/gem5-users@gem5.org/msg11701.html

Thanks
Tim


--
Timothy M. Jones
http://www.cl.cam.ac.uk/~tmj32/
_______________________________________________
gem5-dev mailing list
gem5-dev@gem5.org
http://m5sim.org/mailman/listinfo/gem5-dev

Reply via email to