So, is it possible to delay the checkpointing process till all the CPUs are done with their current locked memory operations? Or do you think it is possible that such a time may never arrive, in which case switching to timing mode is not possible?

--
Nilay

On Tue, 3 Apr 2012, Gabe Black wrote:

I haven't looked at the code in depth, but I think "locked" describes
whether or not you're in the middle of a locked (roughly atomic)
operation. Not keeping track of that would be bad, although transferring
that to timing mode would be tricky. On the other hand, timing mode
doesn't implement locked memory operations anyway, so...

Gabe

On 04/03/12 21:53, Nilay Vaish wrote:
What I wrote was more of a hypothesis that might explain / help debug
the problem Joel is facing. I might be incorrect, though I faintly
recall facing some such problem my self.

I am more in favor of moving towards storing only architectural state
in the checkpoint so that we can restore from any cpu type. Any ideas
as to why we need fields like so_state, locked, status for atomic CPU?

--
Nilay

On Tue, 3 Apr 2012, Beckmann, Brad wrote:

Hmm...timing cpu restoring a checkpoint created by the atomic cpu
used to work?  What changed with regards to interrupts broke it?

It would be great if we could maintain atomic -> timing checkpoint
capability.  It is really useful to use the atomic cpu to
fast-forward to an interesting point in an application, take a
checkpoint, then use that checkpoint across multiple experiments
using the timing cpu.

Brad



-----Original Message-----
From: [email protected] [mailto:[email protected]]
On Behalf Of Nilay Vaish
Sent: Tuesday, April 03, 2012 6:51 PM
To: gem5 Developer List
Subject: Re: [gem5-dev] x86 Checkpoint Restore Trouble

Joel, you might know that system being restored from a check point
starts with
an atomic cpu by default. I think the problem you are facing is in
initializing a
timing CPU. IIRC, the interrupt controller is moved from atomic CPU
to timing
CPU after X number of ticks. You might want to try restoring with a
timing CPU
to begin with (use option --restore-cpu-type).
But I think you will run in to problem there as well if the
checkpoint was
created using an atomic CPU as the check pointed state is a super
set of the
architectural state and depends on the CPU type in use.

You might want to create check points using the timing CPU it self.

--
Nilay

On Tue, 3 Apr 2012, Joel Hestness wrote:

Hey guys,
 I've tried searching around, but I'm having trouble finding any help
on this.  Anyone have insights or pointers?  Is checkpoint restore
supposed to be working?

---------------------------------------------------
joel@vein:~/research/gem5/gem5-latest$ ./build/X86/gem5.debug
./configs/example/fs.py --take-checkpoints=5000000000,1000000000000
gem5 Simulator System.  http://gem5.org
gem5 is copyrighted software; use the --copyright option for details.

gem5 compiled Apr  3 2012 12:11:41
gem5 started Apr  3 2012 18:13:38
gem5 executing on vein
command line: ./build/X86/gem5.debug ./configs/example/fs.py
--take-checkpoints=5000000000,1000000000000
warning: add_child('terminal'): child 'terminal' already has parent
Global frequency set at 1000000000000 ticks per second
info: kernel located at:
/home/joel/research/gem5/full_system_files/binaries/x86_64-vmlinux-
2.6.28.4-smp
     0: rtc: Real-time clock set to Sun Jan  1 00:00:00 2012 Listening
for com_1 connection on port 3458
warn: Reading current count from inactive timer.
0: system.remote_gdb.listener: listening for remote gdb #0 on port
7002
info: Entering event queue @ 0.  Starting simulation...
warn: Don't know what interrupt to clear for console.
warn: instruction 'wbinvd' unimplemented Writing checkpoint
info: Entering event queue @ 5000000000.  Starting simulation...
^Chack: be nice to actually delete the event here Exiting @ tick
5559293000 because user interrupt received
joel@vein:~/research/gem5/gem5-latest$ ./build/X86/gem5.debug
./configs/example/fs.py -r 1 --cpu-type=timing
gem5 Simulator System.  http://gem5.org
gem5 is copyrighted software; use the --copyright option for details.

gem5 compiled Apr  3 2012 12:11:41
gem5 started Apr  3 2012 18:15:08
gem5 executing on vein
command line: ./build/X86/gem5.debug ./configs/example/fs.py -r 1
--cpu-type=timing
warning: add_child('terminal'): child 'terminal' already has parent
Global frequency set at 1000000000000 ticks per second
info: kernel located at:
/home/joel/research/gem5/full_system_files/binaries/x86_64-vmlinux-
2.6.28.4-smp
     0: rtc: Real-time clock set to Sun Jan  1 00:00:00 2012 Listening
for com_1 connection on port 3458
*gem5.debug: build/X86/arch/x86/utility.cc:171: void
X86ISA::initCPU(ThreadContext*, int): Assertion `interrupts' failed.*
*Program aborted at cycle 0*
*Aborted*
---------------------------------------------------

 For this test, I'm using changeset 8900 with Nilay's port fix
patch here:
http://reviews.gem5.org/r/1102/
 The same assertion failure happens for ruby_fs.py (and the script
that I've derived from ruby_fs.py).

 Thanks!
 Joel


--
 Joel Hestness
 PhD Student, Computer Architecture
 Dept. of Computer Science, University of Texas - Austin
http://www.cs.utexas.edu/~hestness
_______________________________________________
gem5-dev mailing list
[email protected]
http://m5sim.org/mailman/listinfo/gem5-dev

_______________________________________________
gem5-dev mailing list
[email protected]
http://m5sim.org/mailman/listinfo/gem5-dev


_______________________________________________
gem5-dev mailing list
[email protected]
http://m5sim.org/mailman/listinfo/gem5-dev

_______________________________________________
gem5-dev mailing list
[email protected]
http://m5sim.org/mailman/listinfo/gem5-dev

_______________________________________________
gem5-dev mailing list
[email protected]
http://m5sim.org/mailman/listinfo/gem5-dev

_______________________________________________
gem5-dev mailing list
[email protected]
http://m5sim.org/mailman/listinfo/gem5-dev

Reply via email to