Hello,

Thanks for the reply, I updated the code repository and was trying to get
it all working again. (Had to change my DMTCP event to the
USER_THREAD_RESUME apparently)
I noticed a bug within dmtcp/jalib/jfilesystem.cpp, the dirname function
does not work with files placed in the root "/xxxx".
A check if the lastSlash is zero and if so returning a slash would solve
this issue.

Then my own problem.
This required a lot of testing :(
But I have now figured out that whenever I checkpoint a VM, it does not
save the /tmp folder.
This is a problem since DMTCP saves it's information in that folder.

I noticed as well that dmtcp_restart_script.sh did not have the flag to
change the temp folder.
So I thought it was not possible. It's only recently that I discoverd
dmtcp_restart to have such a flag !

When I changed the DMTCP_TMP directory it was fixed.
I now can take as many DMTCP checkpoints and restarts as I want without
problems.
Therefor I would like to request if this could be added to the restart
script.

As an attachment I have added a patch that can be used as well.

I hope it helps, thanks again for all your support :)

Robin



2013/3/29 Kapil Arya <[email protected]>

> Hi Robin,
>
> I tried to reproduce your bug but couldn't do it. Are you using DMTCP
> trunk? Also, are you using the new svn url:
>  svn co svn://svn.code.sf.net/p/dmtcp/code/trunk dmtcp-trunk
>
> The old one is now obsolete.
>
> In the past few days, I have pushed a few changes into the trunk. Can
> you checkout the latest version and let me know if the problem still
> persists. If it does, we need to come up with a way to reproduce the
> failure to debug.
>
> Kapil
>
> On Sun, Mar 24, 2013 at 4:19 PM, robin staes <[email protected]>
> wrote:
> > Hi,
> >
> > I have been working with DMTCP quite a lot recently and have encountered
> > several issues which were resolved by your excellent support.
> > Having encountered another problem I hope to benefit from this support
> once
> > more.
> >
> > Currently I am working on a system that should robustly checkpoint a
> process
> > that runs on a virtual machine.
> > This is done by checkpointing the application and through the usage of
> the
> > plugin system with which I checkpoint the VM itself. (This is via a
> "system"
> > call, which I appreciate you have implemented!)
> > This is currently working but not completely.
> > My process:
> >
> > Start a simple python counter and let it checkpoint every 10 minutes.
> > After a checkpoint kill the VM and reboot it from the VM checkpoint.
> > This works perfectly.
> > But when the restarted process takes a new checkpoint and I want to
> restart
> > it, it fails.
> > The error code 99 is returned which indicates a problem within DMTCP.
> > After having recompiled to enable debug information I have traced the
> > problem to this area:
> >
> > [40000] TRACE at pid.cpp:92 in openOriginalToCurrentMappingFiles;
> > REASON='Open dmtcpPidMapFile'
> >      pidMapFile.str() =
> > /tmp/dmtcp-root@ip-10-194-33-122
> /dmtcpPidMap.13a71f78f7523f34-40000-514f32c7.514f57098
> > [40000] ERROR at pid.cpp:80 in openSharedFile; REASON='JASSERT(false)
> > failed'
> >      name =
> > /tmp/dmtcp-root@ip-10-194-33-122
> /dmtcpPidMap.13a71f78f7523f34-40000-514f32c7.514f57098
> >      strerror((*__errno_location ())) = No such file or directory
> > Message: Cannot open file
> > python2.7 (40000): Terminating...
> >
> > It is true that when I look at the designated folder I can't find the
> file.
> > But my first restart works and the pid.cpp:92 result is:
> >
> > [40000] TRACE at pid.cpp:92 in openOriginalToCurrentMappingFiles;
> > REASON='Open dmtcpPidMapFile'
> >      pidMapFile.str() =
> > /tmp/dmtcp-root@ip-10-194-33-122
> /dmtcpPidMap.13a71f78f7523f34-40000-514f32c7.514f56bb5
> > [40000] TRACE at virtualidtable.h:241 in writeMapsToFile; REASON='Write
> Maps
> > to file'
> >      mapFile =
> > /tmp/dmtcp-root@ip-10-194-33-122
> /dmtcpPidMap.13a71f78f7523f34-40000-514f32c7.514f56bb5
> >
> > Now as well I can't find the file on my filesystem.
> >
> > I hope you can enlighten me.
> >
> > Thanks in advance,
> >
> > Robin Staes
> >
> > PS: Sorry for the mail bombardment to Kapil, I did not have the nerve to
> > send another one :)
> >
> >
> >
> ------------------------------------------------------------------------------
> > Everyone hates slow websites. So do we.
> > Make your web apps faster with AppDynamics
> > Download AppDynamics Lite for free today:
> > http://p.sf.net/sfu/appdyn_d2d_mar
> > _______________________________________________
> > Dmtcp-forum mailing list
> > [email protected]
> > https://lists.sourceforge.net/lists/listinfo/dmtcp-forum
> >
>



-- 

Robin

Attachment: dirname_and_restartscript_fix.diff
Description: Binary data

------------------------------------------------------------------------------
Own the Future-Intel(R) Level Up Game Demo Contest 2013
Rise to greatness in Intel's independent game demo contest. Compete 
for recognition, cash, and the chance to get your game on Steam. 
$5K grand prize plus 10 genre and skill prizes. Submit your demo 
by 6/6/13. http://altfarm.mediaplex.com/ad/ck/12124-176961-30367-2
_______________________________________________
Dmtcp-forum mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dmtcp-forum

Reply via email to