Hi Joshua,

When moving from DMTCP version 1.2.4 to 1.2.6, we were using this option to
> place our checkpoints in different directories. When enabled now, it always
> creates a base 00000 directory at the start of the program. Checkpoints
> still land in sequence thereafter
>

>
> **
>
> Example:****
>
> ckpt_4bfe32141b50c203-20272-503d3ae1_00000 <= Created when I launch
> program under dmtcp, no contents****
>
> ckpt_4bfe32141b50c203-20272-503d3ae1_00001 <= Actual checkpoint placed
> into this****
>
> ** **
>
> In the past, the 00000 directory was never created, and the first
> directory would be ckpt_4bfe32141b50c203-20272-503d3ae1_00001 if/when we
> made a checkpoint.****
>
> This is a minor inconvenience in situations where we run under dmtcp, but
> we don’t checkpoint as we end up with an extra directory that makes it look
> like we did have a checkpoint at first glance.****
>
> ** **
>
> I’ve modified 1.2.6 locally to return to the original behavior, so that it
> only creates the directory when the checkpoint actually is about to be
> done. It seems to work for my own usage; I guess it’s up to you if you want
> to slide this in or do it in a different way. If this is intentional to
> have the current behavior (creating the 00000 directory), I’d appreciate an
> explanation.
>

This is certainly a bug introduced in 1.2.5. Thanks for catching this.

Now there are two possible ways to go around it:
1. Do not create the directory until needed (in which case, it will be
created just before the checkpoint happens.
2. Create the directory, but use the correct starting suffix i.e. "00001".

I am leaning towards option (2) so that the directory is present if needed
by DMTCP plugins even before checkpoint. I would be happy to discuss (1) as
well.

Thanks,
Kapil

****
>
> ** **
>
> For UniquePid, I modified two functions to pass a bool, whether or not to
> create the directory.****
>
> dmtcpplugin.cpp:47:    dmtcp::UniquePid::setCkptDir(dir, false);****
>
> dmtcpworker.cpp:740:  UniquePid::updateCkptDir(true);****
>
> uniquepid.cpp:227:    updateCkptDir(false);****
>
> uniquepid.cpp:250:void dmtcp::UniquePid::updateCkptDir(bool create)****
>
> uniquepid.cpp:265:  setCkptDir(o.str().c_str(), create);****
>
> uniquepid.h:58:    static void setCkptDir(const char*, bool);****
>
> uniquepid.h:59:    static void updateCkptDir(bool);****
>
> ** **
>
> uniquepid.cpp:233 – uniquepid.cpp:248:****
>
> void dmtcp::UniquePid::setCkptDir(const char *dir, bool create)****
>
> {****
>
>   JASSERT(dir != NULL);****
>
>   _ckptDir() = dir;****
>
>   _ckptFileName().clear();****
>
>   _ckptFilesSubDir().clear();****
>
> ** **
>
>   if (create) {****
>
>     JASSERT(mkdir(_ckptDir().c_str(), S_IRWXU) == 0 || errno == EEXIST)***
> *
>
>       (JASSERT_ERRNO) (_ckptDir())****
>
>       .Text("Error creating checkpoint directory");****
>
> ** **
>
>     JASSERT(0 == access(_ckptDir().c_str(), X_OK|W_OK)) (_ckptDir())****
>
>       .Text("ERROR: Missing execute- or write-access to checkpoint dir");*
> ***
>
>   }****
>
> }****
>
> ** **
>
> Joshua Louie
>
> **
>
------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
_______________________________________________
Dmtcp-forum mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dmtcp-forum

Reply via email to