I'm investigating a crash on restore of a larger test case we have, in which I 
have determined that the
situation is that we have a file opened (looking at /proc/<pid>/fd) that is 
deleted:

30 -> /tmp/ffiWlKyqm (deleted)

Now, this has been on 1.2.4, so I tried using the latest version, but I got 
stuck somewhere else,
(and haven't had a chance to figure that out).

So I tried making a small test case that could reproduce the exact error 
message I was getting. Now, I
didn't get the  exact same error, but I found one case that occurs when I do a 
checkpoint (on both 1.2.4
and trunk). I'll still be looking for the original problem, so when I get that, 
I'll be submitting another
issue to the forum.

Error shown when dmtcpCheckpoint():
[40000] ERROR at fileconnection.cpp:881 in writeFileFromFd; 
REASON='JASSERT(readBytes != -1) failed'
     (strerror((*__errno_location ()))) = Bad file descriptor
Message: Read Failed
delete_test (40000): Terminating...

delete_test.c:
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <dmtcpaware.h>

int main()
{
    FILE *fp;
    // Problematic only when in "w" mode or "a". All + modes and "r" are fine.
    fp = fopen("/tmp/ff_jdl", "w");

    fprintf(stdout, "Opened ff_jdl\n");
    sleep(1);

    fprintf(stdout, "Deleting ff_jdl\n");
    unlink("/tmp/ff_jdl");
    sleep(2);
    dmtcpCheckpoint();

    fprintf(stdout, "I have returned\n");
    sleep(2);
    return 0;
}

Joshua Louie

------------------------------------------------------------------------------
Everyone hates slow websites. So do we.
Make your web apps faster with AppDynamics
Download AppDynamics Lite for free today:
http://p.sf.net/sfu/appdyn_d2d_mar
_______________________________________________
Dmtcp-forum mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dmtcp-forum

Reply via email to