On 13/05/16 08:08 AM, Rohan Garg wrote:

> Could you try with `dmtcp_launch --disable-alloc-plugin`? Does that
> help? It's not the final solution, of course, but I just want to
> isolate and do sanity tests.

Sorry, I did not know about this.  Judging by the stack
trace, and how it "worked" when I commented out the
"free" body, the memory alloc routines seems to be
where the problem lies for sure.  And it turns out they
appear to work on all my systems if I use that flag.

Does that just mean that the memory alloc routines
used are the raw ones?  And it can still successfully
checkpoint?  What does one lose then by using this flag?
(I have not yet run a big program for example to see if
the saved image is much larger like it was when I just
didn't free anything).

It looks like the problem lies, on my systems, with
the newest version of glibc2.

So, here are the three systems I tested this time:

A: m3-6Y30 , Fedora 23, gcc 5.3.1, glibc2 2.22, kernel 4.5,3 64 bit
B: Atom N550, Fedora 23, gcc 5.3.1, glibc 2.22, kernel 4.4.9, 32 bit
C: i3-540, Fedora 21, gcc 4.8.3  glibc2 2.18, kernel 4.3.3, 64 bit

I tried
        2.4.4
        2.5rc1 and
        3.0 git zip of may 12

on all systems.

I tried starting a session with ocaml
(O) and a session with python (P).  Just
defined a variable, checkpointed, and
attempted to restart,and attempted
to interrupt with ctrl-c.

------------------------------------------

Results System C, with the older glibc2:

All three dmtcp versions WORK with both ocaml and
python, and handle interrupt correctly.  So no
regression there.

------------------------------------------

Results System B: old netbook, 32 bit,
I don't need to use it probably, but gives
another test result, with newest glibc2:

NONE of them work.  Not even the ones
that used to work before I recently
updated it to new glibc2.

They all seem to show an infinite loop
involving some alloc routines, and they
all work if I start with --disable-alloc-plugin.
Including handling ctrl-C

------------------------------------------

Results System A: new ultrabook, 64 bit with
newest glibc2:

NONE of them work, all show similar kind of
infinite loop, and, ALL WORK with
--disable-alloc-plugin.  Including
handling ctrl-C

(It was apparently only with my lobotomized
version with the "free" code removed that
it didn't handle ctrl-C).

------------------------------------------

> Interesting! So, it seems like we missed out on some corner case.
> Could you please share the stack trace? Also, is it easy to isolate
> it to a simple test case that you could share with us? It'll be
> easy to debug if we can reproduce it locally.

It doesn't seem like a "corner case" in the code as
it fails on absolutely anything I call it with.
Maybe a corner case for environments if the phrase makes
sense there. For a simple test case "dmtcp_launch python" fails,
as I mentioned, or "dmctp_lauch who" even.

I hope there is *something* you can do to
reproduce it locally.  What could it be on
my systems (both of them with the new glibc2)
that make it fail that you cannot easily
reproduce??

Here are the tops of the stack crashes
on B without disable alloc (enough to
show the loop):

2.4.4:
> #0  0x00007ffff650915a in do_sym () from /lib64/libc.so.6
> #1  0x00007ffff6509543 in _dl_vsym () from /lib64/libc.so.6
> #2  0x00007ffff69aa198 in dlvsym_doit () from /lib64/libdl.so.2
> #3  0x00007ffff7deb5f4 in _dl_catch_error () from /lib64/ld-linux-x86-64.so.2
> #4  0x00007ffff69aa631 in _dlerror_run () from /lib64/libdl.so.2
> #5  0x00007ffff69aa1ed in dlvsym () from /lib64/libdl.so.2
> #6  0x00007ffff7121522 in initialize_libpthread_wrappers () at 
> syscallsreal.c:315
> #7  0x00007ffff70dfcc9 in dmtcp_prepare_wrappers () at dmtcpworker.cpp:152
> #8  0x00007ffff7bd9b8f in malloc (size=118) at alloc/mallocwrappers.cpp:40
> #9  0x00007ffff7deb3c1 in _dl_signal_error () from /lib64/ld-linux-x86-64.so.2
> #10 0x00007ffff7deb573 in _dl_signal_cerror () from 
> /lib64/ld-linux-x86-64.so.2
> #11 0x00007ffff7de6303 in _dl_lookup_symbol_x () from 
> /lib64/ld-linux-x86-64.so.2
> #12 0x00007ffff6509161 in do_sym () from /lib64/libc.so.6
> #13 0x00007ffff6509543 in _dl_vsym () from /lib64/libc.so.6
> #14 0x00007ffff69aa198 in dlvsym_doit () from /lib64/libdl.so.2
> #15 0x00007ffff7deb5f4 in _dl_catch_error () from /lib64/ld-linux-x86-64.so.2
> #16 0x00007ffff69aa631 in _dlerror_run () from /lib64/libdl.so.2
> #17 0x00007ffff69aa1ed in dlvsym () from /lib64/libdl.so.2
> #18 0x00007ffff7121522 in initialize_libpthread_wrappers () at 
> syscallsreal.c:315
> #19 0x00007ffff70dfcc9 in dmtcp_prepare_wrappers () at dmtcpworker.cpp:152
> #20 0x00007ffff7bd9b8f in malloc (size=118) at alloc/mallocwrappers.cpp:40

2.5:
> #0  0x00007ffff7deb58d in _dl_catch_error () from /lib64/ld-linux-x86-64.so.2
> #1  0x00007ffff69aa631 in _dlerror_run () from /lib64/libdl.so.2
> #2  0x00007ffff69aa148 in dlsym () from /lib64/libdl.so.2
> #3  0x00007ffff7121624 in initialize_libc_wrappers () at syscallsreal.c:256
> #4  dmtcp_prepare_wrappers () at syscallsreal.c:302
> #5  0x00007ffff7bd9b8f in malloc (size=109) at alloc/mallocwrappers.cpp:40
> #6  0x00007ffff7deb3c1 in _dl_signal_error () from /lib64/ld-linux-x86-64.so.2
> #7  0x00007ffff7deb573 in _dl_signal_cerror () from 
> /lib64/ld-linux-x86-64.so.2
> #8  0x00007ffff7de6303 in _dl_lookup_symbol_x () from 
> /lib64/ld-linux-x86-64.so.2
> #9  0x00007ffff6509161 in do_sym () from /lib64/libc.so.6
> #10 0x00007ffff6509543 in _dl_vsym () from /lib64/libc.so.6
> #11 0x00007ffff69aa198 in dlvsym_doit () from /lib64/libdl.so.2
> #12 0x00007ffff7deb5f4 in _dl_catch_error () from /lib64/ld-linux-x86-64.so.2
> #13 0x00007ffff69aa631 in _dlerror_run () from /lib64/libdl.so.2
> #14 0x00007ffff69aa1ed in dlvsym () from /lib64/libdl.so.2
> #15 0x00007ffff71223ad in initialize_libc_wrappers () at syscallsreal.c:267
> #16 dmtcp_prepare_wrappers () at syscallsreal.c:302

3.0
> #0  free (ptr=0x63d010) at alloc/mallocwrappers.cpp:72
> #1  0x00007ffff699f715 in _dlerror_run () from /lib64/libdl.so.2
> #2  0x00007ffff699f148 in dlsym () from /lib64/libdl.so.2
> #3  0x00007ffff7bd9da4 in free (ptr=0x63d010) at alloc/mallocwrappers.cpp:74
> #4  0x00007ffff699f715 in _dlerror_run () from /lib64/libdl.so.2
> #5  0x00007ffff699f148 in dlsym () from /lib64/libdl.so.2
> #6  0x00007ffff7bd9da4 in free (ptr=0x63d010) at alloc/mallocwrappers.cpp:74
> #7  0x00007ffff699f715 in _dlerror_run () from /lib64/libdl.so.2
> #8  0x00007ffff699f148 in dlsym () from /lib64/libdl.so.2
> #9  0x00007ffff7bd9da4 in free (ptr=0x63d010) at alloc/mallocwrappers.cpp:74

------------------------------------------------------------------------------
Mobile security can be enabling, not merely restricting. Employees who
bring their own devices (BYOD) to work are irked by the imposition of MDM
restrictions. Mobile Device Manager Plus allows you to control only the
apps on BYO-devices by containerizing them, leaving personal data untouched!
https://ad.doubleclick.net/ddm/clk/304595813;131938128;j
_______________________________________________
Dmtcp-forum mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dmtcp-forum

Reply via email to