Hi Calvin,
Thanks for this feedback. Concerning rc3, there are still
some issues with a 32-bit O/S. You wrote that you were testing
Fedora 19 on the Atom N550. Is that running in 32-bit mode?
Also, just to confirm, I assume that you are interrupting a function
prior to checkpointing, and then checkpointing and restarting.
It's in this case that you see a segfault on restart.
Is this correct?
rc2 and rc3 have known bugs in terms of supporting 32-bit mode.
After rc1, we changed our restart algorithm a little. In the
next few updates this week, we're hoping to fix the 32-bit mode.
Thank you for the further details. We'll especially
look into why ocaml should be more sensitive than R/python.
Best wishes,
- Gene
On Sun, Apr 26, 2015 at 05:07:40AM -0400, Calvin Ostrum wrote:
> Latest results on my experimentation here. Short form: rc1 seems to
> work on Atom N550, but still not rc2 or rc3.
>
> For some reason, finally seem to have gotten a saved checkpoint that
> works all the time on the Fedora 20 (kernel 3.17.7) i3 540, with rc2.
> Ran it about 50 times in a row with no segfault.
>
> Also have tried building and running each of 2.3.1 and 2.4.0 rc1,
> rc2,rc3 on the Fedora 19 (kernel 3.14.23) Atom N550.
>
> Results: 2.3.1 works like the packaged version, but that means
> control-c ends all checkpointed shells tested with (R,python,ocaml).
>
> rc2 and rc3 (just put up for download hours ago) still crash on every
> checkpointed shell at the same place shown with strace.
>
> However, *rc1* *works* each time I try on each of these shells, and
> handles control-C correctly with R and Python. However, with ocaml,
> when one hits control-C, ocaml prints its "interrupted" message (so
> the signal does get to it fine) but then the whole thing quits.
>
> Now, I noticed that ocaml is actually bytecode for the ocaml
> interpreter, run by the shell using a bang line. That could easily
> mess things up, I suppose. So I tried a checkpoint running the ocaml
> bytecode interpeter directly, passing ocaml into it. And... that
> seems to work. Tried it many times in a row. Runs okay (after
> loading in a huge ocaml program) and control-c works as it should.
>
> So hopefully this is some kind of clue. I assume rc1 works on this
> system. But, rc2 and rc3 don't. Without any understanding of what
> these programs truly do I don't know what other information I can
> provide but will try to provide more if told what I could gather.
> Maybe it is just something in the configure/make that differs?
------------------------------------------------------------------------------
One dashboard for servers and applications across Physical-Virtual-Cloud
Widest out-of-the-box monitoring support with 50+ applications
Performance metrics, stats and reports that give you Actionable Insights
Deep dive visibility with transaction tracing using APM Insight.
http://ad.doubleclick.net/ddm/clk/290420510;117567292;y
_______________________________________________
Dmtcp-forum mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dmtcp-forum