Hi Nick,
Have you looked to see if there is an old coordinator still running?
If there is, try killing it, and then again trying the restart.
We recently improved the error message about "funny state" to make
this possibility clearer.
And as always, if you have a DMTCP bug that you can reproduce, we'd
be very eager to get a copy of the code (or a small test case) that
demonstrates the bug.
Best,
- Gene
On Mon, Mar 12, 2012 at 07:52:38PM +0000, Harezga, Nick wrote:
> Hi all,
>
> For the purposes of getting a demonstration running, we have decided to
> attach both the IRC server and IRC client to the dmtcp_coordinator. We are
> able to successfully checkpoint all programs, with the client and server
> running on different virtual machines. We can restart everything when
> starting the IRC server first, but if we kill the IRC server while the
> clients are running, the server gives us the following error when attempting
> to restart.
>
> Message: Coordinator in a funny state. Peers exist, not restarting,
> but not in a running state. Checkpointing?
> Or maybe restarting and running with peers existing?
>
> It then advised to use the utils/dmtcp_backtrace.py utility to dump the error
> output, which I have attached below.
>
> Examing stack for call frames from:
> /usr/local/bin/dmtcp_restart
> FORMAT: FNC: ..., followed by file:line_number (most recent first).
>
> ** FNC: writeBacktrace
> dmtcp-1.2.4/dmtcp/src/../jalib/jassert.cpp:193
> ** FNC: jassert_internal::JAssert::jbacktrace()
> dmtcp-1.2.4/dmtcp/src/../jalib/jassert.cpp:228
> ** FNC: ~JAssert
> dmtcp-1.2.4/dmtcp/src/../jalib/jassert.cpp:116
> ** FNC: dmtcp::DmtcpCoordinatorAPI::recvCoordinatorHandshake(int*)
> dmtcp-1.2.4/dmtcp/src/dmtcpcoordinatorapi.cpp:224
> ** FNC: restoreSockets
> dmtcp-1.2.4/dmtcp/src/dmtcp_restart.cpp:704
> ** FNC: main
> dmtcp-1.2.4/dmtcp/src/dmtcp_restart.cpp:925
> /lib/i386-linux-gnu/libc.so.6(__libc_start_main+0xf3)[0x469113]
> ** FNC: _start
> ??:0
>
> Any ideas? Is there a reason that we wouldn't be able to restart the server
> while the clients are still running?
>
> Thanks,
> Nick
> ------------------------------------------------------------------------------
> Try before you buy = See our experts in action!
> The most comprehensive online learning library for Microsoft developers
> is just $99.99! Visual Studio, SharePoint, SQL - plus HTML5, CSS3, MVC3,
> Metro Style Apps, more. Free future releases when you subscribe now!
> http://p.sf.net/sfu/learndevnow-dev2
> _______________________________________________
> Dmtcp-forum mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/dmtcp-forum
------------------------------------------------------------------------------
Keep Your Developer Skills Current with LearnDevNow!
The most comprehensive online learning library for Microsoft developers
is just $99.99! Visual Studio, SharePoint, SQL - plus HTML5, CSS3, MVC3,
Metro Style Apps, more. Free future releases when you subscribe now!
http://p.sf.net/sfu/learndevnow-d2d
_______________________________________________
Dmtcp-forum mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dmtcp-forum