Hi all:

I am using DMTCP over Debian Linux, and the NAS bechmark suite. I have
2 PCs with Debian GNU/Linux wheezy/sid (and OpenMPI), each PC has an
Intel Core(TM) i5 CPU 750  de 2.67GHz of 4 cores.

When I run the benchmarks over 2 computers, when the program finished
one process keep running on the coordinator, and the program doesn't
go back to the system symbol.
So, I start the program with 8 cores, the DMTCP has the 8 process,
does the checkpoints, but when the program finished on process keep
running, and actually keep doing checkpointing of it single process at
the specified interval.

What I am doing is at this point, press "k" at the coordinator to kill
the remaining process. The remaining process is orterun, as we can see
from the paste below.

Any idea whay I have this issue? I really would like to remove this behavior.


[4423] NOTE at dmtcp_coordinator.cpp:643 in onData; REASON='locking all nodes'
[4423] NOTE at dmtcp_coordinator.cpp:678 in onData; REASON='draining all nodes'
[4423] NOTE at dmtcp_coordinator.cpp:684 in onData;
REASON='checkpointing all nodes'
[4423] NOTE at dmtcp_coordinator.cpp:694 in onData; REASON='building
name service database'
[4423] NOTE at dmtcp_coordinator.cpp:713 in onData;
REASON='entertaining queries now'
[4423] NOTE at dmtcp_coordinator.cpp:718 in onData; REASON='refilling all nodes'
[4423] NOTE at dmtcp_coordinator.cpp:747 in onData; REASON='restarting
all nodes'
[4423] NOTE at dmtcp_coordinator.cpp:919 in onDisconnect;
REASON='client disconnected'
     client.identity() = 111ee9e3-4434-5378c252
[4423] NOTE at dmtcp_coordinator.cpp:919 in onDisconnect;
REASON='client disconnected'
     client.identity() = 111ee9e3-4439-5378c252
[4423] NOTE at dmtcp_coordinator.cpp:919 in onDisconnect;
REASON='client disconnected'
     client.identity() = 111ee9e3-4436-5378c252
[4423] NOTE at dmtcp_coordinator.cpp:919 in onDisconnect;
REASON='client disconnected'
     client.identity() = 111ee9e3-3230-5378c255
[4423] NOTE at dmtcp_coordinator.cpp:919 in onDisconnect;
REASON='client disconnected'
     client.identity() = 111ee9e3-3234-5378c255
[4423] NOTE at dmtcp_coordinator.cpp:919 in onDisconnect;
REASON='client disconnected'
     client.identity() = 111ee9e3-4442-5378c252
[4423] NOTE at dmtcp_coordinator.cpp:919 in onDisconnect;
REASON='client disconnected'
     client.identity() = 111ee9e3-3228-5378c254
[4423] NOTE at dmtcp_coordinator.cpp:919 in onDisconnect;
REASON='client disconnected'
     client.identity() = 111ee9e3-3236-5378c255
[4423] NOTE at dmtcp_coordinator.cpp:919 in onDisconnect;
REASON='client disconnected'
     client.identity() = 111ee9e3-3232-5378c255
l
Client List:
#, PROG[PID]@HOST, DMTCP-UNIQUEPID, STATE
1, orterun[4427]@debian-testing-marina, 111ee9e3-4427-5378c252, RUNNING
k
[4423] NOTE at dmtcp_coordinator.cpp:571 in handleUserCommand;
REASON='Killing all connected Peers...'
[4423] NOTE at dmtcp_coordinator.cpp:919 in onDisconnect;
REASON='client disconnected'
     client.identity() = 111ee9e3-4427-5378c252


Thanks in advance!!
Regards
Marina

------------------------------------------------------------------------------
"Accelerate Dev Cycles with Automated Cross-Browser Testing - For FREE
Instantly run your Selenium tests across 300+ browser/OS combos.
Get unparalleled scalability from the best Selenium testing platform available
Simple to use. Nothing to install. Get started now for free."
http://p.sf.net/sfu/SauceLabs
_______________________________________________
Dmtcp-forum mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dmtcp-forum

Reply via email to