I'm a beginner in DMTCP  and trying to run an MPI application. My environment, 
running in KVM hypervisor:
Master/coordinator:
OS: RHEL Server 7.6
CPUs: 2
RAM: 2048

Node1 and Node 2 (each):
OS: RHEl Server 7.6
CPUs: 1
RAM: 1024

If I try to run the dmtcp without OpenMP all works right, the same is true to 
run OpenMP without the dmtcp. Only when I run the DMTCP with OpenMP I receive 
the error:

Command used:
dmtcp_launch mpirun -host 192.168.122.225,192.168.122.163 gmx mdrun -s 
ion_channel.tpr -maxh 0.50 -resethway -noconfout -nsteps 10000 -g logfile

"[48000] WARNING at signalwrappers.cpp:141 in sigaction; 
REASON='JWARNING(false) failed'
     "Application trying to use DMTCP's signal for it's own use.\n" "  You 
should employ a different signal by setting the\n" "  environment variable 
DMTCP_SIGCKPT to the number\n" "  of the signal that DMTCP should use for 
checkpointing." = Application trying to use DMTCP's signal for it's own use.
  You should employ a different signal by setting the
  environment variable DMTCP_SIGCKPT to the number
  of the signal that DMTCP should use for checkpointing.
     stopSignal = 12
[48000] WARNING at socketconnection.cpp:219 in TcpConnection; 
REASON='JWARNING(false) failed'
     type = 2
Message: Datagram Sockets not supported. Hopefully, this is a short lived 
connection!
[49000] NOTE at ssh.cpp:423 in prepareForExec; REASON='New ssh command'
     newCommand = /usr/local/bin/dmtcp_ssh --ssh-slave 
/usr/local/bin/dmtcp_nocheckpoint /usr/bin/ssh -x 192.168.122.225 
/usr/local/bin/dmtcp_launch --coord-host 0.0.0.0 --coord-port 7779 --ckptdir 
/home/roribeir /usr/local/bin/dmtcp_sshd  --ssh-slave  orted --hnp-topo-sig 
0N:2S:2L3:2L2:2L1:2C:2H:x86_64 -mca ess "env" -mca orte_ess_jobid "1796145152" 
-mca orte_ess_vpid 1 -mca orte_ess_num_procs "2" -mca orte_hnp_uri 
"1796145152.0;tcp://192.168.122.158:50099" --tree-spawn -mca plm "rsh" 
--tree-spawn

I tried to change the signal to 16 and to 21 using the option --ckpt-signal, 
but keep don't working, only the error was changed:

Command:
dmtcp_launch --ckpt-signal 21 mpirun -host 192.168.122.225 gmx mdrun -s 
ion_channel.tpr -maxh 0.50 -resethway -noconfout -nsteps 10000 -g logfile

error:
[51000] WARNING at socketconnection.cpp:219 in TcpConnection; 
REASON='JWARNING(false) failed'
     type = 2
Message: Datagram Sockets not supported. Hopefully, this is a short lived 
connection!
[52000] NOTE at ssh.cpp:423 in prepareForExec; REASON='New ssh command'
     newCommand = /usr/local/bin/dmtcp_ssh --ssh-slave 
/usr/local/bin/dmtcp_nocheckpoint /usr/bin/ssh -x 192.168.122.225 
/usr/local/bin/dmtcp_launch --coord-host 0.0.0.0 --coord-port 7779 
--ckpt-signal 21 --ckptdir /home/roribeir /usr/local/bin/dmtcp_sshd  
--ssh-slave  orted --hnp-topo-sig 0N:2S:2L3:2L2:2L1:2C:2H:x86_64 -mca ess "env" 
-mca orte_ess_jobid "397869056" -mca orte_ess_vpid 1 -mca orte_ess_num_procs 
"2" -mca orte_hnp_uri "397869056.0;tcp://192.168.122.158:43551" --tree-spawn 
-mca plm "rsh" --tree-spawn



--
Best regards,

Rodrigo Vitor Ribeiro

Intern

Red Hat <https://www.redhat.com>

3900 Brigadeiro Faria Lima Ave.

Sao Paulo, SP 04538 BR

[email protected]<mailto:[email protected]>    M: 
+55-11-981537326<tel:+55-11-981537326>

[https://ci5.googleusercontent.com/proxy/xXqg35UrSxPgyjuIn0l27pX9ZCdn0XNE5N1LwbvIoIktj6W7NLafQwtbezJ4YuhNgbC8VocSlYRAohr0UPRS7E0mN9vkdiZYV8ZmZSM=s0-d-e1-ft#https://www.redhat.com/files/brand/email/sig-redhat.png]<https://red.ht/sig>

_______________________________________________
Dmtcp-forum mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dmtcp-forum

Reply via email to