Hi all, I'm new to DMTCP.
* I was trying to checkpoint HDDM 0.7.1 with Slurm (18.08.7) + DMTCP (2.6.0), but it would halt forever at some point: http://ski.clps.brown.edu/hddm_docs/abstract.html ** So I started to test if Python would work and it halted again with SciPy test: import scipy scipy.test() for scipy.test(), the only output I've got so far is: ====== ============================= test session starts ============================== platform linux -- Python 3.7.5, pytest-5.3.1, py-1.8.0, pluggy-0.13.1 rootdir: /teahome02/wgao/dmtcp-py-test plugins: openfiles-0.4.0, doctestplus-0.5.0, astropy-header-0.1.1, arraydiff-0.3, remotedata-0.3.2 ====== scipy.test() worked well without DMTCP. *** print("Hello World!") test with Slurm and DMTCP worked. **** My questions are: 1) any suggestions on checkpointing HDDM with DMTCP? 2) any tips on what could be the reasons that scipy.test() halts? 3) (I know Tensorflow can do its own checkpointing) my testing with Tensorflow + DMTCP halted at "import tensorflow", any tips? ***** I also tried DMTCP 3 but got the same results. Our system is Ubuntu 18.04. Any suggestions would be greatly appreciated. Best, Weijun _______________________________________________ Dmtcp-forum mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/dmtcp-forum
