meanmee removed a comment on issue #12363: distributed training notebook tests URL: https://github.com/apache/incubator-mxnet/issues/12363#issuecomment-416146366 I update the version to 1.2.1 by pip, but new error shows up:( xiaomin.wu@iva0605:/autofs/data56/public/xiaomin.wu/code/dist_mxnet_test$ python /home/xiaomin.wu/anaconda2/lib/python2.7/site-packages/mxnet/tools/launch.py -n 2 -s 2 -H hosts --sync-dst-dir /home/xiaomin.wu/cifar10_dist --launcher ssh "python /autofs/data56/public/xiaomin.wu/code/dist_mxnet_test/cifar10_dist.py" 2018-08-27 15:54:12,883 INFO rsync /autofs/data56/public/xiaomin.wu/code/dist_mxnet_test/ -> 10.14.6.5:/home/xiaomin.wu/cifar10_dist xiaomin.wu@10.14.6.5's password: /home/xiaomin.wu/anaconda2/lib/python2.7/site-packages/h5py/__init__.py:36: FutureWarning: Conversion of the second argument of issubdtype from `float` to `np.floating` is deprecated. In future, it will be treated as `np.float64 == np.dtype(float).type`. from ._conv import register_converters as _register_converters 2018-08-27 15:54:18,462 INFO rsync /autofs/data56/public/xiaomin.wu/code/dist_mxnet_test/ -> 10.14.6.8:/home/xiaomin.wu/cifar10_dist xiaomin.wu@10.14.6.5's password: xiaomin.wu@10.14.6.5's password: Traceback (most recent call last): File "/autofs/data56/public/xiaomin.wu/code/dist_mxnet_test/cifar10_dist.py", line 23, in <module> import mxnet as mx ImportError: No module named mxnet Traceback (most recent call last): File "/autofs/data56/public/xiaomin.wu/code/dist_mxnet_test/cifar10_dist.py", line 23, in <module> import mxnet as mx ImportError: No module named mxnet Exception in thread Thread-5: Traceback (most recent call last): File "/home/xiaomin.wu/anaconda2/lib/python2.7/threading.py", line 801, in __bootstrap_inner self.run() File "/home/xiaomin.wu/anaconda2/lib/python2.7/threading.py", line 754, in run self.__target(*self.__args, **self.__kwargs) File "/home/xiaomin.wu/anaconda2/lib/python2.7/site-packages/dmlc_tracker/ssh.py", line 61, in run subprocess.check_call(prog, shell = True) File "/home/xiaomin.wu/anaconda2/lib/python2.7/subprocess.py", line 186, in check_call raise CalledProcessError(retcode, cmd) CalledProcessError: Command 'ssh -o StrictHostKeyChecking=no 10.14.6.8 -p 22 'export LD_LIBRARY_PATH=/usr/local/cuda/lib64::/usr/local/cuda-8.0/lib64:~/TensorRT-4.0.0.3/lib; export DMLC_ROLE=worker; export DMLC_PS_ROOT_PORT=9093; export DMLC_PS_ROOT_URI=10.14.6.5; export DMLC_NUM_SERVER=2; export DMLC_NUM_WORKER=2; cd /home/xiaomin.wu/cifar10_dist; python /autofs/data56/public/xiaomin.wu/code/dist_mxnet_test/cifar10_dist.py'' returned non-zero exit status 1 Exception in thread Thread-3: Traceback (most recent call last): File "/home/xiaomin.wu/anaconda2/lib/python2.7/threading.py", line 801, in __bootstrap_inner self.run() File "/home/xiaomin.wu/anaconda2/lib/python2.7/threading.py", line 754, in run self.__target(*self.__args, **self.__kwargs) File "/home/xiaomin.wu/anaconda2/lib/python2.7/site-packages/dmlc_tracker/ssh.py", line 61, in run subprocess.check_call(prog, shell = True) File "/home/xiaomin.wu/anaconda2/lib/python2.7/subprocess.py", line 186, in check_call raise CalledProcessError(retcode, cmd) CalledProcessError: Command 'ssh -o StrictHostKeyChecking=no 10.14.6.8 -p 22 'export LD_LIBRARY_PATH=/usr/local/cuda/lib64::/usr/local/cuda-8.0/lib64:~/TensorRT-4.0.0.3/lib; export DMLC_ROLE=server; export DMLC_PS_ROOT_PORT=9093; export DMLC_PS_ROOT_URI=10.14.6.5; export DMLC_NUM_SERVER=2; export DMLC_NUM_WORKER=2; cd /home/xiaomin.wu/cifar10_dist; python /autofs/data56/public/xiaomin.wu/code/dist_mxnet_test/cifar10_dist.py'' returned non-zero exit status 1
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services