indhub opened a new issue #11442: No module named 'dmlc_tracker' URL: https://github.com/apache/incubator-mxnet/issues/11442 ## Description `pip install mxnet-cu91` (or `mxnet-cu90`) does not install dmlc-tracker. Running distributed training after pip install gives this error ``` ubuntu@ip-172-31-33-187:~$ mxnet/tools/launch.py -n 2 -s 2 --launcher local python cifar10_dist.py Can't load dmlc_tracker package. Perhaps you need to run git submodule update --init --recursive Traceback (most recent call last): File "mxnet/tools/launch.py", line 128, in <module> main() File "mxnet/tools/launch.py", line 96, in main args = dmlc_opts(args) File "mxnet/tools/launch.py", line 48, in dmlc_opts from dmlc_tracker import opts ImportError: No module named dmlc_tracker ``` The error goes away if dmlc-core is built separately and PYTHONPATH is set correctly. After `pip install mxnet-cu`, shouldn't users be able to run distributed training without any more installations? ## Environment info: Ubuntu p2 instance Package used (Python/R/Scala/Julia): 1.2 ## Steps to reproduce (Paste the commands you ran that produced the error.) - Create an instance with 4 or more gpus. - wget https://raw.githubusercontent.com/indhub/mxnet/e5b89cf9d7c35ac749ed14b54c0faa6dfffa15ef/example/distributed_training/cifar10_dist.py - mxnet/tools/launch.py -n 2 -s 2 --launcher local python cifar10_dist.py
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services