[GitHub] [incubator-mxnet] thomelane commented on issue #12577: Training with fc and multi-gpu is much slower than single gpu

GitHub Tue, 02 Oct 2018 15:35:58 -0700

Hi @liu6381810,

Seems like there's an overhead to using multiple GPUs, and one possible source 
is the transfer of gradients between GPUs. Are you using an AWS EC2 p3.16xlarge 
instance for this, or do you have your own server here? Check `nvidia-smi topo 
--matrix` to confirm that you have fast GPU to GPU communications. You could 
also take a look at gradient compression to reduce the amount of data being 
transferred: see [this 
tutorial](https://mxnet.incubator.apache.org/faq/gradient_compression.html?highlight=compression)
 for more information.


[ Full content available at: 
https://github.com/apache/incubator-mxnet/issues/12577 ]
This message was relayed via gitbox.apache.org for [email protected]

[GitHub] [incubator-mxnet] thomelane commented on issue #12577: Training with fc and multi-gpu is much slower than single gpu

Reply via email to