[jira] [Updated] (MXNET-22) Support for distributed training based on NCCL Kvstore

2018-02-26 Thread Rahul Huilgol (JIRA)

 [ 
https://issues.apache.org/jira/browse/MXNET-22?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rahul Huilgol updated MXNET-22:
---
Component/s: MXNet Engine

> Support for distributed training based on NCCL Kvstore
> --
>
> Key: MXNET-22
> URL: https://issues.apache.org/jira/browse/MXNET-22
> Project: Apache MXNet
>  Issue Type: New Feature
>  Components: MXNet Engine
>Reporter: Rahul Huilgol
>Priority: Major
>
> NCCL kvstore is currently supported for a single machine only. When training 
> is distributed, there is no way to use NCCL for aggregation of gradients from 
> different GPUs on a single machine.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@mxnet.apache.org
For additional commands, e-mail: issues-h...@mxnet.apache.org



[jira] [Created] (MXNET-22) Support for distributed training based on NCCL Kvstore

2018-02-26 Thread Rahul Huilgol (JIRA)
Rahul Huilgol created MXNET-22:
--

 Summary: Support for distributed training based on NCCL Kvstore
 Key: MXNET-22
 URL: https://issues.apache.org/jira/browse/MXNET-22
 Project: Apache MXNet
  Issue Type: New Feature
Reporter: Rahul Huilgol


NCCL kvstore is currently supported for a single machine only. When training is 
distributed, there is no way to use NCCL for aggregation of gradients from 
different GPUs on a single machine.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@mxnet.apache.org
For additional commands, e-mail: issues-h...@mxnet.apache.org