[ 
https://issues.apache.org/jira/browse/SINGA-226?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

wangwei closed SINGA-226.
-------------------------
    Resolution: Fixed

> Add parallel training on a single machine for singa v1.0
> --------------------------------------------------------
>
>                 Key: SINGA-226
>                 URL: https://issues.apache.org/jira/browse/SINGA-226
>             Project: Singa
>          Issue Type: New Feature
>            Reporter: Wang Ji
>            Assignee: Wang Ji
>
> In this ticket, we implement parallel training using multiple devices on a 
> single machine. 
> To support parallel training, a Updater class need to be implemented to 
> aggregate partial gradient from parallel workers and using Optimizer to 
> update the Parameters. Updater can be designed for different kinds of 
> topological structure, i.e., *local-cpu*, *local-dev*, *local-allreduce*. 
> *local-cpu:* Do aggregate and update parameter using CPU. In this mode, host 
> CPU need to copy gradient and parameter tensor from GPU workers, do update, 
> and copy back.
> *local-gpu:* Do aggregate and update parameter using a chosen GPU. In this 
> mode, the updater GPU need to copy gradient and parameter tensor from other 
> GPU workers, do update, and copy back.
> *local-allreduce:* In this mode, each parameter will be sliced among all GPU 
> workers. In each iteration, gradients are aggregated and updated like a MPI 
> Allreduce style.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to