RE: Data and Model Parallelism in MLPC

Ulanov, Alexander Tue, 08 Dec 2015 12:03:13 -0800

Hi Disha,

Which use case do you have in mind that would require model parallelism? It 
should have large number of weights, so it could not fit into the memory of a 
single machine. For example, multilayer perceptron topologies, that are used 
for speech recognition, have up to 100M of weights. Present hardware is capable 
of accommodating this in the main memory. That might be a problem for GPUs, but 
this is a different topic.


The straightforward way of model parallelism for fully connected neural 
networks is to distribute horizontal (or vertical) blocks of weight matrices 
across several nodes. That means that the input data has to be reproduced on 
all these nodes. The forward and the backward passes will require re-assembling 
the outputs and the errors on each of the nodes after each layer, because each 
of the node can produce only partial results since it holds a part of weights. 
According to my estimations, this is inefficient due to large intermediate 
traffic between the nodes and should be used only if the model does not fit in 
memory of a single machine. Another way of model parallelism would be to 
represent the network as the graph and use GraphX to write forward and back 
propagation. However, this option does not seem very practical to me.

Best regards, Alexander

From: Disha Shrivastava [mailto:dishu....@gmail.com]
Sent: Tuesday, December 08, 2015 11:19 AM
To: Ulanov, Alexander
Cc: dev@spark.apache.org
Subject: Re: Data and Model Parallelism in MLPC

Hi Alexander,
Thanks for your response. Can you suggest ways to incorporate Model Parallelism 
in MPLC? I am trying to do the same in Spark. I got hold of your post 
http://apache-spark-developers-list.1001551.n3.nabble.com/Model-parallelism-with-RDD-td13141.html
 where you have divided the weight matrix into different worker machines. I 
have two basic questions in this regard:
1. How to actually visualize/analyze and control how nodes of the neural 
network/ weights are divided across different workers?
2. Is there any alternate way to achieve model parallelism for MPLC in Spark? I 
believe we need to have some kind of synchronization and control for the 
updation of weights shared across different workers during backpropagation.
Looking forward for your views on this.
Thanks and Regards,
Disha

On Wed, Dec 9, 2015 at 12:36 AM, Ulanov, Alexander 
<alexander.ula...@hpe.com<mailto:alexander.ula...@hpe.com>> wrote:
Hi Disha,

Multilayer perceptron classifier in Spark implements data parallelism.

Best regards, Alexander

From: Disha Shrivastava [mailto:dishu....@gmail.com<mailto:dishu....@gmail.com>]
Sent: Tuesday, December 08, 2015 12:43 AM
To: dev@spark.apache.org<mailto:dev@spark.apache.org>; Ulanov, Alexander
Subject: Data and Model Parallelism in MLPC

Hi,
I would like to know if the implementation of MLPC in the latest released 
version of Spark ( 1.5.2 ) implements model parallelism and data parallelism as 
done in the DistBelief model implemented by Google  
http://static.googleusercontent.com/media/research.google.com/hi//archive/large_deep_networks_nips2012.pdf<http://static.googleusercontent.com/media/research.google.com/hi/archive/large_deep_networks_nips2012.pdf>
Thanks And Regards,
Disha

RE: Data and Model Parallelism in MLPC

Reply via email to