I fixed typo. :-) Thank you very much! See also https://twitter.com/eddieyoon/status/616515242701357056
> You are planned others modes of training? (aka Backpropagation/online/etc.) I've not think about it yet. -- Best Regards, Edward J. Yoon -----Original Message----- From: Julio Pires [mailto:[email protected]] Sent: Friday, July 03, 2015 10:10 AM To: [email protected] Subject: Re: Future plan for large scale DNN. Hi Edward, very nice! 1) ?You are planned others modes of training? (aka Backpropagation/online/etc.) 2) I suggest change setMomemtum to setMome*n*tum. Thanks! 2015-07-02 4:50 GMT-03:00 Edward J. Yoon <[email protected]>: > Here's new user interface design idea I propose. Any advices are welcome! > > https://wiki.apache.org/hama/Neuron > > On Mon, Jun 29, 2015 at 4:38 PM, Edward J. Yoon <[email protected]> > wrote: > > Hey all, > > > > As you know, the lastest Apache Hama provides distributed training of > > an Artificial Neural Network using its BSP computing engine. In > > general, the training data is stored in HDFS and is distributed in > > multiple machines. In Hama, two kinds of components are involved in > > the training procedure: the master task and the groom task. The master > > task is in charge of merging the model updating information and > > sending model updating information to all the groom tasks. The groom > > tasks is in charge of calculate the weight updates according to the > > training data. > > > > The training procedure is iterative and each iteration consists of > > two phases: update weights and merge update. In the update weights > > phase, each groom task would first update the local model according to > > the received message from the master task. Then they would compute the > > weight updates locally with assigned data partitions (mini-batch SGD) > > and finally send the updated weights to the master task. In the merge > > update phase, the master task would update the model according to the > > messages received from the groom tasks. Then it would distribute the > > updated model to all groom tasks. The two phases will repeat > > alternatively until the termination condition is met (reach a > > specified number of iterations). > > > > The model is designed in a hierarchical way. The base class is more > > abstract than the derived class, so that the structure of the ANN > > model can be freely set by the user, as long as it is a layered model. > > Therefore, the Perceptron, Auto-encoder, Linear and Logistic regressor > > can all be uniformly represented by an ANN. > > > > However, as described in above, currently the data parallelism is > > only used. Each node will have a copy of the model. In each iteration, > > the computation is conducted on each node and a final aggregation is > > conducted in one node. Then the updated model will be synchronized to > > each node. So, the performance is one thing; the parameters should fit > > into the memory of a single machine. > > > > Here is a tentative near future plan I propose for applications > > needing large model with huge memory consumptions, moderate > > computational power for one mini-batch, and lots of training data. The > > main idea is use of Parameter Server to parallelize model creation and > > distribute training across machines. Apache Hama framework assigns > > each split of training data stored in HDFS to each BSP task. Then, the > > BSP task assigns each of the N threads a small portion of work, much > > smaller than 1/Nth of the total size of a mini-batch, and assigns new > > portions whenever they are free. With this approach, faster threads do > > more work than slower threads. Each thread asynchronously asks the > > Parameter Server who stores the parameters in distributed machines for > > an updated copy of its model, computes the gradients on the assigned > > data, and sends updated gradients back to the parameter server. This > > architecture is inspired by Google's DistBelief (Jeff Dean et al, > > 2012). Finally, I have no concrete idea regarding programming > > interface at the moment but I'll try to provide neuron-centric > > programming model like Google's Pregel if possible. > > > > -- > > Best Regards, Edward J. Yoon > > > > -- > Best Regards, Edward J. Yoon >
