You are not the first :) probably not the fifth to have the question. parameter server is not included in spark framework and I've seen all kinds of hacking to improvise it: REST api, HDFS, tachyon, etc. Not sure if an 'official' benchmark & implementation will be released soon
On 9 January 2015 at 10:59, Marco Shaw <marco.s...@gmail.com> wrote: > Pretty vague on details: > > http://www.datasciencecentral.com/m/blogpost?id=6448529%3ABlogPost%3A227199 > > > On Jan 9, 2015, at 11:39 AM, Jaonary Rabarisoa <jaon...@gmail.com> wrote: > > Hi all, > > DeepLearning algorithms are popular and achieve many state of the art > performance in several real world machine learning problems. Currently > there are no DL implementation in spark and I wonder if there is an ongoing > work on this topics. > > We can do DL in spark Sparkling water and H2O but this adds an additional > software stack. > > Deeplearning4j seems to implements a distributed version of many popural > DL algorithm. Porting DL4j in Spark can be interesting. > > Google describes an implementation of a large scale DL in this paper > http://research.google.com/archive/large_deep_networks_nips2012.html. > Based on model parallelism and data parallelism. > > So, I'm trying to imaging what should be a good design for DL algorithm in > Spark ? Spark already have RDD (for data parallelism). Can GraphX be used > for the model parallelism (as DNN are generally designed as DAG) ? And what > about using GPUs to do local parallelism (mecanism to push partition into > GPU memory ) ? > > > What do you think about this ? > > > Cheers, > > Jao > >