I will let Xiangrui to comment on the PR process to add the code in mllib but I would love to look into your initial version if you push it to github...
As far as I remember Quoc got his best ANN results using back-propagation algorithm and solved using CG...do you have those features or you are using SGD style update.... On Mon, Jun 30, 2014 at 8:13 PM, Bert Greevenbosch < bert.greevenbo...@huawei.com> wrote: > Hi Debasish, Alexander, all, > > Indeed I found the OpenDL project through the Powered by Spark page. I'll > need some time to look into the code, but on the first sight it looks quite > well-developed. I'll contact the author about this too. > > My own implementation (in Scala) works for multiple inputs and multiple > outputs. It implements a single hidden layer, the number of nodes in it can > be specified. > > The implementation is a general ANN implementation. As such, it should be > useable for an autoencoder too, since that is just an ANN with some special > input/output constraints. > > As said before, the implementation is built upon the linear regression > model and gradient descent implementation. However it did require some > tweaks: > > - The linear regression model only supports a single output "label" (as > Double). Since the ANN can have multiple outputs, it ignores the "label" > attribute, but for training divides the input vector into two parts, the > first part being the genuine input vector, the second the target output > vector. > > - The concatenation of input and target output vectors is only internally, > the training function takes as input an RDD with tuples of two Vectors, one > for each input and output. > > - The GradientDescend optimizer is re-used without modification. > > - I have made an even simpler updater than the SimpleUpdater, leaving out > the division by the square root of the number of iterations. The > SimpleUpdater can also be used, but I created this simpler one because I > like to plot the result every now and then, and then continue the > calculations. For this, I also wrote a training function with as input the > weights from the previous training session. > > - I created a ParallelANNModel similar to the LinearRegressionModel. > > - I created a new GeneralizedSteepestDescendAlgorithm class similar to the > GeneralizedLinearAlgorithm class. > > - Created some example code to test with 2D (1 input 1 output), 3D (2 > inputs 1 output) and 4D (1 input 3 outputs) functions. > > If there is interest, I would be happy to release the code. What would be > the best way to do this? Is there some kind of review process? > > Best regards, > Bert > > > > -----Original Message----- > > From: Debasish Das [mailto:debasish.da...@gmail.com] > > Sent: 27 June 2014 14:02 > > To: dev@spark.apache.org > > Subject: Re: Artificial Neural Network in Spark? > > > > Look into Powered by Spark page...I found a project there which used > > autoencoder functions...It's not updated for a long time now ! > > > > On Thu, Jun 26, 2014 at 10:51 PM, Ulanov, Alexander > > <alexander.ula...@hp.com > > > wrote: > > > > > Hi Bert, > > > > > > It would be extremely interesting. Do you plan to implement > > autoencoder as > > > well? It would be great to have deep learning in Spark. > > > > > > Best regards, Alexander > > > > > > 27.06.2014, в 4:47, "Bert Greevenbosch" <bert.greevenbo...@huawei.com> > > > написал(а): > > > > > > > Hello all, > > > > > > > > I was wondering whether Spark/mllib supports Artificial Neural > > Networks > > > (ANNs)? > > > > > > > > If not, I am currently working on an implementation of it. I re-use > > the > > > code for linear regression and gradient descent as much as possible. > > > > > > > > Would the community be interested in such implementation? Or maybe > > > somebody is already working on it? > > > > > > > > Best regards, > > > > Bert > > > >