Hi Bert,

There is a specific process of pull request if you wish to share the code 
https://cwiki.apache.org/confluence/display/SPARK/Contributing+to+Spark

I would be glad to benchmark your ANN implementation by means of running some 
experiments that we run with the other ANN toolkits. I am also interested in 
Autoencoder and have plans to implement it for MLLib in the near future. 

Best regards, Alexander

-----Original Message-----
From: Bert Greevenbosch [mailto:bert.greevenbo...@huawei.com] 
Sent: Tuesday, July 01, 2014 7:14 AM
To: dev@spark.apache.org
Subject: RE: Artificial Neural Network in Spark?

Hi Debasish, Alexander, all,

Indeed I found the OpenDL project through the Powered by Spark page. I'll need 
some time to look into the code, but on the first sight it looks quite 
well-developed. I'll contact the author about this too.

My own implementation (in Scala) works for multiple inputs and multiple 
outputs. It implements a single hidden layer, the number of nodes in it can be 
specified.

The implementation is a general ANN implementation. As such, it should be 
useable for an autoencoder too, since that is just an ANN with some special 
input/output constraints.

As said before, the implementation is built upon the linear regression model 
and gradient descent implementation. However it did require some tweaks:

- The linear regression model only supports a single output "label" (as 
Double). Since the ANN can have multiple outputs, it ignores the "label" 
attribute, but for training divides the input vector into two parts, the first 
part being the genuine input vector, the second the target output vector.

- The concatenation of input and target output vectors is only internally, the 
training function takes as input an RDD with tuples of two Vectors, one for 
each input and output.

- The GradientDescend optimizer is re-used without modification.

- I have made an even simpler updater than the SimpleUpdater, leaving out the 
division by the square root of the number of iterations. The SimpleUpdater can 
also be used, but I created this simpler one because I like to plot the result 
every now and then, and then continue the calculations. For this, I also wrote 
a training function with as input the weights from the previous training 
session.

- I created a ParallelANNModel similar to the LinearRegressionModel.

- I created a new GeneralizedSteepestDescendAlgorithm class similar to the 
GeneralizedLinearAlgorithm class.

- Created some example code to test with 2D (1 input 1 output), 3D (2 inputs 1 
output) and 4D (1 input 3 outputs) functions.

If there is interest, I would be happy to release the code. What would be the 
best way to do this? Is there some kind of review process?

Best regards,
Bert


> -----Original Message-----
> From: Debasish Das [mailto:debasish.da...@gmail.com]
> Sent: 27 June 2014 14:02
> To: dev@spark.apache.org
> Subject: Re: Artificial Neural Network in Spark?
> 
> Look into Powered by Spark page...I found a project there which used 
> autoencoder functions...It's not updated for a long time now !
> 
> On Thu, Jun 26, 2014 at 10:51 PM, Ulanov, Alexander 
> <alexander.ula...@hp.com
> > wrote:
> 
> > Hi Bert,
> >
> > It would be extremely interesting. Do you plan to implement
> autoencoder as
> > well? It would be great to have deep learning in Spark.
> >
> > Best regards, Alexander
> >
> > 27.06.2014, в 4:47, "Bert Greevenbosch" 
> > <bert.greevenbo...@huawei.com>
> > написал(а):
> >
> > > Hello all,
> > >
> > > I was wondering whether Spark/mllib supports Artificial Neural
> Networks
> > (ANNs)?
> > >
> > > If not, I am currently working on an implementation of it. I 
> > > re-use
> the
> > code for linear regression and gradient descent as much as possible.
> > >
> > > Would the community be interested in such implementation? Or maybe
> > somebody is already working on it?
> > >
> > > Best regards,
> > > Bert
> >

Reply via email to