My 2 cents: * There is frequency domain processing available already (e.g. spark.ml DCT transformer) but no FFT transformer yet because complex numbers are not currently a Spark SQL datatype * We shouldn't assume signals are even, so we need complex numbers to implement the FFT * I have not closely studied the relative performance tradeoffs, so please do let me know if there's a significant difference in practice
On Tue, Sep 8, 2015 at 5:46 PM, Ulanov, Alexander <alexander.ula...@hpe.com> wrote: > That is an option too. Implementing convolutions with FFTs should be > considered as well http://arxiv.org/pdf/1312.5851.pdf. > > > > *From:* Feynman Liang [mailto:fli...@databricks.com] > *Sent:* Tuesday, September 08, 2015 12:07 PM > *To:* Ulanov, Alexander > *Cc:* Ruslan Dautkhanov; Nick Pentreath; user; na...@yandex.ru > *Subject:* Re: Spark ANN > > > > Just wondering, why do we need tensors? Is the implementation of convnets > using im2col (see here <http://cs231n.github.io/convolutional-networks/>) > insufficient? > > > > On Tue, Sep 8, 2015 at 11:55 AM, Ulanov, Alexander < > alexander.ula...@hpe.com> wrote: > > Ruslan, thanks for including me in the discussion! > > > > Dropout and other features such as Autoencoder were implemented, but not > merged yet in order to have room for improving the internal Layer API. For > example, there is an ongoing work with convolutional layer that > consumes/outputs 2D arrays. We’ll probably need to change the Layer’s > input/output type to tensors. This will influence dropout which will need > some refactoring to handle tensors too. Also, all new components should > have ML pipeline public interface. There is an umbrella issue for deep > learning in Spark https://issues.apache.org/jira/browse/SPARK-5575 which > includes various features of Autoencoder, in particular > https://issues.apache.org/jira/browse/SPARK-10408. You are very welcome > to join and contribute since there is a lot of work to be done. > > > > Best regards, Alexander > > *From:* Ruslan Dautkhanov [mailto:dautkha...@gmail.com] > *Sent:* Monday, September 07, 2015 10:09 PM > *To:* Feynman Liang > *Cc:* Nick Pentreath; user; na...@yandex.ru > *Subject:* Re: Spark ANN > > > > Found a dropout commit from avulanov: > > > https://github.com/avulanov/spark/commit/3f25e26d10ef8617e46e35953fe0ad1a178be69d > > > > It probably hasn't made its way to MLLib (yet?). > > > > > > -- > Ruslan Dautkhanov > > > > On Mon, Sep 7, 2015 at 8:34 PM, Feynman Liang <fli...@databricks.com> > wrote: > > Unfortunately, not yet... Deep learning support (autoencoders, RBMs) is on > the roadmap for 1.6 <https://issues.apache.org/jira/browse/SPARK-10324> > though, and there is a spark package > <http://spark-packages.org/package/rakeshchalasani/MLlib-dropout> for > dropout regularized logistic regression. > > > > > > On Mon, Sep 7, 2015 at 3:15 PM, Ruslan Dautkhanov <dautkha...@gmail.com> > wrote: > > Thanks! > > > > It does not look Spark ANN yet supports dropout/dropconnect or any other > techniques that help avoiding overfitting? > > http://www.cs.toronto.edu/~rsalakhu/papers/srivastava14a.pdf > > https://cs.nyu.edu/~wanli/dropc/dropc.pdf > > > > ps. There is a small copy-paste typo in > > > https://github.com/apache/spark/blob/master/mllib/src/main/scala/org/apache/spark/ml/ann/BreezeUtil.scala#L43 > > should read B&C :) > > > > > > -- > Ruslan Dautkhanov > > > > On Mon, Sep 7, 2015 at 12:47 PM, Feynman Liang <fli...@databricks.com> > wrote: > > Backprop is used to compute the gradient here > <https://github.com/apache/spark/blob/master/mllib/src/main/scala/org/apache/spark/ml/ann/Layer.scala#L579-L584>, > which is then optimized by SGD or LBFGS here > <https://github.com/apache/spark/blob/master/mllib/src/main/scala/org/apache/spark/ml/ann/Layer.scala#L878> > > > > On Mon, Sep 7, 2015 at 11:24 AM, Nick Pentreath <nick.pentre...@gmail.com> > wrote: > > Haven't checked the actual code but that doc says "MLPC employes > backpropagation for learning the model. .."? > > > > > > > — > Sent from Mailbox <https://www.dropbox.com/mailbox> > > > > On Mon, Sep 7, 2015 at 8:18 PM, Ruslan Dautkhanov <dautkha...@gmail.com> > wrote: > > http://people.apache.org/~pwendell/spark-releases/latest/ml-ann.html > > > > Implementation seems missing backpropagation? > > Was there is a good reason to omit BP? > > What are the drawbacks of a pure feedforward-only ANN? > > > > Thanks! > > > > -- > Ruslan Dautkhanov > > > > > > > > > > > > >