Github user sethah commented on the issue: https://github.com/apache/spark/pull/13621 @avulanov I used this implementation to run a simple single layer autoencoder on the MNIST dataset. I also used keras/theano to implement the same autoencoder and run on the MNIST data. With Spark, I got very poor results. First, here are the results of encode/decode using Keras with a cross entropy loss function on the output, and sigmoid activations. ![image](https://cloud.githubusercontent.com/assets/7275795/17375073/59b14faa-5964-11e6-943f-d2e1db06089d.png) The implementation in this patch yielded very similar results. ![image](https://cloud.githubusercontent.com/assets/7275795/17374543/fb923c42-5961-11e6-8c97-dfa7626c4cc3.png) Finally, here is the Keras implementation using RELU activations. ![image](https://cloud.githubusercontent.com/assets/7275795/17375464/ebe1b8d2-5965-11e6-964f-fa8cc1c2a4f5.png) It appears the sigmoid activations are saturating during training and preventing the algorithm from learning. If you have any thoughts/suggestions to improve these results I'd really appreciate it. Does it make sense to add another algorithm based on MLP/NN when the current functionality is so limited? If the autoencoder library is not useful without more than sigmoid activations, I'd vote for focusing on adding new activations before another algorithm. I'm not an expert here, so I would really appreciate your thoughts. Thanks!
--- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org