If you need ConvNets and RNNs and want to stay in Scala/Java, then Deep Learning for Java (DL4J) might be the most mature option.
If you want ConvNets and RNNs, as implemented in TensorFlow, along with all the bells and whistles, then you might want to switch to PySpark + TensorFlow and write the entire pipeline in Python. You'd do the data preparation/ingestion in PySpark and pass the data to TensorFlow for the ML part. There are 2 supported modes here: 1) Simultaneous multi-model training (a.k.a. embarrassingly parallel: each node has the entire data and model): https://databricks.com/blog/2016/01/25/deep-learning-with-apache-spark-and-tensorflow.html 2) Data parallelism (data is distributed, each node has the entire model): There are some prototypes out there and TensorSpark seems to be most mature: https://github.com/adatao/tensorspark It implements Downpour/Asynchronous SGD for the distributed training; it remains to be stress-tested with large datasets, however. More info: https://arimo.com/machine-learning/deep-learning/2016/arimo-distributed-tensorflow-on-spark/ TensorFrames does not allow distributed training and I did not see any performance benchmarks last time I checked. Alexander Ulanov of HP made a presentation of the options few months ago: https://www.oreilly.com/learning/distributed-deep-learning-on-spark Masood ------------------------------ Masood Krohy, Ph.D. Data Scientist, Intact Lab-R&D Intact Financial Corporation De : Benjamin Kim <bbuil...@gmail.com> A : janardhan shetty <janardhan...@gmail.com> Cc : Gourav Sengupta <gourav.sengu...@gmail.com>, user <user@spark.apache.org> Date : 2016-11-01 13:14 Objet : Re: Deep learning libraries for scala To add, I see that Databricks has been busy integrating deep learning more into their product and put out a new article about this. https://databricks.com/blog/2016/10/27/gpu-acceleration-in-databricks.html An interesting tidbit is at the bottom of the article mentioning TensorFrames. https://github.com/databricks/tensorframes Seems like an interesting direction… Cheers, Ben On Oct 19, 2016, at 9:05 AM, janardhan shetty <janardhan...@gmail.com> wrote: Agreed. But as it states deeper integration with (scala) is yet to be developed. Any thoughts on how to use tensorflow with scala ? Need to write wrappers I think. On Oct 19, 2016 7:56 AM, "Benjamin Kim" <bbuil...@gmail.com> wrote: On that note, here is an article that Databricks made regarding using Tensorflow in conjunction with Spark. https://databricks.com/blog/2016/01/25/deep-learning-with-apache-spark-and-tensorflow.html Cheers, Ben On Oct 19, 2016, at 3:09 AM, Gourav Sengupta <gourav.sengu...@gmail.com> wrote: while using Deep Learning you might want to stay as close to tensorflow as possible. There is very less translation loss, you get to access stable, scalable and tested libraries from the best brains in the industry and as far as Scala goes, it helps a lot to think about using the language as a tool to access algorithms in this instance unless you want to start developing algorithms from grounds up ( and in which case you might not require any libraries at all). On Sat, Oct 1, 2016 at 3:30 AM, janardhan shetty <janardhan...@gmail.com> wrote: Hi, Are there any good libraries which can be used for scala deep learning models ? How can we integrate tensorflow with scala ML ?