Thank you for your answers! While it is clear each DL framework can solve the distributed model training on their own (some better than others). Still I see a lot of value of having Spark on the ETL/pre-processing part, thus the origin of my question. I am trying to avoid to mange multiple stacks/workflows and hoping to unify my system. Projects like TensorflowOnSpark or Analytics-Zoo (to name couple) feels like they can help, still I really appreciate your comments and anyone that could add some value to this discussion. Does anyone have experience with them?
Thanks On Sat, May 4, 2019 at 8:01 PM Pat Ferrel <p...@occamsmachete.com> wrote: > @Riccardo > > Spark does not do the DL learning part of the pipeline (afaik) so it is > limited to data ingestion and transforms (ETL). It therefore is optional > and other ETL options might be better for you. > > Most of the technologies @Gourav mentions have their own scaling based on > their own compute engines specialized for their DL implementations, so be > aware that Spark scaling has nothing to do with scaling most of the DL > engines, they have their own solutions. > > From: Gourav Sengupta <gourav.sengu...@gmail.com> > <gourav.sengu...@gmail.com> > Reply: Gourav Sengupta <gourav.sengu...@gmail.com> > <gourav.sengu...@gmail.com> > Date: May 4, 2019 at 10:24:29 AM > To: Riccardo Ferrari <ferra...@gmail.com> <ferra...@gmail.com> > Cc: User <user@spark.apache.org> <user@spark.apache.org> > Subject: Re: Deep Learning with Spark, what is your experience? > > Try using MxNet and Horovod directly as well (I think that MXNet is worth > a try as well): > 1. > https://medium.com/apache-mxnet/distributed-training-using-apache-mxnet-with-horovod-44f98bf0e7b7 > 2. > https://docs.nvidia.com/deeplearning/dgx/mxnet-release-notes/rel_19-01.html > 3. https://aws.amazon.com/mxnet/ > 4. > https://aws.amazon.com/blogs/machine-learning/aws-deep-learning-amis-now-include-horovod-for-faster-multi-gpu-tensorflow-training-on-amazon-ec2-p3-instances/ > > > Ofcourse Tensorflow is backed by Google's advertisement team as well > https://aws.amazon.com/blogs/machine-learning/scalable-multi-node-training-with-tensorflow/ > > > Regards, > > > > > On Sat, May 4, 2019 at 10:59 AM Riccardo Ferrari <ferra...@gmail.com> > wrote: > >> Hi list, >> >> I am trying to undestand if ti make sense to leverage on Spark as >> enabling platform for Deep Learning. >> >> My open question to you are: >> >> - Do you use Apache Spark in you DL pipelines? >> - How do you use Spark for DL? Is it just a stand-alone stage in the >> workflow (ie data preparation script) or is it more integrated >> >> I see a major advantage in leveraging on Spark as a unified entrypoint, >> for example you can easily abstract data sources and leverage on existing >> team skills for data pre-processing and training. On the flip side you may >> hit some limitations including supported versions and so on. >> What is your experience? >> >> Thanks! >> >