Hi,

I am sure those who have actually built a data processing pipeline whose
contents have to be then delivered to tensorflow or pytorch (not for POC,
or writing a blog to get clicks, or resolving symptomatic bugs, but in real
life end-to-end application), will perhaps understand some of  the issues
because SPARK dataframes do not natively integrate with tensorflow/
pytorch.

But perhaps I am wrong.

My point of mentioning Ray is simple, it is based on the fact that if SPARK
were to be able to natively scale out and distribute data to tensorflow, or
pytorch then there will be competition between Ray and SPARK.

Regards,
Gourav Sengupta

On Wed, Feb 23, 2022 at 12:35 PM Sean Owen <sro...@gmail.com> wrote:

> Spark does do distributed ML, but not Tensorflow. Barrier execution mode
> is an element that things like Horovod uses. Not sure what you are getting
> at?
> Ray is not Spark.
> As I say -- Horovod does this already. The upside over TF distributed is
> that Spark sets up and manages the daemon processes rather than doing it by
> hand.
>
>
> On Wed, Feb 23, 2022 at 2:43 AM Gourav Sengupta <gourav.sengu...@gmail.com>
> wrote:
>
>> Hi,
>>
>> the SPARK community should have been able to build distributed ML
>> capabilities, and as far as I remember that was the idea initially behind
>> SPARK 3.x roadmap (barrier execution mode,
>> https://issues.apache.org/jira/browse/SPARK-24579).
>>
>> Ray, another Berkeley Labs output like SPARK, is trying to capture that
>> market space.
>>
>> I am not sure whether there is any steer by the SPARK community leaders
>> to seriously prioritise building those capabilities at all. But I am sure
>> if the brilliant and fantastic minds behind SPARK did actually want to
>> allow building those capabilities, they can easily do so, and achieve that
>> :)
>>
>> I would sincerely request the open source SPARK community to prioritise
>> building the SPARK capabilities to scale ML applications.
>>
>>
>>
>> Thanks and Regards,
>> Gourav Sengupta
>>
>> On Wed, Feb 23, 2022 at 3:53 AM Bitfox <bit...@bitfox.top> wrote:
>>
>>> tensorflow itself can implement the distributed computing via a
>>> parameter server. Why did you want spark here?
>>>
>>> regards.
>>>
>>> On Wed, Feb 23, 2022 at 11:27 AM Vijayant Kumar
>>> <vijayant.ku...@mavenir.com.invalid> wrote:
>>>
>>>> Thanks Sean for your response. !!
>>>>
>>>>
>>>>
>>>> Want to add some more background here.
>>>>
>>>>
>>>>
>>>> I am using Spark3.0+ version with Tensorflow 2.0+.
>>>>
>>>> My use case is not for the image data but for the Time-series data
>>>> where I am using LSTM and transformers to forecast.
>>>>
>>>>
>>>>
>>>> I evaluated *SparkFlow* and *spark_tensorflow_distributor *libraries, and
>>>> there has been no major development recently on those libraries. I faced
>>>> the issue of version dependencies on those and had a hard time fixing the
>>>> library compatibilities. Hence a couple of below doubts:-
>>>>
>>>>
>>>>
>>>>    - Does *Horovod* have any dependencies?
>>>>    - Any other library which is suitable for my use case.?
>>>>    - Any example code would really be of great help to understand.
>>>>
>>>>
>>>>
>>>> Thanks,
>>>>
>>>> Vijayant
>>>>
>>>>
>>>>
>>>> *From:* Sean Owen <sro...@gmail.com>
>>>> *Sent:* Wednesday, February 23, 2022 8:40 AM
>>>> *To:* Vijayant Kumar <vijayant.ku...@mavenir.com.invalid>
>>>> *Cc:* user @spark <user@spark.apache.org>
>>>> *Subject:* [E] COMMERCIAL BULK: Re: TensorFlow on Spark
>>>>
>>>>
>>>>
>>>> *Email is from a Free Mail Service (Gmail/Yahoo/Hotmail….) *: Beware
>>>> of Phishing Scams, Report questionable emails to s...@mavenir.com
>>>>
>>>> Sure, Horovod is commonly used on Spark for this:
>>>>
>>>> https://horovod.readthedocs.io/en/stable/spark_include.html
>>>>
>>>>
>>>>
>>>> On Tue, Feb 22, 2022 at 8:51 PM Vijayant Kumar <
>>>> vijayant.ku...@mavenir.com.invalid> wrote:
>>>>
>>>> Hi All,
>>>>
>>>>
>>>>
>>>> Anyone using Apache spark with TensorFlow for building models. My
>>>> requirement is to use TensorFlow distributed model training across the
>>>> Spark executors.
>>>>
>>>> Please help me with some resources or some sample code.
>>>>
>>>>
>>>>
>>>> Thanks,
>>>>
>>>> Vijayant
>>>> ------------------------------
>>>>
>>>> This e-mail message may contain confidential or proprietary information
>>>> of Mavenir Systems, Inc. or its affiliates and is intended solely for the
>>>> use of the intended recipient(s). If you are not the intended recipient of
>>>> this message, you are hereby notified that any review, use or distribution
>>>> of this information is absolutely prohibited and we request that you delete
>>>> all copies in your control and contact us by e-mailing to
>>>> secur...@mavenir.com. This message contains the views of its author
>>>> and may not necessarily reflect the views of Mavenir Systems, Inc. or its
>>>> affiliates, who employ systems to monitor email messages, but make no
>>>> representation that such messages are authorized, secure, uncompromised, or
>>>> free from computer viruses, malware, or other defects. Thank You
>>>>
>>>> ------------------------------
>>>>
>>>> This e-mail message may contain confidential or proprietary information
>>>> of Mavenir Systems, Inc. or its affiliates and is intended solely for the
>>>> use of the intended recipient(s). If you are not the intended recipient of
>>>> this message, you are hereby notified that any review, use or distribution
>>>> of this information is absolutely prohibited and we request that you delete
>>>> all copies in your control and contact us by e-mailing to
>>>> secur...@mavenir.com. This message contains the views of its author
>>>> and may not necessarily reflect the views of Mavenir Systems, Inc. or its
>>>> affiliates, who employ systems to monitor email messages, but make no
>>>> representation that such messages are authorized, secure, uncompromised, or
>>>> free from computer viruses, malware, or other defects. Thank You
>>>>
>>>

Reply via email to