On the contrary, distributed deep learning is not data parallel. It's
dominated by the need to share parameters across workers.
Gourav, I don't understand what you're looking for. Have you looked at
Petastorm and Horovod? they _use Spark_, not another platform like Ray. Why
recreate this which has worked for years? what would it matter if it were
in the Spark project? I think you're on a limb there.
One goal of Spark is very much not to build in everything that could exist
as a library, and distributed deep learning remains an important but niche
use case. Instead it provides the infra for these things, like barrier mode.

On Thu, Feb 24, 2022 at 7:21 AM Bitfox <bit...@bitfox.top> wrote:

> I have been using tensorflow for a long time, it's not hard to implement a
> distributed training job at all, either by model parallelization or data
> parallelization. I don't think there is much need to develop spark to
> support tensorflow jobs. Just my thoughts...
>
>
> On Thu, Feb 24, 2022 at 4:36 PM Gourav Sengupta <gourav.sengu...@gmail.com>
> wrote:
>
>> Hi,
>>
>> I do not think that there is any reason for using over engineered
>> platforms like Petastorm and Ray, except for certain use cases.
>>
>> What Ray is doing, except for certain use cases, could have been easily
>> done by SPARK, I think, had the open source community got that steer. But
>> maybe I am wrong and someone should be able to explain why the SPARK open
>> source community cannot develop the capabilities which are so natural to
>> almost all use cases of data processing in SPARK where the data gets
>> consumed by deep learning frameworks and we are asked to use Ray or
>> Petastorm?
>>
>> For those of us who are asking what does native integrations means please
>> try to compare delta between release 2.x and 3.x and koalas before 3.2 and
>> after 3.2.
>>
>> I am sure that the SPARK community can push for extending the dataframes
>> from SPARK to deep learning and other frameworks by natively integrating
>> them.
>>
>>
>> Regards,
>> Gourav Sengupta
>>
>>

Reply via email to