Sorry Germain ! and thanks again :)

Le mar. 16 juil. 2019 à 15:00, Massy Bourennani <massybourenn...@gmail.com>
a écrit :

> Hi David !
> this helps a lot,
> Many thanks :)
> Massy
>
> Le mar. 16 juil. 2019 à 11:12, Germain Tanguy <
> germain.tan...@dailymotion.com> a écrit :
>
>> Hello Massy,
>>
>> I just answer on reddit, I copy/paste answer here in case someone is
>> interested too.
>>
>>
>>
>> Dataflow support python 3.5
>> <https://beam.apache.org/roadmap/python-sdk/#python-3-support>.
>>
>>
>>
>> In my company we do use apache-beam/dataflow in prod with a setup.py to
>> initialize dependencies, even non-python one
>> <https://beam.apache.org/documentation/sdks/python-pipeline-dependencies/#nonpython>
>>  like polyglot
>> <https://polyglot.readthedocs.io/en/latest/Installation.html>. The
>> juliaset example is helpful to start.
>>
>> We have the same constraint as you regarding DS, but in our side it is
>> mainly tensorflow.
>>
>>
>>
>> Don't hesitate to take a look at this article
>> <https://medium.com/dailymotion/collaboration-between-data-engineers-data-analysts-and-data-scientists-97c00ab1211f>
>>  which
>> give an overview on how we work with DS.
>>
>>
>>
>>
>>
>> You should be able to wrap apache-beam/dataflow code to have the same
>> syntax as sklearn. Then, DS will be able to be autonomous with the
>> scalability and without to know the complexity of the cluster-computing
>> framework.
>>
>>
>>
>> Hope this helps.
>>
>>
>>
>> Germain.
>>
>>
>>
>> *From: *Massy Bourennani <massybourenn...@gmail.com>
>> *Reply-To: *"user@beam.apache.org" <user@beam.apache.org>
>> *Date: *Tuesday 16 July 2019 at 10:49
>> *To: *"user@beam.apache.org" <user@beam.apache.org>
>> *Subject: *Industrializing batch ML algorithm using Apache Beam/Dataflow
>> (on Google Cloud Platform)
>>
>>
>>
>> Hi all,
>>
>> Here is the link to the Reddit post[1]
>>
>> Many thanks for your help.
>>
>> Massy
>>
>>
>>
>> [1]
>> https://www.reddit.com/r/dataengineering/comments/cdp5i3/industrializing_batch_ml_algorithm_using_apache/
>> <https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.reddit.com%2Fr%2Fdataengineering%2Fcomments%2Fcdp5i3%2Findustrializing_batch_ml_algorithm_using_apache%2F&data=02%7C01%7Cgermain.tanguy%40dailymotion.com%7C5fade1523efd48ce9add08d709ca7bbb%7C37530da3f7a748f4ba462dc336d55387%7C0%7C1%7C636988637568962821&sdata=jGht7l2BuvLMzGCn42l3M1opsKIz%2FHZuPTMQg%2FllDoM%3D&reserved=0>
>>
>

Reply via email to