Sorry Germain ! and thanks again :) Le mar. 16 juil. 2019 à 15:00, Massy Bourennani <massybourenn...@gmail.com> a écrit :
> Hi David ! > this helps a lot, > Many thanks :) > Massy > > Le mar. 16 juil. 2019 à 11:12, Germain Tanguy < > germain.tan...@dailymotion.com> a écrit : > >> Hello Massy, >> >> I just answer on reddit, I copy/paste answer here in case someone is >> interested too. >> >> >> >> Dataflow support python 3.5 >> <https://beam.apache.org/roadmap/python-sdk/#python-3-support>. >> >> >> >> In my company we do use apache-beam/dataflow in prod with a setup.py to >> initialize dependencies, even non-python one >> <https://beam.apache.org/documentation/sdks/python-pipeline-dependencies/#nonpython> >> like polyglot >> <https://polyglot.readthedocs.io/en/latest/Installation.html>. The >> juliaset example is helpful to start. >> >> We have the same constraint as you regarding DS, but in our side it is >> mainly tensorflow. >> >> >> >> Don't hesitate to take a look at this article >> <https://medium.com/dailymotion/collaboration-between-data-engineers-data-analysts-and-data-scientists-97c00ab1211f> >> which >> give an overview on how we work with DS. >> >> >> >> >> >> You should be able to wrap apache-beam/dataflow code to have the same >> syntax as sklearn. Then, DS will be able to be autonomous with the >> scalability and without to know the complexity of the cluster-computing >> framework. >> >> >> >> Hope this helps. >> >> >> >> Germain. >> >> >> >> *From: *Massy Bourennani <massybourenn...@gmail.com> >> *Reply-To: *"user@beam.apache.org" <user@beam.apache.org> >> *Date: *Tuesday 16 July 2019 at 10:49 >> *To: *"user@beam.apache.org" <user@beam.apache.org> >> *Subject: *Industrializing batch ML algorithm using Apache Beam/Dataflow >> (on Google Cloud Platform) >> >> >> >> Hi all, >> >> Here is the link to the Reddit post[1] >> >> Many thanks for your help. >> >> Massy >> >> >> >> [1] >> https://www.reddit.com/r/dataengineering/comments/cdp5i3/industrializing_batch_ml_algorithm_using_apache/ >> <https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.reddit.com%2Fr%2Fdataengineering%2Fcomments%2Fcdp5i3%2Findustrializing_batch_ml_algorithm_using_apache%2F&data=02%7C01%7Cgermain.tanguy%40dailymotion.com%7C5fade1523efd48ce9add08d709ca7bbb%7C37530da3f7a748f4ba462dc336d55387%7C0%7C1%7C636988637568962821&sdata=jGht7l2BuvLMzGCn42l3M1opsKIz%2FHZuPTMQg%2FllDoM%3D&reserved=0> >> >