Re: Can I use PyFlink together with PyTorch/Tensorflow/PyTorch

2021-03-16 Thread Xingbo Huang
Hi Yik San,

Thanks for the investigation of PyFlink together with all these ML libs.
IMO, you could refer to the flink-ai-extended project that supports the
Tensorflow on Flink, PyTorch on Flink etc, whose repository url is
https://github.com/alibaba/flink-ai-extended. Flink AI Extended is a
project extending Flink to various machine learning scenarios, which could
be used together with PyFlink. You can also join the group by scanning the
QR code involved in the README file.

Best,
Xingbo

Yik San Chan  于2021年3月15日周一 上午11:06写道:

> Hi community,
>
> I am exploring PyFlink and I wonder if it is possible to use PyFlink
> together with all these ML libs that ML engineers normally use: PyTorch,
> Tensorflow, Scikit Learn, Xgboost, LightGBM, etc.
>
> According to this SO thread
> <https://stackoverflow.com/questions/38187637/integrating-scikit-learn-with-pyspark>,
> PySpark cannot use Scikit Learn directly inside UDF because Scikit Learn
> algorithms are not implemented to be distributed, while Spark runs
> distributedly.
>
> Given PyFlink is similar to PySpark, I guess the answer may be "no". But I
> would love to double check, and to see what I need to do to make PyFlink
> able to define UDFs using these ML libs.
>
>
> (This question is cross-posted on StackOverflow
> https://stackoverflow.com/questions/66631859/can-i-use-pyflink-together-with-pytorch-tensorflow-scikitlearn-xgboost-lightgbm
> )
>
>
> Thanks.
>
>
> Best,
>
> Yik San
>


Can I use PyFlink together with PyTorch/Tensorflow/PyTorch

2021-03-14 Thread Yik San Chan
Hi community,

I am exploring PyFlink and I wonder if it is possible to use PyFlink
together with all these ML libs that ML engineers normally use: PyTorch,
Tensorflow, Scikit Learn, Xgboost, LightGBM, etc.

According to this SO thread
<https://stackoverflow.com/questions/38187637/integrating-scikit-learn-with-pyspark>,
PySpark cannot use Scikit Learn directly inside UDF because Scikit Learn
algorithms are not implemented to be distributed, while Spark runs
distributedly.

Given PyFlink is similar to PySpark, I guess the answer may be "no". But I
would love to double check, and to see what I need to do to make PyFlink
able to define UDFs using these ML libs.


(This question is cross-posted on StackOverflow
https://stackoverflow.com/questions/66631859/can-i-use-pyflink-together-with-pytorch-tensorflow-scikitlearn-xgboost-lightgbm
)


Thanks.


Best,

Yik San