Re: State of Machine Learning with Flink and especially FLIP-39

2020-11-20 Thread Niklas Wilcke
Hi Arvid and Jiangjie,

thanks to both of you for the quick and valuable response.
I will take a look at the linked projects.

Kind Regards,
Niklas


--
niklas.wil...@uniberg.com
Mobile: +49 160 9793 2593
Office: +49 40 2380 6523

Simon-von-Utrecht-Straße 85a
20359 Hamburg

UNIBERG GmbH
Registergericht: Amtsgericht Kiel HRB SE-1507
Geschäftsführer: Andreas Möller, Martin Ulbricht

Information Art. 13 DSGVO B2B:
Für die Kommunikation mit Ihnen verarbeiten wir ggf. Ihre personenbezogenen 
Daten.
Alle Informationen zum Umgang mit Ihren Daten finden Sie unter 
https://www.uniberg.com/impressum.html. 

> On 20. Nov 2020, at 03:07, Becket Qin  wrote:
> 
> Hi Niklas,
> 
> We dropped the Flink ML lib in 1.9 and plan to replace it with a new machine 
> learning library for traditional machine learning algorithms. And that 
> library will be based on FLIP-39. The plan was pushed back a little bit 
> because we plan to deprecate DataSet API and but haven't got the batch 
> iteration support in DataStream API yet. So at this point we don't have an ML 
> lib implementation in Flink.
> 
> That being said, we are working with the community to add some ML related 
> features to Flink. At this point, we have the following two projects 
> available from Alibaba that will likely be contributed to Flink. You may also 
> take a look at them.
> 
> Alink -  A machine learning library.
> https://github.com/alibaba/alink 
> 
> Flink-AI-Extended - A project helps running TF / PyTorch on top of Flink.
> https://github.com/alibaba/flink-ai-extended 
> 
> 
> Thanks,
> 
> Jiangjie (Becket) Qin
> 
> On Fri, Nov 20, 2020 at 3:43 AM Arvid Heise  > wrote:
> Hi Niklas,
> 
> indeed some efforts on the machine learning libraries are pushed back in 
> favor of getting proper PyTorch and Tensorflow support through PyFlink. 
> 
> Native implementations in Flink have been done so far in the DataSet API, 
> which is going to deprecated in the next few releases in favor of the unified 
> DataStream API with bounded streams. I expect efforts for native 
> implementations to be picked up once DataSet is fully replaced to avoid 
> doubling the work. One of the most important features that is lacking is 
> proper iteration support in DataStream.
> 
> On Thu, Nov 19, 2020 at 1:34 PM Niklas Wilcke  > wrote:
> Hi Flink-Community,
> 
> I'm digging through the history of FlinkML and FLIP-39 [0]. What I understood 
> so far is that FlinkML has been removed in 1.9, because it got unmaintained.
> I'm not really able to find out whether FLIP-39 and providing a replacement 
> for FlinkML is currently worked on. The Umbrella Jira Ticket FLINK-12470 [1] 
> looks stale to me.
> Was there maybe a change of strategy in the meantime? Is the focus currently 
> on PyFlink to provide ML-Solutions (FLIP-96 [2])?
> It would be really interesting to get some insights about the future and 
> roadmap of ML in the Flink ecosystem. Thank you very much!
> 
> Kind Regards,
> Niklas
> 
> [0] 
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-39+Flink+ML+pipeline+and+ML+libs
>  
> 
> [1] https://issues.apache.org/jira/browse/FLINK-12470 
> 
> [2] 
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-96%3A+Support+Python+ML+Pipeline+API
>  
> 
> 
> -- 
> Arvid Heise | Senior Java Developer
>  
> Follow us @VervericaData
> --
> Join Flink Forward  - The Apache Flink Conference
> Stream Processing | Event Driven | Real Time
> --
> Ververica GmbH | Invalidenstrasse 115, 10115 Berlin, Germany
> --
> Ververica GmbH
> Registered at Amtsgericht Charlottenburg: HRB 158244 B
> Managing Directors: Timothy Alexander Steinert, Yip Park Tung Jason, Ji 
> (Toni) Cheng



smime.p7s
Description: S/MIME cryptographic signature


Re: State of Machine Learning with Flink and especially FLIP-39

2020-11-19 Thread Becket Qin
Hi Niklas,

We dropped the Flink ML lib in 1.9 and plan to replace it with a new
machine learning library for traditional machine learning algorithms. And
that library will be based on FLIP-39. The plan was pushed back a little
bit because we plan to deprecate DataSet API and but haven't got the batch
iteration support in DataStream API yet. So at this point we don't have an
ML lib implementation in Flink.

That being said, we are working with the community to add some ML related
features to Flink. At this point, we have the following two projects
available from Alibaba that will likely be contributed to Flink. You may
also take a look at them.

Alink -  A machine learning library.
https://github.com/alibaba/alink

Flink-AI-Extended - A project helps running TF / PyTorch on top of Flink.
https://github.com/alibaba/flink-ai-extended

Thanks,

Jiangjie (Becket) Qin

On Fri, Nov 20, 2020 at 3:43 AM Arvid Heise  wrote:

> Hi Niklas,
>
> indeed some efforts on the machine learning libraries are pushed back in
> favor of getting proper PyTorch and Tensorflow support through PyFlink.
>
> Native implementations in Flink have been done so far in the DataSet API,
> which is going to deprecated in the next few releases in favor of the
> unified DataStream API with bounded streams. I expect efforts for native
> implementations to be picked up once DataSet is fully replaced to avoid
> doubling the work. One of the most important features that is lacking is
> proper iteration support in DataStream.
>
> On Thu, Nov 19, 2020 at 1:34 PM Niklas Wilcke 
> wrote:
>
>> Hi Flink-Community,
>>
>> I'm digging through the history of FlinkML and FLIP-39 [0]. What I
>> understood so far is that FlinkML has been removed in 1.9, because it got
>> unmaintained.
>> I'm not really able to find out whether FLIP-39 and providing a
>> replacement for FlinkML is currently worked on. The Umbrella Jira Ticket
>> FLINK-12470 [1] looks stale to me.
>> Was there maybe a change of strategy in the meantime? Is the focus
>> currently on PyFlink to provide ML-Solutions (FLIP-96 [2])?
>> It would be really interesting to get some insights about the future and
>> roadmap of ML in the Flink ecosystem. Thank you very much!
>>
>> Kind Regards,
>> Niklas
>>
>> [0]
>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-39+Flink+ML+pipeline+and+ML+libs
>> [1] https://issues.apache.org/jira/browse/FLINK-12470
>> [2]
>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-96%3A+Support+Python+ML+Pipeline+API
>
>
>
> --
>
> Arvid Heise | Senior Java Developer
>
> 
>
> Follow us @VervericaData
>
> --
>
> Join Flink Forward  - The Apache Flink
> Conference
>
> Stream Processing | Event Driven | Real Time
>
> --
>
> Ververica GmbH | Invalidenstrasse 115, 10115 Berlin, Germany
>
> --
> Ververica GmbH
> Registered at Amtsgericht Charlottenburg: HRB 158244 B
> Managing Directors: Timothy Alexander Steinert, Yip Park Tung Jason, Ji
> (Toni) Cheng
>


Re: State of Machine Learning with Flink and especially FLIP-39

2020-11-19 Thread Arvid Heise
Hi Niklas,

indeed some efforts on the machine learning libraries are pushed back in
favor of getting proper PyTorch and Tensorflow support through PyFlink.

Native implementations in Flink have been done so far in the DataSet API,
which is going to deprecated in the next few releases in favor of the
unified DataStream API with bounded streams. I expect efforts for native
implementations to be picked up once DataSet is fully replaced to avoid
doubling the work. One of the most important features that is lacking is
proper iteration support in DataStream.

On Thu, Nov 19, 2020 at 1:34 PM Niklas Wilcke 
wrote:

> Hi Flink-Community,
>
> I'm digging through the history of FlinkML and FLIP-39 [0]. What I
> understood so far is that FlinkML has been removed in 1.9, because it got
> unmaintained.
> I'm not really able to find out whether FLIP-39 and providing a
> replacement for FlinkML is currently worked on. The Umbrella Jira Ticket
> FLINK-12470 [1] looks stale to me.
> Was there maybe a change of strategy in the meantime? Is the focus
> currently on PyFlink to provide ML-Solutions (FLIP-96 [2])?
> It would be really interesting to get some insights about the future and
> roadmap of ML in the Flink ecosystem. Thank you very much!
>
> Kind Regards,
> Niklas
>
> [0]
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-39+Flink+ML+pipeline+and+ML+libs
> [1] https://issues.apache.org/jira/browse/FLINK-12470
> [2]
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-96%3A+Support+Python+ML+Pipeline+API



-- 

Arvid Heise | Senior Java Developer



Follow us @VervericaData

--

Join Flink Forward  - The Apache Flink
Conference

Stream Processing | Event Driven | Real Time

--

Ververica GmbH | Invalidenstrasse 115, 10115 Berlin, Germany

--
Ververica GmbH
Registered at Amtsgericht Charlottenburg: HRB 158244 B
Managing Directors: Timothy Alexander Steinert, Yip Park Tung Jason, Ji
(Toni) Cheng


State of Machine Learning with Flink and especially FLIP-39

2020-11-19 Thread Niklas Wilcke
Hi Flink-Community,

I'm digging through the history of FlinkML and FLIP-39 [0]. What I understood 
so far is that FlinkML has been removed in 1.9, because it got unmaintained.
I'm not really able to find out whether FLIP-39 and providing a replacement for 
FlinkML is currently worked on. The Umbrella Jira Ticket FLINK-12470 [1] looks 
stale to me.
Was there maybe a change of strategy in the meantime? Is the focus currently on 
PyFlink to provide ML-Solutions (FLIP-96 [2])?
It would be really interesting to get some insights about the future and 
roadmap of ML in the Flink ecosystem. Thank you very much!

Kind Regards,
Niklas

[0] 
https://cwiki.apache.org/confluence/display/FLINK/FLIP-39+Flink+ML+pipeline+and+ML+libs
[1] https://issues.apache.org/jira/browse/FLINK-12470
[2] 
https://cwiki.apache.org/confluence/display/FLINK/FLIP-96%3A+Support+Python+ML+Pipeline+API

smime.p7s
Description: S/MIME cryptographic signature