Re: [DISCUSS] Include flink-ml-api and flink-ml-lib in opt

2020-02-10 Thread Hequn Cheng
Hi Rong, That's great! Looking forward to your feedback. Thanks, Hequn On Tue, Feb 11, 2020 at 1:06 AM Rong Rong wrote: > Yes. I think the argument is fairly valid - we can always adjust the API > in the future, in fact most of the APIs are labeled publicEvolving at this > moment. > I was

Re: [DISCUSS] Include flink-ml-api and flink-ml-lib in opt

2020-02-10 Thread Rong Rong
Yes. I think the argument is fairly valid - we can always adjust the API in the future, in fact most of the APIs are labeled publicEvolving at this moment. I was only trying to provide the info, that the interfaces in flink-ml-api might change in the near future, for others when voting. In fact,

Re: [DISCUSS] Include flink-ml-api and flink-ml-lib in opt

2020-02-09 Thread Hequn Cheng
Hi Rong, Thanks a lot for joining the discussion! It would be great if we can have a long term plan. My intention is to provide a way for users to add dependencies of Flink ML, either through the opt or download page. This would be more and more critical along with the improvement of the Flink

Re: [DISCUSS] Include flink-ml-api and flink-ml-lib in opt

2020-02-07 Thread Rong Rong
CC @Xu Yang Thanks for starting the discussion @Hequn Cheng and sorry for joining the discussion late. I've mainly helped merging the code in flink-ml-api and flink-ml-lib in the past several months. IMO the flink-ml-api are an extension on top of the table API and agree that it should be

Re: [DISCUSS] Include flink-ml-api and flink-ml-lib in opt

2020-02-07 Thread Hequn Cheng
Hi, @Till Rohrmann Thanks for the great inputs. I agree with you that we should have a long term plan for this. It definitely deserves another discussion. @Jeff Zhang Thanks for your reports and ideas. It's a good idea to improve the error messages. Do we have any JIRAs for it or maybe we can

Re: [DISCUSS] Include flink-ml-api and flink-ml-lib in opt

2020-02-06 Thread Jeff Zhang
I have another concern which may not be closely related to this thread. Since flink doesn't include all the necessary jars, I think it is critical for flink to display meaningful error message when any class is missing. e.g. Here's the error message when I use kafka but miss including flink-json.

Re: [DISCUSS] Include flink-ml-api and flink-ml-lib in opt

2020-02-06 Thread Till Rohrmann
I would not object given that it is rather small at the moment. However, I also think that we should have a plan how to handle the ever growing Flink ecosystem and how to make it easily accessible to our users. E.g. one far fetched idea could be something like a configuration script which

Re: [DISCUSS] Include flink-ml-api and flink-ml-lib in opt

2020-02-06 Thread Hequn Cheng
Hi everyone, Thank you all for the great inputs! I think probably what we all agree on is we should try to make a leaner flink-dist. However, we may also need to do some compromises considering the user experience that users don't need to download the dependencies from different places.

Re: [DISCUSS] Include flink-ml-api and flink-ml-lib in opt

2020-02-04 Thread Chesnay Schepler
Around a year ago I started a discussion on reducing the amount of jars we ship with the distribution. While there was no definitive conclusion there was a shared sentiment

Re: [DISCUSS] Include flink-ml-api and flink-ml-lib in opt

2020-02-04 Thread Till Rohrmann
I think there is no such rule that APIs go automatically into opt/ and "libraries" not. The contents of opt/ have mainly grown over time w/o following a strict rule. I think the decisive factor for what goes into Flink's binary distribution should be how core it is to Flink. Of course another

Re: [DISCUSS] Include flink-ml-api and flink-ml-lib in opt

2020-02-04 Thread Becket Qin
Thanks for the suggestion, Till. I am curious about how do we usually decide when to put the jars into the opt folder? Technically speaking, it seems that `flink-ml-api` should be put into the opt directory because they are actually API instead of libraries, just like CEP and Table.

Re: [DISCUSS] Include flink-ml-api and flink-ml-lib in opt

2020-02-03 Thread Hequn Cheng
Hi Till, Thanks a lot for your suggestion. It's a good idea to offer the flink-ml libraries as optional dependencies on the download page which can make the dist smaller. But I also have some concerns for it, e.g., the download page now only includes the latest 3 releases. We may need to find

Re: [DISCUSS] Include flink-ml-api and flink-ml-lib in opt

2020-02-03 Thread Till Rohrmann
An alternative solution would be to offer the flink-ml libraries as optional dependencies on the download page. Similar to how we offer the different SQL formats and Hadoop releases [1]. [1] https://flink.apache.org/downloads.html Cheers, Till On Mon, Feb 3, 2020 at 10:19 AM Hequn Cheng wrote:

Re: [DISCUSS] Include flink-ml-api and flink-ml-lib in opt

2020-02-03 Thread Hequn Cheng
Thank you all for your feedback and suggestions! Best, Hequn On Mon, Feb 3, 2020 at 5:07 PM Becket Qin wrote: > Thanks for bringing up the discussion, Hequn. > > +1 on adding `flink-ml-api` and `flink-ml-lib` into opt. This would make > it much easier for the users to try out some simple ml

Re: [DISCUSS] Include flink-ml-api and flink-ml-lib in opt

2020-02-03 Thread Becket Qin
Thanks for bringing up the discussion, Hequn. +1 on adding `flink-ml-api` and `flink-ml-lib` into opt. This would make it much easier for the users to try out some simple ml tasks. Thanks, Jiangjie (Becket) Qin On Mon, Feb 3, 2020 at 4:34 PM jincheng sun wrote: > Thank you for pushing

Re: [DISCUSS] Include flink-ml-api and flink-ml-lib in opt

2020-02-03 Thread jincheng sun
Thank you for pushing forward @Hequn Cheng ! Hi @Becket Qin , Do you have any concerns on this ? Best, Jincheng Hequn Cheng 于2020年2月3日周一 下午2:09写道: > Hi everyone, > > Thanks for the feedback. As there are no objections, I've opened a JIRA > issue(FLINK-15847[1]) to address this issue. > The

Re: [DISCUSS] Include flink-ml-api and flink-ml-lib in opt

2020-02-02 Thread Hequn Cheng
Hi everyone, Thanks for the feedback. As there are no objections, I've opened a JIRA issue(FLINK-15847[1]) to address this issue. The implementation details can be discussed in the issue or in the following PR. Best, Hequn [1] https://issues.apache.org/jira/browse/FLINK-15847 On Wed, Jan 8,

Re: [DISCUSS] Include flink-ml-api and flink-ml-lib in opt

2020-01-08 Thread Hequn Cheng
Hi Jincheng, Thanks a lot for your feedback! Yes, I agree with you. There are cases that multi jars need to be uploaded. I will prepare another discussion later. Maybe with a simple design doc. Best, Hequn On Wed, Jan 8, 2020 at 3:06 PM jincheng sun wrote: > Thanks for bring up this

Re: [DISCUSS] Include flink-ml-api and flink-ml-lib in opt

2020-01-07 Thread jincheng sun
Thanks for bring up this discussion Hequn! +1 for include `flink-ml-api` and `flink-ml-lib` in opt. BTW: I think would be great if bring up a discussion for upload multiple Jars at the same time. as PyFlink JOB also can have the benefit if we do that improvement. Best, Jincheng Hequn Cheng

[DISCUSS] Include flink-ml-api and flink-ml-lib in opt

2020-01-07 Thread Hequn Cheng
Hi everyone, FLIP-39[1] rebuilds Flink ML pipeline on top of TableAPI which moves Flink ML a step further. Base on it, users can develop their ML jobs and more and more machine learning platforms are providing ML services. However, the problem now is the jars of flink-ml-api and flink-ml-lib are