Re: [DISCUSS] FLIP-39: Flink ML pipeline and ML libs

Stavros Kontopoulos Mon, 03 Jun 2019 03:08:40 -0700

Hi,

Some portion of the code could be migrated to the new Table API no?
I am saying that because the new API design is based on scikit-learn and
the old one was also inspired by it.


Best,
Stavros
On Wed, May 22, 2019 at 1:24 PM Shaoxuan Wang <[email protected]> wrote:

> Another consensus (from the offline discussion) is that we will
> delete/deprecate flink-libraries/flink-ml. I have started a survey and
> discussion [1] in dev/user-ml to collect the feedback. Depending on the
> replies, we will decide if we shall delete it in Flink1.9 or
> deprecate&delete in the next release after 1.9.
>
> [1]
>
> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/SURVEY-Usage-of-flink-ml-and-DISCUSS-Delete-flink-ml-td29057.html
>
> Regards,
> Shaoxuan
>
>
> On Tue, May 21, 2019 at 9:22 PM Gen Luo <[email protected]> wrote:
>
> > Yes, this is our conclusion. I'd like to add only one point that
> > registering user defined aggregator is also needed which is currently
> > provided by 'bridge' and finally will be merged into Table API. It's same
> > with collect().
> >
> > I will add a TableEnvironment argument in Estimator.fit() and
> > Transformer.transform() to get rid of the dependency on
> > flink-table-planner. This will be committed soon.
> >
> > Aljoscha Krettek <[email protected]> 于2019年5月21日周二 下午7:31写道：
> >
> > > We discussed this in private and came to the conclusion that we should
> > > (for now) have the dependency on flink-table-api-xxx-bridge because we
> > need
> > > access to the collect() method, which is not yet available in the Table
> > > API. Once that is available the code can be refactored but for now we
> > want
> > > to unblock work on this new module.
> > >
> > > We also agreed that we don’t need a direct dependency on
> > > flink-table-planner.
> > >
> > > I hope I summarised our discussion correctly.
> > >
> > > > On 17. May 2019, at 12:20, Gen Luo <[email protected]> wrote:
> > > >
> > > > Thanks for your reply.
> > > >
> > > > For the first question, it's not strictly necessary. But I perfer not
> > to
> > > > have a TableEnvironment argument in Estimator.fit() or
> > > > Transformer.transform(), which is not part of machine learning
> concept,
> > > and
> > > > may make our API not as clean and pretty as other systems do. I would
> > > like
> > > > another way other than introducing flink-table-planner to do this. If
> > > it's
> > > > impossible or severely opposed, I may make the concession to add the
> > > > argument.
> > > >
> > > > Other than that, "flink-table-api-xxx-bridge"s are still needed. A
> vary
> > > > common case is that an algorithm needs to guarantee that it's running
> > > under
> > > > a BatchTableEnvironment, which makes it possible to collect result
> each
> > > > iteration. A typical algorithm like this is ALS. By flink1.8, this
> can
> > be
> > > > only achieved by converting Table to DataSet than call
> > DataSet.collect(),
> > > > which is available in flink-table-api-xxx-bridge. Besides,
> registering
> > > > UDAGG is also depending on it.
> > > >
> > > > In conclusion, '"planner" can be removed from dependencies but
> > > introducing
> > > > "bridge"s are inevitable. Whether and how to acquire TableEnvironment
> > > from
> > > > a Table can be discussed.
> > >
> > >
> >
>

Re: [DISCUSS] FLIP-39: Flink ML pipeline and ML libs

Reply via email to