Re: [DISCUSS] FLIP-137: Support Pandas UDAF in PyFlink

2020-09-03 Thread Xingbo Huang
Hi everyone, Thanks all of you for the discussion. If there are no objections, I would like to start a vote thread tomorrow. Best, Xingbo Dian Fu 于2020年9月3日周四 下午5:45写道: > Thanks for preparing the FLIP, xingbo! > > LGTM overall and looking forward to the voting! > > Regards, > Dian > > > 在

Re: [DISCUSS] FLIP-137: Support Pandas UDAF in PyFlink

2020-09-03 Thread Dian Fu
Thanks for preparing the FLIP, xingbo! LGTM overall and looking forward to the voting! Regards, Dian > 在 2020年9月3日,下午5:22,jincheng sun 写道: > > Thank you! looking forward to the voting :) > > Best, > Jincheng > > > Xingbo Huang 于2020年9月3日周四 下午2:39写道: > >> Hi Jincheng, >> >> Yes, I agree

Re: [DISCUSS] FLIP-137: Support Pandas UDAF in PyFlink

2020-09-03 Thread jincheng sun
Thank you! looking forward to the voting :) Best, Jincheng Xingbo Huang 于2020年9月3日周四 下午2:39写道: > Hi Jincheng, > > Yes, I agree that users can extend the class `AggregateFunction` if they > want to define a Pandas UDAF by the way of custom classes. I have updated > the part of the FLIP. > >

Re: [DISCUSS] FLIP-137: Support Pandas UDAF in PyFlink

2020-09-03 Thread Xingbo Huang
Hi Jincheng, Yes, I agree that users can extend the class `AggregateFunction` if they want to define a Pandas UDAF by the way of custom classes. I have updated the part of the FLIP. Best, Xingbo jincheng sun 于2020年9月3日周四 下午1:48写道: > Thanks for the update Xingbo! > > Pandas UDAF can reuse the

Re: [DISCUSS] FLIP-137: Support Pandas UDAF in PyFlink

2020-09-02 Thread jincheng sun
Thanks for the update Xingbo! Pandas UDAF can reuse the `class aggregate function (user defined function)` interface in FLIP-139, and the core logic of Pandas UDAF users is written in the `accumulate` method. In this way, we can unify the interface semantics of all UDAF. What do you think?

Re: [DISCUSS] FLIP-137: Support Pandas UDAF in PyFlink

2020-08-31 Thread Xingbo Huang
Hi Jincheng, Thanks a lot for joining the discussion and the suggestion of discussing FLIP-137 and FLIP-139 together. >> 1. We also need to consider how pandas UDAF supports metrics, and whether we need a custom interface for pandas UDAF? Yes. We need to add an interface so that users can add

Re: [DISCUSS] FLIP-137: Support Pandas UDAF in PyFlink

2020-08-30 Thread jincheng sun
Hi Xingbo, Thanks for the discussion! Overall, + 1 for this FLIP. I have two points to add: - We also need to consider how pandas UDAF supports metrics, and whether we need a custom interface for pandas UDAF? - We have added @udaf(), so whether to use ordinary Python UDAF? If not, the addition

[DISCUSS] FLIP-137: Support Pandas UDAF in PyFlink

2020-08-24 Thread Xingbo Huang
Hi everyone, I would like to start a discussion thread on "Support Pandas UDAF in PyFlink" Pandas UDF has been supported in FLINK 1.11 (FLIP-97[1]). It solves the high serialization/deserialization overhead in Python UDF and makes it convenient to leverage the popular Python libraries such as