Hi Jincheng, Fudian, and Aljoscha, I am assuming the proposed python UDX can also be applied to Flink SQL. Is this correct? If yes, I would suggest to title the FLIP as "Flink Python User-Defined Function" or "Flink Python User-Defined Function for Table".
Regards, Shaoxuan On Wed, Aug 28, 2019 at 12:22 PM jincheng sun <sunjincheng...@gmail.com> wrote: > Thanks for the feedback Bowen! > > Great thanks for create the FLIP and bring up the VOTE Dian! > > Best, Jincheng > > Dian Fu <dian0511...@gmail.com> 于2019年8月28日周三 上午11:32写道: > > > Hi all, > > > > I have started a voting thread [1]. Thanks a lot for your help during > > creating the FLIP @Jincheng. > > > > > > Hi Bowen, > > > > Very appreciated for your comments. I have replied you in the design doc. > > As it seems that the comments doesn't affect the overall design, I'll not > > cancel the vote for now and we can continue the discussion in the design > > doc. > > > > [1] > > > http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/VOTE-FLIP-58-Flink-Python-User-Defined-Function-for-Table-API-td32295.html > > < > > > http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/VOTE-FLIP-58-Flink-Python-User-Defined-Function-for-Table-API-td32295.html > > > > > > > Regards, > > Dian > > > > > 在 2019年8月28日,上午11:05,Bowen Li <bowenl...@gmail.com> 写道: > > > > > > Hi Jincheng and Dian, > > > > > > Sorry for being late to the party. I took a glance at the proposal, > LGTM > > in > > > general, and I left only a couple comments. > > > > > > Thanks, > > > Bowen > > > > > > > > > On Mon, Aug 26, 2019 at 8:05 PM Dian Fu <dian0511...@gmail.com> wrote: > > > > > >> Hi Jincheng, > > >> > > >> Thanks! It works. > > >> > > >> Thanks, > > >> Dian > > >> > > >>> 在 2019年8月27日,上午10:55,jincheng sun <sunjincheng...@gmail.com> 写道: > > >>> > > >>> Hi Dian, can you check if you have edit access? :) > > >>> > > >>> > > >>> Dian Fu <dian0511...@gmail.com> 于2019年8月26日周一 上午10:52写道: > > >>> > > >>>> Hi Jincheng, > > >>>> > > >>>> Appreciated for the kind tips and offering of help. Definitely need > > it! > > >>>> Could you grant me write permission for confluence? My Id: Dian Fu > > >>>> > > >>>> Thanks, > > >>>> Dian > > >>>> > > >>>>> 在 2019年8月26日,上午9:53,jincheng sun <sunjincheng...@gmail.com> 写道: > > >>>>> > > >>>>> Thanks for your feedback Hequn & Dian. > > >>>>> > > >>>>> Dian, I am glad to see that you want help to create the FLIP! > > >>>>> Everyone will have first time, and I am very willing to help you > > >> complete > > >>>>> your first FLIP creation. Here some tips: > > >>>>> > > >>>>> - First I'll give your account write permission for confluence. > > >>>>> - Before create the FLIP, please have look at the FLIP Template > [1], > > >>>> (It's > > >>>>> better to know more about FLIP by reading [2]) > > >>>>> - Create Flink Python UDFs related JIRAs after completing the VOTE > of > > >>>>> FLIP.(I think you also can bring up the VOTE thread, if you want! ) > > >>>>> > > >>>>> Any problems you encounter during this period,feel free to tell me > > that > > >>>> we > > >>>>> can solve them together. :) > > >>>>> > > >>>>> Best, > > >>>>> Jincheng > > >>>>> > > >>>>> > > >>>>> > > >>>>> > > >>>>> [1] > https://cwiki.apache.org/confluence/display/FLINK/FLIP+Template > > >>>>> [2] > > >>>>> > > >>>> > > >> > > > https://cwiki.apache.org/confluence/display/FLINK/Flink+Improvement+Proposals > > >>>>> > > >>>>> > > >>>>> Hequn Cheng <chenghe...@gmail.com> 于2019年8月23日周五 上午11:54写道: > > >>>>> > > >>>>>> +1 for starting the vote. > > >>>>>> > > >>>>>> Thanks Jincheng a lot for the discussion. > > >>>>>> > > >>>>>> Best, Hequn > > >>>>>> > > >>>>>> On Fri, Aug 23, 2019 at 10:06 AM Dian Fu <dian0511...@gmail.com> > > >> wrote: > > >>>>>> > > >>>>>>> Hi Jincheng, > > >>>>>>> > > >>>>>>> +1 to start the FLIP create and VOTE on this feature. I'm willing > > to > > >>>> help > > >>>>>>> on the FLIP create if you don't mind. As I haven't created a FLIP > > >>>> before, > > >>>>>>> it will be great if you could help on this. :) > > >>>>>>> > > >>>>>>> Regards, > > >>>>>>> Dian > > >>>>>>> > > >>>>>>>> 在 2019年8月22日,下午11:41,jincheng sun <sunjincheng...@gmail.com> > 写道: > > >>>>>>>> > > >>>>>>>> Hi all, > > >>>>>>>> > > >>>>>>>> Thanks a lot for your feedback. If there are no more suggestions > > and > > >>>>>>>> comments, I think it's better to initiate a vote to create a > FLIP > > >> for > > >>>>>>>> Apache Flink Python UDFs. > > >>>>>>>> What do you think? > > >>>>>>>> > > >>>>>>>> Best, Jincheng > > >>>>>>>> > > >>>>>>>> jincheng sun <sunjincheng...@gmail.com> 于2019年8月15日周四 > 上午12:54写道: > > >>>>>>>> > > >>>>>>>>> Hi Thomas, > > >>>>>>>>> > > >>>>>>>>> Thanks for your confirmation and the very important reminder > > about > > >>>>>>> bundle > > >>>>>>>>> processing. > > >>>>>>>>> > > >>>>>>>>> I have had add the description about how to perform bundle > > >> processing > > >>>>>>> from > > >>>>>>>>> the perspective of checkpoint and watermark. Feel free to leave > > >>>>>>> comments if > > >>>>>>>>> there are anything not describe clearly. > > >>>>>>>>> > > >>>>>>>>> Best, > > >>>>>>>>> Jincheng > > >>>>>>>>> > > >>>>>>>>> > > >>>>>>>>> Dian Fu <dian0511...@gmail.com> 于2019年8月14日周三 上午10:08写道: > > >>>>>>>>> > > >>>>>>>>>> Hi Thomas, > > >>>>>>>>>> > > >>>>>>>>>> Thanks a lot the suggestions. > > >>>>>>>>>> > > >>>>>>>>>> Regarding to bundle processing, there is a section > > "Checkpoint"[1] > > >>>> in > > >>>>>>> the > > >>>>>>>>>> design doc which talks about how to handle the checkpoint. > > >>>>>>>>>> However, I think you are right that we should talk more about > > it, > > >>>>>> such > > >>>>>>> as > > >>>>>>>>>> what's bundle processing, how it affects the checkpoint and > > >>>>>> watermark, > > >>>>>>> how > > >>>>>>>>>> to handle the checkpoint and watermark, etc. > > >>>>>>>>>> > > >>>>>>>>>> [1] > > >>>>>>>>>> > > >>>>>>> > > >>>>>> > > >>>> > > >> > > > https://docs.google.com/document/d/1WpTyCXAQh8Jr2yWfz7MWCD2-lou05QaQFb810ZvTefY/edit#heading=h.urladt565yo3 > > >>>>>>>>>> < > > >>>>>>>>>> > > >>>>>>> > > >>>>>> > > >>>> > > >> > > > https://docs.google.com/document/d/1WpTyCXAQh8Jr2yWfz7MWCD2-lou05QaQFb810ZvTefY/edit#heading=h.urladt565yo3 > > >>>>>>>>>>> > > >>>>>>>>>> > > >>>>>>>>>> Regards, > > >>>>>>>>>> Dian > > >>>>>>>>>> > > >>>>>>>>>>> 在 2019年8月14日,上午1:01,Thomas Weise <t...@apache.org> 写道: > > >>>>>>>>>>> > > >>>>>>>>>>> Hi Jincheng, > > >>>>>>>>>>> > > >>>>>>>>>>> Thanks for putting this together. The proposal is very > > detailed, > > >>>>>>>>>> thorough > > >>>>>>>>>>> and for me as a Beam Flink runner contributor easy to > > understand > > >> :) > > >>>>>>>>>>> > > >>>>>>>>>>> One thing that you should probably detail more is the bundle > > >>>>>>>>>> processing. It > > >>>>>>>>>>> is critically important for performance that multiple > elements > > >> are > > >>>>>>>>>>> processed in a bundle. The default bundle size in the Flink > > >> runner > > >>>>>> is > > >>>>>>>>>> 1s or > > >>>>>>>>>>> 1000 elements, whichever comes first. And for streaming, you > > can > > >>>>>> find > > >>>>>>>>>> the > > >>>>>>>>>>> logic necessary to align the bundle processing with > watermarks > > >> and > > >>>>>>>>>>> checkpointing here: > > >>>>>>>>>>> > > >>>>>>>>>> > > >>>>>>> > > >>>>>> > > >>>> > > >> > > > https://github.com/apache/beam/blob/release-2.14.0/runners/flink/src/main/java/org/apache/beam/runners/flink/translation/wrappers/streaming/ExecutableStageDoFnOperator.java > > >>>>>>>>>>> > > >>>>>>>>>>> Thomas > > >>>>>>>>>>> > > >>>>>>>>>>> > > >>>>>>>>>>> > > >>>>>>>>>>> > > >>>>>>>>>>> > > >>>>>>>>>>> > > >>>>>>>>>>> > > >>>>>>>>>>> On Tue, Aug 13, 2019 at 7:05 AM jincheng sun < > > >>>>>>> sunjincheng...@gmail.com> > > >>>>>>>>>>> wrote: > > >>>>>>>>>>> > > >>>>>>>>>>>> Hi all, > > >>>>>>>>>>>> > > >>>>>>>>>>>> The Python Table API(without Python UDF support) has already > > >> been > > >>>>>>>>>> supported > > >>>>>>>>>>>> and will be available in the coming release 1.9. > > >>>>>>>>>>>> As Python UDF is very important for Python users, we'd like > to > > >>>>>> start > > >>>>>>>>>> the > > >>>>>>>>>>>> discussion about the Python UDF support in the Python Table > > API. > > >>>>>>>>>>>> Aljoscha Krettek, Dian Fu and I have discussed offline and > > have > > >>>>>>>>>> drafted a > > >>>>>>>>>>>> design doc[1]. It includes the following items: > > >>>>>>>>>>>> > > >>>>>>>>>>>> - The user-defined function interfaces. > > >>>>>>>>>>>> - The user-defined function execution architecture. > > >>>>>>>>>>>> > > >>>>>>>>>>>> As mentioned by many guys in the previous discussion > > thread[2], > > >> a > > >>>>>>>>>>>> portability framework was introduced in Apache Beam in > latest > > >>>>>>>>>> releases. It > > >>>>>>>>>>>> provides well-defined, language-neutral data structures and > > >>>>>> protocols > > >>>>>>>>>> for > > >>>>>>>>>>>> language-neutral user-defined function execution. This > design > > is > > >>>>>>> based > > >>>>>>>>>> on > > >>>>>>>>>>>> Beam's portability framework. We will introduce how to make > > use > > >> of > > >>>>>>>>>> Beam's > > >>>>>>>>>>>> portability framework for user-defined function execution: > > data > > >>>>>>>>>>>> transmission, state access, checkpoint, metrics, logging, > etc. > > >>>>>>>>>>>> > > >>>>>>>>>>>> Considering that the design relies on Beam's portability > > >> framework > > >>>>>>> for > > >>>>>>>>>>>> Python user-defined function execution and not all the > > >>>> contributors > > >>>>>>> in > > >>>>>>>>>>>> Flink community are familiar with Beam's portability > > framework, > > >> we > > >>>>>>> have > > >>>>>>>>>>>> done a prototype[3] for proof of concept and also ease of > > >>>>>>>>>> understanding of > > >>>>>>>>>>>> the design. > > >>>>>>>>>>>> > > >>>>>>>>>>>> Welcome any feedback. > > >>>>>>>>>>>> > > >>>>>>>>>>>> Best, > > >>>>>>>>>>>> Jincheng > > >>>>>>>>>>>> > > >>>>>>>>>>>> [1] > > >>>>>>>>>>>> > > >>>>>>>>>>>> > > >>>>>>>>>> > > >>>>>>> > > >>>>>> > > >>>> > > >> > > > https://docs.google.com/document/d/1WpTyCXAQh8Jr2yWfz7MWCD2-lou05QaQFb810ZvTefY/edit?usp=sharing > > >>>>>>>>>>>> [2] > > >>>>>>>>>>>> > > >>>>>>>>>>>> > > >>>>>>>>>> > > >>>>>>> > > >>>>>> > > >>>> > > >> > > > http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLIP-38-Support-python-language-in-flink-TableAPI-td28061.html > > >>>>>>>>>>>> [3] https://github.com/dianfu/flink/commits/udf_poc > > >>>>>>>>>>>> > > >>>>>>>>>> > > >>>>>>>>>> > > >>>>>>> > > >>>>>>> > > >>>>>> > > >>>> > > >>>> > > >> > > >> > > > > >