Hi all,

Thanks a lot for your feedback. If there are no more suggestions and
comments, I think it's better to  initiate a vote to create a FLIP for
Apache Flink Python UDFs.
What do you think?

Best, Jincheng

jincheng sun <sunjincheng...@gmail.com> 于2019年8月15日周四 上午12:54写道:

> Hi Thomas,
>
> Thanks for your confirmation and the very important reminder about bundle
> processing.
>
> I have had add the description about how to perform bundle processing from
> the perspective of checkpoint and watermark. Feel free to leave comments if
> there are anything not describe clearly.
>
> Best,
> Jincheng
>
>
> Dian Fu <dian0511...@gmail.com> 于2019年8月14日周三 上午10:08写道:
>
>> Hi Thomas,
>>
>> Thanks a lot the suggestions.
>>
>> Regarding to bundle processing, there is a section "Checkpoint"[1] in the
>> design doc which talks about how to handle the checkpoint.
>> However, I think you are right that we should talk more about it, such as
>> what's bundle processing, how it affects the checkpoint and watermark, how
>> to handle the checkpoint and watermark, etc.
>>
>> [1]
>> https://docs.google.com/document/d/1WpTyCXAQh8Jr2yWfz7MWCD2-lou05QaQFb810ZvTefY/edit#heading=h.urladt565yo3
>> <
>> https://docs.google.com/document/d/1WpTyCXAQh8Jr2yWfz7MWCD2-lou05QaQFb810ZvTefY/edit#heading=h.urladt565yo3
>> >
>>
>> Regards,
>> Dian
>>
>> > 在 2019年8月14日,上午1:01,Thomas Weise <t...@apache.org> 写道:
>> >
>> > Hi Jincheng,
>> >
>> > Thanks for putting this together. The proposal is very detailed,
>> thorough
>> > and for me as a Beam Flink runner contributor easy to understand :)
>> >
>> > One thing that you should probably detail more is the bundle
>> processing. It
>> > is critically important for performance that multiple elements are
>> > processed in a bundle. The default bundle size in the Flink runner is
>> 1s or
>> > 1000 elements, whichever comes first. And for streaming, you can find
>> the
>> > logic necessary to align the bundle processing with watermarks and
>> > checkpointing here:
>> >
>> https://github.com/apache/beam/blob/release-2.14.0/runners/flink/src/main/java/org/apache/beam/runners/flink/translation/wrappers/streaming/ExecutableStageDoFnOperator.java
>> >
>> > Thomas
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> > On Tue, Aug 13, 2019 at 7:05 AM jincheng sun <sunjincheng...@gmail.com>
>> > wrote:
>> >
>> >> Hi all,
>> >>
>> >> The Python Table API(without Python UDF support) has already been
>> supported
>> >> and will be available in the coming release 1.9.
>> >> As Python UDF is very important for Python users, we'd like to start
>> the
>> >> discussion about the Python UDF support in the Python Table API.
>> >> Aljoscha Krettek, Dian Fu and I have discussed offline and have
>> drafted a
>> >> design doc[1]. It includes the following items:
>> >>
>> >> - The user-defined function interfaces.
>> >> - The user-defined function execution architecture.
>> >>
>> >> As mentioned by many guys in the previous discussion thread[2], a
>> >> portability framework was introduced in Apache Beam in latest
>> releases. It
>> >> provides well-defined, language-neutral data structures and protocols
>> for
>> >> language-neutral user-defined function execution. This design is based
>> on
>> >> Beam's portability framework. We will introduce how to make use of
>> Beam's
>> >> portability framework for user-defined function execution: data
>> >> transmission, state access, checkpoint, metrics, logging, etc.
>> >>
>> >> Considering that the design relies on Beam's portability framework for
>> >> Python user-defined function execution and not all the contributors in
>> >> Flink community are familiar with Beam's portability framework, we have
>> >> done a prototype[3] for proof of concept and also ease of
>> understanding of
>> >> the design.
>> >>
>> >> Welcome any feedback.
>> >>
>> >> Best,
>> >> Jincheng
>> >>
>> >> [1]
>> >>
>> >>
>> https://docs.google.com/document/d/1WpTyCXAQh8Jr2yWfz7MWCD2-lou05QaQFb810ZvTefY/edit?usp=sharing
>> >> [2]
>> >>
>> >>
>> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLIP-38-Support-python-language-in-flink-TableAPI-td28061.html
>> >> [3] https://github.com/dianfu/flink/commits/udf_poc
>> >>
>>
>>

Reply via email to