Hi all, Thanks for all of your response.
If there's no more comments, I would like to bring up the VOTE. Best, Wei > 在 2020年3月13日,20:50,Xingbo Huang <hxbks...@gmail.com> 写道: > > Hi Wei, > Thanks a lot for drafting the FLIP and kicking off the discussion. > Big +1 for this feature. > This feature will greatly facilitate PyFlink users to use Python UDF in SQL > scenarios. > > Best, > Xingbo > > Hequn Cheng <he...@apache.org> 于2020年3月13日周五 下午5:10写道: > >> Big +1 on this feature! It would be great to extend the usage of Python UDF >> in SQL scenarios. >> The design doc looks good from my side now. Thank you for the update. >> >> Best, >> Hequn >> >> On Tue, Mar 10, 2020 at 3:50 PM Wei Zhong <weizhong0...@gmail.com> wrote: >> >>> Hi Timo, >>> >>> Thanks for your reply. >>> >>> If we aim for the option 1, it makes sense for me to include the change >> in >>> this FLIP as the option 1 does not change any public API. I'll update the >>> FLIP page to illustrate this. >>> >>> Best, >>> Wei >>> >>>> 在 2020年3月9日,17:58,Timo Walther <twal...@apache.org> 写道: >>>> >>>> Hi Wei, >>>> >>>> I agree with Dawid that we should defer the instantiation of temporary >>> functions to compile time. In the long-term, we would like to integrate >>> FunctionCatalog as a component of CatalogManager and unify the handling >> of >>> catalog objects as much as possible. >>>> >>>> We should aim for your proposed option 1. For fluent definition of >>> functions in Table API, we would still like to offer passing instances >> like >>> `t.select(call(new ScalarFunction() { ... }))` that would be registered >> as >>> temporary system functions. >>>> >>>> Regrds, >>>> Timo >>>> >>>> >>>> On 09.03.20 09:24, Wei Zhong wrote: >>>>> Hi Dawid, >>>>> I think defering the instantiation of temporary functions to compile >>> time is quite a good idea but needs further discussion. As it is >> orthogonal >>> with this FLIP, we could continue the discussion in a new thread later. >>> What do you think? >>>>> Best, >>>>> Wei >>>>>> 在 2020年3月5日,21:11,Wei Zhong <weizhong0...@gmail.com> 写道: >>>>>> >>>>>> Hi Dawid, >>>>>> >>>>>> Thanks for your suggestion. >>>>>> >>>>>> After some investigation, there are two designs in my mind about how >>> to defer the instantiation of temporary system function and temporary >>> catalog function to compile time. >>>>>> >>>>>> 1. FunctionCatalog accepts both FunctionDefinitions and >> uninstantiated >>> temporary functions. The uninstantiated temporary functions will be >>> instantiated when compiling. There is no public API change in this >> design, >>> but the FunctionCatalog needs to store and process both >> FunctionDefinitions >>> and uninstantiated temporary functions. >>>>>> >>>>>> 2. FunctionCatalog accepts only uninstantiated temporary functions. >> In >>> this design we need to remove those APIs that accepts FunctionDefinitions >>> from TableEnvironment, i.e. `void createTemporaryFunction(String path, >>> UserDefinedFunction functionInstance)` and `void >>> createTemporarySystemFunction(String name, UserDefinedFunction >>> functionInstance)`. But the FunctionCatalog only needs to store and >> process >>> uninstantiated temporary functions. >>>>>> >>>>>> As I don't know the details about the plan to store temporary >>> functions as catalog functions instead of FunctionDefinitions, I'm not >> sure >>> which solution fits more. It would be great if you could share more >> details >>> or share some thoughts on these two solutions? >>>>>> >>>>>> Best, >>>>>> Wei >>>>>> >>>>>>> 在 2020年3月4日,16:17,Dawid Wysakowicz <dwysakow...@apache.org> 写道: >>>>>>> >>>>>>> Hi all, >>>>>>> I had a really quick look and from my perspective the proposal looks >>> fine. >>>>>>> I share Jarks opinion that the instantiation could be done at a >> later >>>>>>> stage. I agree with Wei it requires some changes in the internal >>>>>>> implementation of the FunctionCatalog, to store temporary functions >> as >>>>>>> catalog functions instead of FunctionDefinitions, but we have that >> on >>> our >>>>>>> agenda anyway. I would suggest investigating if we could do that as >>> part of >>>>>>> this flip already. Nevertheless this in theory can be also done >> later. >>>>>>> >>>>>>> Best, >>>>>>> Dawid >>>>>>> >>>>>>> On Mon, 2 Mar 2020, 14:58 Jark Wu, <imj...@gmail.com> wrote: >>>>>>> >>>>>>>> Thanks for the explanation, Wei! >>>>>>>> >>>>>>>> On Mon, 2 Mar 2020 at 20:59, Wei Zhong <weizhong0...@gmail.com> >>> wrote: >>>>>>>> >>>>>>>>> Hi Jark, >>>>>>>>> >>>>>>>>> Thanks for your suggestion. >>>>>>>>> >>>>>>>>> Actually, the timing of starting a Python process depends on the >> UDF >>>>>>>> type, >>>>>>>>> because the Python process is used to provide the necessary >>> information >>>>>>>> to >>>>>>>>> instantiate the FunctionDefinition object of the Python UDF. For >>> catalog >>>>>>>>> function, the FunctionDefinition will be instantiated when >>> compiling the >>>>>>>>> job, which means the Python process is required during the >>> compilation >>>>>>>>> instead of the registeration. For temporary system function and >>> temporary >>>>>>>>> catalog function, the FunctionDefinition will be instantiated >>> during the >>>>>>>>> UDF registeration, so the Python process need to be started at >> that >>> time. >>>>>>>>> >>>>>>>>> But this FLIP will only support registering the temporary system >>> function >>>>>>>>> and temporary catalog function in SQL DDL because registering >>> Python UDF >>>>>>>> to >>>>>>>>> catalog is not supported yet. We plan to support the registeration >>> of >>>>>>>>> Python catalog function (via Table API and SQL DDL) in a separate >>> FLIP. >>>>>>>>> I'll add a non-goal section to the FLIP page to illustrate this. >>>>>>>>> >>>>>>>>> Best, >>>>>>>>> Wei >>>>>>>>> >>>>>>>>> >>>>>>>>>> 在 2020年3月2日,15:11,Jark Wu <imj...@gmail.com> 写道: >>>>>>>>>> >>>>>>>>>> Hi Weizhong, >>>>>>>>>> >>>>>>>>>> Thanks for proposing this feature. In geneal, I'm +1 from the >>> table's >>>>>>>>> view. >>>>>>>>>> >>>>>>>>>> I have one suggestion: I think the register python function into >>>>>>>> catalog >>>>>>>>>> doesn't need to startup python process (the "High Level Sequence >>>>>>>> Diagram" >>>>>>>>>> in your FLIP). >>>>>>>>>> Because only meta-information is persisted into catalog, we don't >>> need >>>>>>>> to >>>>>>>>>> store "return type", "input types" into catalog. >>>>>>>>>> I guess the python process is required when compiling a SQL job. >>>>>>>>>> >>>>>>>>>> Best, >>>>>>>>>> Jark >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On Fri, 28 Feb 2020 at 19:04, Benchao Li <libenc...@gmail.com> >>> wrote: >>>>>>>>>> >>>>>>>>>>> Big +1 for this feature. >>>>>>>>>>> >>>>>>>>>>> We built our SQL platform on Java Table API, and most common UDF >>> are >>>>>>>>>>> implemented in Java. However some python developers are not >>> familiar >>>>>>>>> with >>>>>>>>>>> Java/Scala, and it's very inconvenient for these users to use >> UDF >>> in >>>>>>>>> SQL. >>>>>>>>>>> >>>>>>>>>>> Wei Zhong <weizhong0...@gmail.com> 于2020年2月28日周五 下午6:58写道: >>>>>>>>>>> >>>>>>>>>>>> Thank for your reply Dan! >>>>>>>>>>>> >>>>>>>>>>>> By the way, this FLIP is closely related to the SQL API. @Jark >>> Wu < >>>>>>>>>>>> imj...@gmail.com> @Timo <twal...@apache.org> could you please >>> take a >>>>>>>>>>>> look? >>>>>>>>>>>> >>>>>>>>>>>> Thanks, >>>>>>>>>>>> Wei >>>>>>>>>>>> >>>>>>>>>>>>> 在 2020年2月25日,16:25,zoudan <zoud...@163.com> 写道: >>>>>>>>>>>>> >>>>>>>>>>>>> +1 for supporting Python UDF in Java/Scala Table API. >>>>>>>>>>>>> This is a great feature and would be helpful for python users! >>>>>>>>>>>>> >>>>>>>>>>>>> Best, >>>>>>>>>>>>> Dan Zou >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> -- >>>>>>>>>>> >>>>>>>>>>> Benchao Li >>>>>>>>>>> School of Electronics Engineering and Computer Science, Peking >>>>>>>>> University >>>>>>>>>>> Tel:+86-15650713730 >>>>>>>>>>> Email: libenc...@gmail.com; libenc...@pku.edu.cn >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>> >>>> >>> >>> >>