Thanks for your reply. Nice feature! Best regards, Jing
On Wed, Jun 14, 2023 at 3:11 AM yuxia <luoyu...@alumni.sjtu.edu.cn> wrote: > Yes, you're right. > > Best regards, > Yuxia > > ----- 原始邮件 ----- > 发件人: "Jing Ge" <j...@ververica.com.INVALID> > 收件人: "dev" <dev@flink.apache.org> > 发送时间: 星期三, 2023年 6 月 14日 上午 4:46:58 > 主题: Re: [DISCUSS] FLIP-311: Support Call Stored Procedure > > Hi yuxia, > > Thanks for your proposal and sorry for the late reply. The FLIP is in good > shape. If I am not mistaken, Everything, that a stored procedure could do, > could also be done by a Flink job. The current stored procedure design is > to empower Catalogs to provide users some commonly used logics/functions > centrally and out-of-the-box, i.e. DRY. Is that correct? > > Best regards, > Jing > > On Thu, Jun 8, 2023 at 10:32 AM Jark Wu <imj...@gmail.com> wrote: > > > Thank you for the proposal, yuxia! The FLIP looks good to me. > > > > Best, > > Jark > > > > > 2023年6月8日 11:39,yuxia <luoyu...@alumni.sjtu.edu.cn> 写道: > > > > > > Hi, all. > > > Thanks everyone for the valuable input. If there are are no further > > concerns about this FLIP[1], I would like to start voting next monday > > (6/12). > > > > > > [1] > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-311%3A+Support+Call+Stored+Procedure > > > > > > > > > Best regards, > > > Yuxia > > > > > > ----- 原始邮件 ----- > > > 发件人: "Martijn Visser" <martijnvis...@apache.org> > > > 收件人: "dev" <dev@flink.apache.org> > > > 发送时间: 星期二, 2023年 6 月 06日 下午 3:57:56 > > > 主题: Re: [DISCUSS] FLIP-311: Support Call Stored Procedure > > > > > > Hi Yuxia, > > > > > > Thanks for the clarification. I would be +0 overall, because I think > > > without actually allowing creation/customization of stored procedures, > > the > > > value for the majority of Flink users will be minimal. > > > > > > Best regards, > > > > > > Martijn > > > > > > On Tue, Jun 6, 2023 at 3:52 AM yuxia <luoyu...@alumni.sjtu.edu.cn> > > wrote: > > > > > >> Hi, Martijn. > > >> Thanks for you feedback. > > >> 1: In this FLIP we don't intend to allow users to customize their own > > >> stored procedure for we don't want to expose too much to users too > > early as > > >> the FLIP said. > > >> The procedures are supposed to be provided only by Catalog. Catalog > devs > > >> can write their build-in procedures, and return the procedure in > method > > >> Catalog.getProcedure(ObjectPath procedurePath); > > >> So, there won't be SQL syntax to create/save a stored procedure in > this > > >> FLIP. If we find we do need it, we can propse the SQL syntax to > create a > > >> stored procedure in another dedicated FLIP. > > >> > > >> 2: The syntax `Call procedure_name(xx)` proposed in this FLIP is the > > >> default syntax in Calcite for call stored procedures. Actaully, we > don't > > >> need to do any modifcation in flink-sql-parser module for syntax of > > calling > > >> a procedure. MySQL[1], Postgres[2], Oracle[3] also use the syntax to > > call a > > >> stored procedure. > > >> > > >> > > >> [1] https://dev.mysql.com/doc/refman/8.0/en/call.html > > >> [2] https://www.postgresql.org/docs/15/sql-call.html > > >> [3] > https://docs.oracle.com/javadb/10.8.3.0/ref/rrefcallprocedure.html > > >> > > >> Best regards, > > >> Yuxia > > >> > > >> ----- 原始邮件 ----- > > >> 发件人: "Martijn Visser" <martijnvis...@apache.org> > > >> 收件人: "dev" <dev@flink.apache.org> > > >> 发送时间: 星期一, 2023年 6 月 05日 下午 8:35:44 > > >> 主题: Re: [DISCUSS] FLIP-311: Support Call Stored Procedure > > >> > > >> Hi Yuxia, > > >> > > >> Thanks for the FLIP. I have a couple of questions: > > >> > > >> 1. The syntax talks about how to CALL or SHOW the available stored > > >> procedures, but not on how to create one. Will there not be a SQL > > syntax to > > >> create/save a stored procedure? > > >> 2. Is there a default syntax in Calcite for stored procedures? What do > > >> other databases do, do they use CALL/SHOW or something like EXEC, USE? > > >> > > >> Best regards, > > >> > > >> Martijn > > >> > > >> On Mon, Jun 5, 2023 at 3:23 AM yuxia <luoyu...@alumni.sjtu.edu.cn> > > wrote: > > >> > > >>> Hi, Jane. > > >>> Thanks for you input. I think we can add the auxiliary command show > > >>> procedures in this FLIP. > > >>> Following the syntax for show functions proposed in FLIP-297. > > >>> The syntax will be > > >>> SHOW PROCEDURES [ ( FROM | IN ) [catalog_name.]database_name ] [ > [NOT] > > >>> (LIKE | ILIKE) <sql_like_pattern> ]. > > >>> I have updated to this FLIP. > > >>> > > >>> The other auxiliary commands maybe not suitable currently or need a > > >>> further/dedicated dicussion. Let's keep this FLIP focus. > > >>> > > >>> [1] > > >>> > > >> > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-297%3A+Improve+Auxiliary+Sql+Statements > > >>> > > >>> Best regards, > > >>> Yuxia > > >>> > > >>> ----- 原始邮件 ----- > > >>> 发件人: "Jane Chan" <qingyue....@gmail.com> > > >>> 收件人: "dev" <dev@flink.apache.org> > > >>> 发送时间: 星期六, 2023年 6 月 03日 下午 7:04:39 > > >>> 主题: Re: [DISCUSS] FLIP-311: Support Call Stored Procedure > > >>> > > >>> Hi Yuxia, > > >>> > > >>> Thanks for bringing this to the discussion. The call procedure is a > > >> widely > > >>> used feature and will be very useful for users. > > >>> > > >>> I just have one question regarding the usage. The FLIP mentioned that > > >>> > > >>> Flink will allow connector developers to develop their own built-in > > >> stored > > >>>> procedures, and then enables users to call these predefiend stored > > >>>> procedures. > > >>>> > > >>> In this FLIP, we don't intend to allow users to customize their own > > >> stored > > >>>> procedure for we don't want to expose too much to users too early. > > >>> > > >>> > > >>> If I understand correctly, we might need to provide some auxiliary > > >> commands > > >>> to inform users what built-in procedures are provided and how to use > > >> them. > > >>> For example, Snowflake provides commands like [1] [2], and MySQL > > provides > > >>> commands like [3] [4]. > > >>> > > >>> [1] SHOW PROCEDURES, > > >>> https://docs.snowflake.com/en/sql-reference/sql/show-procedures > > >>> [2] DESCRIBE PROCEDURE <procedure_name>, > > >>> https://docs.snowflake.com/en/sql-reference/sql/desc-procedure > > >>> [3] SHOW PROCEDURE CODE, > > >>> https://dev.mysql.com/doc/refman/5.7/en/show-procedure-code.html > > >>> [4] SHOW PROCEDURE STATUS, > > >>> https://dev.mysql.com/doc/refman/5.7/en/show-procedure-status.html > > >>> > > >>> Best, > > >>> Jane > > >>> > > >>> On Sat, Jun 3, 2023 at 3:20 PM Benchao Li <libenc...@apache.org> > > wrote: > > >>> > > >>>> Thanks Yuxia for the explanation, it makes sense to me. It would be > > >> great > > >>>> if you also add this to the FLIP doc. > > >>>> > > >>>> yuxia <luoyu...@alumni.sjtu.edu.cn> 于2023年6月1日周四 17:11写道: > > >>>> > > >>>>> Hi, Benchao. > > >>>>> Thanks for your attention. > > >>>>> > > >>>>> Initially, I also want to pass `TableEnvironment` to procedure. But > > >>>>> according my investegation and offline discussion with Jingson, the > > >>> real > > >>>>> important thing for procedure devs is the ability to build Flink > > >>>>> datastream. But we can't get the `StreamExecutionEnvironment` which > > >> is > > >>>> the > > >>>>> entrypoint to build datastream. That's to say we will lost the > > >> ability > > >>> to > > >>>>> build a datastream if we just pass `TableEnvironment`. > > >>>>> > > >>>>> Of course, we can also pass `TableEnvironment` along with > > >>>>> `StreamExecutionEnvironment` to Procedure. But I'm intend to be > > >>> cautious > > >>>>> about exposing too much too early to procedure devs. If someday we > > >> find > > >>>> we > > >>>>> will need `TableEnvironment` to custom a procedure, we can then > add a > > >>>>> method like `getTableEnvironment()` in `ProcedureContext`. > > >>>>> > > >>>>> Best regards, > > >>>>> Yuxia > > >>>>> > > >>>>> ----- 原始邮件 ----- > > >>>>> 发件人: "Benchao Li" <libenc...@apache.org> > > >>>>> 收件人: "dev" <dev@flink.apache.org> > > >>>>> 发送时间: 星期四, 2023年 6 月 01日 下午 12:58:08 > > >>>>> 主题: Re: [DISCUSS] FLIP-311: Support Call Stored Procedure > > >>>>> > > >>>>> Thanks Yuxia for opening this discussion, > > >>>>> > > >>>>> The general idea looks good to me, I only have one question about > the > > >>>>> `ProcedureContext#getExecutionEnvironment`. Why are you proposing > to > > >>>> return > > >>>>> a `StreamExecutionEnvironment` instead of `TableEnvironment`, could > > >> you > > >>>>> elaborate a little more on this? > > >>>>> > > >>>>> Jingsong Li <jingsongl...@gmail.com> 于2023年5月30日周二 17:58写道: > > >>>>> > > >>>>>> Thanks for your explanation. > > >>>>>> > > >>>>>> We can support Iterable in future. Current design looks good to > me. > > >>>>>> > > >>>>>> Best, > > >>>>>> Jingsong > > >>>>>> > > >>>>>> On Tue, May 30, 2023 at 4:56 PM yuxia < > luoyu...@alumni.sjtu.edu.cn > > >>> > > >>>>> wrote: > > >>>>>>> > > >>>>>>> Hi, Jingsong. > > >>>>>>> Thanks for your feedback. > > >>>>>>> > > >>>>>>>> Does this need to be a function call? Do you have some example? > > >>>>>>> I think it'll be useful to support function call when user call > > >>>>>> procedure. > > >>>>>>> The following example is from iceberg:[1] > > >>>>>>> CALL catalog_name.system.migrate('spark_catalog.db.sample', > > >>>> map('foo', > > >>>>>> 'bar')); > > >>>>>>> > > >>>>>>> It allows user to use `map('foo', 'bar')` to pass a map data to > > >>>>>> procedure. > > >>>>>>> > > >>>>>>> Another case that I can imagine may be rollback a table to the > > >>>> snapshot > > >>>>>> of one week ago. > > >>>>>>> Then, with function call, user may call `rollback(table_name, > > >>> now() - > > >>>>>> INTERVAL '7' DAY)` to acheive such purpose. > > >>>>>>> > > >>>>>>> Although it can be function call, the eventual parameter got by > > >> the > > >>>>>> procedure will always be the literal evaluated. > > >>>>>>> > > >>>>>>> > > >>>>>>>> Procedure looks like a TableFunction, do you consider using > > >>>> Collector > > >>>>>>> something like TableFunction? (Supports large amount of data) > > >>>>>>> > > >>>>>>> Yes, I had considered it. But returns T[] is for simpility, > > >>>>>>> > > >>>>>>> First, regarding how to return the calling result of a procedure, > > >>> it > > >>>>>> looks more intuitive to me to use the return result of the `call` > > >>>> method > > >>>>>> instead of by calling something like collector#collect. > > >>>>>>> Introduce a collector will increase necessary complexity. > > >>>>>>> > > >>>>>>> Second, regarding supporting large amount of data, acoording my > > >>>>>> investagtion, I haven't seen the requirement that supports > > >> returning > > >>>>> large > > >>>>>> amount of data. > > >>>>>>> Iceberg also return an array.[2] If you do think we should > > >> support > > >>>>> large > > >>>>>> amount of data, I think we can change to return type from T[] to > > >>>>> Iterable<T> > > >>>>>>> > > >>>>>>> [1]: > > >>>> https://iceberg.apache.org/docs/latest/spark-procedures/#migrate > > >>>>>>> [2]: > > >>>>>> > > >>>>> > > >>>> > > >>> > > >> > > > https://github.com/apache/iceberg/blob/601c5af9b6abded79dabeba177331310d5487f43/spark/v3.2/spark/src/main/java/org/apache/spark/sql/connector/iceberg/catalog/Procedure.java#L44 > > >>>>>>> > > >>>>>>> Best regards, > > >>>>>>> Yuxia > > >>>>>>> > > >>>>>>> ----- 原始邮件 ----- > > >>>>>>> 发件人: "Jingsong Li" <jingsongl...@gmail.com> > > >>>>>>> 收件人: "dev" <dev@flink.apache.org> > > >>>>>>> 发送时间: 星期一, 2023年 5 月 29日 下午 2:42:04 > > >>>>>>> 主题: Re: [DISCUSS] FLIP-311: Support Call Stored Procedure > > >>>>>>> > > >>>>>>> Thanks Yuxia for the proposal. > > >>>>>>> > > >>>>>>>> CALL [catalog_name.][database_name.]procedure_name ([ > > >> expression > > >>> [, > > >>>>>> expression]* ] ) > > >>>>>>> > > >>>>>>> The expression can be a function call. Does this need to be a > > >>>> function > > >>>>>>> call? Do you have some example? > > >>>>>>> > > >>>>>>>> Procedure returns T[] > > >>>>>>> > > >>>>>>> Procedure looks like a TableFunction, do you consider using > > >>> Collector > > >>>>>>> something like TableFunction? (Supports large amount of data) > > >>>>>>> > > >>>>>>> Best, > > >>>>>>> Jingsong > > >>>>>>> > > >>>>>>> On Mon, May 29, 2023 at 2:33 PM yuxia < > > >> luoyu...@alumni.sjtu.edu.cn > > >>>> > > >>>>>> wrote: > > >>>>>>>> > > >>>>>>>> Hi, everyone. > > >>>>>>>> > > >>>>>>>> I’d like to start a discussion about FLIP-311: Support Call > > >>> Stored > > >>>>>> Procedure [1] > > >>>>>>>> > > >>>>>>>> Stored procedure provides a convenient way to encapsulate > > >> complex > > >>>>>> logic to perform data manipulation or administrative tasks in > > >>> external > > >>>>>> storage systems. It's widely used in traditional databases and > > >>> popular > > >>>>>> compute engines like Trino for it's convenience. Therefore, we > > >>> propose > > >>>>>> adding support for call stored procedure in Flink to enable better > > >>>>>> integration with external storage systems. > > >>>>>>>> > > >>>>>>>> With this FLIP, Flink will allow connector developers to > > >> develop > > >>>>> their > > >>>>>> own built-in stored procedures, and then enables users to call > > >> these > > >>>>>> predefiend stored procedures. > > >>>>>>>> > > >>>>>>>> Looking forward to your feedbacks. > > >>>>>>>> > > >>>>>>>> [1]: > > >>>>>> > > >>>>> > > >>>> > > >>> > > >> > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-311%3A+Support+Call+Stored+Procedure > > >>>>>>>> > > >>>>>>>> Best regards, > > >>>>>>>> Yuxia > > >>>>>> > > >>>>> > > >>>>> > > >>>>> -- > > >>>>> > > >>>>> Best, > > >>>>> Benchao Li > > >>>>> > > >>>> > > >>>> > > >>>> -- > > >>>> > > >>>> Best, > > >>>> Benchao Li > > >>>> > > >>> > > >> > > > > >