Thanks for your explanation.

We can support Iterable in future. Current design looks good to me.

Best,
Jingsong

On Tue, May 30, 2023 at 4:56 PM yuxia <luoyu...@alumni.sjtu.edu.cn> wrote:
>
> Hi, Jingsong.
> Thanks for your feedback.
>
> > Does this need to be a function call? Do you have some example?
> I think it'll be useful to support function call when user call procedure.
> The following example is from iceberg:[1]
> CALL catalog_name.system.migrate('spark_catalog.db.sample', map('foo', 
> 'bar'));
>
> It allows user to use `map('foo', 'bar')` to pass a map data to procedure.
>
> Another case that I can imagine may be rollback a table to the snapshot of 
> one week ago.
> Then, with function call, user may call `rollback(table_name, now() - 
> INTERVAL '7' DAY)` to acheive such purpose.
>
> Although it can be function call, the eventual parameter got by the procedure 
> will always be the literal evaluated.
>
>
> > Procedure looks like a TableFunction, do you consider using Collector
> something like TableFunction? (Supports large amount of data)
>
> Yes, I had considered it. But returns T[] is for simpility,
>
> First, regarding how to return the calling result of a procedure, it looks 
> more intuitive to me to use the return result of the `call` method instead of 
> by calling something like collector#collect.
> Introduce a collector will increase necessary complexity.
>
> Second, regarding supporting large amount of data,  acoording my 
> investagtion, I haven't seen the requirement that supports returning large 
> amount of data.
> Iceberg also return an array.[2] If you do think we should support large 
> amount of data, I think we can change to return type from T[] to Iterable<T>
>
> [1]: https://iceberg.apache.org/docs/latest/spark-procedures/#migrate
> [2]: 
> https://github.com/apache/iceberg/blob/601c5af9b6abded79dabeba177331310d5487f43/spark/v3.2/spark/src/main/java/org/apache/spark/sql/connector/iceberg/catalog/Procedure.java#L44
>
> Best regards,
> Yuxia
>
> ----- 原始邮件 -----
> 发件人: "Jingsong Li" <jingsongl...@gmail.com>
> 收件人: "dev" <dev@flink.apache.org>
> 发送时间: 星期一, 2023年 5 月 29日 下午 2:42:04
> 主题: Re: [DISCUSS] FLIP-311: Support Call Stored Procedure
>
> Thanks Yuxia for the proposal.
>
> > CALL [catalog_name.][database_name.]procedure_name ([ expression [, 
> > expression]* ] )
>
> The expression can be a function call. Does this need to be a function
> call? Do you have some example?
>
> > Procedure returns T[]
>
> Procedure looks like a TableFunction, do you consider using Collector
> something like TableFunction? (Supports large amount of data)
>
> Best,
> Jingsong
>
> On Mon, May 29, 2023 at 2:33 PM yuxia <luoyu...@alumni.sjtu.edu.cn> wrote:
> >
> > Hi, everyone.
> >
> > I’d like to start a discussion about FLIP-311: Support Call Stored 
> > Procedure [1]
> >
> > Stored procedure provides a convenient way to encapsulate complex logic to 
> > perform data manipulation or administrative tasks in external storage 
> > systems. It's widely used in traditional databases and popular compute 
> > engines like Trino for it's convenience. Therefore, we propose adding 
> > support for call stored procedure in Flink to enable better integration 
> > with external storage systems.
> >
> > With this FLIP, Flink will allow connector developers to develop their own 
> > built-in stored procedures, and then enables users to call these predefiend 
> > stored procedures.
> >
> > Looking forward to your feedbacks.
> >
> > [1]: 
> > https://cwiki.apache.org/confluence/display/FLINK/FLIP-311%3A+Support+Call+Stored+Procedure
> >
> > Best regards,
> > Yuxia

Reply via email to