Hi, Jingsong.
Thanks for your feedback.
> Does this need to be a function call? Do you have some example?
I think it'll be useful to support function call when user call procedure.
The following example is from iceberg:[1]
CALL catalog_name.system.migrate('spark_catalog.db.sample', map('foo', 'bar'));
It allows user to use `map('foo', 'bar')` to pass a map data to procedure.
Another case that I can imagine may be rollback a table to the snapshot of one
week ago.
Then, with function call, user may call `rollback(table_name, now() - INTERVAL
'7' DAY)` to acheive such purpose.
Although it can be function call, the eventual parameter got by the procedure
will always be the literal evaluated.
> Procedure looks like a TableFunction, do you consider using Collector
something like TableFunction? (Supports large amount of data)
Yes, I had considered it. But returns T[] is for simpility,
First, regarding how to return the calling result of a procedure, it looks more
intuitive to me to use the return result of the `call` method instead of by
calling something like collector#collect.
Introduce a collector will increase necessary complexity.
Second, regarding supporting large amount of data, acoording my investagtion,
I haven't seen the requirement that supports returning large amount of data.
Iceberg also return an array.[2] If you do think we should support large amount
of data, I think we can change to return type from T[] to Iterable<T>
[1]: https://iceberg.apache.org/docs/latest/spark-procedures/#migrate
[2]:
https://github.com/apache/iceberg/blob/601c5af9b6abded79dabeba177331310d5487f43/spark/v3.2/spark/src/main/java/org/apache/spark/sql/connector/iceberg/catalog/Procedure.java#L44
Best regards,
Yuxia
----- 原始邮件 -----
发件人: "Jingsong Li" <[email protected]>
收件人: "dev" <[email protected]>
发送时间: 星期一, 2023年 5 月 29日 下午 2:42:04
主题: Re: [DISCUSS] FLIP-311: Support Call Stored Procedure
Thanks Yuxia for the proposal.
> CALL [catalog_name.][database_name.]procedure_name ([ expression [,
> expression]* ] )
The expression can be a function call. Does this need to be a function
call? Do you have some example?
> Procedure returns T[]
Procedure looks like a TableFunction, do you consider using Collector
something like TableFunction? (Supports large amount of data)
Best,
Jingsong
On Mon, May 29, 2023 at 2:33 PM yuxia <[email protected]> wrote:
>
> Hi, everyone.
>
> I’d like to start a discussion about FLIP-311: Support Call Stored Procedure
> [1]
>
> Stored procedure provides a convenient way to encapsulate complex logic to
> perform data manipulation or administrative tasks in external storage
> systems. It's widely used in traditional databases and popular compute
> engines like Trino for it's convenience. Therefore, we propose adding support
> for call stored procedure in Flink to enable better integration with external
> storage systems.
>
> With this FLIP, Flink will allow connector developers to develop their own
> built-in stored procedures, and then enables users to call these predefiend
> stored procedures.
>
> Looking forward to your feedbacks.
>
> [1]:
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-311%3A+Support+Call+Stored+Procedure
>
> Best regards,
> Yuxia