Hi, xia
It's a nice improvement to support TRUNCATE TABLE statement, making Flink
more feature-rich.
I think the truncate syntax is a command that will be executed in the
client's process, rather than pulling up a Flink job to execute on the
cluster. So on the user-facing exposed interface, I think we should not let
users implement the SupportsTruncate interface on the DynamicTableSink
interface. This seems a bit strange and also confuses users, as hang said,
why Source table does not support truncate. It would be nice if we could
come up with a generic interface that supports truncate instead of binding
it to the DynamicTableSink interface, and maybe in the future we will
support more commands like truncate command.

In addition, after truncating data, we may also need to update the metadata
of the table, such as Hive table, we need to update the statistics, as well
as clear the cache in the metastore, I think we should also consider these
capabilities, Sparky has considered these, refer to
https://github.com/apache/spark/blob/69dd20b5e45c7e3533efbfdc1974f59931c1b781/sql/core/src/main/scala/org/apache/spark/sql/execution/command/tables.scala#L573
.

Best,

Ron

Jim Hughes <jhug...@confluent.io.invalid> 于2023年4月11日周二 02:15写道:

> Hi Yuxia,
>
> On Mon, Apr 10, 2023 at 10:35 AM yuxia <luoyu...@alumni.sjtu.edu.cn>
> wrote:
>
> > Hi, Jim.
> >
> > 1: I'm expecting all DynamicTableSinks to support. But it's hard to
> > support all at one shot. For the DynamicTableSinks that haven't
> implemented
> > SupportsTruncate interface, we'll throw exception
> > like 'The truncate statement for the table is not supported as it hasn't
> > implemented the interface SupportsTruncate'. Also, for some sinks that
> > doesn't support deleting data, it can also implements it but throw more
> > concrete exception like "xxx donesn't support to truncate a table as
> delete
> > is impossible for xxx". It depends on the external connector's
> > implementation.
> > Thanks for your advice, I updated it to the FLIP.
> >
>
> Makes sense.
>
>
> > 2: What do you mean by saying "truncate an input to a streaming query"?
> > This FLIP is aimed to support TRUNCATE TABLE statement which is for
> > truncating a table. In which case it will inoperates with streaming
> queries?
> >
>
> Let's take a source like Kafka as an example.  Suppose I have an input
> topic Foo, and query which uses it as an input.
>
> When Foo is truncated, if the truncation works as a delete and create, then
> the connector may need to be made aware (otherwise it may try to use
> offsets from the previous topic).  On the other hand, one may have to ask
> Kafka to delete records up to a certain point.
>
> Also, savepoints for the query may contain information from the truncated
> table.  Should this FLIP involve invalidating that information in some
> manner?  Or does truncating a source table for a query cause undefined
> behavior on that query?
>
> Basically, I'm trying to think through the implementations of a truncate
> operation to streaming sources and queries.
>
> Cheers,
>
> Jim
>
>
> > Best regards,
> > Yuxia
> >
> > ----- 原始邮件 -----
> > 发件人: "Jim Hughes" <jhug...@confluent.io.INVALID>
> > 收件人: "dev" <dev@flink.apache.org>
> > 发送时间: 星期一, 2023年 4 月 10日 下午 9:32:28
> > 主题: Re: [DISCUSS] FLIP-302: Support TRUNCATE TABLE statement
> >
> > Hi Yuxia,
> >
> > Two questions:
> >
> > 1.  Are you expecting all DynamicTableSinks to support Truncate?  The
> FLIP
> > could use some explanation for what supporting and not supporting the
> > operation means.
> >
> > 2.  How will truncate inoperate with streaming queries?  That is, if I
> > truncate an input to a streaming query, is there any defined behavior?
> >
> > Cheers,
> >
> > Jim
> >
> > On Wed, Mar 22, 2023 at 9:13 AM yuxia <luoyu...@alumni.sjtu.edu.cn>
> wrote:
> >
> > > Hi, devs.
> > >
> > > I'd like to start a discussion about FLIP-302: Support TRUNCATE TABLE
> > > statement [1].
> > >
> > > The TRUNCATE TABLE statement is a SQL command that allows users to
> > quickly
> > > and efficiently delete all rows from a table without dropping the table
> > > itself. This statement is commonly used in data warehouse, where large
> > data
> > > sets are frequently loaded and unloaded from tables.
> > > So, this FLIP is meant to support TRUNCATE TABLE statement. M ore
> > exactly,
> > > this FLIP will bring Flink the TRUNCATE TABLE syntax and an interface
> > with
> > > which the coresponding connectors can implement their own logic for
> > > truncating table.
> > >
> > > Looking forwards to your feedback.
> > >
> > > [1]: [
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-302%3A+Support+TRUNCATE+TABLE+statement
> > > |
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-302%3A+Support+TRUNCATE+TABLE+statement
> > > ]
> > >
> > >
> > > Best regards,
> > > Yuxia
> > >
> >
>

Reply via email to