Well, thanks xia for your clarification. Agree with your point, I have
no other concerns.

Best,
Aitozi.

yuxia <luoyu...@alumni.sjtu.edu.cn> 于2023年4月13日周四 16:17写道:
>
> Hi, Aitozi.
> Thanks for your inputs. I understand your concern. Althogh the external 
> connector can update the metadata in method `executeTruncation`,
> but the Flink catalog can't be aware the updating in some case. If the Hive 
> catalog only store hive tables, everything will be fine.
> But if the Hive catalog also store non-hive table, and the non-hive table 
> can't be update the underlying Hive metatasore, as a result of which
> the Hive catalog will still get old metata.
>
> Since this problem is generic which is not only limited to truncate table 
> statment, but also to other statement, like insert, update/delete or other 
> statments on the way.
> I think it deserves another dedicated channel to discuss what the Flink 
> catalog is for or do we need to introduce some new mechanism for it.
>
>
> Best regards,
> Yuxia
>
> ----- 原始邮件 -----
> 发件人: "Aitozi" <gjying1...@gmail.com>
> 收件人: "dev" <dev@flink.apache.org>
> 发送时间: 星期四, 2023年 4 月 13日 下午 2:37:48
> 主题: Re: [DISCUSS] FLIP-302: Support TRUNCATE TABLE statement
>
> Hi, xia
>    > which I think if Flink supports table cache in framework-level,
> we can also recache in framework-level for truncate table statement.
>
> I think currently flink catalog already will some stats for the table,
> eg: after `ANALYZE TABLE`, the table's Statistics will be stored in
> the
> catalog, but truncate table will not correct the statistic.
>
> I know it's hard for Flink to do the unified follow-up actions after
> truncating table. But I think we need define a clear location for the
> Flink Catalog
> in mind.
> IMO, Flink as a compute engine, it's hard for it to maintain the
> catalog for different storage table itself. So with more and more
> `Executable`
> command introduced the data in catalog will be cleaved.
> In this case, after truncate the catalog's following part may be affected:
>
> - the table/column statistic will be not correct
> - the partition of this table should be cleared
>
>
> Best,
> Aitozi.
>
>
> liu ron <ron9....@gmail.com> 于2023年4月13日周四 11:28写道:
>
> >
> > Hi, xia
> >
> > Thanks for your explanation, for the first question, given the current
> > status, I think we can provide the generic interface in the future if we
> > need it. For the second question,  it makes sense to me if we can
> > support the table cache at the framework level.
> >
> > Best,
> > Ron
> >
> > yuxia <luoyu...@alumni.sjtu.edu.cn> 于2023年4月11日周二 16:12写道:
> >
> > > Hi, ron.
> > >
> > > 1: Considering for deleting rows, Flink will also write delete record to
> > > achive purpose of deleting data, it may not as so strange for connector
> > > devs to make DynamicTableSink implement SupportsTruncate to support
> > > truncate the table. Based on the assume that DynamicTableSink is used for
> > > inserting/updating/deleting, I think it's reasonable for DynamicTableSink
> > > to implement SupportsTruncate. But I think it sounds reasonable to add a
> > > generic interface like DynamicTable to differentiate DynamicTableSource &
> > > DynamicTableSink. But it will definitely requires much design and
> > > discussion which deserves a dedicated FLIP. I perfer not to do that in 
> > > this
> > > FLIP to avoid overdesign and I think it's not a must for this FLIP. Maybe
> > > we can discuss it if some day if we do need the new generic table 
> > > interface.
> > >
> > > 2: Considering various catalogs and tables, it's hard for Flink to do the
> > > unified follow-up actions after truncating table. But still the external
> > > connector can do such follow-up actions in method `executeTruncation`.
> > > Btw, in Spark, for the newly truncate table interface[1], Spark only
> > > recaches the table after truncating table[2] which I think if Flink
> > > supports table cache in framework-level,
> > > we can also recache in framework-level for truncate table statement.
> > >
> > > [1]
> > > https://github.com/apache/spark/blob/1a42aa5bd44e7524bb55463bbd85bea782715834/sql/catalyst/src/main/java/org/apache/spark/sql/connector/catalog/TruncatableTable.java
> > > [2]
> > > https://github.com/apache/spark/blob/06c09a79b371c5ac3e4ebad1118ed94b460f48d1/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/TruncateTableExec.scala
> > >
> > >
> > > I think the external catalog can implemnet such logic in method
> > > `executeTruncation`.
> > >
> > > Best regards,
> > > Yuxia
> > >
> > > ----- 原始邮件 -----
> > > 发件人: "liu ron" <ron9....@gmail.com>
> > > 收件人: "dev" <dev@flink.apache.org>
> > > 发送时间: 星期二, 2023年 4 月 11日 上午 10:51:36
> > > 主题: Re: [DISCUSS] FLIP-302: Support TRUNCATE TABLE statement
> > >
> > > Hi, xia
> > > It's a nice improvement to support TRUNCATE TABLE statement, making Flink
> > > more feature-rich.
> > > I think the truncate syntax is a command that will be executed in the
> > > client's process, rather than pulling up a Flink job to execute on the
> > > cluster. So on the user-facing exposed interface, I think we should not 
> > > let
> > > users implement the SupportsTruncate interface on the DynamicTableSink
> > > interface. This seems a bit strange and also confuses users, as hang said,
> > > why Source table does not support truncate. It would be nice if we could
> > > come up with a generic interface that supports truncate instead of binding
> > > it to the DynamicTableSink interface, and maybe in the future we will
> > > support more commands like truncate command.
> > >
> > > In addition, after truncating data, we may also need to update the 
> > > metadata
> > > of the table, such as Hive table, we need to update the statistics, as 
> > > well
> > > as clear the cache in the metastore, I think we should also consider these
> > > capabilities, Sparky has considered these, refer to
> > >
> > > https://github.com/apache/spark/blob/69dd20b5e45c7e3533efbfdc1974f59931c1b781/sql/core/src/main/scala/org/apache/spark/sql/execution/command/tables.scala#L573
> > > .
> > >
> > > Best,
> > >
> > > Ron
> > >
> > > Jim Hughes <jhug...@confluent.io.invalid> 于2023年4月11日周二 02:15写道:
> > >
> > > > Hi Yuxia,
> > > >
> > > > On Mon, Apr 10, 2023 at 10:35 AM yuxia <luoyu...@alumni.sjtu.edu.cn>
> > > > wrote:
> > > >
> > > > > Hi, Jim.
> > > > >
> > > > > 1: I'm expecting all DynamicTableSinks to support. But it's hard to
> > > > > support all at one shot. For the DynamicTableSinks that haven't
> > > > implemented
> > > > > SupportsTruncate interface, we'll throw exception
> > > > > like 'The truncate statement for the table is not supported as it
> > > hasn't
> > > > > implemented the interface SupportsTruncate'. Also, for some sinks that
> > > > > doesn't support deleting data, it can also implements it but throw 
> > > > > more
> > > > > concrete exception like "xxx donesn't support to truncate a table as
> > > > delete
> > > > > is impossible for xxx". It depends on the external connector's
> > > > > implementation.
> > > > > Thanks for your advice, I updated it to the FLIP.
> > > > >
> > > >
> > > > Makes sense.
> > > >
> > > >
> > > > > 2: What do you mean by saying "truncate an input to a streaming 
> > > > > query"?
> > > > > This FLIP is aimed to support TRUNCATE TABLE statement which is for
> > > > > truncating a table. In which case it will inoperates with streaming
> > > > queries?
> > > > >
> > > >
> > > > Let's take a source like Kafka as an example.  Suppose I have an input
> > > > topic Foo, and query which uses it as an input.
> > > >
> > > > When Foo is truncated, if the truncation works as a delete and create,
> > > then
> > > > the connector may need to be made aware (otherwise it may try to use
> > > > offsets from the previous topic).  On the other hand, one may have to 
> > > > ask
> > > > Kafka to delete records up to a certain point.
> > > >
> > > > Also, savepoints for the query may contain information from the 
> > > > truncated
> > > > table.  Should this FLIP involve invalidating that information in some
> > > > manner?  Or does truncating a source table for a query cause undefined
> > > > behavior on that query?
> > > >
> > > > Basically, I'm trying to think through the implementations of a truncate
> > > > operation to streaming sources and queries.
> > > >
> > > > Cheers,
> > > >
> > > > Jim
> > > >
> > > >
> > > > > Best regards,
> > > > > Yuxia
> > > > >
> > > > > ----- 原始邮件 -----
> > > > > 发件人: "Jim Hughes" <jhug...@confluent.io.INVALID>
> > > > > 收件人: "dev" <dev@flink.apache.org>
> > > > > 发送时间: 星期一, 2023年 4 月 10日 下午 9:32:28
> > > > > 主题: Re: [DISCUSS] FLIP-302: Support TRUNCATE TABLE statement
> > > > >
> > > > > Hi Yuxia,
> > > > >
> > > > > Two questions:
> > > > >
> > > > > 1.  Are you expecting all DynamicTableSinks to support Truncate?  The
> > > > FLIP
> > > > > could use some explanation for what supporting and not supporting the
> > > > > operation means.
> > > > >
> > > > > 2.  How will truncate inoperate with streaming queries?  That is, if I
> > > > > truncate an input to a streaming query, is there any defined behavior?
> > > > >
> > > > > Cheers,
> > > > >
> > > > > Jim
> > > > >
> > > > > On Wed, Mar 22, 2023 at 9:13 AM yuxia <luoyu...@alumni.sjtu.edu.cn>
> > > > wrote:
> > > > >
> > > > > > Hi, devs.
> > > > > >
> > > > > > I'd like to start a discussion about FLIP-302: Support TRUNCATE 
> > > > > > TABLE
> > > > > > statement [1].
> > > > > >
> > > > > > The TRUNCATE TABLE statement is a SQL command that allows users to
> > > > > quickly
> > > > > > and efficiently delete all rows from a table without dropping the
> > > table
> > > > > > itself. This statement is commonly used in data warehouse, where
> > > large
> > > > > data
> > > > > > sets are frequently loaded and unloaded from tables.
> > > > > > So, this FLIP is meant to support TRUNCATE TABLE statement. M ore
> > > > > exactly,
> > > > > > this FLIP will bring Flink the TRUNCATE TABLE syntax and an 
> > > > > > interface
> > > > > with
> > > > > > which the coresponding connectors can implement their own logic for
> > > > > > truncating table.
> > > > > >
> > > > > > Looking forwards to your feedback.
> > > > > >
> > > > > > [1]: [
> > > > > >
> > > > >
> > > >
> > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-302%3A+Support+TRUNCATE+TABLE+statement
> > > > > > |
> > > > > >
> > > > >
> > > >
> > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-302%3A+Support+TRUNCATE+TABLE+statement
> > > > > > ]
> > > > > >
> > > > > >
> > > > > > Best regards,
> > > > > > Yuxia
> > > > > >
> > > > >
> > > >
> > >

Reply via email to