Hi, Dev My rankings are:
1. Derived Table 2. Materialized Table 3. Live Table 4. Materialized View Best, Ron Ron liu <ron9....@gmail.com> 于2024年4月9日周二 20:07写道: > Hi, Dev > > After several rounds of discussion, there is currently no consensus on the > name of the new concept. Timo has proposed that we decide the name through > a vote. This is a good solution when there is no clear preference, so we > will adopt this approach. > > Regarding the name of the new concept, there are currently five candidates: > 1. Derived Table -> taken by SQL standard > 2. Materialized Table -> similar to SQL materialized view but a table > 3. Live Table -> similar to dynamic tables > 4. Refresh Table -> states what it does > 5. Materialized View -> needs to extend the standard to support modifying > data > > For the above five candidates, everyone can give your rankings based on > your preferences. You can choose up to five options or only choose some of > them. > We will use a scoring rule, where the* first rank gets 5 points, second > rank gets 4 points, third rank gets 3 points, fourth rank gets 2 points, > and fifth rank gets 1 point*. > After the voting closes, I will score all the candidates based on > everyone's votes, and the candidate with the highest score will be chosen > as the name for the new concept. > > The voting will last up to 72 hours and is expected to close this Friday. > I look forward to everyone voting on the name in this thread. Of course, we > also welcome new input regarding the name. > > Best, > Ron > > Ron liu <ron9....@gmail.com> 于2024年4月9日周二 19:49写道: > >> Hi, Dev >> >> Sorry for my previous statement was not quite accurate. We will hold a >> vote for the name within this thread. >> >> Best, >> Ron >> >> >> Ron liu <ron9....@gmail.com> 于2024年4月9日周二 19:29写道: >> >>> Hi, Timo >>> >>> Thanks for your reply. >>> >>> I agree with you that sometimes naming is more difficult. When no one >>> has a clear preference, voting on the name is a good solution, so I'll send >>> a separate email for the vote, clarify the rules for the vote, then let >>> everyone vote. >>> >>> One other point to confirm, in your ranking there is an option for >>> Materialized View, does it stand for the UPDATING Materialized View that >>> you mentioned earlier in the discussion? If using Materialized View I think >>> it is needed to extend it. >>> >>> Best, >>> Ron >>> >>> Timo Walther <twal...@apache.org> 于2024年4月9日周二 17:20写道: >>> >>>> Hi Ron, >>>> >>>> yes naming is hard. But it will have large impact on trainings, >>>> presentations, and the mental model of users. Maybe the easiest is to >>>> collect ranking by everyone with some short justification: >>>> >>>> >>>> My ranking (from good to not so good): >>>> >>>> 1. Refresh Table -> states what it does >>>> 2. Materialized Table -> similar to SQL materialized view but a table >>>> 3. Live Table -> nice buzzword, but maybe still too close to dynamic >>>> tables? >>>> 4. Materialized View -> a bit broader than standard but still very >>>> similar >>>> 5. Derived table -> taken by standard >>>> >>>> Regards, >>>> Timo >>>> >>>> >>>> >>>> On 07.04.24 11:34, Ron liu wrote: >>>> > Hi, Dev >>>> > >>>> > This is a summary letter. After several rounds of discussion, there >>>> is a >>>> > strong consensus about the FLIP proposal and the issues it aims to >>>> address. >>>> > The current point of disagreement is the naming of the new concept. I >>>> have >>>> > summarized the candidates as follows: >>>> > >>>> > 1. Derived Table (Inspired by Google Lookers) >>>> > - Pros: Google Lookers has introduced this concept, which is >>>> designed >>>> > for building Looker's automated modeling, aligning with our purpose >>>> for the >>>> > stream-batch automatic pipeline. >>>> > >>>> > - Cons: The SQL standard uses derived table term extensively, >>>> vendors >>>> > adopt this for simply referring to a table within a subclause. >>>> > >>>> > 2. Materialized Table: It means materialize the query result to table, >>>> > similar to Db2 MQT (Materialized Query Tables). In addition, Snowflake >>>> > Dynamic Table's predecessor is also called Materialized Table. >>>> > >>>> > 3. Updating Table (From Timo) >>>> > >>>> > 4. Updating Materialized View (From Timo) >>>> > >>>> > 5. Refresh/Live Table (From Martijn) >>>> > >>>> > As Martijn said, naming is a headache, looking forward to more >>>> valuable >>>> > input from everyone. >>>> > >>>> > [1] >>>> > >>>> https://cloud.google.com/looker/docs/derived-tables#persistent_derived_tables >>>> > [2] >>>> https://www.ibm.com/docs/en/db2/11.5?topic=tables-materialized-query >>>> > [3] >>>> > >>>> https://community.denodo.com/docs/html/browse/6.0/vdp/vql/materialized_tables/creating_materialized_tables/creating_materialized_tables >>>> > >>>> > Best, >>>> > Ron >>>> > >>>> > Ron liu <ron9....@gmail.com> 于2024年4月7日周日 15:55写道: >>>> > >>>> >> Hi, Lorenzo >>>> >> >>>> >> Thank you for your insightful input. >>>> >> >>>> >>>>> I think the 2 above twisted the materialized view concept to more >>>> than >>>> >> just an optimization for accessing pre-computed aggregates/filters. >>>> >> I think that concept (at least in my mind) is now adherent to the >>>> >> semantics of the words themselves ("materialized" and "view") than >>>> on its >>>> >> implementations in DBMs, as just a view on raw data that, hopefully, >>>> is >>>> >> constantly updated with fresh results. >>>> >> That's why I understand Timo's et al. objections. >>>> >> >>>> >> Your understanding of Materialized Views is correct. However, in our >>>> >> scenario, an important feature is the support for Update & Delete >>>> >> operations, which the current Materialized Views cannot fulfill. As >>>> we >>>> >> discussed with Timo before, if Materialized Views needs to support >>>> data >>>> >> modifications, it would require an extension of new keywords, such as >>>> >> CREATING xxx (UPDATING) MATERIALIZED VIEW. >>>> >> >>>> >>>>> Still, I don't understand why we need another type of special >>>> table. >>>> >> Could you dive deep into the reasons why not simply adding the >>>> FRESHNESS >>>> >> parameter to standard tables? >>>> >> >>>> >> Firstly, I need to emphasize that we cannot achieve the design goal >>>> of >>>> >> FLIP through the CREATE TABLE syntax combined with a FRESHNESS >>>> parameter. >>>> >> The proposal of this FLIP is to use Dynamic Table + Continuous >>>> Query, and >>>> >> combine it with FRESHNESS to realize a streaming-batch unification. >>>> >> However, CREATE TABLE is merely a metadata operation and cannot >>>> >> automatically start a background refresh job. To achieve the design >>>> goal of >>>> >> FLIP with standard tables, it would require extending the CTAS[1] >>>> syntax to >>>> >> introduce the FRESHNESS keyword. We considered this design >>>> initially, but >>>> >> it has following problems: >>>> >> >>>> >> 1. Distinguishing a table created through CTAS as a standard table >>>> or as a >>>> >> "special" standard table with an ongoing background refresh job >>>> using the >>>> >> FRESHNESS keyword is very obscure for users. >>>> >> 2. It intrudes on the semantics of the CTAS syntax. Currently, tables >>>> >> created using CTAS only add table metadata to the Catalog and do not >>>> record >>>> >> attributes such as query. There are also no ongoing background >>>> refresh >>>> >> jobs, and the data writing operation happens only once at table >>>> creation. >>>> >> 3. For the framework, when we perform a certain kind of Alter Table >>>> >> behavior for a table, for the table created by specifying FRESHNESS >>>> and did >>>> >> not specify the FRESHNESS created table behavior how to distinguish >>>> , which >>>> >> will also cause confusion. >>>> >> >>>> >> In terms of the design goal of combining Dynamic Table + Continuous >>>> Query, >>>> >> the FLIP proposal cannot be realized by only extending the current >>>> stardand >>>> >> tables, so a new kind of dynamic table needs to be introduced at the >>>> >> first-level concept. >>>> >> >>>> >> [1] >>>> >> >>>> https://nightlies.apache.org/flink/flink-docs-master/docs/dev/table/sql/create/#as-select_statement >>>> >> >>>> >> Best, >>>> >> Ron >>>> >> >>>> >> <lorenzo.affe...@ververica.com.invalid> 于2024年4月3日周三 22:25写道: >>>> >> >>>> >>> Hello everybody! >>>> >>> Thanks for the FLIP as it looks amazing (and I think the prove is >>>> this >>>> >>> deep discussion it is provoking :)) >>>> >>> >>>> >>> I have a couple of comments to add to this: >>>> >>> >>>> >>> Even though I get the reason why you rejected MATERIALIZED VIEW, I >>>> still >>>> >>> like it a lot, and I would like to provide pointers on how the >>>> materialized >>>> >>> view concept twisted in last years: >>>> >>> >>>> >>> • Materialize DB (https://materialize.com/) >>>> >>> • The famous talk by Martin Kleppmann "turning the database inside >>>> out" ( >>>> >>> https://www.youtube.com/watch?v=fU9hR3kiOK0) >>>> >>> >>>> >>> I think the 2 above twisted the materialized view concept to more >>>> than >>>> >>> just an optimization for accessing pre-computed aggregates/filters. >>>> >>> I think that concept (at least in my mind) is now adherent to the >>>> >>> semantics of the words themselves ("materialized" and "view") than >>>> on its >>>> >>> implementations in DBMs, as just a view on raw data that, >>>> hopefully, is >>>> >>> constantly updated with fresh results. >>>> >>> That's why I understand Timo's et al. objections. >>>> >>> Still I understand there is no need to add confusion :) >>>> >>> >>>> >>> Still, I don't understand why we need another type of special table. >>>> >>> Could you dive deep into the reasons why not simply adding the >>>> FRESHNESS >>>> >>> parameter to standard tables? >>>> >>> >>>> >>> I would say that as a very seamless implementation with the goal of >>>> a >>>> >>> unification of batch and streaming. >>>> >>> If we stick to a unified world, I think that Flink should just >>>> provide 1 >>>> >>> type of table that is inherently dynamic. >>>> >>> Now, depending on FRESHNESS objectives / connectors used in WITH, >>>> that >>>> >>> table can be backed by a stream or batch job as you explained in >>>> your FLIP. >>>> >>> >>>> >>> Maybe I am totally missing the point :) >>>> >>> >>>> >>> Thank you in advance, >>>> >>> Lorenzo >>>> >>> On Apr 3, 2024 at 15:25 +0200, Martijn Visser < >>>> martijnvis...@apache.org>, >>>> >>> wrote: >>>> >>>> Hi all, >>>> >>>> >>>> >>>> Thanks for the proposal. While the FLIP talks extensively on how >>>> >>> Snowflake >>>> >>>> has Dynamic Tables and Databricks has Delta Live Tables, my >>>> >>> understanding >>>> >>>> is that Databricks has CREATE STREAMING TABLE [1] which relates >>>> with >>>> >>> this >>>> >>>> proposal. >>>> >>>> >>>> >>>> I do have concerns about using CREATE DYNAMIC TABLE, specifically >>>> about >>>> >>>> confusing the users who are familiar with Snowflake's approach >>>> where you >>>> >>>> can't change the content via DML statements, while that is >>>> something >>>> >>> that >>>> >>>> would work in this proposal. Naming is hard of course, but I would >>>> >>> probably >>>> >>>> prefer something like CREATE CONTINUOUS TABLE, CREATE REFRESH >>>> TABLE or >>>> >>>> CREATE LIVE TABLE. >>>> >>>> >>>> >>>> Best regards, >>>> >>>> >>>> >>>> Martijn >>>> >>>> >>>> >>>> [1] >>>> >>>> >>>> >>> >>>> https://docs.databricks.com/en/sql/language-manual/sql-ref-syntax-ddl-create-streaming-table.html >>>> >>>> >>>> >>>> On Wed, Apr 3, 2024 at 5:19 AM Ron liu <ron9....@gmail.com> wrote: >>>> >>>> >>>> >>>>> Hi, dev >>>> >>>>> >>>> >>>>> After offline discussion with Becket Qin, Lincoln Lee and Jark >>>> Wu, we >>>> >>> have >>>> >>>>> improved some parts of the FLIP. >>>> >>>>> >>>> >>>>> 1. Add Full Refresh Mode section to clarify the semantics of full >>>> >>> refresh >>>> >>>>> mode. >>>> >>>>> 2. Add Future Improvement section explaining why query statement >>>> does >>>> >>> not >>>> >>>>> support references to temporary view and possible solutions. >>>> >>>>> 3. The Future Improvement section explains a possible future >>>> solution >>>> >>> for >>>> >>>>> dynamic table to support the modification of query statements to >>>> meet >>>> >>> the >>>> >>>>> common field-level schema evolution requirements of the lakehouse. >>>> >>>>> 4. The Refresh section emphasizes that the Refresh command and the >>>> >>>>> background refresh job can be executed in parallel, with no >>>> >>> restrictions at >>>> >>>>> the framework level. >>>> >>>>> 5. Convert RefreshHandler into a plug-in interface to support >>>> various >>>> >>>>> workflow schedulers. >>>> >>>>> >>>> >>>>> Best, >>>> >>>>> Ron >>>> >>>>> >>>> >>>>> Ron liu <ron9....@gmail.com> 于2024年4月2日周二 10:28写道: >>>> >>>>> >>>> >>>>>>> Hi, Venkata krishnan >>>> >>>>>>> >>>> >>>>>>> Thank you for your involvement and suggestions, and hope that >>>> the >>>> >>> design >>>> >>>>>>> goals of this FLIP will be helpful to your business. >>>> >>>>>>> >>>> >>>>>>>>>>>>> 1. In the proposed FLIP, given the example for the >>>> >>> dynamic table, do >>>> >>>>>>> the >>>> >>>>>>> data sources always come from a single lake storage such as >>>> >>> Paimon or >>>> >>>>> does >>>> >>>>>>> the same proposal solve for 2 disparate storage systems like >>>> >>> Kafka and >>>> >>>>>>> Iceberg where Kafka events are ETLed to Iceberg similar to >>>> Paimon? >>>> >>>>>>> Basically the lambda architecture that is mentioned in the FLIP >>>> >>> as well. >>>> >>>>>>> I'm wondering if it is possible to switch b/w sources based on >>>> the >>>> >>>>>>> execution mode, for eg: if it is backfill operation, switch to a >>>> >>> data >>>> >>>>> lake >>>> >>>>>>> storage system like Iceberg, otherwise an event streaming system >>>> >>> like >>>> >>>>>>> Kafka. >>>> >>>>>>> >>>> >>>>>>> Dynamic table is a design abstraction at the framework level and >>>> >>> is not >>>> >>>>>>> tied to the physical implementation of the connector. If a >>>> >>> connector >>>> >>>>>>> supports a combination of Kafka and lake storage, this works >>>> fine. >>>> >>>>>>> >>>> >>>>>>>>>>>>> 2. What happens in the context of a bootstrap (batch) + >>>> >>> nearline >>>> >>>>> update >>>> >>>>>>> (streaming) case that are stateful applications? What I mean by >>>> >>> that is, >>>> >>>>>>> will the state from the batch application be transferred to the >>>> >>> nearline >>>> >>>>>>> application after the bootstrap execution is complete? >>>> >>>>>>> >>>> >>>>>>> I think this is another orthogonal thing, something that >>>> FLIP-327 >>>> >>> tries >>>> >>>>> to >>>> >>>>>>> address, not directly related to Dynamic Table. >>>> >>>>>>> >>>> >>>>>>> [1] >>>> >>>>>>> >>>> >>>>> >>>> >>> >>>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-327%3A+Support+switching+from+batch+to+stream+mode+to+improve+throughput+when+processing+backlog+data >>>> >>>>>>> >>>> >>>>>>> Best, >>>> >>>>>>> Ron >>>> >>>>>>> >>>> >>>>>>> Venkatakrishnan Sowrirajan <vsowr...@asu.edu> 于2024年3月30日周六 >>>> >>> 07:06写道: >>>> >>>>>>> >>>> >>>>>>>>> Ron and Lincoln, >>>> >>>>>>>>> >>>> >>>>>>>>> Great proposal and interesting discussion for adding support >>>> >>> for dynamic >>>> >>>>>>>>> tables within Flink. >>>> >>>>>>>>> >>>> >>>>>>>>> At LinkedIn, we are also trying to solve compute/storage >>>> >>> convergence for >>>> >>>>>>>>> similar problems discussed as part of this FLIP, specifically >>>> >>> periodic >>>> >>>>>>>>> backfill, bootstrap + nearline update use cases using single >>>> >>>>>>>>> implementation >>>> >>>>>>>>> of business logic (single script). >>>> >>>>>>>>> >>>> >>>>>>>>> Few clarifying questions: >>>> >>>>>>>>> >>>> >>>>>>>>> 1. In the proposed FLIP, given the example for the dynamic >>>> >>> table, do the >>>> >>>>>>>>> data sources always come from a single lake storage such as >>>> >>> Paimon or >>>> >>>>> does >>>> >>>>>>>>> the same proposal solve for 2 disparate storage systems like >>>> >>> Kafka and >>>> >>>>>>>>> Iceberg where Kafka events are ETLed to Iceberg similar to >>>> >>> Paimon? >>>> >>>>>>>>> Basically the lambda architecture that is mentioned in the >>>> >>> FLIP as well. >>>> >>>>>>>>> I'm wondering if it is possible to switch b/w sources based on >>>> >>> the >>>> >>>>>>>>> execution mode, for eg: if it is backfill operation, switch to >>>> >>> a data >>>> >>>>> lake >>>> >>>>>>>>> storage system like Iceberg, otherwise an event streaming >>>> >>> system like >>>> >>>>>>>>> Kafka. >>>> >>>>>>>>> 2. What happens in the context of a bootstrap (batch) + >>>> >>> nearline update >>>> >>>>>>>>> (streaming) case that are stateful applications? What I mean >>>> >>> by that is, >>>> >>>>>>>>> will the state from the batch application be transferred to >>>> >>> the nearline >>>> >>>>>>>>> application after the bootstrap execution is complete? >>>> >>>>>>>>> >>>> >>>>>>>>> Regards >>>> >>>>>>>>> Venkata krishnan >>>> >>>>>>>>> >>>> >>>>>>>>> >>>> >>>>>>>>> On Mon, Mar 25, 2024 at 8:03 PM Ron liu <ron9....@gmail.com> >>>> >>> wrote: >>>> >>>>>>>>> >>>> >>>>>>>>>>> Hi, Timo >>>> >>>>>>>>>>> >>>> >>>>>>>>>>> Thanks for your quick response, and your suggestion. >>>> >>>>>>>>>>> >>>> >>>>>>>>>>> Yes, this discussion has turned into confirming whether >>>> >>> it's a special >>>> >>>>>>>>>>> table or a special MV. >>>> >>>>>>>>>>> >>>> >>>>>>>>>>> 1. The key problem with MVs is that they don't support >>>> >>> modification, >>>> >>>>> so >>>> >>>>>>>>> I >>>> >>>>>>>>>>> prefer it to be a special table. Although the periodic >>>> >>> refresh >>>> >>>>> behavior >>>> >>>>>>>>> is >>>> >>>>>>>>>>> more characteristic of an MV, since we are already a >>>> >>> special table, >>>> >>>>>>>>>>> supporting periodic refresh behavior is quite natural, >>>> >>> similar to >>>> >>>>>>>>> Snowflake >>>> >>>>>>>>>>> dynamic tables. >>>> >>>>>>>>>>> >>>> >>>>>>>>>>> 2. Regarding the keyword UPDATING, since the current >>>> >>> Regular Table is >>>> >>>>> a >>>> >>>>>>>>>>> Dynamic Table, which implies support for updating through >>>> >>> Continuous >>>> >>>>>>>>> Query, >>>> >>>>>>>>>>> I think it is redundant to add the keyword UPDATING. In >>>> >>> addition, >>>> >>>>>>>>> UPDATING >>>> >>>>>>>>>>> can not reflect the Continuous Query part, can not express >>>> >>> the purpose >>>> >>>>>>>>> we >>>> >>>>>>>>>>> want to simplify the data pipeline through Dynamic Table + >>>> >>> Continuous >>>> >>>>>>>>>>> Query. >>>> >>>>>>>>>>> >>>> >>>>>>>>>>> 3. From the perspective of the SQL standard definition, I >>>> >>> can >>>> >>>>> understand >>>> >>>>>>>>>>> your concerns about Derived Table, but is it possible to >>>> >>> make a slight >>>> >>>>>>>>>>> adjustment to meet our needs? Additionally, as Lincoln >>>> >>> mentioned, the >>>> >>>>>>>>>>> Google Looker platform has introduced Persistent Derived >>>> >>> Table, and >>>> >>>>>>>>> there >>>> >>>>>>>>>>> are precedents in the industry; could Derived Table be a >>>> >>> candidate? >>>> >>>>>>>>>>> >>>> >>>>>>>>>>> Of course, look forward to your better suggestions. >>>> >>>>>>>>>>> >>>> >>>>>>>>>>> Best, >>>> >>>>>>>>>>> Ron >>>> >>>>>>>>>>> >>>> >>>>>>>>>>> >>>> >>>>>>>>>>> >>>> >>>>>>>>>>> Timo Walther <twal...@apache.org> 于2024年3月25日周一 18:49写道: >>>> >>>>>>>>>>> >>>> >>>>>>>>>>>>> After thinking about this more, this discussion boils >>>> >>> down to >>>> >>>>> whether >>>> >>>>>>>>>>>>> this is a special table or a special materialized >>>> >>> view. In both >>>> >>>>> cases, >>>> >>>>>>>>>>>>> we would need to add a special keyword: >>>> >>>>>>>>>>>>> >>>> >>>>>>>>>>>>> Either >>>> >>>>>>>>>>>>> >>>> >>>>>>>>>>>>> CREATE UPDATING TABLE >>>> >>>>>>>>>>>>> >>>> >>>>>>>>>>>>> or >>>> >>>>>>>>>>>>> >>>> >>>>>>>>>>>>> CREATE UPDATING MATERIALIZED VIEW >>>> >>>>>>>>>>>>> >>>> >>>>>>>>>>>>> I still feel that the periodic refreshing behavior is >>>> >>> closer to a >>>> >>>>> MV. >>>> >>>>>>>>> If >>>> >>>>>>>>>>>>> we add a special keyword to MV, the optimizer would >>>> >>> know that the >>>> >>>>> data >>>> >>>>>>>>>>>>> cannot be used for query optimizations. >>>> >>>>>>>>>>>>> >>>> >>>>>>>>>>>>> I will ask more people for their opinion. >>>> >>>>>>>>>>>>> >>>> >>>>>>>>>>>>> Regards, >>>> >>>>>>>>>>>>> Timo >>>> >>>>>>>>>>>>> >>>> >>>>>>>>>>>>> >>>> >>>>>>>>>>>>> On 25.03.24 10:45, Timo Walther wrote: >>>> >>>>>>>>>>>>>>> Hi Ron and Lincoln, >>>> >>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>> thanks for the quick response and the very >>>> >>> insightful discussion. >>>> >>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>>> we might limit future opportunities to >>>> >>> optimize queries >>>> >>>>>>>>>>>>>>>>> through automatic materialization rewriting by >>>> >>> allowing data >>>> >>>>>>>>>>>>>>>>> modifications, thus losing the potential for >>>> >>> such >>>> >>>>> optimizations. >>>> >>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>> This argument makes a lot of sense to me. Due to >>>> >>> the updates, the >>>> >>>>>>>>>>> system >>>> >>>>>>>>>>>>>>> is not in full control of the persisted data. >>>> >>> However, the system >>>> >>>>> is >>>> >>>>>>>>>>>>>>> still in full control of the job that powers the >>>> >>> refresh. So if >>>> >>>>> the >>>> >>>>>>>>>>>>>>> system manages all updating pipelines, it could >>>> >>> still leverage >>>> >>>>>>>>>>> automatic >>>> >>>>>>>>>>>>>>> materialization rewriting but without leveraging >>>> >>> the data at rest >>>> >>>>>>>>> (only >>>> >>>>>>>>>>>>>>> the data in flight). >>>> >>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>>> we are considering another candidate, Derived >>>> >>> Table, the term >>>> >>>>>>>>>>> 'derive' >>>> >>>>>>>>>>>>>>>>> suggests a query, and 'table' retains >>>> >>> modifiability. This >>>> >>>>>>>>> approach >>>> >>>>>>>>>>>>>>>>> would not disrupt our current concept of a >>>> >>> dynamic table >>>> >>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>> I did some research on this term. The SQL standard >>>> >>> uses the term >>>> >>>>>>>>>>>>>>> "derived table" extensively (defined in section >>>> >>> 4.17.3). Thus, a >>>> >>>>>>>>> lot of >>>> >>>>>>>>>>>>>>> vendors adopt this for simply referring to a table >>>> >>> within a >>>> >>>>>>>>> subclause: >>>> >>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>> >>>> >>>>>>>>>>> >>>> >>>>>>>>> >>>> >>>>> >>>> >>> >>>> https://urldefense.com/v3/__https://dev.mysql.com/doc/refman/8.0/en/derived-tables.html__;!!IKRxdwAv5BmarQ!dVYcp9PUyjpBGzkYFxb2sdnmB0E22koc-YLdxY2LidExEHUJKRkyvRbAveqjlYFKWevFvmE1Z-j735ghdiMp$ >>>> >>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>> >>>> >>>>>>>>>>> >>>> >>>>>>>>> >>>> >>>>> >>>> >>> >>>> https://urldefense.com/v3/__https://infocenter.sybase.com/help/topic/com.sybase.infocenter.dc32300.1600/doc/html/san1390612291252.html__;!!IKRxdwAv5BmarQ!dVYcp9PUyjpBGzkYFxb2sdnmB0E22koc-YLdxY2LidExEHUJKRkyvRbAveqjlYFKWevFvmE1Z-j737h1gRux$ >>>> >>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>> >>>> >>>>>>>>>>> >>>> >>>>>>>>> >>>> >>>>> >>>> >>> >>>> https://urldefense.com/v3/__https://www.c-sharpcorner.com/article/derived-tables-vs-common-table-expressions/__;!!IKRxdwAv5BmarQ!dVYcp9PUyjpBGzkYFxb2sdnmB0E22koc-YLdxY2LidExEHUJKRkyvRbAveqjlYFKWevFvmE1Z-j739bWIEcL$ >>>> >>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>> >>>> >>>>>>>>>>> >>>> >>>>>>>>> >>>> >>>>> >>>> >>> >>>> https://urldefense.com/v3/__https://stackoverflow.com/questions/26529804/what-are-the-derived-tables-in-my-explain-statement__;!!IKRxdwAv5BmarQ!dVYcp9PUyjpBGzkYFxb2sdnmB0E22koc-YLdxY2LidExEHUJKRkyvRbAveqjlYFKWevFvmE1Z-j739HnGtQf$ >>>> >>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>> >>>> >>>>>>>>>>> >>>> >>>>>>>>> >>>> >>>>> >>>> >>> >>>> https://urldefense.com/v3/__https://www.sqlservercentral.com/articles/sql-derived-tables__;!!IKRxdwAv5BmarQ!dVYcp9PUyjpBGzkYFxb2sdnmB0E22koc-YLdxY2LidExEHUJKRkyvRbAveqjlYFKWevFvmE1Z-j737DeBiqg$ >>>> >>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>> Esp. the latter example is interesting, SQL Server >>>> >>> allows things >>>> >>>>>>>>> like >>>> >>>>>>>>>>>>>>> this on derived tables: >>>> >>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>> UPDATE T SET Name='Timo' FROM (SELECT * FROM >>>> >>> Product) AS T >>>> >>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>> SELECT * FROM Product; >>>> >>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>> Btw also Snowflake's dynamic table state: >>>> >>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>>> Because the content of a dynamic table is >>>> >>> fully determined >>>> >>>>>>>>>>>>>>>>> by the given query, the content cannot be >>>> >>> changed by using DML. >>>> >>>>>>>>>>>>>>>>> You don’t insert, update, or delete the rows >>>> >>> in a dynamic >>>> >>>>> table. >>>> >>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>> So a new term makes a lot of sense. >>>> >>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>> How about using `UPDATING`? >>>> >>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>> CREATE UPDATING TABLE >>>> >>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>> This reflects that modifications can be made and >>>> >>> from an >>>> >>>>>>>>>>>>>>> English-language perspective you can PAUSE or >>>> >>> RESUME the UPDATING. >>>> >>>>>>>>>>>>>>> Thus, a user can define UPDATING interval and mode? >>>> >>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>> Looking forward to your thoughts. >>>> >>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>> Regards, >>>> >>>>>>>>>>>>>>> Timo >>>> >>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>> On 25.03.24 07:09, Ron liu wrote: >>>> >>>>>>>>>>>>>>>>> Hi, Ahmed >>>> >>>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>>> Thanks for your feedback. >>>> >>>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>>> Regarding your question: >>>> >>>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>>>>> I want to iterate on Timo's comments >>>> >>> regarding the confusion >>>> >>>>>>>>> between >>>> >>>>>>>>>>>>>>>>> "Dynamic Table" and current Flink "Table". >>>> >>> Should the refactoring >>>> >>>>>>>>> of >>>> >>>>>>>>>>> the >>>> >>>>>>>>>>>>>>>>> system happen in 2.0, should we rename it in >>>> >>> this Flip ( as the >>>> >>>>>>>>>>>>>>>>> suggestions >>>> >>>>>>>>>>>>>>>>> in the thread ) and address the holistic >>>> >>> changes in a separate >>>> >>>>> Flip >>>> >>>>>>>>>>>>>>>>> for 2.0? >>>> >>>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>>> Lincoln proposed a new concept in reply to >>>> >>> Timo: Derived Table, >>>> >>>>>>>>> which >>>> >>>>>>>>>>>>>>>>> is a >>>> >>>>>>>>>>>>>>>>> combination of Dynamic Table + Continuous >>>> >>> Query, and the use of >>>> >>>>>>>>>>> Derived >>>> >>>>>>>>>>>>>>>>> Table will not conflict with existing concepts, >>>> >>> what do you >>>> >>>>> think? >>>> >>>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>>>>> I feel confused with how it is further with >>>> >>> other components, >>>> >>>>> the >>>> >>>>>>>>>>>>>>>>> examples provided feel like a standalone ETL >>>> >>> job, could you >>>> >>>>>>>>> provide in >>>> >>>>>>>>>>>>>>>>> the >>>> >>>>>>>>>>>>>>>>> FLIP an example where the table is further used >>>> >>> in subsequent >>>> >>>>>>>>> queries >>>> >>>>>>>>>>>>>>>>> (specially in batch mode). >>>> >>>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>>> Thanks for your suggestion, I added how to use >>>> >>> Dynamic Table in >>>> >>>>>>>>> FLIP >>>> >>>>>>>>>>>>> user >>>> >>>>>>>>>>>>>>>>> story section, Dynamic Table can be referenced >>>> >>> by downstream >>>> >>>>>>>>> Dynamic >>>> >>>>>>>>>>>>>>>>> Table >>>> >>>>>>>>>>>>>>>>> and can also support OLAP queries. >>>> >>>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>>> Best, >>>> >>>>>>>>>>>>>>>>> Ron >>>> >>>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>>> Ron liu <ron9....@gmail.com> 于2024年3月23日周六 >>>> >>> 10:35写道: >>>> >>>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>>>>> Hi, Feng >>>> >>>>>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>>>>> Thanks for your feedback. >>>> >>>>>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>>>>>>> Although currently we restrict users from >>>> >>> modifying the query, >>>> >>>>> I >>>> >>>>>>>>>>>>> wonder >>>> >>>>>>>>>>>>>>>>>>> if >>>> >>>>>>>>>>>>>>>>>>> we can provide a better way to help users >>>> >>> rebuild it without >>>> >>>>>>>>>>> affecting >>>> >>>>>>>>>>>>>>>>>>> downstream OLAP queries. >>>> >>>>>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>>>>> Considering the problem of data consistency, >>>> >>> so in the first >>>> >>>>> step >>>> >>>>>>>>> we >>>> >>>>>>>>>>>>> are >>>> >>>>>>>>>>>>>>>>>>> strictly limited in semantics and do not >>>> >>> support modify the >>>> >>>>> query. >>>> >>>>>>>>>>>>>>>>>>> This is >>>> >>>>>>>>>>>>>>>>>>> really a good problem, one of my ideas is to >>>> >>> introduce a syntax >>>> >>>>>>>>>>>>>>>>>>> similar to >>>> >>>>>>>>>>>>>>>>>>> SWAP [1], which supports exchanging two >>>> >>> Dynamic Tables. >>>> >>>>>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>>>>>>> From the documentation, the definitions >>>> >>> SQL and job >>>> >>>>> information >>>> >>>>>>>>> are >>>> >>>>>>>>>>>>>>>>>>> stored in the Catalog. Does this mean that >>>> >>> if a system needs to >>>> >>>>>>>>> adapt >>>> >>>>>>>>>>>>> to >>>> >>>>>>>>>>>>>>>>>>> Dynamic Tables, it also needs to store >>>> >>> Flink's job information >>>> >>>>> in >>>> >>>>>>>>> the >>>> >>>>>>>>>>>>>>>>>>> corresponding system? >>>> >>>>>>>>>>>>>>>>>>> For example, does MySQL's Catalog need to >>>> >>> store flink job >>>> >>>>>>>>> information >>>> >>>>>>>>>>>>> as >>>> >>>>>>>>>>>>>>>>>>> well? >>>> >>>>>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>>>>> Yes, currently we need to rely on Catalog to >>>> >>> store refresh job >>>> >>>>>>>>>>>>>>>>>>> information. >>>> >>>>>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>>>>>>> Users still need to consider how much >>>> >>> memory is being used, how >>>> >>>>>>>>>>> large >>>> >>>>>>>>>>>>>>>>>>> the concurrency is, which type of state >>>> >>> backend is being used, >>>> >>>>> and >>>> >>>>>>>>>>>>>>>>>>> may need >>>> >>>>>>>>>>>>>>>>>>> to set TTL expiration. >>>> >>>>>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>>>>> Similar to the current practice, job >>>> >>> parameters can be set via >>>> >>>>> the >>>> >>>>>>>>>>>>> Flink >>>> >>>>>>>>>>>>>>>>>>> conf or SET commands >>>> >>>>>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>>>>>>> When we submit a refresh command, can we >>>> >>> help users detect if >>>> >>>>>>>>> there >>>> >>>>>>>>>>>>> are >>>> >>>>>>>>>>>>>>>>>>> any >>>> >>>>>>>>>>>>>>>>>>> running jobs and automatically stop them >>>> >>> before executing the >>>> >>>>>>>>> refresh >>>> >>>>>>>>>>>>>>>>>>> command? Then wait for it to complete before >>>> >>> restarting the >>>> >>>>>>>>>>> background >>>> >>>>>>>>>>>>>>>>>>> streaming job? >>>> >>>>>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>>>>> Purely from a technical implementation point >>>> >>> of view, your >>>> >>>>>>>>> proposal >>>> >>>>>>>>>>> is >>>> >>>>>>>>>>>>>>>>>>> doable, but it would be more costly. Also I >>>> >>> think data >>>> >>>>> consistency >>>> >>>>>>>>>>>>>>>>>>> itself >>>> >>>>>>>>>>>>>>>>>>> is the responsibility of the user, similar >>>> >>> to how Regular Table >>>> >>>>> is >>>> >>>>>>>>>>>>>>>>>>> now also >>>> >>>>>>>>>>>>>>>>>>> the responsibility of the user, so it's >>>> >>> consistent with its >>>> >>>>>>>>> behavior >>>> >>>>>>>>>>>>>>>>>>> and no >>>> >>>>>>>>>>>>>>>>>>> additional guarantees are made at the engine >>>> >>> level. >>>> >>>>>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>>>>> Best, >>>> >>>>>>>>>>>>>>>>>>> Ron >>>> >>>>>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>>>>> Ahmed Hamdy <hamdy10...@gmail.com> >>>> >>> 于2024年3月22日周五 23:50写道: >>>> >>>>>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>>>>>>> Hi Ron, >>>> >>>>>>>>>>>>>>>>>>>>> Sorry for joining the discussion late, >>>> >>> thanks for the effort. >>>> >>>>>>>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>>>>>>> I think the base idea is great, however I >>>> >>> have a couple of >>>> >>>>>>>>> comments: >>>> >>>>>>>>>>>>>>>>>>>>> - I want to iterate on Timo's comments >>>> >>> regarding the confusion >>>> >>>>>>>>>>> between >>>> >>>>>>>>>>>>>>>>>>>>> "Dynamic Table" and current Flink >>>> >>> "Table". Should the >>>> >>>>>>>>> refactoring of >>>> >>>>>>>>>>>>>>>>>>>>> the >>>> >>>>>>>>>>>>>>>>>>>>> system happen in 2.0, should we rename it >>>> >>> in this Flip ( as the >>>> >>>>>>>>>>>>>>>>>>>>> suggestions >>>> >>>>>>>>>>>>>>>>>>>>> in the thread ) and address the holistic >>>> >>> changes in a separate >>>> >>>>>>>>> Flip >>>> >>>>>>>>>>>>> for >>>> >>>>>>>>>>>>>>>>>>>>> 2.0? >>>> >>>>>>>>>>>>>>>>>>>>> - I feel confused with how it is further >>>> >>> with other components, >>>> >>>>>>>>> the >>>> >>>>>>>>>>>>>>>>>>>>> examples provided feel like a standalone >>>> >>> ETL job, could you >>>> >>>>>>>>> provide >>>> >>>>>>>>>>>>>>>>>>>>> in the >>>> >>>>>>>>>>>>>>>>>>>>> FLIP an example where the table is >>>> >>> further used in subsequent >>>> >>>>>>>>>>> queries >>>> >>>>>>>>>>>>>>>>>>>>> (specially in batch mode). >>>> >>>>>>>>>>>>>>>>>>>>> - I really like the standard of keeping >>>> >>> the unified batch and >>>> >>>>>>>>>>>>> streaming >>>> >>>>>>>>>>>>>>>>>>>>> approach >>>> >>>>>>>>>>>>>>>>>>>>> Best Regards >>>> >>>>>>>>>>>>>>>>>>>>> Ahmed Hamdy >>>> >>>>>>>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>>>>>>> On Fri, 22 Mar 2024 at 12:07, Lincoln Lee >>>> >>> < >>>> >>>>>>>>> lincoln.8...@gmail.com> >>>> >>>>>>>>>>>>>>>>>>>>> wrote: >>>> >>>>>>>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>>>>>>>>> Hi Timo, >>>> >>>>>>>>>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>>>>>>>>> Thanks for your thoughtful inputs! >>>> >>>>>>>>>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>>>>>>>>> Yes, expanding the MATERIALIZED >>>> >>> VIEW(MV) could achieve the >>>> >>>>> same >>>> >>>>>>>>>>>>>>>>>>>>> function, >>>> >>>>>>>>>>>>>>>>>>>>>>> but our primary concern is that by >>>> >>> using a view, we might >>>> >>>>> limit >>>> >>>>>>>>>>>>> future >>>> >>>>>>>>>>>>>>>>>>>>>>> opportunities >>>> >>>>>>>>>>>>>>>>>>>>>>> to optimize queries through automatic >>>> >>> materialization >>>> >>>>> rewriting >>>> >>>>>>>>>>> [1], >>>> >>>>>>>>>>>>>>>>>>>>>>> leveraging >>>> >>>>>>>>>>>>>>>>>>>>>>> the support for MV by physical >>>> >>> storage. This is because we >>>> >>>>>>>>> would be >>>> >>>>>>>>>>>>>>>>>>>>>>> breaking >>>> >>>>>>>>>>>>>>>>>>>>>>> the intuitive semantics of a >>>> >>> materialized view (a materialized >>>> >>>>>>>>> view >>>> >>>>>>>>>>>>>>>>>>>>>>> represents >>>> >>>>>>>>>>>>>>>>>>>>>>> the result of a query) by allowing >>>> >>> data modifications, thus >>>> >>>>>>>>> losing >>>> >>>>>>>>>>>>> the >>>> >>>>>>>>>>>>>>>>>>>>>>> potential >>>> >>>>>>>>>>>>>>>>>>>>>>> for such optimizations. >>>> >>>>>>>>>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>>>>>>>>> With these considerations in mind, we >>>> >>> were inspired by Google >>>> >>>>>>>>>>>>> Looker's >>>> >>>>>>>>>>>>>>>>>>>>>>> Persistent >>>> >>>>>>>>>>>>>>>>>>>>>>> Derived Table [2]. PDT is designed for >>>> >>> building Looker's >>>> >>>>>>>>> automated >>>> >>>>>>>>>>>>>>>>>>>>>>> modeling, >>>> >>>>>>>>>>>>>>>>>>>>>>> aligning with our purpose for the >>>> >>> stream-batch automatic >>>> >>>>>>>>> pipeline. >>>> >>>>>>>>>>>>>>>>>>>>>>> Therefore, >>>> >>>>>>>>>>>>>>>>>>>>>>> we are considering another candidate, >>>> >>> Derived Table, the term >>>> >>>>>>>>>>>>> 'derive' >>>> >>>>>>>>>>>>>>>>>>>>>>> suggests a >>>> >>>>>>>>>>>>>>>>>>>>>>> query, and 'table' retains >>>> >>> modifiability. This approach would >>>> >>>>>>>>> not >>>> >>>>>>>>>>>>>>>>>>>>> disrupt >>>> >>>>>>>>>>>>>>>>>>>>>>> our current >>>> >>>>>>>>>>>>>>>>>>>>>>> concept of a dynamic table, preserving >>>> >>> the future utility of >>>> >>>>>>>>> MVs. >>>> >>>>>>>>>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>>>>>>>>> Conceptually, a Derived Table is a >>>> >>> Dynamic Table + Continuous >>>> >>>>>>>>>>>>>>>>>>>>>>> Query. By >>>> >>>>>>>>>>>>>>>>>>>>>>> introducing >>>> >>>>>>>>>>>>>>>>>>>>>>> a new concept Derived Table for this >>>> >>> FLIP, this makes all >>>> >>>>>>>>>>>>>>>>>>>>>>> concepts to >>>> >>>>>>>>>>>>>>>>>>>>> play >>>> >>>>>>>>>>>>>>>>>>>>>>> together nicely. >>>> >>>>>>>>>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>>>>>>>>> What do you think about this? >>>> >>>>>>>>>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>>>>>>>>> [1] >>>> >>>>>>>>>>> >>>> >>>>>>>>> >>>> >>>>> >>>> >>> >>>> https://urldefense.com/v3/__https://calcite.apache.org/docs/materialized_views.html__;!!IKRxdwAv5BmarQ!dVYcp9PUyjpBGzkYFxb2sdnmB0E22koc-YLdxY2LidExEHUJKRkyvRbAveqjlYFKWevFvmE1Z-j73_NFf4D5$ >>>> >>>>>>>>>>>>>>>>>>>>>>> [2] >>>> >>>>>>>>>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>> >>>> >>>>>>>>>>> >>>> >>>>>>>>> >>>> >>>>> >>>> >>> >>>> https://urldefense.com/v3/__https://cloud.google.com/looker/docs/derived-tables*persistent_derived_tables__;Iw!!IKRxdwAv5BmarQ!dVYcp9PUyjpBGzkYFxb2sdnmB0E22koc-YLdxY2LidExEHUJKRkyvRbAveqjlYFKWevFvmE1Z-j7382-2zI3$ >>>> >>>>>>>>>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>>>>>>>>> Best, >>>> >>>>>>>>>>>>>>>>>>>>>>> Lincoln Lee >>>> >>>>>>>>>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>>>>>>>>> Timo Walther <twal...@apache.org> >>>> >>> 于2024年3月22日周五 17:54写道: >>>> >>>>>>>>>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>>>>>>>>>>> Hi Ron, >>>> >>>>>>>>>>>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>>>>>>>>>>> thanks for the detailed answer. >>>> >>> Sorry, for my late reply, we >>>> >>>>>>>>> had a >>>> >>>>>>>>>>>>>>>>>>>>>>>>> conference that kept me busy. >>>> >>>>>>>>>>>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> In the current concept[1], it >>>> >>> actually includes: Dynamic >>>> >>>>>>>>>>> Tables >>>> >>>>>>>>>>>>> & >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> & Continuous Query. Dynamic >>>> >>> Table is just an abstract >>>> >>>>>>>>> logical >>>> >>>>>>>>>>>>>>>>>>>>> concept >>>> >>>>>>>>>>>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>>>>>>>>>>> This explanation makes sense to me. >>>> >>> But the docs also say "A >>>> >>>>>>>>>>>>>>>>>>>>> continuous >>>> >>>>>>>>>>>>>>>>>>>>>>>>> query is evaluated on the dynamic >>>> >>> table yielding a new >>>> >>>>> dynamic >>>> >>>>>>>>>>>>>>>>>>>>> table.". >>>> >>>>>>>>>>>>>>>>>>>>>>>>> So even our regular CREATE TABLEs >>>> >>> are considered dynamic >>>> >>>>>>>>> tables. >>>> >>>>>>>>>>>>> This >>>> >>>>>>>>>>>>>>>>>>>>>>>>> can also be seen in the diagram >>>> >>> "Dynamic Table -> Continuous >>>> >>>>>>>>> Query >>>> >>>>>>>>>>>>> -> >>>> >>>>>>>>>>>>>>>>>>>>>>>>> Dynamic Table". Currently, Flink >>>> >>> queries can only be executed >>>> >>>>>>>>> on >>>> >>>>>>>>>>>>>>>>>>>>> Dynamic >>>> >>>>>>>>>>>>>>>>>>>>>>>>> Tables. >>>> >>>>>>>>>>>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> In essence, a materialized view >>>> >>> represents the result of >>>> >>>>> a >>>> >>>>>>>>>>>>> query. >>>> >>>>>>>>>>>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>>>>>>>>>>> Isn't that what your proposal does >>>> >>> as well? >>>> >>>>>>>>>>>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> the object of the suspend >>>> >>> operation is the refresh task >>>> >>>>> of >>>> >>>>>>>>> the >>>> >>>>>>>>>>>>>>>>>>>>>>>>> dynamic table >>>> >>>>>>>>>>>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>>>>>>>>>>> I understand that Snowflake uses >>>> >>> the term [1] to merge their >>>> >>>>>>>>>>>>> concepts >>>> >>>>>>>>>>>>>>>>>>>>> of >>>> >>>>>>>>>>>>>>>>>>>>>>>>> STREAM, TASK, and TABLE into one >>>> >>> piece of concept. But Flink >>>> >>>>>>>>> has >>>> >>>>>>>>>>> no >>>> >>>>>>>>>>>>>>>>>>>>>>>>> concept of a "refresh task". Also, >>>> >>> they already introduced >>>> >>>>>>>>>>>>>>>>>>>>> MATERIALIZED >>>> >>>>>>>>>>>>>>>>>>>>>>>>> VIEW. Flink is in the convenient >>>> >>> position that the concept of >>>> >>>>>>>>>>>>>>>>>>>>>>>>> materialized views is not taken >>>> >>> (reserved maybe for exactly >>>> >>>>>>>>> this >>>> >>>>>>>>>>> use >>>> >>>>>>>>>>>>>>>>>>>>>>>>> case?). And SQL standard concept >>>> >>> could be "slightly adapted" >>>> >>>>> to >>>> >>>>>>>>>>> our >>>> >>>>>>>>>>>>>>>>>>>>>>>>> needs. Looking at other vendors >>>> >>> like Postgres[2], they also >>>> >>>>> use >>>> >>>>>>>>>>>>>>>>>>>>>>>>> `REFRESH` commands so why not >>>> >>> adding additional commands such >>>> >>>>>>>>> as >>>> >>>>>>>>>>>>>>>>>>>>> DELETE >>>> >>>>>>>>>>>>>>>>>>>>>>>>> or UPDATE. Oracle supports "ON >>>> >>> PREBUILT TABLE clause tells >>>> >>>>> the >>>> >>>>>>>>>>>>>>>>>>>>> database >>>> >>>>>>>>>>>>>>>>>>>>>>>>> to use an existing table >>>> >>> segment"[3] which comes closer to >>>> >>>>>>>>> what we >>>> >>>>>>>>>>>>>>>>>>>>> want >>>> >>>>>>>>>>>>>>>>>>>>>>>>> as well. >>>> >>>>>>>>>>>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> it is not intended to support >>>> >>> data modification >>>> >>>>>>>>>>>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>>>>>>>>>>> This is an argument that I >>>> >>> understand. But we as Flink could >>>> >>>>>>>>> allow >>>> >>>>>>>>>>>>>>>>>>>>> data >>>> >>>>>>>>>>>>>>>>>>>>>>>>> modifications. This way we are only >>>> >>> extending the standard >>>> >>>>> and >>>> >>>>>>>>>>> don't >>>> >>>>>>>>>>>>>>>>>>>>>>>>> introduce new concepts. >>>> >>>>>>>>>>>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>>>>>>>>>>> If we can't agree on using >>>> >>> MATERIALIZED VIEW concept. We >>>> >>>>> should >>>> >>>>>>>>>>> fix >>>> >>>>>>>>>>>>>>>>>>>>> our >>>> >>>>>>>>>>>>>>>>>>>>>>>>> syntax in a Flink 2.0 effort. >>>> >>> Making regular tables bounded >>>> >>>>> and >>>> >>>>>>>>>>>>>>>>>>>>> dynamic >>>> >>>>>>>>>>>>>>>>>>>>>>>>> tables unbounded. We would be >>>> >>> closer to the SQL standard with >>>> >>>>>>>>> this >>>> >>>>>>>>>>>>>>>>>>>>>>>>> and >>>> >>>>>>>>>>>>>>>>>>>>>>>>> pave the way for the future. I >>>> >>> would actually support this if >>>> >>>>>>>>> all >>>> >>>>>>>>>>>>>>>>>>>>>>>>> concepts play together nicely. >>>> >>>>>>>>>>>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> In the future, we can consider >>>> >>> extending the statement >>>> >>>>> set >>>> >>>>>>>>>>>>> syntax >>>> >>>>>>>>>>>>>>>>>>>>> to >>>> >>>>>>>>>>>>>>>>>>>>>>>>> support the creation of multiple >>>> >>> dynamic tables. >>>> >>>>>>>>>>>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>>>>>>>>>>> It's good that we called the >>>> >>> concept STATEMENT SET. This >>>> >>>>>>>>> allows us >>>> >>>>>>>>>>>>> to >>>> >>>>>>>>>>>>>>>>>>>>>>>>> defined CREATE TABLE within. Even >>>> >>> if it might look a bit >>>> >>>>>>>>>>> confusing. >>>> >>>>>>>>>>>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>>>>>>>>>>> Regards, >>>> >>>>>>>>>>>>>>>>>>>>>>>>> Timo >>>> >>>>>>>>>>>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>>>>>>>>>>> [1] >>>> >>>>>>>>>>> >>>> >>>>>>>>> >>>> >>>>> >>>> >>> >>>> https://urldefense.com/v3/__https://docs.snowflake.com/en/user-guide/dynamic-tables-about__;!!IKRxdwAv5BmarQ!dVYcp9PUyjpBGzkYFxb2sdnmB0E22koc-YLdxY2LidExEHUJKRkyvRbAveqjlYFKWevFvmE1Z-j73zexZBXu$ >>>> >>>>>>>>>>>>>>>>>>>>>>>>> [2] >>>> >>>>>>>>>>>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>> >>>> >>>>>>>>>>> >>>> >>>>>>>>> >>>> >>>>> >>>> >>> >>>> https://urldefense.com/v3/__https://www.postgresql.org/docs/current/sql-creatematerializedview.html__;!!IKRxdwAv5BmarQ!dVYcp9PUyjpBGzkYFxb2sdnmB0E22koc-YLdxY2LidExEHUJKRkyvRbAveqjlYFKWevFvmE1Z-j73zbNhvS7$ >>>> >>>>>>>>>>>>>>>>>>>>>>>>> [3] >>>> >>>>>>>>>>> >>>> >>>>>>>>> >>>> >>>>> >>>> >>> >>>> https://urldefense.com/v3/__https://oracle-base.com/articles/misc/materialized-views__;!!IKRxdwAv5BmarQ!dVYcp9PUyjpBGzkYFxb2sdnmB0E22koc-YLdxY2LidExEHUJKRkyvRbAveqjlYFKWevFvmE1Z-j739xS1kvD$ >>>> >>>>>>>>>>>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>>>>>>>>>>> On 21.03.24 04:14, Feng Jin wrote: >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> Hi Ron and Lincoln >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> Thanks for driving this >>>> >>> discussion. I believe it will >>>> >>>>> greatly >>>> >>>>>>>>>>>>>>>>>>>>> improve >>>> >>>>>>>>>>>>>>>>>>>>>>>>> the >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> convenience of managing user >>>> >>> real-time pipelines. >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> I have some questions. >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> *Regarding Limitations of >>>> >>> Dynamic Table:* >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> Does not support modifying >>>> >>> the select statement after the >>>> >>>>>>>>>>> dynamic >>>> >>>>>>>>>>>>>>>>>>>>>>> table >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> is created. >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> Although currently we restrict >>>> >>> users from modifying the >>>> >>>>>>>>> query, I >>>> >>>>>>>>>>>>>>>>>>>>> wonder >>>> >>>>>>>>>>>>>>>>>>>>>>>>> if >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> we can provide a better way to >>>> >>> help users rebuild it without >>>> >>>>>>>>>>>>>>>>>>>>> affecting >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> downstream OLAP queries. >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> *Regarding the management of >>>> >>> background jobs:* >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> 1. From the documentation, the >>>> >>> definitions SQL and job >>>> >>>>>>>>>>> information >>>> >>>>>>>>>>>>>>>>>>>>> are >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> stored in the Catalog. Does this >>>> >>> mean that if a system needs >>>> >>>>>>>>> to >>>> >>>>>>>>>>>>>>>>>>>>> adapt >>>> >>>>>>>>>>>>>>>>>>>>>>> to >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> Dynamic Tables, it also needs to >>>> >>> store Flink's job >>>> >>>>>>>>> information in >>>> >>>>>>>>>>>>>>>>>>>>> the >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> corresponding system? >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> For example, does MySQL's >>>> >>> Catalog need to store flink job >>>> >>>>>>>>>>>>>>>>>>>>> information >>>> >>>>>>>>>>>>>>>>>>>>>>> as >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> well? >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> 2. Users still need to consider >>>> >>> how much memory is being >>>> >>>>> used, >>>> >>>>>>>>>>> how >>>> >>>>>>>>>>>>>>>>>>>>>>> large >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> the concurrency is, which type >>>> >>> of state backend is being >>>> >>>>> used, >>>> >>>>>>>>>>> and >>>> >>>>>>>>>>>>>>>>>>>>> may >>>> >>>>>>>>>>>>>>>>>>>>>>>>> need >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> to set TTL expiration. >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> *Regarding the Refresh Part:* >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> If the refresh mode is >>>> >>> continuous and a background job is >>>> >>>>>>>>>>> running, >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> caution should be taken with the >>>> >>> refresh command as it can >>>> >>>>>>>>> lead >>>> >>>>>>>>>>> to >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> inconsistent data. >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> When we submit a refresh >>>> >>> command, can we help users detect >>>> >>>>> if >>>> >>>>>>>>>>> there >>>> >>>>>>>>>>>>>>>>>>>>> are >>>> >>>>>>>>>>>>>>>>>>>>>>>>> any >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> running jobs and automatically >>>> >>> stop them before executing >>>> >>>>> the >>>> >>>>>>>>>>>>>>>>>>>>> refresh >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> command? Then wait for it to >>>> >>> complete before restarting the >>>> >>>>>>>>>>>>>>>>>>>>> background >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> streaming job? >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> Best, >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> Feng >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> On Tue, Mar 19, 2024 at 9:40 PM >>>> >>> Lincoln Lee < >>>> >>>>>>>>>>>>> lincoln.8...@gmail.com >>>> >>>>>>>>>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>>>>>>>>>>> wrote: >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> Hi Yun, >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> Thank you very much for your >>>> >>> valuable input! >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> Incremental mode is indeed an >>>> >>> attractive idea, we have also >>>> >>>>>>>>>>>>>>>>>>>>> discussed >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> this, but in the current >>>> >>> design, >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> we first provided two refresh >>>> >>> modes: CONTINUOUS and >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> FULL. Incremental mode can be >>>> >>> introduced >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> once the execution layer has >>>> >>> the capability. >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> My answer for the two >>>> >>> questions: >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> 1. >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> Yes, cascading is a good >>>> >>> question. Current proposal >>>> >>>>>>>>> provides a >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> freshness that defines a >>>> >>> dynamic >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> table relative to the base >>>> >>> table’s lag. If users need to >>>> >>>>>>>>>>> consider >>>> >>>>>>>>>>>>>>>>>>>>> the >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> end-to-end freshness of >>>> >>> multiple >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> cascaded dynamic tables, he >>>> >>> can manually split them for >>>> >>>>> now. >>>> >>>>>>>>> Of >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> course, how to let multiple >>>> >>> cascaded >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> or dependent dynamic tables >>>> >>> complete the freshness >>>> >>>>>>>>> definition >>>> >>>>>>>>>>>>> in >>>> >>>>>>>>>>>>>>>>>>>>> a >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> simpler way, I think it can be >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> extended in the future. >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> 2. >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> Cascading refresh is also a >>>> >>> part we focus on discussing. In >>>> >>>>>>>>> this >>>> >>>>>>>>>>>>>>>>>>>>> flip, >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> we hope to focus as much as >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> possible on the core features >>>> >>> (as it already involves a lot >>>> >>>>>>>>>>>>>>>>>>>>> things), >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> so we did not directly >>>> >>> introduce related >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> syntax. However, based on the >>>> >>> current design, combined >>>> >>>>>>>>> with >>>> >>>>>>>>>>> the >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> catalog and lineage, >>>> >>> theoretically, >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> users can also finish the >>>> >>> cascading refresh. >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> Best, >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> Lincoln Lee >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> Yun Tang <myas...@live.com> >>>> >>> 于2024年3月19日周二 13:45写道: >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Hi Lincoln, >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Thanks for driving this >>>> >>> discussion, and I am so excited to >>>> >>>>>>>>> see >>>> >>>>>>>>>>>>>>>>>>>>> this >>>> >>>>>>>>>>>>>>>>>>>>>>>>> topic >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> being discussed in the >>>> >>> Flink community! >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> From my point of view, >>>> >>> instead of the work of unifying >>>> >>>>>>>>>>>>> streaming >>>> >>>>>>>>>>>>>>>>>>>>> and >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> batch >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> in DataStream API [1], >>>> >>> this FLIP actually could make users >>>> >>>>>>>>>>>>> benefit >>>> >>>>>>>>>>>>>>>>>>>>>>> from >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> one >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> engine to rule batch & >>>> >>> streaming. >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> If we treat this FLIP as >>>> >>> an open-source implementation of >>>> >>>>>>>>>>>>>>>>>>>>> Snowflake's >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> dynamic tables [2], we >>>> >>> still lack an incremental refresh >>>> >>>>>>>>> mode >>>> >>>>>>>>>>> to >>>> >>>>>>>>>>>>>>>>>>>>> make >>>> >>>>>>>>>>>>>>>>>>>>>>>>> the >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> ETL near real-time with a >>>> >>> much cheaper computation cost. >>>> >>>>>>>>>>> However, >>>> >>>>>>>>>>>>>>>>>>>>> I >>>> >>>>>>>>>>>>>>>>>>>>>>>>> think >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> this could be done under >>>> >>> the current design by introducing >>>> >>>>>>>>>>>>> another >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> refresh >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> mode in the future. >>>> >>> Although the extra work of incremental >>>> >>>>>>>>> view >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> maintenance >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> would be much larger. >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> For the FLIP itself, I >>>> >>> have several questions below: >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 1. It seems this FLIP does >>>> >>> not consider the lag of >>>> >>>>> refreshes >>>> >>>>>>>>>>>>>>>>>>>>> across >>>> >>>>>>>>>>>>>>>>>>>>>>> ETL >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> layers from ODS ---> DWD >>>> >>> ---> APP [3]. We currently only >>>> >>>>>>>>>>> consider >>>> >>>>>>>>>>>>>>>>>>>>> the >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> scheduler interval, which >>>> >>> means we cannot use lag to >>>> >>>>>>>>>>>>> automatically >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> schedule >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> the upfront micro-batch >>>> >>> jobs to do the work. >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 2. To support the >>>> >>> automagical refreshes, we should >>>> >>>>> consider >>>> >>>>>>>>> the >>>> >>>>>>>>>>>>>>>>>>>>>>> lineage >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> in >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> the catalog or somewhere >>>> >>> else. >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> [1] >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>> >>>> >>>>>>>>>>> >>>> >>>>>>>>> >>>> >>>>> >>>> >>> >>>> https://urldefense.com/v3/__https://cwiki.apache.org/confluence/display/FLINK/FLIP-134*3A*Batch*execution*for*the*DataStream*API__;JSsrKysrKw!!IKRxdwAv5BmarQ!dVYcp9PUyjpBGzkYFxb2sdnmB0E22koc-YLdxY2LidExEHUJKRkyvRbAveqjlYFKWevFvmE1Z-j7352JICzI$ >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> [2] >>>> >>>>>>>>>>>>> >>>> >>>>>>>>>>> >>>> >>>>>>>>> >>>> >>>>> >>>> >>> >>>> https://urldefense.com/v3/__https://docs.snowflake.com/en/user-guide/dynamic-tables-about__;!!IKRxdwAv5BmarQ!dVYcp9PUyjpBGzkYFxb2sdnmB0E22koc-YLdxY2LidExEHUJKRkyvRbAveqjlYFKWevFvmE1Z-j73zexZBXu$ >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> [3] >>>> >>>>>>>>>>>>>>>>>>>>> >>>> >>>>>>>>>>> >>>> >>>>>>>>> >>>> >>>>> >>>> >>> >>>> https://urldefense.com/v3/__https://docs.snowflake.com/en/user-guide/dynamic-tables-refresh__;!!IKRxdwAv5BmarQ!dVYcp9PUyjpBGzkYFxb2sdnmB0E22koc-YLdxY2LidExEHUJKRkyvRbAveqjlYFKWevFvmE1Z-j735ghqpxk$ >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Best >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Yun Tang >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>> >>> ________________________________ >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> From: Lincoln Lee < >>>> >>> lincoln.8...@gmail.com> >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Sent: Thursday, March 14, >>>> >>> 2024 14:35 >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> To: dev@flink.apache.org < >>>> >>> dev@flink.apache.org> >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Subject: Re: [DISCUSS] >>>> >>> FLIP-435: Introduce a New Dynamic >>>> >>>>>>>>> Table >>>> >>>>>>>>>>>>> for >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Simplifying Data Pipelines >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Hi Jing, >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Thanks for your attention >>>> >>> to this flip! I'll try to answer >>>> >>>>>>>>> the >>>> >>>>>>>>>>>>>>>>>>>>>>>>> following >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> questions. >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 1. How to define query >>>> >>> of dynamic table? >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Use flink sql or >>>> >>> introducing new syntax? >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> If use flink sql, how >>>> >>> to handle the difference in SQL >>>> >>>>>>>>> between >>>> >>>>>>>>>>>>>>>>>>>>>>>>> streaming >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> and >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> batch processing? >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> For example, a query >>>> >>> including window aggregate based on >>>> >>>>>>>>>>>>>>>>>>>>> processing >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> time? >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> or a query including >>>> >>> global order by? >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Similar to `CREATE TABLE >>>> >>> AS query`, here the `query` also >>>> >>>>>>>>> uses >>>> >>>>>>>>>>>>>>>>>>>>> Flink >>>> >>>>>>>>>>>>>>>>>>>>>>>>> sql >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> and >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> doesn't introduce a >>>> >>> totally new syntax. >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> We will not change the >>>> >>> status respect to >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> the difference in >>>> >>> functionality of flink sql itself on >>>> >>>>>>>>>>> streaming >>>> >>>>>>>>>>>>>>>>>>>>> and >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> batch, for example, >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> the proctime window agg on >>>> >>> streaming and global sort on >>>> >>>>>>>>> batch >>>> >>>>>>>>>>>>> that >>>> >>>>>>>>>>>>>>>>>>>>>>> you >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> mentioned, >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> in fact, do not work >>>> >>> properly in the >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> other mode, so when the >>>> >>> user modifies the >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> refresh mode of a dynamic >>>> >>> table that is not supported, we >>>> >>>>>>>>> will >>>> >>>>>>>>>>>>>>>>>>>>> throw >>>> >>>>>>>>>>>>>>>>>>>>>>> an >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> exception. >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 2. Whether modify the >>>> >>> query of dynamic table is allowed? >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Or we could only >>>> >>> refresh a dynamic table based on the >>>> >>>>>>>>> initial >>>> >>>>>>>>>>>>>>>>>>>>> query? >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Yes, in the current >>>> >>> design, the query definition of the >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> dynamic table is not >>>> >>> allowed >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> to be modified, and you >>>> >>> can only refresh the data based >>>> >>>>>>>>> on >>>> >>>>>>>>>>> the >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> initial definition. >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 3. How to use dynamic >>>> >>> table? >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> The dynamic table seems >>>> >>> to be similar to the materialized >>>> >>>>>>>>>>> view. >>>> >>>>>>>>>>>>>>>>>>>>>>> Will >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> we >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> do >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> something like >>>> >>> materialized view rewriting during the >>>> >>>>>>>>>>>>>>>>>>>>> optimization? >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> It's true that dynamic >>>> >>> table and materialized view >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> are similar in some ways, >>>> >>> but as Ron >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> explains >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> there are differences. In >>>> >>> terms of optimization, automated >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> materialization discovery >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> similar to that supported >>>> >>> by calcite is also a potential >>>> >>>>>>>>>>>>>>>>>>>>> possibility, >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> perhaps with the >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> addition of automated >>>> >>> rewriting in the future. >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Best, >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Lincoln Lee >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Ron liu < >>>> >>> ron9....@gmail.com> 于2024年3月14日周四 14:01写道: >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Hi, Timo >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Sorry for later >>>> >>> response, thanks for your feedback. >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Regarding your >>>> >>> questions: >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Flink has introduced >>>> >>> the concept of Dynamic Tables many >>>> >>>>>>>>> years >>>> >>>>>>>>>>>>>>>>>>>>> ago. >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> How >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> does the term "Dynamic >>>> >>> Table" fit into Flink's regular >>>> >>>>>>>>> tables >>>> >>>>>>>>>>>>> and >>>> >>>>>>>>>>>>>>>>>>>>>>> also >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> how does it relate to >>>> >>> Table API? >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I fear that adding >>>> >>> the DYNAMIC TABLE keyword could cause >>>> >>>>>>>>>>>>>>>>>>>>> confusion >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> for >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> users, because a >>>> >>> term for regular CREATE TABLE (that can >>>> >>>>>>>>> be >>>> >>>>>>>>>>>>>>>>>>>>> "kind >>>> >>>>>>>>>>>>>>>>>>>>>>> of >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> dynamic" as well and >>>> >>> is backed by a changelog) is then >>>> >>>>>>>>>>> missing. >>>> >>>>>>>>>>>>>>>>>>>>>>> Also >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> given that we call >>>> >>> our connectors for those tables, >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> DynamicTableSource >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> and DynamicTableSink. >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> In general, I find >>>> >>> it contradicting that a TABLE can be >>>> >>>>>>>>>>>>>>>>>>>>> "paused" or >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> "resumed". From an >>>> >>> English language perspective, this >>>> >>>>> does >>>> >>>>>>>>>>>>> sound >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> incorrect. In my >>>> >>> opinion (without much research yet), a >>>> >>>>>>>>>>>>>>>>>>>>> continuous >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> updating trigger >>>> >>> should rather be modelled as a CREATE >>>> >>>>>>>>>>>>>>>>>>>>> MATERIALIZED >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> VIEW >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> (which users are >>>> >>> familiar with?) or a new concept such >>>> >>>>> as >>>> >>>>>>>>> a >>>> >>>>>>>>>>>>>>>>>>>>> CREATE >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> TASK >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> (that can be paused >>>> >>> and resumed?). >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 1. >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> In the current >>>> >>> concept[1], it actually includes: Dynamic >>>> >>>>>>>>>>> Tables >>>> >>>>>>>>>>>>> & >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Continuous Query. >>>> >>> Dynamic Table is just an abstract >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> logical concept >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> , which in its physical >>>> >>> form represents either a table >>>> >>>>> or a >>>> >>>>>>>>>>>>>>>>>>>>>>> changelog >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> stream. It requires the >>>> >>> combination with Continuous Query >>>> >>>>>>>>> to >>>> >>>>>>>>>>>>>>>>>>>>> achieve >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> dynamic updates of the >>>> >>> target table similar to a >>>> >>>>> database’s >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Materialized View. >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> We hope to upgrade the >>>> >>> Dynamic Table to a real entity >>>> >>>>> that >>>> >>>>>>>>>>> users >>>> >>>>>>>>>>>>>>>>>>>>> can >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> operate, which combines >>>> >>> the logical concepts of Dynamic >>>> >>>>>>>>>>> Tables + >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Continuous Query. By >>>> >>> integrating the definition of tables >>>> >>>>>>>>> and >>>> >>>>>>>>>>>>>>>>>>>>>>> queries, >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> it can achieve >>>> >>> functions similar to Materialized Views, >>>> >>>>>>>>>>>>>>>>>>>>> simplifying >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> users' data processing >>>> >>> pipelines. >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> So, the object of the >>>> >>> suspend operation is the refresh >>>> >>>>>>>>> task of >>>> >>>>>>>>>>>>>>>>>>>>> the >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> dynamic table. The >>>> >>> command `ALTER DYNAMIC TABLE >>>> >>>>> table_name >>>> >>>>>>>>>>>>>>>>>>>>> SUSPEND >>>> >>>>>>>>>>>>>>>>>>>>>>> ` >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> is actually a shorthand >>>> >>> for `ALTER DYNAMIC TABLE >>>> >>>>> table_name >>>> >>>>>>>>>>>>>>>>>>>>> SUSPEND >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> REFRESH` (if written in >>>> >>> full for clarity, we can also >>>> >>>>>>>>> modify >>>> >>>>>>>>>>>>> it). >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 2. Initially, we also >>>> >>> considered Materialized Views >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> , but ultimately >>>> >>> decided against them. Materialized views >>>> >>>>>>>>> are >>>> >>>>>>>>>>>>>>>>>>>>>>> designed >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> to enhance query >>>> >>> performance for workloads that consist >>>> >>>>> of >>>> >>>>>>>>>>>>>>>>>>>>> common, >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> repetitive query >>>> >>> patterns. In essence, a materialized >>>> >>>>> view >>>> >>>>>>>>>>>>>>>>>>>>>>> represents >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> the result of a query. >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> However, it is not >>>> >>> intended to support data modification. >>>> >>>>>>>>> For >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Lakehouse scenarios, >>>> >>> where the ability to delete or >>>> >>>>> update >>>> >>>>>>>>>>> data >>>> >>>>>>>>>>>>>>>>>>>>> is >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> crucial (such as >>>> >>> compliance with GDPR, FLIP-2), >>>> >>>>>>>>> materialized >>>> >>>>>>>>>>>>>>>>>>>>> views >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> fall short. >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 3. >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Compared to CREATE >>>> >>> (regular) TABLE, CREATE DYNAMIC TABLE >>>> >>>>>>>>> not >>>> >>>>>>>>>>>>> only >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> defines metadata in the >>>> >>> catalog but also automatically >>>> >>>>>>>>>>> initiates >>>> >>>>>>>>>>>>>>>>>>>>> a >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> data refresh task based >>>> >>> on the query specified during >>>> >>>>> table >>>> >>>>>>>>>>>>>>>>>>>>>>> creation. >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> It dynamically executes >>>> >>> data updates. Users can focus on >>>> >>>>>>>>> data >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> dependencies and data >>>> >>> generation logic. >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 4. >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> The new dynamic table >>>> >>> does not conflict with the existing >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> DynamicTableSource and >>>> >>> DynamicTableSink interfaces. For >>>> >>>>> the >>>> >>>>>>>>>>>>>>>>>>>>>>> developer, >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> all that needs to be >>>> >>> implemented is the new >>>> >>>>>>>>>>> CatalogDynamicTable, >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> without changing the >>>> >>> implementation of source and sink. >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 5. For now, the FLIP >>>> >>> does not consider supporting Table >>>> >>>>> API >>>> >>>>>>>>>>>>>>>>>>>>>>> operations >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> on >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Dynamic Table >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> . However, once the SQL >>>> >>> syntax is finalized, we can >>>> >>>>> discuss >>>> >>>>>>>>>>> this >>>> >>>>>>>>>>>>>>>>>>>>> in >>>> >>>>>>>>>>>>>>>>>>>>>>> a >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> separate FLIP. >>>> >>> Currently, I have a rough idea: the Table >>>> >>>>>>>>> API >>>> >>>>>>>>>>>>>>>>>>>>> should >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> also introduce >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> DynamicTable operation >>>> >>> interfaces >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> corresponding to the >>>> >>> existing Table interfaces. >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> The TableEnvironment >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> will provide relevant >>>> >>> methods to support various >>>> >>>>> dynamic >>>> >>>>>>>>>>>>> table >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> operations. The goal >>>> >>> for the new Dynamic Table is to >>>> >>>>> offer >>>> >>>>>>>>>>> users >>>> >>>>>>>>>>>>>>>>>>>>> an >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> experience similar to >>>> >>> using a database, which is why we >>>> >>>>>>>>>>>>>>>>>>>>> prioritize >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> SQL-based approaches >>>> >>> initially. >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> How do you envision >>>> >>> re-adding the functionality of a >>>> >>>>>>>>>>> statement >>>> >>>>>>>>>>>>>>>>>>>>> set, >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> that >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> fans out to multiple >>>> >>> tables? This is a very important >>>> >>>>> use >>>> >>>>>>>>>>> case >>>> >>>>>>>>>>>>>>>>>>>>> for >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> data >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> pipelines. >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Multi-tables is indeed >>>> >>> a very important user scenario. In >>>> >>>>>>>>> the >>>> >>>>>>>>>>>>>>>>>>>>>>> future, >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> we can consider >>>> >>> extending the statement set syntax to >>>> >>>>>>>>> support >>>> >>>>>>>>>>>>> the >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> creation of multiple >>>> >>> dynamic tables. >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Since the early >>>> >>> days of Flink SQL, we were discussing >>>> >>>>>>>>>>> `SELECT >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> STREAM >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> * >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> FROM T EMIT 5 >>>> >>> MINUTES`. Your proposal seems to rephrase >>>> >>>>>>>>>>> STREAM >>>> >>>>>>>>>>>>>>>>>>>>> and >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> EMIT, >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> into other keywords >>>> >>> DYNAMIC TABLE and FRESHNESS. But the >>>> >>>>>>>>> core >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> functionality is >>>> >>> still there. I'm wondering if we should >>>> >>>>>>>>>>> widen >>>> >>>>>>>>>>>>>>>>>>>>> the >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> scope >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> (maybe not part of >>>> >>> this FLIP but a new FLIP) to follow >>>> >>>>> the >>>> >>>>>>>>>>>>>>>>>>>>> standard >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> more >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> closely. Making >>>> >>> `SELECT * FROM t` bounded by default and >>>> >>>>>>>>> use >>>> >>>>>>>>>>>>> new >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> syntax >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> for the dynamic >>>> >>> behavior. Flink 2.0 would be the perfect >>>> >>>>>>>>> time >>>> >>>>>>>>>>>>>>>>>>>>> for >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> this, >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> however, it would >>>> >>> require careful discussions. What do >>>> >>>>> you >>>> >>>>>>>>>>>>>>>>>>>>> think? >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> The query part indeed >>>> >>> requires a separate FLIP >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> for discussion, as it >>>> >>> involves changes to the default >>>> >>>>>>>>>>> behavior. >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> [1] >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>> >>>> >>>>>>>>>>> >>>> >>>>>>>>> >>>> >>>>> >>>> >>> >>>> https://urldefense.com/v3/__https://nightlies.apache.org/flink/flink-docs-master/docs/dev/table/concepts/dynamic_tables__;!!IKRxdwAv5BmarQ!dVYcp9PUyjpBGzkYFxb2sdnmB0E22koc-YLdxY2LidExEHUJKRkyvRbAveqjlYFKWevFvmE1Z-j73477_wHn$ >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Best, >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Ron >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Jing Zhang < >>>> >>> beyond1...@gmail.com> 于2024年3月13日周三 15:19写道: >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Hi, Lincoln & Ron, >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Thanks for the >>>> >>> proposal. >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I agree with the >>>> >>> question raised by Timo. >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Besides, I have some >>>> >>> other questions. >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 1. How to define >>>> >>> query of dynamic table? >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Use flink sql or >>>> >>> introducing new syntax? >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> If use flink sql, >>>> >>> how to handle the difference in SQL >>>> >>>>>>>>> between >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> streaming >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> and >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> batch processing? >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> For example, a query >>>> >>> including window aggregate based on >>>> >>>>>>>>>>>>>>>>>>>>> processing >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> time? >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> or a query including >>>> >>> global order by? >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 2. Whether modify >>>> >>> the query of dynamic table is allowed? >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Or we could only >>>> >>> refresh a dynamic table based on >>>> >>>>> initial >>>> >>>>>>>>>>>>> query? >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 3. How to use >>>> >>> dynamic table? >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> The dynamic table >>>> >>> seems to be similar with materialized >>>> >>>>>>>>> view. >>>> >>>>>>>>>>>>>>>>>>>>> Will >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> we >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> do >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> something like >>>> >>> materialized view rewriting during the >>>> >>>>>>>>>>>>>>>>>>>>> optimization? >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Best, >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Jing Zhang >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Timo Walther < >>>> >>> twal...@apache.org> 于2024年3月13日周三 01:24写 >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 道: >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Hi Lincoln & Ron, >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> thanks for >>>> >>> proposing this FLIP. I think a design >>>> >>>>> similar >>>> >>>>>>>>> to >>>> >>>>>>>>>>>>>>>>>>>>> what >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> you >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> propose has been >>>> >>> in the heads of many people, however, >>>> >>>>>>>>> I'm >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> wondering >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> how >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> this will fit >>>> >>> into the bigger picture. >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I haven't deeply >>>> >>> reviewed the FLIP yet, but would like >>>> >>>>> to >>>> >>>>>>>>>>> ask >>>> >>>>>>>>>>>>>>>>>>>>> some >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> initial questions: >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Flink has >>>> >>> introduced the concept of Dynamic Tables many >>>> >>>>>>>>>>> years >>>> >>>>>>>>>>>>>>>>>>>>> ago. >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> How >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> does the term >>>> >>> "Dynamic Table" fit into Flink's regular >>>> >>>>>>>>>>> tables >>>> >>>>>>>>>>>>>>>>>>>>> and >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> also >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> how does it >>>> >>> relate to Table API? >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I fear that >>>> >>> adding the DYNAMIC TABLE keyword could >>>> >>>>> cause >>>> >>>>>>>>>>>>>>>>>>>>> confusion >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> for >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> users, because a >>>> >>> term for regular CREATE TABLE (that >>>> >>>>> can >>>> >>>>>>>>> be >>>> >>>>>>>>>>>>>>>>>>>>> "kind >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> of >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> dynamic" as well >>>> >>> and is backed by a changelog) is then >>>> >>>>>>>>>>>>> missing. >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> Also >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> given that we >>>> >>> call our connectors for those tables, >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> DynamicTableSource >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> and >>>> >>> DynamicTableSink. >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> In general, I >>>> >>> find it contradicting that a TABLE can be >>>> >>>>>>>>>>>>>>>>>>>>> "paused" >>>> >>>>>>>>>>>>>>>>>>>>>>> or >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> "resumed". From >>>> >>> an English language perspective, this >>>> >>>>>>>>> does >>>> >>>>>>>>>>>>>>>>>>>>> sound >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> incorrect. In my >>>> >>> opinion (without much research yet), a >>>> >>>>>>>>>>>>>>>>>>>>> continuous >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> updating trigger >>>> >>> should rather be modelled as a CREATE >>>> >>>>>>>>>>>>>>>>>>>>>>> MATERIALIZED >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> VIEW >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> (which users are >>>> >>> familiar with?) or a new concept such >>>> >>>>>>>>> as a >>>> >>>>>>>>>>>>>>>>>>>>> CREATE >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> TASK >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> (that can be >>>> >>> paused and resumed?). >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> How do you >>>> >>> envision re-adding the functionality of a >>>> >>>>>>>>>>> statement >>>> >>>>>>>>>>>>>>>>>>>>>>> set, >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> that >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> fans out to >>>> >>> multiple tables? This is a very important >>>> >>>>> use >>>> >>>>>>>>>>> case >>>> >>>>>>>>>>>>>>>>>>>>> for >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> data >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> pipelines. >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Since the early >>>> >>> days of Flink SQL, we were discussing >>>> >>>>>>>>>>> `SELECT >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> STREAM >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> * >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> FROM T EMIT 5 >>>> >>> MINUTES`. Your proposal seems to rephrase >>>> >>>>>>>>>>> STREAM >>>> >>>>>>>>>>>>>>>>>>>>> and >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> EMIT, >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> into other >>>> >>> keywords DYNAMIC TABLE and FRESHNESS. But >>>> >>>>> the >>>> >>>>>>>>>>> core >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> functionality is >>>> >>> still there. I'm wondering if we >>>> >>>>> should >>>> >>>>>>>>>>> widen >>>> >>>>>>>>>>>>>>>>>>>>> the >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> scope >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> (maybe not part >>>> >>> of this FLIP but a new FLIP) to follow >>>> >>>>>>>>> the >>>> >>>>>>>>>>>>>>>>>>>>>>> standard >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> more >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> closely. Making >>>> >>> `SELECT * FROM t` bounded by default >>>> >>>>> and >>>> >>>>>>>>> use >>>> >>>>>>>>>>>>>>>>>>>>> new >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> syntax >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> for the dynamic >>>> >>> behavior. Flink 2.0 would be the >>>> >>>>> perfect >>>> >>>>>>>>>>> time >>>> >>>>>>>>>>>>>>>>>>>>> for >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> this, >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> however, it would >>>> >>> require careful discussions. What do >>>> >>>>>>>>> you >>>> >>>>>>>>>>>>>>>>>>>>> think? >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Regards, >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Timo >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> On 11.03.24 >>>> >>> 08:23, Ron liu wrote: >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Hi, Dev >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Lincoln Lee >>>> >>> and I would like to start a discussion >>>> >>>>> about >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> FLIP-435: >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Introduce a >>>> >>> New Dynamic Table for Simplifying Data >>>> >>>>>>>>>>>>> Pipelines. >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> This FLIP is >>>> >>> designed to simplify the development of >>>> >>>>>>>>> data >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> processing >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> pipelines. >>>> >>> With Dynamic Tables with uniform SQL >>>> >>>>>>>>> statements >>>> >>>>>>>>>>>>> and >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> freshness, >>>> >>> users can define batch and streaming >>>> >>>>>>>>>>>>>>>>>>>>> transformations >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> to >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> data in the >>>> >>> same way, accelerate ETL pipeline >>>> >>>>>>>>> development, >>>> >>>>>>>>>>>>> and >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> manage >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> task >>>> >>> scheduling automatically. >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> For more >>>> >>> details, see FLIP-435 [1]. Looking forward to >>>> >>>>>>>>> your >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> feedback. >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> [1] >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Best, >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Lincoln & Ron >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>> >>>> >>>>>>>>>>>>> >>>> >>>>>>>>>>> >>>> >>>>>>>>> >>>> >>>>>>> >>>> >>>>> >>>> >>> >>>> >> >>>> > >>>> >>>>