Hi, Ramin Sorry for late reply. Thanks for the detailed explanation, I think you explained it very well, +1. We should keep CREATE OR REPLACE and CREATE OR ALTER with their own clear semantics.
Best, Ron David Radley <[email protected]> 于2025年9月9日周二 16:18写道: > Hi Ramin, > Thanks for the feedback - that makes sense, > Kind regards, David. > > From: Ramin Gharib <[email protected]> > Date: Tuesday, 9 September 2025 at 08:18 > To: [email protected] <[email protected]> > Subject: [EXTERNAL] Re: [DISCUSS] FLIP-546: Introduce CREATE OR ALTER for > Materialized Tables > > Hi Ron, > > Thanks again for your detailed feedback and for raising the points about > CREATE > OR REPLACE and the existing FLIP-492. > > Based on your comments and our discussion, I've updated the FLIP document > to include a more detailed rationale in the "Rejected Alternatives" > section. This now covers the semantic differences between REPLACE and > ALTER, > and includes the research on industry precedents like Snowflake. > > I am following up to see if my previous email and the recent updates to the > FLIP have addressed the concerns you raised. > > Of course, this question is open to everyone on the list as well. If there > are any other thoughts or remaining concerns about this proposal, please > let me know. > > Best, > > Ramin > > On Fri, Sep 5, 2025 at 5:10 PM Ramin Gharib <[email protected]> wrote: > > > Hi David, > > > > Thanks for the feedback and the good questions about the evolution > > scenario. > > > > To clarify the scope of this FLIP, it's focused on introducing the CREATE > > OR ALTER syntax itself, and not on expanding Flink's underlying schema > > evolution capabilities. Every (un)supported scenario described in the > FLIP > > is taken from the source code of the ALTER command. > > > > The implementation plan for the new CREATE OR ALTER syntax is to delegate > > directly to the existing ALTER MATERIALIZED TABLE ... AS > > <select_statement> logic if the table already exists. In other words, a > > set of supported evolutions through CREATE OR ALTER will be exactly what > > the standard ALTER command currently supports. > > > > So, regarding your specific points: > > > > 1. > > > > *Changing nullability:* This would only work if the underlying ALTER > > logic already supports it. > > 2. > > > > *Castable type changes:* As you noted, this would require a data > > rewrite and is a significant feature in itself. It would need to be > > implemented in the core ALTER command first. Once it is, CREATE OR > > ALTER would support it by extension. > > > > Essentially, this FLIP provides the idempotent entry point, while a > > separate, existing component handles the actual evolution logic it > > triggers. Expanding those capabilities would be a great topic for future > > FLIPs. > > > > Kind regards, > > > > Ramin > > > > On Fri, Sep 5, 2025 at 3:33 PM David Radley <[email protected]> > > wrote: > > > >> Hi Ramin, > >> I like this proposal. > >> I wondered what you thought of supporting evolutions like : > >> > >> 1. > >> Changing a column or child object to be nullable? > >> 2. > >> Allowing column type changes if they were castable? This could require a > >> rewriting of the column data - so may not be feasible. > >> 3. > >> > >> > >> WDYT? > >> Kind regards, David. > >> > >> From: Ramin Gharib <[email protected]> > >> Date: Friday, 5 September 2025 at 11:27 > >> To: [email protected] <[email protected]> > >> Subject: [EXTERNAL] Re: [DISCUSS] FLIP-546: Introduce CREATE OR ALTER > for > >> Materialized Tables > >> > >> Hi Ron, > >> > >> Thanks again for the excellent feedback and for bringing FLIP-492 [1] > into > >> the discussion. I want to address your points on CREATE OR REPLACE vs. > >> CREATE > >> OR ALTER and the SQL standard. > >> > >> *1. Semantics of REPLACE vs. ALTER* > >> > >> You’re right that FLIP-492 [1] proposed CREATE OR REPLACE. However, I > >> believe the semantics of ALTER are a much better fit for an object as > >> complex as a Materialized Table, which has a running job, state, and > >> physical data associated with it. > >> > >> Interestingly, this exact distinction is recognized by other modern data > >> platforms. For example, *Snowflake supports both CREATE OR REPLACE TABLE > >> and CREATE OR ALTER TABLE as distinct commands with different use cases > >> [2].* > >> > >> - > >> > >> Their documentation for CREATE OR ALTER TABLE explicitly states it > >> *"creates > >> a table if it doesn’t exist, or alters it according to the table > >> definition... existing data in the table is preserved when > possible."* > >> This "modify in place" semantic is precisely what this FLIP aims to > >> achieve. > >> - > >> > >> Conversely, their docs describe CREATE OR REPLACE as *"the equivalent > >> of > >> using DROP TABLE on the existing table and then creating a new > >> table..."* > >> This "drop and recreate" semantics is a destructive action that is > not > >> suitable for a stateful object like a Flink Materialized Table, where > >> the > >> goal is to evolve the pipeline, not destroy it. > >> > >> This is particularly relevant for Flink, as we already have an ALTER > >> MATERIALIZED TABLE ... AS <select_statement> command that defines the > >> "modify in-place" behavior. My proposal is that the ALTER path of CREATE > >> OR > >> ALTER should be implemented by delegating directly to this existing > logic. > >> This ensures behavioral consistency. Using REPLACE would introduce a > >> conflict that doesn't align with our current implementation. > >> > >> *2. Forward-Looking Resilience* > >> > >> The ALTER semantic is also more forward-looking. As Flink's evolution > >> capabilities become more sophisticated—potentially including complex > >> reprocessing strategies that retain historical data—the concept of > >> 'altering' a pipeline is more fitting than 'replacing' it. REPLACE > >> suggests > >> a simple, destructive action, whereas ALTER provides a more resilient > >> foundation for nuanced, state-preserving modifications in the future. > >> > >> *3. The SQL Standard* > >> > >> You've raised an important point about the SQL standard. A review shows > >> that neither CREATE OR REPLACE nor CREATE OR ALTER are part of the > formal > >> ANSI/ISO SQL standard*.* They are both widely adopted, vendor-specific > >> extensions. > >> > >> - > >> > >> *CREATE OR REPLACE* was popularized by systems like PostgreSQL (for > >> views/functions) [3] and Oracle [4]. > >> - > >> > >> *CREATE OR ALTER* was popularized by Microsoft SQL Server [5] and is > >> now > >> a key feature in Snowflake [6]. > >> > >> Since neither is formally "standard," our choice should be guided by > which > >> one provides the most clarity and consistency for Flink users. The fact > >> that a major platform like Snowflake has implemented both as distinct > >> commands reinforces the idea that the industry sees value in their > >> different semantics. > >> > >> *4. Preventing User Confusion* > >> > >> I believe CREATE OR ALTER will actually be *less confusing* for Flink > >> users > >> precisely because it maps directly to the existing commands they already > >> know. A user familiar with ALTER MATERIALIZED TABLE ... AS will > >> immediately > >> understand what the ALTER part of the new command does. > >> > >> Given that the implementation for FLIP-492 [1] is not yet complete ( > >> FLINK-36995 [7] is still open), now is a good time to choose (or define) > >> the syntax with the clearest and safest semantics before any code is > >> merged. > >> > >> In summary, I believe CREATE OR ALTER provides clearer, non-destructive > >> semantics that are more consistent with Flink's existing DDL for > >> Materialized Tables, making it the superior choice for both current > >> functionality and future resilience. > >> > >> Thanks again for the great discussion points. I look forward to hearing > >> your thoughts and those of the wider community. > >> > >> [1] > >> > >> > https://cwiki.apache.org/confluence/display/FLINK/FLIP-492%3A+Support+Query+Modifications+for+Materialized+Tables > >> [2] > >> > https://docs.snowflake.com/en/sql-reference/sql/create-table#usage-notes > >> [3] https://www.postgresql.org/docs/current/sql-createview.html > >> [4] https://docs.oracle.com/cd/E17952_01/mysql-5.7-en/create-view.html > >> [5] > >> > >> > https://learn.microsoft.com/en-us/sql/t-sql/statements/create-view-transact-sql?view=sql-server-ver17#or-alter > >> [6] > >> > >> > https://docs.snowflake.com/en/sql-reference/sql/create-table#label-create-or-alter-table-syntax > >> [7] https://issues.apache.org/jira/browse/FLINK-36995 > >> > >> Best, > >> > >> Ramin > >> > >> On Thu, Sep 4, 2025 at 1:54 PM Ron Liu <[email protected]> wrote: > >> > >> > Hi, Ramin > >> > > >> > In FLIP-492[1], we introduced the `CREATE OR REPLACE MATERIALIZED > TABLE` > >> > syntax to support modifying materialized tables. Can we extend this > >> syntax > >> > to achieve the functionality you need, such as introducing clause > >> > parameters to determine whether to replace the entire table or just > >> some of > >> > its attributes? > >> > > >> > Regarding the `CREATE OR ALTER TABLE` syntax, I couldn't find it in > the > >> SQL > >> > standard. I'm concerned that this new syntax might confuse users and > add > >> > new usage costs. What do you think? > >> > > >> > 1. > >> > > >> > > >> > https://cwiki.apache.org/confluence/display/FLINK/FLIP-492%3A+Support+Query+Modifications+for+Materialized+Tables > >> > > >> > Best, > >> > Ron > >> > > >> > Ramin Gharib <[email protected]> 于2025年9月4日周四 15:04写道: > >> > > >> > > Good morning, Mate, > >> > > Thanks for the feedback! When I wrote this, the FLIP was not yet > >> closed! > >> > I > >> > > have added the DISTRIBUTED BY/INTO clause to the FLIP. > >> > > > >> > > Cheers, > >> > > > >> > > Ramin > >> > > > >> > > On Wed, Sep 3, 2025 at 7:58 PM Mate Czagany <[email protected]> > >> wrote: > >> > > > >> > > > Hi Ramin, > >> > > > > >> > > > Thank you for the proposal, I think this new command makes perfect > >> > sense > >> > > > and has the potential to improve IaC pipelines. > >> > > > > >> > > > It's missing from the SQL syntax, but are there any plans to make > >> this > >> > > new > >> > > > command also support the DISTRIBUTED BY/INTO functionality of > >> FLIP-542 > >> > > [1] > >> > > > that was just voted to be implemented yesterday [2]? > >> > > > > >> > > > Best Regards > >> > > > Mate > >> > > > > >> > > > [1] > >> > > > > >> > > > > >> > > > >> > > >> > https://cwiki.apache.org/confluence/display/FLINK/FLIP-542%3A+Make+materialized+table+DDL+consistent+with+regular+tables > >> > > > [2] > >> https://lists.apache.org/thread/40xyjjvklt65tw21o2pmk8r3srprxs5q > >> > > > > >> > > > On Tue, Sep 2, 2025 at 3:14 PM Ramin Gharib < > [email protected]> > >> > > wrote: > >> > > > > >> > > > > Hi everyone, > >> > > > > > >> > > > > I want to start a discussion on a new Flink Improvement > Proposal, > >> > > > FLIP-546: > >> > > > > Introduce CREATE OR ALTER for Materialized Tables [1]. > >> > > > > > >> > > > > The introduction of Materialized Tables in FLIP-435 [2] was a > >> > > significant > >> > > > > step toward simplifying data pipelines. However, as we integrate > >> them > >> > > > into > >> > > > > production environments managed by automation, a key challenge > >> > arises. > >> > > > > Modern data operations rely on declarative tools (like dbt, > >> > Terraform, > >> > > > > CI/CD pipelines) that require idempotent commands to manage the > >> > > lifecycle > >> > > > > of data assets. > >> > > > > > >> > > > > The current CREATE MATERIALIZED TABLE syntax is not idempotent > and > >> > > fails > >> > > > if > >> > > > > the table already exists. This forces developers and tool > >> creators to > >> > > > > implement complex and risky CREATE IF NOT EXISTS or DROP...IF > >> EXISTS > >> > > > logic > >> > > > > can lead to data loss and complicate the development of reliable > >> data > >> > > > > management tools. > >> > > > > > >> > > > > This FLIP proposes introducing a CREATE OR ALTER MATERIALIZED > >> TABLE > >> > > > command > >> > > > > to provide a native, idempotent way to manage materialized > tables, > >> > > > aligning > >> > > > > Flink with best practices for declarative, > infrastructure-as-code > >> > > > > deployments. > >> > > > > > >> > > > > [1] https://cwiki.apache.org/confluence/x/0wnXFg > >> > > > > < > >> > > > > > >> > > > > >> > > > >> > > >> > https://cwiki.apache.org/confluence/display/FLINK/FLIP-546%3A+Introduce+CREATE+OR+ALTER+for+Materialized+Tables > >> > > > > > > >> > > > > > >> > > > > [2] https://cwiki.apache.org/confluence/x/HYySEQ > >> > > > > > >> > > > > Best Regards, > >> > > > > > >> > > > > Ramin Gharib > >> > > > > > >> > > > >> > > > >> > > On Wed, Sep 3, 2025 at 7:58 PM Mate Czagany <[email protected]> > >> wrote: > >> > > > >> > > > Hi Ramin, > >> > > > > >> > > > Thank you for the proposal, I think this new command makes perfect > >> > sense > >> > > > and has the potential to improve IaC pipelines. > >> > > > > >> > > > It's missing from the SQL syntax, but are there any plans to make > >> this > >> > > new > >> > > > command also support the DISTRIBUTED BY/INTO functionality of > >> FLIP-542 > >> > > [1] > >> > > > that was just voted to be implemented yesterday [2]? > >> > > > > >> > > > Best Regards > >> > > > Mate > >> > > > > >> > > > [1] > >> > > > > >> > > > > >> > > > >> > > >> > https://cwiki.apache.org/confluence/display/FLINK/FLIP-542%3A+Make+materialized+table+DDL+consistent+with+regular+tables > >> > > > [2] > >> https://lists.apache.org/thread/40xyjjvklt65tw21o2pmk8r3srprxs5q > >> > > > > >> > > > On Tue, Sep 2, 2025 at 3:14 PM Ramin Gharib < > [email protected]> > >> > > wrote: > >> > > > > >> > > > > Hi everyone, > >> > > > > > >> > > > > I want to start a discussion on a new Flink Improvement > Proposal, > >> > > > FLIP-546: > >> > > > > Introduce CREATE OR ALTER for Materialized Tables [1]. > >> > > > > > >> > > > > The introduction of Materialized Tables in FLIP-435 [2] was a > >> > > significant > >> > > > > step toward simplifying data pipelines. However, as we integrate > >> them > >> > > > into > >> > > > > production environments managed by automation, a key challenge > >> > arises. > >> > > > > Modern data operations rely on declarative tools (like dbt, > >> > Terraform, > >> > > > > CI/CD pipelines) that require idempotent commands to manage the > >> > > lifecycle > >> > > > > of data assets. > >> > > > > > >> > > > > The current CREATE MATERIALIZED TABLE syntax is not idempotent > and > >> > > fails > >> > > > if > >> > > > > the table already exists. This forces developers and tool > >> creators to > >> > > > > implement complex and risky CREATE IF NOT EXISTS or DROP...IF > >> EXISTS > >> > > > logic > >> > > > > can lead to data loss and complicate the development of reliable > >> data > >> > > > > management tools. > >> > > > > > >> > > > > This FLIP proposes introducing a CREATE OR ALTER MATERIALIZED > >> TABLE > >> > > > command > >> > > > > to provide a native, idempotent way to manage materialized > tables, > >> > > > aligning > >> > > > > Flink with best practices for declarative, > infrastructure-as-code > >> > > > > deployments. > >> > > > > > >> > > > > [1] https://cwiki.apache.org/confluence/x/0wnXFg > >> > > > > < > >> > > > > > >> > > > > >> > > > >> > > >> > https://cwiki.apache.org/confluence/display/FLINK/FLIP-546%3A+Introduce+CREATE+OR+ALTER+for+Materialized+Tables > >> > > > > > > >> > > > > > >> > > > > [2] https://cwiki.apache.org/confluence/x/HYySEQ > >> > > > > > >> > > > > Best Regards, > >> > > > > > >> > > > > Ramin Gharib > >> > > > > > >> > > > > >> > > > >> > > >> > >> Unless otherwise stated above: > >> > >> IBM United Kingdom Limited > >> Registered in England and Wales with number 741598 > >> Registered office: Building C, IBM Hursley Office, Hursley Park Road, > >> Winchester, Hampshire SO21 2JN > >> > > > > Unless otherwise stated above: > > IBM United Kingdom Limited > Registered in England and Wales with number 741598 > Registered office: Building C, IBM Hursley Office, Hursley Park Road, > Winchester, Hampshire SO21 2JN >
