Hi David, Thanks for the feedback and the good questions about the evolution scenario.
To clarify the scope of this FLIP, it's focused on introducing the CREATE OR ALTER syntax itself, and not on expanding Flink's underlying schema evolution capabilities. Every (un)supported scenario described in the FLIP is taken from the source code of the ALTER command. The implementation plan for the new CREATE OR ALTER syntax is to delegate directly to the existing ALTER MATERIALIZED TABLE ... AS <select_statement> logic if the table already exists. In other words, a set of supported evolutions through CREATE OR ALTER will be exactly what the standard ALTER command currently supports. So, regarding your specific points: 1. *Changing nullability:* This would only work if the underlying ALTER logic already supports it. 2. *Castable type changes:* As you noted, this would require a data rewrite and is a significant feature in itself. It would need to be implemented in the core ALTER command first. Once it is, CREATE OR ALTER would support it by extension. Essentially, this FLIP provides the idempotent entry point, while a separate, existing component handles the actual evolution logic it triggers. Expanding those capabilities would be a great topic for future FLIPs. Kind regards, Ramin On Fri, Sep 5, 2025 at 3:33 PM David Radley <[email protected]> wrote: > Hi Ramin, > I like this proposal. > I wondered what you thought of supporting evolutions like : > > 1. > Changing a column or child object to be nullable? > 2. > Allowing column type changes if they were castable? This could require a > rewriting of the column data - so may not be feasible. > 3. > > > WDYT? > Kind regards, David. > > From: Ramin Gharib <[email protected]> > Date: Friday, 5 September 2025 at 11:27 > To: [email protected] <[email protected]> > Subject: [EXTERNAL] Re: [DISCUSS] FLIP-546: Introduce CREATE OR ALTER for > Materialized Tables > > Hi Ron, > > Thanks again for the excellent feedback and for bringing FLIP-492 [1] into > the discussion. I want to address your points on CREATE OR REPLACE vs. > CREATE > OR ALTER and the SQL standard. > > *1. Semantics of REPLACE vs. ALTER* > > You’re right that FLIP-492 [1] proposed CREATE OR REPLACE. However, I > believe the semantics of ALTER are a much better fit for an object as > complex as a Materialized Table, which has a running job, state, and > physical data associated with it. > > Interestingly, this exact distinction is recognized by other modern data > platforms. For example, *Snowflake supports both CREATE OR REPLACE TABLE > and CREATE OR ALTER TABLE as distinct commands with different use cases > [2].* > > - > > Their documentation for CREATE OR ALTER TABLE explicitly states it > *"creates > a table if it doesn’t exist, or alters it according to the table > definition... existing data in the table is preserved when possible."* > This "modify in place" semantic is precisely what this FLIP aims to > achieve. > - > > Conversely, their docs describe CREATE OR REPLACE as *"the equivalent of > using DROP TABLE on the existing table and then creating a new > table..."* > This "drop and recreate" semantics is a destructive action that is not > suitable for a stateful object like a Flink Materialized Table, where > the > goal is to evolve the pipeline, not destroy it. > > This is particularly relevant for Flink, as we already have an ALTER > MATERIALIZED TABLE ... AS <select_statement> command that defines the > "modify in-place" behavior. My proposal is that the ALTER path of CREATE OR > ALTER should be implemented by delegating directly to this existing logic. > This ensures behavioral consistency. Using REPLACE would introduce a > conflict that doesn't align with our current implementation. > > *2. Forward-Looking Resilience* > > The ALTER semantic is also more forward-looking. As Flink's evolution > capabilities become more sophisticated—potentially including complex > reprocessing strategies that retain historical data—the concept of > 'altering' a pipeline is more fitting than 'replacing' it. REPLACE suggests > a simple, destructive action, whereas ALTER provides a more resilient > foundation for nuanced, state-preserving modifications in the future. > > *3. The SQL Standard* > > You've raised an important point about the SQL standard. A review shows > that neither CREATE OR REPLACE nor CREATE OR ALTER are part of the formal > ANSI/ISO SQL standard*.* They are both widely adopted, vendor-specific > extensions. > > - > > *CREATE OR REPLACE* was popularized by systems like PostgreSQL (for > views/functions) [3] and Oracle [4]. > - > > *CREATE OR ALTER* was popularized by Microsoft SQL Server [5] and is now > a key feature in Snowflake [6]. > > Since neither is formally "standard," our choice should be guided by which > one provides the most clarity and consistency for Flink users. The fact > that a major platform like Snowflake has implemented both as distinct > commands reinforces the idea that the industry sees value in their > different semantics. > > *4. Preventing User Confusion* > > I believe CREATE OR ALTER will actually be *less confusing* for Flink users > precisely because it maps directly to the existing commands they already > know. A user familiar with ALTER MATERIALIZED TABLE ... AS will immediately > understand what the ALTER part of the new command does. > > Given that the implementation for FLIP-492 [1] is not yet complete ( > FLINK-36995 [7] is still open), now is a good time to choose (or define) > the syntax with the clearest and safest semantics before any code is > merged. > > In summary, I believe CREATE OR ALTER provides clearer, non-destructive > semantics that are more consistent with Flink's existing DDL for > Materialized Tables, making it the superior choice for both current > functionality and future resilience. > > Thanks again for the great discussion points. I look forward to hearing > your thoughts and those of the wider community. > > [1] > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-492%3A+Support+Query+Modifications+for+Materialized+Tables > [2] > https://docs.snowflake.com/en/sql-reference/sql/create-table#usage-notes > [3] https://www.postgresql.org/docs/current/sql-createview.html > [4] https://docs.oracle.com/cd/E17952_01/mysql-5.7-en/create-view.html > [5] > > https://learn.microsoft.com/en-us/sql/t-sql/statements/create-view-transact-sql?view=sql-server-ver17#or-alter > [6] > > https://docs.snowflake.com/en/sql-reference/sql/create-table#label-create-or-alter-table-syntax > [7] https://issues.apache.org/jira/browse/FLINK-36995 > > Best, > > Ramin > > On Thu, Sep 4, 2025 at 1:54 PM Ron Liu <[email protected]> wrote: > > > Hi, Ramin > > > > In FLIP-492[1], we introduced the `CREATE OR REPLACE MATERIALIZED TABLE` > > syntax to support modifying materialized tables. Can we extend this > syntax > > to achieve the functionality you need, such as introducing clause > > parameters to determine whether to replace the entire table or just some > of > > its attributes? > > > > Regarding the `CREATE OR ALTER TABLE` syntax, I couldn't find it in the > SQL > > standard. I'm concerned that this new syntax might confuse users and add > > new usage costs. What do you think? > > > > 1. > > > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-492%3A+Support+Query+Modifications+for+Materialized+Tables > > > > Best, > > Ron > > > > Ramin Gharib <[email protected]> 于2025年9月4日周四 15:04写道: > > > > > Good morning, Mate, > > > Thanks for the feedback! When I wrote this, the FLIP was not yet > closed! > > I > > > have added the DISTRIBUTED BY/INTO clause to the FLIP. > > > > > > Cheers, > > > > > > Ramin > > > > > > On Wed, Sep 3, 2025 at 7:58 PM Mate Czagany <[email protected]> > wrote: > > > > > > > Hi Ramin, > > > > > > > > Thank you for the proposal, I think this new command makes perfect > > sense > > > > and has the potential to improve IaC pipelines. > > > > > > > > It's missing from the SQL syntax, but are there any plans to make > this > > > new > > > > command also support the DISTRIBUTED BY/INTO functionality of > FLIP-542 > > > [1] > > > > that was just voted to be implemented yesterday [2]? > > > > > > > > Best Regards > > > > Mate > > > > > > > > [1] > > > > > > > > > > > > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-542%3A+Make+materialized+table+DDL+consistent+with+regular+tables > > > > [2] https://lists.apache.org/thread/40xyjjvklt65tw21o2pmk8r3srprxs5q > > > > > > > > On Tue, Sep 2, 2025 at 3:14 PM Ramin Gharib <[email protected]> > > > wrote: > > > > > > > > > Hi everyone, > > > > > > > > > > I want to start a discussion on a new Flink Improvement Proposal, > > > > FLIP-546: > > > > > Introduce CREATE OR ALTER for Materialized Tables [1]. > > > > > > > > > > The introduction of Materialized Tables in FLIP-435 [2] was a > > > significant > > > > > step toward simplifying data pipelines. However, as we integrate > them > > > > into > > > > > production environments managed by automation, a key challenge > > arises. > > > > > Modern data operations rely on declarative tools (like dbt, > > Terraform, > > > > > CI/CD pipelines) that require idempotent commands to manage the > > > lifecycle > > > > > of data assets. > > > > > > > > > > The current CREATE MATERIALIZED TABLE syntax is not idempotent and > > > fails > > > > if > > > > > the table already exists. This forces developers and tool creators > to > > > > > implement complex and risky CREATE IF NOT EXISTS or DROP...IF > EXISTS > > > > logic > > > > > can lead to data loss and complicate the development of reliable > data > > > > > management tools. > > > > > > > > > > This FLIP proposes introducing a CREATE OR ALTER MATERIALIZED TABLE > > > > command > > > > > to provide a native, idempotent way to manage materialized tables, > > > > aligning > > > > > Flink with best practices for declarative, infrastructure-as-code > > > > > deployments. > > > > > > > > > > [1] https://cwiki.apache.org/confluence/x/0wnXFg > > > > > < > > > > > > > > > > > > > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-546%3A+Introduce+CREATE+OR+ALTER+for+Materialized+Tables > > > > > > > > > > > > > > > > [2] https://cwiki.apache.org/confluence/x/HYySEQ > > > > > > > > > > Best Regards, > > > > > > > > > > Ramin Gharib > > > > > > > > > > > > > > On Wed, Sep 3, 2025 at 7:58 PM Mate Czagany <[email protected]> > wrote: > > > > > > > Hi Ramin, > > > > > > > > Thank you for the proposal, I think this new command makes perfect > > sense > > > > and has the potential to improve IaC pipelines. > > > > > > > > It's missing from the SQL syntax, but are there any plans to make > this > > > new > > > > command also support the DISTRIBUTED BY/INTO functionality of > FLIP-542 > > > [1] > > > > that was just voted to be implemented yesterday [2]? > > > > > > > > Best Regards > > > > Mate > > > > > > > > [1] > > > > > > > > > > > > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-542%3A+Make+materialized+table+DDL+consistent+with+regular+tables > > > > [2] https://lists.apache.org/thread/40xyjjvklt65tw21o2pmk8r3srprxs5q > > > > > > > > On Tue, Sep 2, 2025 at 3:14 PM Ramin Gharib <[email protected]> > > > wrote: > > > > > > > > > Hi everyone, > > > > > > > > > > I want to start a discussion on a new Flink Improvement Proposal, > > > > FLIP-546: > > > > > Introduce CREATE OR ALTER for Materialized Tables [1]. > > > > > > > > > > The introduction of Materialized Tables in FLIP-435 [2] was a > > > significant > > > > > step toward simplifying data pipelines. However, as we integrate > them > > > > into > > > > > production environments managed by automation, a key challenge > > arises. > > > > > Modern data operations rely on declarative tools (like dbt, > > Terraform, > > > > > CI/CD pipelines) that require idempotent commands to manage the > > > lifecycle > > > > > of data assets. > > > > > > > > > > The current CREATE MATERIALIZED TABLE syntax is not idempotent and > > > fails > > > > if > > > > > the table already exists. This forces developers and tool creators > to > > > > > implement complex and risky CREATE IF NOT EXISTS or DROP...IF > EXISTS > > > > logic > > > > > can lead to data loss and complicate the development of reliable > data > > > > > management tools. > > > > > > > > > > This FLIP proposes introducing a CREATE OR ALTER MATERIALIZED TABLE > > > > command > > > > > to provide a native, idempotent way to manage materialized tables, > > > > aligning > > > > > Flink with best practices for declarative, infrastructure-as-code > > > > > deployments. > > > > > > > > > > [1] https://cwiki.apache.org/confluence/x/0wnXFg > > > > > < > > > > > > > > > > > > > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-546%3A+Introduce+CREATE+OR+ALTER+for+Materialized+Tables > > > > > > > > > > > > > > > > [2] https://cwiki.apache.org/confluence/x/HYySEQ > > > > > > > > > > Best Regards, > > > > > > > > > > Ramin Gharib > > > > > > > > > > > > > > > > Unless otherwise stated above: > > IBM United Kingdom Limited > Registered in England and Wales with number 741598 > Registered office: Building C, IBM Hursley Office, Hursley Park Road, > Winchester, Hampshire SO21 2JN >
