xiangyuf commented on code in PR #26778: URL: https://github.com/apache/flink/pull/26778#discussion_r2199755136
########## docs/content.zh/release-notes/flink-2.1.md: ########## @@ -0,0 +1,176 @@ +--- +title: "Release Notes - Flink 2.1" +--- + +<!-- +Licensed to the Apache Software Foundation (ASF) under one +or more contributor license agreements. See the NOTICE file +distributed with this work for additional information +regarding copyright ownership. The ASF licenses this file +to you under the Apache License, Version 2.0 (the +"License"); you may not use this file except in compliance +with the License. You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + +Unless required by applicable law or agreed to in writing, +software distributed under the License is distributed on an +"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +KIND, either express or implied. See the License for the +specific language governing permissions and limitations +under the License. +--> + +# Release notes - Flink 2.1 + +These release notes discuss important aspects, such as configuration, behavior or dependencies, +that changed between Flink 2.0 and Flink 2.1. Please read these notes carefully if you are +planning to upgrade your Flink version to 2.1. + +### Table SQL / API + +#### Realtime AI Function + +##### [FLINK-34992](https://issues.apache.org/jira/browse/FLINK-34992), [FLINK-37777](https://issues.apache.org/jira/browse/FLINK-37777) + +Since Flink 2.0, we have introduced dedicated syntax for AI models, enabling users to define models +as easily as creating catalog objects and invoke them like standard functions or table functions in +SQL statements. In this release, we expanded the `ML_PREDICT` table-valued function (TVF) to perform +realtime model inference in SQL queries, applying machine learning models to data streams +seamlessly. The implementation supports both embedded models (including OpenAI) and custom model +providers, accelerating Flink's evolution from a real-time data processing engine to a unified +realtime AI platform. Looking ahead, we plan to introduce more AI functions to unlock +end-to-end experience for real-time data processing, model training, and inference. + +See more details about the capabilities and usages of +Flink's [Model Inference](https://nightlies.apache.org/flink/flink-docs-release-2.1/docs/dev/table/sql/queries/model-inference/). + +#### Variant Type + +##### [FLINK-37922](https://issues.apache.org/jira/browse/FLINK-37922) + +Variant is a new data type for semi-structured data(e.g. JSON), it supports storing any +semi-structured data, including ARRAY, MAP(with STRING keys), and scalar types—while preserving +field type information in a JSON-like structure. Unlike ROW and STRUCTURED types, VARIANT provides +superior flexibility for handling deeply nested and evolving schemas. + +Users can use `PARSE_JSON` or`TRY_PARSE_JSON` to convert JSON-formatted VARCHAR data to VARIANT. In +addition, table formats like Apache Paimon and Iceberg now support the VARIANT type, this enable +users to efficiently process semi-structured data in Lakehouse using Flink SQL. + +#### Structured Type Enhancements + +##### [FLINK-37861](https://issues.apache.org/jira/browse/FLINK-37861) + +Enabling declare user-defined objects via STRUCTURED TYPE directly in `CREATE TABLE` DDL +statements, resolving critical type equivalence issues and significantly improving API usability. + +#### Delta Join + +##### [FLINK-37836](https://issues.apache.org/jira/browse/FLINK-37836) + +Introduced a new DeltaJoin operator in stream processing jobs, along with optimizations for simple +streaming join pipeline. Compared to traditional streaming join, delta join requires significantly +less state, effectively mitigating issues related to large state, including resource bottlenecks, +slow checkpointing, and lengthy job recovery times. This feature is enabled by default. More details +can be found +at [Delta Join](https://cwiki.apache.org/confluence/display/FLINK/FLIP-486%3A+Introduce+A+New+DeltaJoin) + +#### Multi-way Join + +##### [FLINK-37859](https://issues.apache.org/jira/browse/FLINK-37859) + +Streaming Flink jobs with multiple cascaded streaming joins often experience operational +instability and performance degradation due to large state sizes. This release introduces a +multi-way join operator (`StreamingMultiJoinOperator`) that drastically reduces state size +by eliminating intermediate results. The operator achieves this by processing joins across all input +streams simultaneously within a single operator instance, storing only raw input records instead of +propagated join output. + +This "zero intermediate state" approach primarily targets state reduction, offering substantial +benefits in resource consumption and operational stability. This feature is now available for +pipelines with multiple INNER/LEFT joins that share at least one common join key, enable with +`SET 'table.optimizer.multi-join.enabled' = 'true'`. + +#### Async Lookup Join Enhancements + +##### [FLINK-37874](https://issues.apache.org/jira/browse/FLINK-37874) + +Support handling records in order based on upsert key (the unique key in the input stream deduced by +planner) while allowing parallel processing of different keys to achieve better throughput when +processing changelog data stream. + +#### Sink Reuse + +##### [FLINK-37227](https://issues.apache.org/jira/browse/FLINK-37227) + +Within a single Flink job, when write multiple `INSERT INTO` statements update identical or +different columns of a target table, the planner will optimize the execution plan and merge the sink Review Comment: Currently, only identical columns are supported. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
