[GitHub] [flink] dmvk commented on a change in pull request #17182: [release] Create 1.14 release-notes

GitBox Fri, 10 Sep 2021 03:20:28 -0700


dmvk commented on a change in pull request #17182:
URL: https://github.com/apache/flink/pull/17182#discussion_r706069538




##########
File path: docs/content.zh/release-notes/flink-1.14.md
##########
@@ -0,0 +1,424 @@
+---
+title: "Release Notes - Flink 1.14"
+---
+<!--
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+  http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+
+# Release Notes - Flink 1.14
+
+These release notes discuss important aspects, such as configuration, 
behavior, or dependencies,
+that changed between Flink 1.13 and Flink 1.14. Please read these notes 
carefully if you are
+planning to upgrade your Flink version to 1.14.
+
+### DataStream API
+
+#### Expose a consistent GlobalDataExchangeMode
+
+##### [FLINK-23402](https://issues.apache.org/jira/browse/FLINK-23402)
+
+The default DataStream API shuffle mode for batch executions has been changed 
to blocking exchanges
+for all edges of the stream graph. A new option `execution.batch-shuffle-mode` 
allows to change it
+to pipelined behavior if necessary.
+
+#### Allow @TypeInfo annotation on POJO field declarations
+
+##### [FLINK-12141](https://issues.apache.org/jira/browse/FLINK-12141)
+
+`@TypeInfo` annotations can now also be used on POJO fields which, for 
example, can help to define
+custom serializers for third-party classes that can otherwise not be annotated 
themselves.
+
+### Table & SQL
+
+#### Use pipeline name consistently across DataStream API and Table API
+
+##### [FLINK-23646](https://issues.apache.org/jira/browse/FLINK-23646)
+
+The default job name for DataStream API programs in batch mode has changed 
from `"Flink Streaming Job"` to
+`"Flink Batch Job"`. A custom name can be set with config option 
`pipeline.name`.
+
+#### Propagate unique keys for fromChangelogStream
+
+##### [FLINK-24033](https://issues.apache.org/jira/browse/FLINK-24033)
+
+Compared to 1.13.2, `StreamTableEnvironment.fromChangelogStream` might produce 
a different stream
+because primary keys were not properly considered before.
+
+#### Support new type inference for Table#flatMap
+
+##### [FLINK-16769](https://issues.apache.org/jira/browse/FLINK-16769)
+
+`Table.flatMap()` supports the new type system now. Users are requested to 
upgrade their functions.
+
+#### Add Scala implicit conversions for new API methods
+
+##### [FLINK-22590](https://issues.apache.org/jira/browse/FLINK-22590)
+
+The Scala implicits that convert between DataStream API and Table API have 
been updated to the new
+methods of FLIP-136.
+
+The changes might require an update of pipelines that used `toTable` or 
implicit conversions from
+`Table` to `DataStream[Row]`.
+
+#### Remove YAML environment file support in SQL Client
+
+##### [FLINK-22540](https://issues.apache.org/jira/browse/FLINK-22540)
+
+The sql-client-defaults.yaml YAML file was deprecated in 1.13 release and now 
it is totally removed
+in this release. As an alternative, you can use the `-i` startup option to 
execute an initialization SQL
+file to setup the SQL Client session. The initialization SQL file can use 
Flink DDLs to
+define available catalogs, table sources and sinks, user-defined functions, 
and other properties
+required for execution and deployment.
+
+See more: 
https://nightlies.apache.org/flink/flink-docs-release-1.14/docs/dev/table/sqlclient/#initialize-session-using-sql-files
+
+#### Remove the legacy planner code base
+
+##### [FLINK-22864](https://issues.apache.org/jira/browse/FLINK-22864)
+
+The old Table/SQL planner has been removed. BatchTableEnvironment and DataSet 
API interop with Table
+API are not supported anymore. Use the unified TableEnvironment for batch and 
stream processing with
+the new planner or the DataStream API in batch execution mode.
+
+Users are encouraged to update their pipelines. Otherwise Flink 1.13 is the 
last version that offers
+the old functionality.
+
+#### Remove "blink" suffix from table modules
+
+##### [FLINK-22879](https://issues.apache.org/jira/browse/FLINK-22879)
+
+The following Maven modules have been renamed:
+* flink-table-planner-blink -> flink-table-planner
+* flink-table-runtime-blink -> flink-table-runtime
+* flink-table-uber-blink ->flink-table-uber
+
+It might be required to update job JAR dependencies. Note that
+flink-table-planner and flink-table-uber used to contain the legacy planner 
before Flink 1.14 and
+now contain the only officially supported planner (i.e. previously known as 
'Blink' planner).
+
+#### Remove BatchTableEnvironment and related API classes
+
+##### [FLINK-22877](https://issues.apache.org/jira/browse/FLINK-22877)
+
+Due to the removal of BatchTableEnvironment, BatchTableSource and 
BatchTableSink have been removed
+as well. Use DynamicTableSource and DynamicTableSink instead. They support the 
old InputFormat and
+OutputFormat interfaces as runtime providers if necessary.
+
+#### Remove TableEnvironment#connect
+
+##### [FLINK-23063](https://issues.apache.org/jira/browse/FLINK-23063)
+
+The deprecated `TableEnvironment#connect()` method has been removed. Use the
+new `TableEnvironment#createTemporaryTable(String, TableDescriptor)` to create 
tables
+programatically. Please note that this method only supports sources and sinks 
that comply with
+FLIP-95. This is also indicated by the new property design 
`'connector'='kafka'` instead of `'connector.type'='kafka'`.
+
+#### Deprecate toAppendStream and toRetractStream
+
+##### [FLINK-23330](https://issues.apache.org/jira/browse/FLINK-23330)
+
+The outdated variants of 
`StreamTableEnvironment.{fromDataStream|toAppendStream|toRetractStream)`
+have been deprecated. Use the `(from|to)(Data|Changelog)Stream` alternatives 
introduced in 1.13.
+
+#### Remove old connectors and formats stack around descriptors
+
+##### [FLINK-23513](https://issues.apache.org/jira/browse/FLINK-23513)
+
+The legacy versions of the SQL Kafka connector and SQL Elasticsearch connector 
have been removed
+together with their corresponding legacy formats. DDL or descriptors that 
still use `'connector.type='` or
+`'format.type='` options need to be updated to the new connector and formats 
available via the `'connector='` option.
+
+#### Drop BatchTableSource/Sink HBaseTableSource/Sink and related classes
+
+##### [FLINK-22623](https://issues.apache.org/jira/browse/FLINK-22623)
+
+The HBaseTableSource/Sink and related classes including various 
HBaseInputFormats and
+HBaseSinkFunction have been removed. It is possible to read via Table & SQL 
API and convert the
+Table to DataStream API (or vice versa) if necessary. DataSet API is not 
supported anymore.
+
+#### Drop BatchTableSource ParquetTableSource and related classes
+
+##### [FLINK-22622](https://issues.apache.org/jira/browse/FLINK-22622)
+
+The ParquetTableSource and related classes including various 
ParquetInputFormats have been removed.
+Use the filesystem connector with a Parquet format as a replacement. It is 
possible to read via
+Table & SQL API and convert the Table to DataStream API if necessary. DataSet 
API is not supported
+anymore.
+
+#### Drop BatchTableSource OrcTableSource and related classes
+
+##### [FLINK-22620](https://issues.apache.org/jira/browse/FLINK-22620)
+
+The OrcTableSource and related classes (including OrcInputFormat) have been 
removed. Use the
+filesystem connector with an ORC format as a replacement. It is possible to 
read via Table & SQL API
+and convert the Table to DataStream API if necessary. DataSet API is not 
supported anymore.
+
+#### Drop usages of BatchTableEnvironment and old planner in Python
+
+##### [FLINK-22619](https://issues.apache.org/jira/browse/FLINK-22619)
+
+The Python API does not offer a dedicated BatchTableEnvironment anymore. 
Instead, users can switch
+to the unified TableEnvironment for both batch and stream processing. Only the 
Blink planner (the
+only remaining planner in 1.14) is supported.
+
+#### Migrate ModuleFactory to the new factory stack
+
+##### [FLINK-23720](https://issues.apache.org/jira/browse/FLINK-23720)
+
+The `LOAD/UNLOAD MODULE` architecture for table modules has been updated to 
the new factory stack of
+FLIP-95. Users of this feature should update their `ModuleFactory` 
implementations.
+
+#### Migrate Table API to new KafkaSink
+
+##### [FLINK-23639](https://issues.apache.org/jira/browse/FLINK-23639)
+
+Table API/SQL write to Kafka with the new KafkaSink.
+
+### Connectors
+
+#### Implement FLIP-179: Expose Standardized Operator Metrics
+
+##### [FLINK-23652](https://issues.apache.org/jira/browse/FLINK-23652)
+
+Connectors using unified Source and Sink interface expose certain standardized 
metrics
+automatically.
+
+#### Port KafkaSink to FLIP-143
+
+##### [FLINK-22902](https://issues.apache.org/jira/browse/FLINK-22902)
+
+KafkaSink supersedes FlinkKafkaProducer and provides efficient exactly-once 
and at-least-once
+writing with the new unified sink interface supporting both batch and 
streaming mode of DataStream
+API.
+
+#### Deprecate FlinkKafkaConsumer
+
+##### [FLINK-24055](https://issues.apache.org/jira/browse/FLINK-24055)
+
+FlinkKafkaConsumer has been deprecated in favor of KafkaSource, 
FlinkKafkaProducer has been
+deprecated in favor of KafkaSink.
+
+#### Pulsar Source
+
+##### [FLINK-20731](https://issues.apache.org/jira/browse/FLINK-20731)
+
+Flink now directly provides a way to read data from Pulsar with DataStream API.
+
+#### Add Support for Azure Data Lake Store Gen 2 in Flink File System
+
+##### [FLINK-18562](https://issues.apache.org/jira/browse/FLINK-18562)
+
+Flink now supports reading from Azure Data Lake Store Gen 2 with the 
flink-azure-fs-hadoop
+filesystem using the abfs(s) scheme.
+
+#### InputStatus should not contain END_OF_RECOVERY
+
+##### [FLINK-23474](https://issues.apache.org/jira/browse/FLINK-23474)
+
+`InputStatus.END_OF_RECOVERY` was removed. It was an internal flag that should 
never be returned from
+SourceReaders. Returning that value in earlier versions might lead to 
misbehaviour.
+
+#### Connector-base exposes dependency to flink-core.
+
+##### [FLINK-22964](https://issues.apache.org/jira/browse/FLINK-22964)
+
+Connectors do not transitively hold a reference to `flink-core` anymore. That 
means that a fat jar
+with a connector does not include `flink-core` with this fix.
+
+### Runtime & Coordination
+
+#### Increase akka.ask.timeout for tests using the MiniCluster
+
+##### [FLINK-23906](https://issues.apache.org/jira/browse/FLINK-23906)
+
+The default `akka.ask.timeout` used by the `MiniCluster` has been increased to 
5 minutes. If you
+want to use a smaller value, then you have to set it explicitly in the passed 
configuration.
+
+#### The node IP obtained in NodePort mode is a VIP
+
+##### [FLINK-23507](https://issues.apache.org/jira/browse/FLINK-23507)
+
+When using `kubernetes.rest-service.exposed.type=NodePort`, connection string 
for Rest gateway is
+now correctly constructed in form of `<nodeIp>:<nodePort>` instead of
+`<kubernetesApiServerUrl>:<nodePort>`. This may be a breaking change for some 
users.
+
+This also introduces a new config option 
`kubernetes.rest-service.exposed.node-port-address-type` that
+lets you select `<nodeIp>` from a desired range.
+
+#### Timeout heartbeat if the heartbeat target is no longer reachable
+
+##### [FLINK-23209](https://issues.apache.org/jira/browse/FLINK-23209)
+
+Flink now supports to detect dead TaskManagers via the number of consecutive 
failed heartbeat RPCs.
+The threshold until a TaskManager is marked as unreachable can be configured
+via `heartbeat.rpc-failure-threshold`. This can speed up the detection of dead 
TaskManagers
+significantly.
+
+#### RpcService should fail result futures if messages could not be sent
+
+##### [FLINK-23202](https://issues.apache.org/jira/browse/FLINK-23202)
+
+Flink now fails rpc requests that cannot be delivered immediately instead of 
waiting for
+the `akka.ask.timeout`. This has the effect that certain operations, such as 
cancelling tasks on a
+dead `TaskExecutor`, will no longer delay the restart of jobs by 
`akka.ask.timeout`. This might
+increase the number of restart attempts Flink will do in a given time 
interval. Hence, we recommend
+to adjust the restart delay to compensate for the faster completion of rpcs if 
this behaviour should
+become a problem.
+
+#### Count and fail the task when the disk is error on JobManager
+
+##### [FLINK-23189](https://issues.apache.org/jira/browse/FLINK-23189)
+
+IOExceptions thrown during triggering checkpoints on the Job Manager (like for 
example errors while
+creating directories) will be since now accounted against maximum number of 
tolerable checkpoint
+failures.
+
+The number of tolerable checkpoint failures can be adjusted or disabled via:
+`org.apache.flink.streaming.api.environment.CheckpointConfig#setTolerableCheckpointFailureNumber`
+(which is disabled by default).
+
+#### Refine ShuffleMaster lifecycle management for pluggable shuffle service 
framework
+
+##### [FLINK-22910](https://issues.apache.org/jira/browse/FLINK-22910)
+
+We improved the ShuffleMaster interface by adding some lifecycle methods, 
including open, close,
+registerJob and unregisterJob. Besides, the ShuffleMaster now becomes a 
cluster level service which
+can be shared by multiple jobs. This is a breaking change to the pluggable 
shuffle service framework
+and the customized shuffle plugin needs to adapt to the new interface 
accordingly.
+
+#### Group job specific ZooKeeper HA services under common jobs/<JobID> zNode
+
+##### [FLINK-22636](https://issues.apache.org/jira/browse/FLINK-22636)
+
+The ZooKeeper job-specific HA services are now grouped under a zNode with the 
respective `JobID`.
+Moreover, the config options `high-availability.zookeeper.path.latch`
+, `high-availability.zookeeper.path.leader`, 
`high-availability.zookeeper.path.checkpoints`
+and `high-availability.zookeeper.path.checkpoint-counter` have been removed 
and, thus, have no
+longer an effect.
+
+#### Fallback value for taskmanager.slot.timeout
+
+##### [FLINK-22002](https://issues.apache.org/jira/browse/FLINK-22002)
+
+The config option `taskmanager.slot.timeout` falls now back to 
`akka.ask.timeout` if no value has
+been configured.
+
+#### DuplicateJobSubmissionException after JobManager failover
+
+##### [FLINK-21928](https://issues.apache.org/jira/browse/FLINK-21928)
+
+The fix for this problem only works if the ApplicationMode is used with a 
single job submission and
+if the user code does not access the `JobExecutionResult`. If any of these 
conditions is violated,
+then Flink cannot guarantee that the whole Flink application is executed.
+
+Additionally, it is still required that the user cleans up the corresponding 
HA entries for the
+running jobs registry because these entries won't be reliably cleaned up when 
encountering the
+situation described by FLINK-21928.
+
+#### Zookeeper node under leader and leaderlatch is not deleted after job 
finished
+
+##### [FLINK-20695](https://issues.apache.org/jira/browse/FLINK-20695)
+
+The `HighAvailabilityServices` have received a new method `cleanupJobData` 
which can be implemented
+in order to clean up job related HA data after a given job has terminated.
+
+### Checkpoints
+
+#### Change local alignment timeout back to the global time out
+
+##### [FLINK-23041](https://issues.apache.org/jira/browse/FLINK-23041)
+
+The semantic of alignmentTimeout configuration has changed to such meaning:
+
+The time between the start of a checkpoint(on the checkpoint coordinator) and 
the time when the
+checkpoint barrier is received by a task.
+
+#### Disable unaligned checkpoints for broadcast partitioning
+
+##### [FLINK-22815](https://issues.apache.org/jira/browse/FLINK-22815)
+
+Unaligned checkpoints were disabled for BROADCAST exchanges.
+
+Broadcast partitioning can not work with unaligned checkpointing. There are no 
guarantees that
+records are consumed at the same rate in all channels. This can result in some 
tasks applying state
+changes corresponding to a certain broadcasted event while others don't. In 
turn upon restore, it
+may lead to an inconsistent state.
+
+#### DefaultCompletedCheckpointStore drops unrecoverable checkpoints silently
+
+##### [FLINK-22502](https://issues.apache.org/jira/browse/FLINK-22502)
+
+On recovery, if a failure occurs during retrieval of a checkpoint, the job is 
restarted (instead of
+skipping the checkpoint in some circumstances). This prevents potential 
consistency violations.
+
+#### Recover checkpoints when JobMaster gains leadership
+
+##### [FLINK-22483](https://issues.apache.org/jira/browse/FLINK-22483)

Review comment:
       👍
   
   I think we should leave a small note, that this changes the interface as 
it's public facing (even though 99.99% user won't touch it, we still allow them 
to override it).
   
   How about:
   
   ```
   Flink no longer re-loads checkpoint metadata from the external storage 
before restoring the task state after the failover (except when the JobManager 
fails over / changes leadership). This results in less external I/O and faster 
failover.
   
   Please note that this changes a public interfaces around 
`CompletedCheckpointStore`, that we allow overriding by providing custom 
implementation of HA Services.
   
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] dmvk commented on a change in pull request #17182: [release] Create 1.14 release-notes

Reply via email to