[flink-web] 01/02: Add Apache Flink release 1.13.0

dwysakowicz Mon, 03 May 2021 06:32:51 -0700

This is an automated email from the ASF dual-hosted git repository.

dwysakowicz pushed a commit to branch asf-site
in repository https://gitbox.apache.org/repos/asf/flink-web.git


commit 698b911b4b93ed7e703d8c42ffaf49c4593a9185
Author: Dawid Wysakowicz <[email protected]>
AuthorDate: Fri Apr 23 15:12:01 2021 +0200

    Add Apache Flink release 1.13.0
    
    Co-authored-by: Stephan Ewen <[email protected]>
---
 _posts/2021-05-03-release-1.13.0.md                | 590 ++++++++++++++
 content/img/blog/2021-05-03-release-1.13.0/7.png   | Bin 0 -> 169877 bytes
 .../blog/2021-05-03-release-1.13.0/bottleneck.png  | Bin 0 -> 256667 bytes
 content/news/2021/05/03/release-1.13.0.html        | 852 +++++++++++++++++++++
 img/blog/2021-05-03-release-1.13.0/7.png           | Bin 0 -> 169877 bytes
 img/blog/2021-05-03-release-1.13.0/bottleneck.png  | Bin 0 -> 256667 bytes
 6 files changed, 1442 insertions(+)

diff --git a/_posts/2021-05-03-release-1.13.0.md 
b/_posts/2021-05-03-release-1.13.0.md
new file mode 100644
index 0000000..793eaa1
--- /dev/null
+++ b/_posts/2021-05-03-release-1.13.0.md
@@ -0,0 +1,590 @@
+---
+layout: post 
+title:  "Apache Flink 1.13.0 Release Announcement"
+date: 2021-05-03T08:00:00.000Z
+categories: news 
+authors:
+- stephan:
+  name: "Stephan Ewen"
+  twitter: "StephanEwen"
+- dwysakowicz:
+  name: "Dawid Wysakowicz"
+  twitter: "dwysakowicz"
+
+excerpt: The Apache Flink community is excited to announce the release of 
Flink 1.13.0! Around 200 contributors worked on over 1,000 issues to bring 
significant improvements to usability and observability as well as new features 
that improve the elasticity of Flink's Application-style deployments.
+---
+
+
+The Apache Flink community is excited to announce the release of Flink 1.13.0! 
More than 200
+contributors worked on over 1,000 issues for this new version.
+
+The release brings us a big step forward in one of our major efforts: **Making 
Stream Processing
+Applications as natural and as simple to manage as any other application.** 
The new *reactive scaling*
+mode means that scaling streaming applications in and out now works like in 
any other application
+by just changing the number of parallel processes.
+
+The release also prominently features a **series of improvements that help 
users better understand the performance of
+applications.** When the streams don't flow as fast as you'd hope, these can 
help you to understand
+why: Load and *backpressure visualization* to identify bottlenecks, *CPU flame 
graphs* to identify hot
+code paths in your application, and *State Access Latencies* to see how the 
State Backends are keeping
+up.
+
+Beyond those features, the Flink community has added a ton of improvements all 
over the system,
+some of which we discuss in this article. We hope you enjoy the new release 
and features.
+Towards the end of the article, we describe changes to be aware of when 
upgrading
+from earlier versions of Apache Flink.
+
+{% toc %}
+
+We encourage you to [download the 
release](https://flink.apache.org/downloads.html) and share your
+feedback with the community through
+the [Flink mailing 
lists](https://flink.apache.org/community.html#mailing-lists)
+or [JIRA](https://issues.apache.org/jira/projects/FLINK/summary).
+
+----
+
+# Notable features
+
+## Reactive scaling
+
+Reactive scaling is the latest piece in Flink's initiative to make Stream 
Processing
+Applications as natural and as simple to manage as any other application.
+
+Flink has a dual nature when it comes to resource management and deployments: 
You can deploy
+Flink applications onto resource orchestrators like Kubernetes or Yarn in such 
a way that Flink actively manages
+the resources and allocates and releases workers as needed. That is especially 
useful for jobs and
+applications that rapidly change their required resources, like batch 
applications and ad-hoc SQL
+queries. The application parallelism rules, the number of workers follows. In 
the context of Flink
+applications, we call this *active scaling*.
+
+For long-running streaming applications, it is often a nicer model to just 
deploy them like any
+other long-running application: The application doesn't really need to know 
that it runs on K8s,
+EKS, Yarn, etc. and doesn't try to acquire a specific amount of workers; 
instead, it just uses the
+number of workers that are given to it. The number of workers rules, the 
application parallelism
+adjusts to that. In the context of Flink, we call that *reactive scaling*.
+
+The [Application Deployment Mode]({{ site.DOCS_BASE_URL 
}}flink-docs-release-1.13/docs/concepts/flink-architecture/#flink-application-execution)
+started this effort, making deployments more application-like (by avoiding two 
separate deployment
+steps to (1) start a cluster and (2) submit an application). The reactive 
scaling mode completes this,
+and you now don't have to use extra tools (scripts, or a K8s operator) anymore 
to keep the number
+of workers, and the application parallelism settings in sync.
+
+You can now put an auto-scaler around Flink applications like around other 
typical applications — as
+long as you are mindful about the cost of rescaling when configuring the 
autoscaler: Stateful
+streaming applications must move state around when scaling.
+
+To try the reactive-scaling mode, add the `scheduler-mode: reactive` config 
entry and deploy
+an application cluster ([standalone]({{ site.DOCS_BASE_URL 
}}flink-docs-release-1.13/docs/deployment/resource-providers/standalone/overview/#application-mode)
 or [Kubernetes]({{ site.DOCS_BASE_URL 
}}flink-docs-release-1.13/docs/deployment/resource-providers/standalone/kubernetes/#deploy-application-cluster)).
 Check out [the reactive scaling docs]({{ site.DOCS_BASE_URL 
}}flink-docs-release-1.13/docs/deployment/elastic_scaling/#reactive-mode) for 
more details.
+
+
+## Analyzing application performance
+
+Like for any application, analyzing and understanding the performance of a 
Flink application
+is critical. Often even more critical, because Flink applications are 
typically data-intensive
+(processing high volumes of data) and are at the same time expected to provide 
results within
+(near-) real-time latencies.
+
+When an application doesn't keep up with the data rate anymore, or an 
application takes more
+resources than you'd expect it would, these new tools can help you track down 
the causes:
+
+**Bottleneck detection, Back Pressure monitoring**
+
+The first question during performance analysis is often: Which operation is 
the bottleneck?
+
+To help answer that, Flink exposes metrics about the degree to which tasks are 
*busy* (doing work)
+and *back-pressured* (have the capacity to do work but cannot because their 
successor operators
+cannot accept more results). Candidates for bottlenecks are the busy operators 
whose predecessors
+are back-pressured.
+
+Flink 1.13 brings an improved back pressure metric system (using task mailbox 
timings rather than
+thread stack sampling), and a reworked graphical representation of the job's 
dataflow with color-coding
+and ratios for busyness and backpressure.
+
+<figure style="align-content: center">
+  <img src="{{ site.baseurl 
}}/img/blog/2021-05-03-release-1.13.0/bottleneck.png" style="width: 900px"/>
+</figure>
+
+**CPU flame graphs in Web UI**
+
+The next question during performance analysis is typically: What part of work 
in the bottlenecked
+operator is expensive?
+
+One visually effective means to investigate that is *Flame Graphs*. They help 
answer question like:
+  - Which methods are currently consuming CPU resources?
+  - How does one method's CPU consumption compare to other methods?
+  - Which series of calls on the stack led to executing a particular method?
+  
+Flame Graphs are constructed by repeatedly sampling the thread stack traces. 
Every method call is
+represented by a bar, where the length of the bar is proportional to the 
number of times it is present
+in the samples. When enabled, the graphs are shown in a new UI component for 
the selected operator.
+
+<figure style="align-content: center">
+  <img src="{{ site.baseurl }}/img/blog/2021-05-03-release-1.13.0/7.png" 
style="display: block; margin-left: auto; margin-right: auto; width: 600px"/>
+</figure>
+
+Flame graphs are expensive to create: They may cause processing overhead and 
can put a heavy load
+on Flink's metric system. Because of that, users need to explicitly enable 
them in the configuration.
+
+**Access Latency Metrics for State**
+
+Another possible performance bottleneck can be the state backend, especially 
when your state is larger
+than the main memory available to Flink and you are using the [RocksDB state 
backend](
+{{ site.DOCS_BASE_URL 
}}flink-docs-release-1.13/docs/ops/state/state_backends/#the-embeddedrocksdbstatebackend).
+
+That's not saying RocksDB is slow (we love RocksDB!), but it has some 
requirements to achieve
+good performance. For example, it is easy to accidentally [starve RocksDB's 
demand for IOPs on cloud setups with
+the wrong type of disk 
resources](https://www.ververica.com/blog/the-impact-of-disks-on-rocksdb-state-backend-in-flink-a-case-study).
+
+On top of the CPU flame graphs, the new *state backend latency metrics* can 
help you understand whether
+your state backend is responsive. For example, if you see that RocksDB state 
accesses start to take
+milliseconds, you probably need to look into your memory and I/O configuration.
+These metrics can be activated by setting the 
`state.backend.rocksdb.latency-track-enabled` option.
+The metrics are sampled, and their collection should have a marginal impact on 
the RocksDB state
+backend performance.
+
+## Switching State Backend with savepoints
+
+You can now change the state backend of a Flink application when resuming from 
a savepoint.
+That means the application's state is no longer locked into the state backend 
that was used when
+the application was initially started.
+
+This makes it possible, for example, to initially start with the HashMap State 
Backend (pure
+in-memory in JVM Heap) and later switch to the RocksDB State Backend, once the 
state grows
+too large.
+
+Under the hood, Flink now has a canonical savepoint format, which all state 
backends use when
+creating a data snapshot for a savepoint.
+
+## User-specified pod templates for Kubernetes deployments
+
+The [native Kubernetes deployment]({{ site.DOCS_BASE_URL 
}}flink-docs-release-1.13/docs/deployment/resource-providers/native_kubernetes/)
+(where Flink actively talks to K8s to start and stop pods) now supports 
*custom pod templates*.
+
+With those templates, users can set up and configure the JobManagers and 
TaskManagers pods in a
+Kubernetes-y way, with flexibility beyond the configuration options that are 
directly built into
+Flink's Kubernetes integration.
+
+## Unaligned Checkpoints - production-ready
+
+Unaligned Checkpoints have matured to the point where we encourage all users 
to try them out,
+if they see issues with their application under backpressure.
+
+In particular, these changes make Unaligned Checkpoints easier to use:
+
+ - You can now rescale applications from unaligned checkpoints. This comes in 
handy if your
+   application needs to be scaled from a retained checkpoint because you 
cannot (afford to) create
+   a savepoint.
+
+ - Enabling unaligned checkpoints is cheaper for applications that are not 
back-pressured.
+   Unaligned checkpoints can now trigger adaptively with a timeout, meaning a 
checkpoint starts
+   as an aligned checkpoint (not storing any in-flight events) and falls back 
to an unaligned
+   checkpoint (storing some in-flight events), if the alignment phase takes 
longer than a certain
+   time.
+
+Find out more about how to enable unaligned checkpoints in the [Checkpointing 
Documentation]({{ site.DOCS_BASE_URL 
}}flink-docs-release-1.13/docs/ops/state/checkpoints/#unaligned-checkpoints).
+
+## Machine Learning Library moving to a separate repository
+
+To accelerate the development of Flink's Machine Learning efforts (streaming, 
batch, and
+unified machine learning), the effort has moved to the new repository 
[flink-ml](https://github.com/apache/flink-ml)
+under the Flink project. We here follow a similar approach like the *Stateful 
Functions* effort,
+where a separate repository has helped to speed up the development by allowing 
for more light-weight
+contribution workflows and separate release cycles.
+
+Stay tuned for more updates in the Machine Learning efforts, like the 
interplay with
+[ALink](https://github.com/alibaba/Alink) (suite of many common Machine 
Learning Algorithms on Flink)
+or the [Flink & TensorFlow 
integration](https://github.com/alibaba/flink-ai-extended).
+
+
+# Notable SQL & Table API improvements
+
+Like in previous releases, SQL and the Table API remain an area of big 
developments.
+
+## Windows via Table-valued functions
+
+Defining time windows is one of the most frequent operations in streaming SQL 
queries.
+Flink 1.13 introduces a new way to define windows: via *Table-valued 
Functions*.
+This approach is both more expressive (lets you define new types of windows) 
and fully
+in line with the SQL standard.
+
+Flink 1.13 supports *TUMBLE* and *HOP* windows in the new syntax, *SESSION* 
windows will
+follow in a subsequent release. To demonstrate the increased expressiveness, 
consider the two examples
+below.
+
+A new *CUMULATE* window function that assigns windows with an expanding step 
size until the maximum
+window size is reached:
+
+```sql
+SELECT window_time, window_start, window_end, SUM(price) AS total_price 
+  FROM TABLE(CUMULATE(TABLE Bid, DESCRIPTOR(bidtime), INTERVAL '2' MINUTES, 
INTERVAL '10' MINUTES))
+GROUP BY window_start, window_end, window_time;
+```
+
+You can reference the window start and window end time of the table-valued 
window functions,
+making new types of constructs possible. Beyond regular windowed aggregations 
and windowed joins,
+you can, for example, now express windowed Top-K aggregations:
+
+```sql
+SELECT window_time, ...
+  FROM (
+    SELECT *, ROW_NUMBER() OVER (PARTITION BY window_start, window_end ORDER 
BY total_price DESC) 
+      as rank 
+    FROM t
+  ) WHERE rank <= 100;
+```
+
+## Improved interoperability between DataStream API and Table API/SQL 
+
+This release radically simplifies mixing DataStream API and Table API programs.
+
+The Table API is a great way to develop applications, with its declarative 
nature and its
+many built-in functions. But sometimes, you need to *escape* to the DataStream 
API for its
+expressiveness, flexibility, and explicit control over the state.
+
+The new methods `StreamTableEnvironment.toDataStream()/.fromDataStream()` can 
model
+a `DataStream` from the DataStream API as a table source or sink. Types are 
automatically
+converted, event-time, and watermarks carry across. In addition, the `Row` 
class (representing
+row events from the Table API) has received a major overhaul (improving the 
behavior of
+`toString()`/`hashCode()`/`equals()` methods) and now supports accessing 
fields by name, with
+support for sparse representations.
+
+```java
+Table table=tableEnv.fromDataStream(
+       dataStream,Schema.newBuilder()
+       .columnByMetadata("rowtime","TIMESTAMP(3)")
+       .watermark("rowtime","SOURCE_WATERMARK()")
+       .build());
+
+DataStream<Row> dataStream=tableEnv.toDataStream(table)
+       .keyBy(r->r.getField("user"))
+       .window(...)
+```
+
+## SQL Client: Init scripts and Statement Sets
+
+The SQL Client is a convenient way to run and deploy SQL streaming and batch 
jobs directly,
+without writing any code from the command line, or as part of a CI/CD workflow.
+
+This release vastly improves the functionality of the SQL client. Almost all 
operations as that
+are available to Java applications (when programmatically launching queries 
from the
+`TableEnvironment`) are now supported in the SQL Client and as SQL scripts.
+That means SQL users need much less glue code for their SQL deployments.
+
+**Easier Configuration and Code Sharing**
+
+The support of YAML files to configure the SQL Client will be discontinued. 
Instead, the client
+accepts one or more *initialization scripts* to configure a session before the 
main SQL script
+gets executed.
+
+These init scripts would typically be shared across teams/deployments and 
could be used for
+loading common catalogs, applying common configuration settings, or defining 
standard views. 
+
+```
+./sql-client.sh -i init1.sql init2.sql -f sqljob.sql
+```
+
+**More config options**
+
+A greater set of recognized config options and improved `SET`/`RESET` commands 
make it easier to
+define and control the execution from within the SQL client and SQL scripts.
+
+**Multi-query Support with Statement Sets**
+
+Multi-query execution lets you execute multiple SQL queries (or statements) as 
a single Flink job.
+This is particularly useful for streaming SQL queries that run indefinitely.
+
+*Statement Sets* are the mechanism to group the queries together that should 
be executed together.
+
+The following is an example of a SQL script that can be run via the SQL 
client. It sets up and
+configures the environment and executes multiple queries. The script captures 
end-to-end the
+queries and all environment setup and configuration work, making it a 
self-contained deployment
+artifact.
+
+```sql
+-- set up a catalog
+CREATE CATALOG hive_catalog WITH ('type' = 'hive');
+USE CATALOG hive_catalog;
+
+-- or use temporary objects
+CREATE TEMPORARY TABLE clicks (
+  user_id BIGINT,
+  page_id BIGINT,
+  viewtime TIMESTAMP
+) WITH (
+  'connector' = 'kafka',
+  'topic' = 'clicks',
+  'properties.bootstrap.servers' = '...',
+  'format' = 'avro'
+);
+
+-- set the execution mode for jobs
+SET execution.runtime-mode=streaming;
+
+-- set the sync/async mode for INSERT INTOs
+SET table.dml-sync=false;
+
+-- set the job's parallelism
+SET parallism.default=10;
+
+-- set the job name
+SET pipeline.name = my_flink_job;
+
+-- restore state from the specific savepoint path
+SET execution.savepoint.path=/tmp/flink-savepoints/savepoint-bb0dab;
+
+BEGIN STATEMENT SET;
+
+INSERT INTO pageview_pv_sink
+SELECT page_id, count(1) FROM clicks GROUP BY page_id;
+
+INSERT INTO pageview_uv_sink
+SELECT page_id, count(distinct user_id) FROM clicks GROUP BY page_id;
+
+END;
+```
+
+## Hive query syntax compatibility
+
+You can now write SQL queries against Flink using the Hive SQL syntax.
+In addition to Hive's DDL dialect, Flink now also accepts the commonly-used 
Hive DML and DQL
+dialects.
+
+To use the Hive SQL dialect, set `table.sql-dialect` to `hive` and load the 
`HiveModule`.
+The latter is important because Hive's built-in functions are required for 
proper syntax and
+semantics compatibility. The following example illustrates that:
+
+```sql
+CREATE CATALOG myhive WITH ('type' = 'hive'); -- setup HiveCatalog
+USE CATALOG myhive;
+LOAD MODULE hive; -- setup HiveModule
+USE MODULES hive,core;
+SET table.sql-dialect = hive; -- enable Hive dialect
+SELECT key, value FROM src CLUSTER BY key; -- run some Hive queries
+```
+
+Please note that the Hive dialect no longer supports Flink's SQL syntax for 
DML and DQL statements.
+Switch back to the `default` dialect for Flink's syntax.
+
+## Improved behavior of SQL time functions
+
+Working with time is a crucial element of any data processing. But 
simultaneously, handling different
+time zones, dates, and times is an [increadibly delicate 
task](https://xkcd.com/1883/) when working with data.
+
+In Flink 1.13. we put much effort into simplifying the usage of time-related 
functions. We adjusted (made
+more specific) the return types of functions such as: `PROCTIME()`, 
`CURRENT_TIMESTAMP`, `NOW()`.
+
+Moreover, you can now also define an event time attribute on a *TIMESTAMP_LTZ* 
column to gracefully
+do window processing with the support of Daylight Saving Time.
+
+Please see the release notes for a complete list of changes.
+
+---
+
+# Notable PyFlink improvements
+
+The general theme of this release in PyFlink is to bring the Python DataStream 
API and Table API
+closer to feature parity with the Java/Scala APIs.
+
+### Stateful operations in the Python DataStream API 
+
+With Flink 1.13, Python programmers now also get to enjoy the full potential 
of Apache Flink's
+stateful stream processing APIs. The rearchitected Python DataStream API, 
introduced in Flink 1.12,
+now has full stateful capabilities, allowing users to remember information 
from events in the state
+and act on it later.
+
+That stateful processing capability is the basis of many of the more 
sophisticated processing
+operations, which need to remember information across individual events (for 
example, Windowing
+Operations).
+
+This example shows a custom counting window implementation, using state:
+
+```python
+class CountWindowAverage(FlatMapFunction):
+    def __init__(self, window_size):
+        self.window_size = window_size
+
+    def open(self, runtime_context: RuntimeContext):
+        descriptor = ValueStateDescriptor("average", 
Types.TUPLE([Types.LONG(), Types.LONG()]))
+        self.sum = runtime_context.get_state(descriptor)
+
+    def flat_map(self, value):
+        current_sum = self.sum.value()
+        if current_sum is None:
+            current_sum = (0, 0)
+        # update the count
+        current_sum = (current_sum[0] + 1, current_sum[1] + value[1])
+        # if the count reaches window_size, emit the average and clear the 
state
+        if current_sum[0] >= self.window_size:
+            self.sum.clear()
+            yield value[0], current_sum[1] // current_sum[0]
+        else:
+            self.sum.update(current_sum)
+
+ds = ...  # type: DataStream
+ds.key_by(lambda row: row[0]) \
+  .flat_map(CountWindowAverage(5))
+```
+
+### User-defined Windows in the PyFlink DataStream API
+
+Flink 1.13 adds support for user-defined windows to the PyFlink DataStream 
API. Programs can now use
+windows beyond the standard window definitions.
+
+Because windows are at the heart of all programs that process unbounded 
streams (by splitting the
+stream into "buckets" of bounded size), this greatly increases the 
expressiveness of the API.
+
+### Row-based operation in the PyFlink Table API 
+
+The Python Table API now supports row-based operations, i.e., custom 
transformation functions on rows.
+These functions are an easy way to apply data transformations on tables beyond 
the built-in functions.
+
+This is an example of using a `map()` operation in Python Table API:
+```python
+@udf(result_type=DataTypes.ROW(
+  [DataTypes.FIELD("c1", DataTypes.BIGINT()),
+   DataTypes.FIELD("c2", DataTypes.STRING())]))
+def increment_column(r: Row) -> Row:
+  return Row(r[0] + 1, r[1])
+
+table = ...  # type: Table
+mapped_result = table.map(increment_column)
+```
+
+In addition to `map()`, the API also supports `flat_map()`, `aggregate()`, 
`flat_aggregate()`,
+and other row-based operations. This brings the Python Table API a big step 
closer to feature
+parity with the Java Table API.
+
+### Batch execution mode for PyFlink DataStream programs
+
+The PyFlink DataStream API now also supports the batch execution mode for 
bounded streams,
+which was introduced for the Java DataStream API in Flink 1.12.
+
+The batch execution mode simplifies operations and improves the performance of 
programs on bounded streams,
+by exploiting the bounded stream nature to bypass state backends and 
checkpoints.
+
+# Other improvements
+
+**Flink Documentation via Hugo**
+
+The Flink Documentation has been migrated from Jekyll to Hugo. If you find 
something missing, please let us know.
+We are also curious to hear if you like the new look & feel.
+
+**Exception histories in the Web UI**
+
+The Flink Web UI will present up to *n* last exceptions that caused a job to 
fail.
+That helps to debug scenarios where a root failure caused subsequent failures. 
The root failure
+cause can be found in the exception history.
+
+**Better exception / failure-cause reporting for unsuccessful checkpoints**
+
+Flink now provides statistics for checkpoints that failed or were aborted to 
make it easier
+to determine the failure cause without having to analyze the logs.
+
+Prior versions of Flink were reporting metrics (e.g., size of persisted data, 
trigger time)
+only in case a checkpoint succeeded.
+
+**Exactly-once JDBC sink**
+
+From 1.13, JDBC sink can guarantee exactly-once delivery of results for 
XA-compliant databases
+by transactionally committing results on checkpoints. The target database must 
have (or be linked
+to) an XA Transaction Manager.
+
+The connector exists currently only for the *DataStream API*, and can be 
created through the
+`JdbcSink.exactlyOnceSink(...)` method (or by instantiating the 
`JdbcXaSinkFunction` directly).
+
+**PyFlink Table API supports User-Defined Aggregate Functions in Group 
Windows**
+
+Group Windows in PyFlink's Table API now support both general Python 
User-defined Aggregate
+Functions (UDAFs) and Pandas UDAFs. Such functions are critical to many 
analysis- and ML training
+programs.
+
+Flink 1.13 improves upon previous releases, where these functions were only 
supported
+in unbounded Group-by aggregations.
+
+**Improved Sort-Merge Shuffle for Batch Execution**
+
+Flink 1.13 improves the memory stability and performance of the *sort-merge 
blocking shuffle*
+for batch-executed programs, initially introduced in Flink 1.12 via 
[FLIP-148](https://cwiki.apache.org/confluence/display/FLINK/FLIP-148%3A+Introduce+Sort-Merge+Based+Blocking+Shuffle+to+Flink).
 
+
+Programs with higher parallelism (1000s) should no longer frequently trigger 
*OutOfMemoryError: Direct Memory*.
+The performance (especially on spinning disks) is improved through better I/O 
scheduling
+and broadcast optimizations.
+
+**HBase connector supports async lookup and lookup cache**
+
+The HBase Lookup Table Source now supports an *async lookup mode* and a lookup 
cache.
+This greatly benefits the performance of Table/SQL jobs with lookup joins 
against HBase, while
+reducing the I/O requests to HBase in the typical case.
+
+In prior versions, the HBase Lookup Source only communicated synchronously, 
resulting in lower
+pipeline utilization and throughput.
+
+# Changes to consider when upgrading to Flink 1.13
+
+* [FLINK-21709](https://issues.apache.org/jira/browse/FLINK-21709) - The old 
planner of the Table &
+  SQL API has been deprecated in Flink 1.13 and will be dropped in Flink 1.14.
+  The *Blink* engine has been the default planner for some releases now and 
will be the only one going forward.
+  That means that both the `BatchTableEnvironment` and SQL/DataSet 
interoperability are reaching
+  the end of life. Please use the unified `TableEnvironment` for batch and 
stream processing going forward.
+* [FLINK-22352](https://issues.apache.org/jira/browse/FLINK-22352) The 
community decided to deprecate
+  the Apache Mesos support for Apache Flink. It is subject to removal in the 
future. Users are
+  encouraged to switch to a different resource manager.
+* [FLINK-21935](https://issues.apache.org/jira/browse/FLINK-21935) - The 
`state.backend.async`
+  option is deprecated. Snapshots are always asynchronous now (as they were by 
default before) and
+  there is no option to configure a synchronous snapshot anymore.
+* [FLINK-17012](https://issues.apache.org/jira/browse/FLINK-17012) - The 
tasks' `RUNNING` state was split
+  into two states: `INITIALIZING` and `RUNNING`. A task is `INITIALIZING` 
while it loads the checkpointed state,
+  and, in the case of unaligned checkpoints, until the checkpointed in-flight 
data has been recovered.
+  This lets monitoring systems better determine when the tasks are really back 
to doing work by making
+  the phase for state restoring explicit.
+* [FLINK-21698](https://issues.apache.org/jira/browse/FLINK-21698) - The 
*CAST* operation between the
+  NUMERIC type and the TIMESTAMP type is problematic and therefore no longer 
supported: Statements like 
+  `CAST(numeric AS TIMESTAMP(3))` will now fail. Please use 
`TO_TIMESTAMP(FROM_UNIXTIME(numeric))` instead.
+* [FLINK-22133](https://issues.apache.org/jira/browse/FLINK-22133) The unified 
source API for connectors
+  has a minor breaking change: The `SplitEnumerator.snapshotState()` method 
was adjusted to accept the
+  *Checkpoint ID* of the checkpoint for which the snapshot is created.
+
+# Resources
+
+The binary distribution and source artifacts are now available on the updated 
[Downloads page]({{ site.baseurl }}/downloads.html)
+of the Flink website, and the most recent distribution of PyFlink is available 
on [PyPI](https://pypi.org/project/apache-flink/).
+
+Please review the [release notes]({{ site.DOCS_BASE_URL 
}}flink-docs-release-1.13/release-notes/flink-1.13.html)
+carefully if you plan to upgrade your setup to Flink 1.13. This version is 
API-compatible with
+previous 1.x releases for APIs annotated with the `@Public` annotation.
+
+You can also check the complete [release 
changelog](https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522&version=12349287)
 
+and [updated documentation]({{ site.DOCS_BASE_URL }}flink-docs-release-1.13/) 
for a detailed list of changes and new features.
+
+# List of Contributors
+
+The Apache Flink community would like to thank each one of the contributors 
that have
+made this release possible:
+
+acqua.csq, AkisAya, Alexander Fedulov, Aljoscha Krettek, Ammar Al-Batool, 
Andrey Zagrebin, anlen321,
+Anton Kalashnikov, appleyuchi, Arvid Heise, Austin Cawley-Edwards, austin ce, 
azagrebin, blublinsky,
+Brian Zhou, bytesmithing, caozhen1937, chen qin, Chesnay Schepler, Congxian 
Qiu, Cristian,
+cxiiiiiii, Danny Chan, Danny Cranmer, David Anderson, Dawid Wysakowicz, 
dbgp2021, Dian Fu,
+DinoZhang, dixingxing, Dong Lin, Dylan Forciea, est08zw, Etienne Chauchot, 
fanrui03, Flora Tao,
+FLRNKS, fornaix, fuyli, George, Giacomo Gamba, GitHub, godfrey he, GuoWei Ma, 
Gyula Fora,
+hackergin, hameizi, Haoyuan Ge, Harshvardhan Chauhan, Haseeb Asif, hehuiyuan, 
huangxiao, HuangXiao,
+huangxingbo, HuangXingBo, humengyu2012, huzekang, Hwanju Kim, Ingo Bürk, I. 
Raleigh, Ivan, iyupeng,
+Jack, Jane, Jark Wu, Jerry Wang, Jiangjie (Becket) Qin, JiangXin, Jiayi Liao, 
JieFang.He, Jie Wang,
+jinfeng, Jingsong Lee, JingsongLi, Jing Zhang, Joao Boto, JohnTeslaa, Jun Qin, 
kanata163, kevin.cyj,
+KevinyhZou, Kezhu Wang, klion26, Kostas Kloudas, kougazhang, Kurt Young, 
laughing, legendtkl,
+leiqiang, Leonard Xu, liaojiayi, Lijie Wang, liming.1018, lincoln lee, 
lincoln-lil, liushouwei,
+liuyufei, LM Kang, lometheus, luyb, Lyn Zhang, Maciej Obuchowski, Maciek 
Próchniak, mans2singh,
+Marek Sabo, Matthias Pohl, meijie, Mika Naylor, Miklos Gergely, Mohit Paliwal, 
Moritz Manner,
+morsapaes, Mulan, Nico Kruber, openopen2, paul8263, Paul Lam, Peidian li, 
pengkangjing, Peter Huang,
+Piotr Nowojski, Qinghui Xu, Qingsheng Ren, Raghav Kumar Gautam, Rainie Li, 
Ricky Burnett, Rion
+Williams, Robert Metzger, Roc Marshal, Roman, Roman Khachatryan, Ruguo,
+Ruguo Yu, Rui Li, Sebastian Liu, Seth Wiesman, sharkdtu, sharkdtu(涂小刚), 
Shengkai, shizhengchao,
+shouweikun, Shuo Cheng, simenliuxing, SteNicholas, Stephan Ewen, Suo Lu, 
sv3ndk, Svend Vanderveken,
+taox, Terry Wang, Thelgis Kotsos, Thesharing, Thomas Weise, Till Rohrmann, 
Timo Walther, Ting Sun,
+totoro, totorooo, TsReaper, Tzu-Li (Gordon) Tai, V1ncentzzZ, vthinkxie, 
wangfeifan, wangpeibin,
+wangyang0918, wangyemao-github, Wei Zhong, Wenlong Lyu, wineandcheeze, wjc, 
xiaoHoly, Xintong Song,
+xixingya, xmarker, Xue Wang, Yadong Xie, yangsanity, Yangze Guo, Yao Zhang, 
Yuan Mei, yulei0824, Yu
+Li, Yun Gao, Yun Tang, yuruguo, yushujun, Yuval Itzchakov, yuzhao.cyz, zck, 
zhangjunfan,
+zhangzhengqi3, zhao_wei_nan, zhaown, zhaoxing, Zhenghua Gao, Zhenqiu Huang, 
zhisheng, zhongqishang,
+zhushang, zhuxiaoshang, Zhu Zhu, zjuwangg, zoucao, zoudan, 左元, 星, 肖佳文, 龙三
+
diff --git a/content/img/blog/2021-05-03-release-1.13.0/7.png 
b/content/img/blog/2021-05-03-release-1.13.0/7.png
new file mode 100644
index 0000000..7801328
Binary files /dev/null and b/content/img/blog/2021-05-03-release-1.13.0/7.png 
differ
diff --git a/content/img/blog/2021-05-03-release-1.13.0/bottleneck.png 
b/content/img/blog/2021-05-03-release-1.13.0/bottleneck.png
new file mode 100644
index 0000000..7aa2c98
Binary files /dev/null and 
b/content/img/blog/2021-05-03-release-1.13.0/bottleneck.png differ
diff --git a/content/news/2021/05/03/release-1.13.0.html 
b/content/news/2021/05/03/release-1.13.0.html
new file mode 100644
index 0000000..d0ce601
--- /dev/null
+++ b/content/news/2021/05/03/release-1.13.0.html
@@ -0,0 +1,852 @@
+<!DOCTYPE html>
+<html lang="en">
+  <head>
+    <meta charset="utf-8">
+    <meta http-equiv="X-UA-Compatible" content="IE=edge">
+    <meta name="viewport" content="width=device-width, initial-scale=1">
+    <!-- The above 3 meta tags *must* come first in the head; any other head 
content must come *after* these tags -->
+    <title>Apache Flink: Apache Flink 1.13.0 Release Announcement</title>
+    <link rel="shortcut icon" href="/favicon.ico" type="image/x-icon">
+    <link rel="icon" href="/favicon.ico" type="image/x-icon">
+
+    <!-- Bootstrap -->
+    <link rel="stylesheet" href="/css/bootstrap.min.css">
+    <link rel="stylesheet" href="/css/flink.css">
+    <link rel="stylesheet" href="/css/syntax.css">
+
+    <!-- Blog RSS feed -->
+    <link href="/blog/feed.xml" rel="alternate" type="application/rss+xml" 
title="Apache Flink Blog: RSS feed" />
+
+    <!-- jQuery (necessary for Bootstrap's JavaScript plugins) -->
+    <!-- We need to load Jquery in the header for custom google analytics 
event tracking-->
+    <script src="/js/jquery.min.js"></script>
+
+    <!-- HTML5 shim and Respond.js for IE8 support of HTML5 elements and media 
queries -->
+    <!-- WARNING: Respond.js doesn't work if you view the page via file:// -->
+    <!--[if lt IE 9]>
+      <script 
src="https://oss.maxcdn.com/html5shiv/3.7.2/html5shiv.min.js";></script>
+      <script 
src="https://oss.maxcdn.com/respond/1.4.2/respond.min.js";></script>
+    <![endif]-->
+  </head>
+  <body>  
+    
+
+    <!-- Main content. -->
+    <div class="container">
+    <div class="row">
+
+      
+     <div id="sidebar" class="col-sm-3">
+        
+
+<!-- Top navbar. -->
+    <nav class="navbar navbar-default">
+        <!-- The logo. -->
+        <div class="navbar-header">
+          <button type="button" class="navbar-toggle collapsed" 
data-toggle="collapse" data-target="#bs-example-navbar-collapse-1">
+            <span class="icon-bar"></span>
+            <span class="icon-bar"></span>
+            <span class="icon-bar"></span>
+          </button>
+          <div class="navbar-logo">
+            <a href="/">
+              <img alt="Apache Flink" src="/img/flink-header-logo.svg" 
width="147px" height="73px">
+            </a>
+          </div>
+        </div><!-- /.navbar-header -->
+
+        <!-- The navigation links. -->
+        <div class="collapse navbar-collapse" 
id="bs-example-navbar-collapse-1">
+          <ul class="nav navbar-nav navbar-main">
+
+            <!-- First menu section explains visitors what Flink is -->
+
+            <!-- What is Stream Processing? -->
+            <!--
+            <li><a href="/streamprocessing1.html">What is Stream 
Processing?</a></li>
+            -->
+
+            <!-- What is Flink? -->
+            <li><a href="/flink-architecture.html">What is Apache 
Flink?</a></li>
+
+            
+
+            <!-- What is Stateful Functions? -->
+
+            <li><a href="/stateful-functions.html">What is Stateful 
Functions?</a></li>
+
+            <!-- Use cases -->
+            <li><a href="/usecases.html">Use Cases</a></li>
+
+            <!-- Powered by -->
+            <li><a href="/poweredby.html">Powered By</a></li>
+
+
+            &nbsp;
+            <!-- Second menu section aims to support Flink users -->
+
+            <!-- Downloads -->
+            <li><a href="/downloads.html">Downloads</a></li>
+
+            <!-- Getting Started -->
+            <li class="dropdown">
+              <a class="dropdown-toggle" data-toggle="dropdown" 
href="#">Getting Started<span class="caret"></span></a>
+              <ul class="dropdown-menu">
+                <li><a 
href="https://ci.apache.org/projects/flink/flink-docs-release-1.13/try-flink/index.html";
 target="_blank">With Flink <small><span class="glyphicon 
glyphicon-new-window"></span></small></a></li>
+                <li><a 
href="https://ci.apache.org/projects/flink/flink-statefun-docs-release-3.0/getting-started/project-setup.html";
 target="_blank">With Flink Stateful Functions <small><span class="glyphicon 
glyphicon-new-window"></span></small></a></li>
+                <li><a href="/training.html">Training Course</a></li>
+              </ul>
+            </li>
+
+            <!-- Documentation -->
+            <li class="dropdown">
+              <a class="dropdown-toggle" data-toggle="dropdown" 
href="#">Documentation<span class="caret"></span></a>
+              <ul class="dropdown-menu">
+                <li><a 
href="https://ci.apache.org/projects/flink/flink-docs-release-1.13"; 
target="_blank">Flink 1.13 (Latest stable release) <small><span 
class="glyphicon glyphicon-new-window"></span></small></a></li>
+                <li><a 
href="https://ci.apache.org/projects/flink/flink-docs-master"; 
target="_blank">Flink Master (Latest Snapshot) <small><span class="glyphicon 
glyphicon-new-window"></span></small></a></li>
+                <li><a 
href="https://ci.apache.org/projects/flink/flink-statefun-docs-release-3.0"; 
target="_blank">Flink Stateful Functions 3.0 (Latest stable release) 
<small><span class="glyphicon glyphicon-new-window"></span></small></a></li>
+                <li><a 
href="https://ci.apache.org/projects/flink/flink-statefun-docs-master"; 
target="_blank">Flink Stateful Functions Master (Latest Snapshot) <small><span 
class="glyphicon glyphicon-new-window"></span></small></a></li>
+              </ul>
+            </li>
+
+            <!-- getting help -->
+            <li><a href="/gettinghelp.html">Getting Help</a></li>
+
+            <!-- Blog -->
+            <li class="active"><a href="/blog/"><b>Flink Blog</b></a></li>
+
+
+            <!-- Flink-packages -->
+            <li>
+              <a href="https://flink-packages.org"; 
target="_blank">flink-packages.org <small><span class="glyphicon 
glyphicon-new-window"></span></small></a>
+            </li>
+            &nbsp;
+
+            <!-- Third menu section aim to support community and contributors 
-->
+
+            <!-- Community -->
+            <li><a href="/community.html">Community &amp; Project Info</a></li>
+
+            <!-- Roadmap -->
+            <li><a href="/roadmap.html">Roadmap</a></li>
+
+            <!-- Contribute -->
+            <li><a href="/contributing/how-to-contribute.html">How to 
Contribute</a></li>
+            
+
+            <!-- GitHub -->
+            <li>
+              <a href="https://github.com/apache/flink"; target="_blank">Flink 
on GitHub <small><span class="glyphicon 
glyphicon-new-window"></span></small></a>
+            </li>
+
+            &nbsp;
+
+            <!-- Language Switcher -->
+            <li>
+              
+                
+                  <!-- link to the Chinese home page when current is blog page 
-->
+                  <a href="/zh">中文版</a>
+                
+              
+            </li>
+
+          </ul>
+
+          <style>
+            .smalllinks:link {
+              display: inline-block !important; background: none; padding-top: 
0px; padding-bottom: 0px; padding-right: 0px; min-width: 75px;
+            }
+          </style>
+
+          <ul class="nav navbar-nav navbar-bottom">
+          <hr />
+
+            <!-- Twitter -->
+            <li><a href="https://twitter.com/apacheflink"; 
target="_blank">@ApacheFlink <small><span class="glyphicon 
glyphicon-new-window"></span></small></a></li>
+
+            <!-- Visualizer -->
+            <li class=" hidden-md hidden-sm"><a href="/visualizer/" 
target="_blank">Plan Visualizer <small><span class="glyphicon 
glyphicon-new-window"></span></small></a></li>
+
+            <li >
+                  <a href="/security.html">Flink Security</a>
+            </li>
+
+          <hr />
+
+            <li><a href="https://apache.org"; target="_blank">Apache Software 
Foundation <small><span class="glyphicon 
glyphicon-new-window"></span></small></a></li>
+
+            <li>
+
+              <a class="smalllinks" href="https://www.apache.org/licenses/"; 
target="_blank">License</a> <small><span class="glyphicon 
glyphicon-new-window"></span></small>
+
+              <a class="smalllinks" href="https://www.apache.org/security/"; 
target="_blank">Security</a> <small><span class="glyphicon 
glyphicon-new-window"></span></small>
+
+              <a class="smalllinks" 
href="https://www.apache.org/foundation/sponsorship.html"; 
target="_blank">Donate</a> <small><span class="glyphicon 
glyphicon-new-window"></span></small>
+
+              <a class="smalllinks" 
href="https://www.apache.org/foundation/thanks.html"; target="_blank">Thanks</a> 
<small><span class="glyphicon glyphicon-new-window"></span></small>
+            </li>
+
+          </ul>
+        </div><!-- /.navbar-collapse -->
+    </nav>
+
+      </div>
+      <div class="col-sm-9">
+      <div class="row-fluid">
+  <div class="col-sm-12">
+    <div class="row">
+      <h1>Apache Flink 1.13.0 Release Announcement</h1>
+      <p><i></i></p>
+
+      <article>
+        <p>03 May 2021 Stephan Ewen (<a 
href="https://twitter.com/StephanEwen";>@StephanEwen</a>) &amp; Dawid Wysakowicz 
(<a href="https://twitter.com/dwysakowicz";>@dwysakowicz</a>)</p>
+
+<p>The Apache Flink community is excited to announce the release of Flink 
1.13.0! More than 200
+contributors worked on over 1,000 issues for this new version.</p>
+
+<p>The release brings us a big step forward in one of our major efforts: 
<strong>Making Stream Processing
+Applications as natural and as simple to manage as any other 
application.</strong> The new <em>reactive scaling</em>
+mode means that scaling streaming applications in and out now works like in 
any other application
+by just changing the number of parallel processes.</p>
+
+<p>The release also prominently features a <strong>series of improvements that 
help users better understand the performance of
+applications.</strong> When the streams don’t flow as fast as you’d hope, 
these can help you to understand
+why: Load and <em>backpressure visualization</em> to identify bottlenecks, 
<em>CPU flame graphs</em> to identify hot
+code paths in your application, and <em>State Access Latencies</em> to see how 
the State Backends are keeping
+up.</p>
+
+<p>Beyond those features, the Flink community has added a ton of improvements 
all over the system,
+some of which we discuss in this article. We hope you enjoy the new release 
and features.
+Towards the end of the article, we describe changes to be aware of when 
upgrading
+from earlier versions of Apache Flink.</p>
+
+<div class="page-toc">
+<ul id="markdown-toc">
+  <li><a href="#notable-features" id="markdown-toc-notable-features">Notable 
features</a>    <ul>
+      <li><a href="#reactive-scaling" 
id="markdown-toc-reactive-scaling">Reactive scaling</a></li>
+      <li><a href="#analyzing-application-performance" 
id="markdown-toc-analyzing-application-performance">Analyzing application 
performance</a></li>
+      <li><a href="#switching-state-backend-with-savepoints" 
id="markdown-toc-switching-state-backend-with-savepoints">Switching State 
Backend with savepoints</a></li>
+      <li><a href="#user-specified-pod-templates-for-kubernetes-deployments" 
id="markdown-toc-user-specified-pod-templates-for-kubernetes-deployments">User-specified
 pod templates for Kubernetes deployments</a></li>
+      <li><a href="#unaligned-checkpoints---production-ready" 
id="markdown-toc-unaligned-checkpoints---production-ready">Unaligned 
Checkpoints - production-ready</a></li>
+      <li><a href="#machine-learning-library-moving-to-a-separate-repository" 
id="markdown-toc-machine-learning-library-moving-to-a-separate-repository">Machine
 Learning Library moving to a separate repository</a></li>
+    </ul>
+  </li>
+  <li><a href="#notable-sql--table-api-improvements" 
id="markdown-toc-notable-sql--table-api-improvements">Notable SQL &amp; Table 
API improvements</a>    <ul>
+      <li><a href="#windows-via-table-valued-functions" 
id="markdown-toc-windows-via-table-valued-functions">Windows via Table-valued 
functions</a></li>
+      <li><a 
href="#improved-interoperability-between-datastream-api-and-table-apisql" 
id="markdown-toc-improved-interoperability-between-datastream-api-and-table-apisql">Improved
 interoperability between DataStream API and Table API/SQL</a></li>
+      <li><a href="#sql-client-init-scripts-and-statement-sets" 
id="markdown-toc-sql-client-init-scripts-and-statement-sets">SQL Client: Init 
scripts and Statement Sets</a></li>
+      <li><a href="#hive-query-syntax-compatibility" 
id="markdown-toc-hive-query-syntax-compatibility">Hive query syntax 
compatibility</a></li>
+      <li><a href="#improved-behavior-of-sql-time-functions" 
id="markdown-toc-improved-behavior-of-sql-time-functions">Improved behavior of 
SQL time functions</a></li>
+    </ul>
+  </li>
+  <li><a href="#notable-pyflink-improvements" 
id="markdown-toc-notable-pyflink-improvements">Notable PyFlink improvements</a> 
   <ul>
+      <li><a href="#stateful-operations-in-the-python-datastream-api" 
id="markdown-toc-stateful-operations-in-the-python-datastream-api">Stateful 
operations in the Python DataStream API</a></li>
+      <li><a href="#user-defined-windows-in-the-pyflink-datastream-api" 
id="markdown-toc-user-defined-windows-in-the-pyflink-datastream-api">User-defined
 Windows in the PyFlink DataStream API</a></li>
+      <li><a href="#row-based-operation-in-the-pyflink-table-api" 
id="markdown-toc-row-based-operation-in-the-pyflink-table-api">Row-based 
operation in the PyFlink Table API</a></li>
+      <li><a href="#batch-execution-mode-for-pyflink-datastream-programs" 
id="markdown-toc-batch-execution-mode-for-pyflink-datastream-programs">Batch 
execution mode for PyFlink DataStream programs</a></li>
+    </ul>
+  </li>
+  <li><a href="#other-improvements" id="markdown-toc-other-improvements">Other 
improvements</a></li>
+  <li><a href="#change-to-consider-when-upgrading-to-flink-113" 
id="markdown-toc-change-to-consider-when-upgrading-to-flink-113">Change to 
consider when upgrading to Flink 1.13</a></li>
+  <li><a href="#resources" id="markdown-toc-resources">Resources</a></li>
+  <li><a href="#list-of-contributors" 
id="markdown-toc-list-of-contributors">List of Contributors</a></li>
+</ul>
+
+</div>
+
+<p>We encourage you to <a 
href="https://flink.apache.org/downloads.html";>download the release</a> and 
share your
+feedback with the community through
+the <a href="https://flink.apache.org/community.html#mailing-lists";>Flink 
mailing lists</a>
+or <a 
href="https://issues.apache.org/jira/projects/FLINK/summary";>JIRA</a>.</p>
+
+<hr />
+
+<h1 id="notable-features">Notable features</h1>
+
+<h2 id="reactive-scaling">Reactive scaling</h2>
+
+<p>Reactive scaling is the latest piece in Flink’s initiative to make Stream 
Processing
+Applications as natural and as simple to manage as any other application.</p>
+
+<p>Flink has a dual nature when it comes to resource management and 
deployments: You can deploy
+Flink applications onto resource orchestrators like Kubernetes or Yarn in such 
a way that Flink actively manages
+the resources and allocates and releases workers as needed. That is especially 
useful for jobs and
+applications that rapidly change their required resources, like batch 
applications and ad-hoc SQL
+queries. The application parallelism rules, the number of workers follows. In 
the context of Flink
+applications, we call this <em>active scaling</em>.</p>
+
+<p>For long-running streaming applications, it is often a nicer model to just 
deploy them like any
+other long-running application: The application doesn’t really need to know 
that it runs on K8s,
+EKS, Yarn, etc. and doesn’t try to acquire a specific amount of workers; 
instead, it just uses the
+number of workers that are given to it. The number of workers rules, the 
application parallelism
+adjusts to that. In the context of Flink, we call that <em>reactive 
scaling</em>.</p>
+
+<p>The <a 
href="https://ci.apache.org/projects/flink/flink-docs-release-1.13/docs/concepts/flink-architecture/#flink-application-execution";>Application
 Deployment Mode</a>
+started this effort, making deployments more application-like (by avoiding two 
separate deployment
+steps to (1) start a cluster and (2) submit an application). The reactive 
scaling mode completes this,
+and you now don’t have to use extra tools (scripts, or a K8s operator) anymore 
to keep the number
+of workers, and the application parallelism settings in sync.</p>
+
+<p>You can now put an auto-scaler around Flink applications like around other 
typical applications — as
+long as you are mindful about the cost of rescaling when configuring the 
autoscaler: Stateful
+streaming applications must move state around when scaling.</p>
+
+<p>To try the reactive-scaling mode, add the <code>scheduler-mode: 
reactive</code> config entry and deploy
+an application cluster (<a 
href="https://ci.apache.org/projects/flink/flink-docs-release-1.13/docs/deployment/resource-providers/standalone/overview/#application-mode";>standalone</a>
 or <a 
href="https://ci.apache.org/projects/flink/flink-docs-release-1.13/docs/deployment/resource-providers/standalone/kubernetes/#deploy-application-cluster";>Kubernetes</a>).
 Check out <a 
href="https://ci.apache.org/projects/flink/flink-docs-release-1.13/docs/deployment/elastic_scaling/#reactive-mode";>the
 r [...]
+
+<h2 id="analyzing-application-performance">Analyzing application 
performance</h2>
+
+<p>Like for any application, analyzing and understanding the performance of a 
Flink application
+is critical. Often even more critical, because Flink applications are 
typically data-intensive
+(processing high volumes of data) and are at the same time expected to provide 
results within
+(near-) real-time latencies.</p>
+
+<p>When an application doesn’t keep up with the data rate anymore, or an 
application takes more
+resources than you’d expect it would, these new tools can help you track down 
the causes:</p>
+
+<p><strong>Bottleneck detection, Back Pressure monitoring</strong></p>
+
+<p>The first question during performance analysis is often: Which operation is 
the bottleneck?</p>
+
+<p>To help answer that, Flink exposes metrics about the degree to which tasks 
are <em>busy</em> (doing work)
+and <em>back-pressured</em> (have the capacity to do work but cannot because 
their successor operators
+cannot accept more results). Candidates for bottlenecks are the busy operators 
whose predecessors
+are back-pressured.</p>
+
+<p>Flink 1.13 brings an improved back pressure metric system (using task 
mailbox timings rather than
+thread stack sampling), and a reworked graphical representation of the job’s 
dataflow with color-coding
+and ratios for busyness and backpressure.</p>
+
+<figure style="align-content: center">
+  <img src="/img/blog/2021-05-03-release-1.13.0/bottleneck.png" style="width: 
900px" />
+</figure>
+
+<p><strong>CPU flame graphs in Web UI</strong></p>
+
+<p>The next question during performance analysis is typically: What part of 
work in the bottlenecked
+operator is expensive?</p>
+
+<p>One visually effective means to investigate that is <em>Flame Graphs</em>. 
They help answer question like:
+  - Which methods are currently consuming CPU resources?
+  - How does one method’s CPU consumption compare to other methods?
+  - Which series of calls on the stack led to executing a particular 
method?</p>
+
+<p>Flame Graphs are constructed by repeatedly sampling the thread stack 
traces. Every method call is
+represented by a bar, where the length of the bar is proportional to the 
number of times it is present
+in the samples. When enabled, the graphs are shown in a new UI component for 
the selected operator.</p>
+
+<figure style="align-content: center">
+  <img src="/img/blog/2021-05-03-release-1.13.0/7.png" style="display: block; 
margin-left: auto; margin-right: auto; width: 600px" />
+</figure>
+
+<p>Flame graphs are expensive to create: They may cause processing overhead 
and can put a heavy load
+on Flink’s metric system. Because of that, users need to explicitly enable 
them in the configuration.</p>
+
+<p><strong>Access Latency Metrics for State</strong></p>
+
+<p>Another possible performance bottleneck can be the state backend, 
especially when your state is larger
+than the main memory available to Flink and you are using the <a 
href="https://ci.apache.org/projects/flink/flink-docs-release-1.13/docs/ops/state/state_backends/#the-embeddedrocksdbstatebackend";>RocksDB
 state backend</a>.</p>
+
+<p>That’s not saying RocksDB is slow (we love RocksDB!), but it has some 
requirements to achieve
+good performance. For example, it is easy to accidentally <a 
href="https://www.ververica.com/blog/the-impact-of-disks-on-rocksdb-state-backend-in-flink-a-case-study";>starve
 RocksDB’s demand for IOPs on cloud setups with
+the wrong type of disk resources</a>.</p>
+
+<p>On top of the CPU flame graphs, the new <em>state backend latency 
metrics</em> can help you understand whether
+your state backend is responsive. For example, if you see that RocksDB state 
accesses start to take
+milliseconds, you probably need to look into your memory and I/O configuration.
+These metrics can be activated by setting the 
<code>state.backend.rocksdb.latency-track-enabled</code> option.
+The metrics are sampled, and their collection should have a marginal impact on 
the RocksDB state
+backend performance.</p>
+
+<h2 id="switching-state-backend-with-savepoints">Switching State Backend with 
savepoints</h2>
+
+<p>You can now change the state backend of a Flink application when resuming 
from a savepoint.
+That means the application’s state is no longer locked into the state backend 
that was used when
+the application was initially started.</p>
+
+<p>This makes it possible, for example, to initially start with the HashMap 
State Backend (pure
+in-memory in JVM Heap) and later switch to the RocksDB State Backend, once the 
state grows
+too large.</p>
+
+<p>Under the hood, Flink now has a canonical savepoint format, which all state 
backends use when
+creating a data snapshot for a savepoint.</p>
+
+<h2 
id="user-specified-pod-templates-for-kubernetes-deployments">User-specified pod 
templates for Kubernetes deployments</h2>
+
+<p>The <a 
href="https://ci.apache.org/projects/flink/flink-docs-release-1.13/docs/deployment/resource-providers/native_kubernetes/";>native
 Kubernetes deployment</a>
+(where Flink actively talks to K8s to start and stop pods) now supports 
<em>custom pod templates</em>.</p>
+
+<p>With those templates, users can set up and configure the JobManagers and 
TaskManagers pods in a
+Kubernetes-y way, with flexibility beyond the configuration options that are 
directly built into
+Flink’s Kubernetes integration.</p>
+
+<h2 id="unaligned-checkpoints---production-ready">Unaligned Checkpoints - 
production-ready</h2>
+
+<p>Unaligned Checkpoints have matured to the point where we encourage all 
users to try them out,
+if they see issues with their application under backpressure.</p>
+
+<p>In particular, these changes make Unaligned Checkpoints easier to use:</p>
+
+<ul>
+  <li>
+    <p>You can now rescale applications from unaligned checkpoints. This comes 
in handy if your
+application needs to be scaled from a retained checkpoint because you cannot 
(afford to) create
+a savepoint.</p>
+  </li>
+  <li>
+    <p>Enabling unaligned checkpoints is cheaper for applications that are not 
back-pressured.
+Unaligned checkpoints can now trigger adaptively with a timeout, meaning a 
checkpoint starts
+as an aligned checkpoint (not storing any in-flight events) and falls back to 
an unaligned
+checkpoint (storing some in-flight events), if the alignment phase takes 
longer than a certain
+time.</p>
+  </li>
+</ul>
+
+<p>Find out more about how to enable unaligned checkpoints in the <a 
href="https://ci.apache.org/projects/flink/flink-docs-release-1.13/docs/ops/state/checkpoints/#unaligned-checkpoints";>Checkpointing
 Documentation</a>.</p>
+
+<h2 id="machine-learning-library-moving-to-a-separate-repository">Machine 
Learning Library moving to a separate repository</h2>
+
+<p>To accelerate the development of Flink’s Machine Learning efforts 
(streaming, batch, and
+unified machine learning), the effort has moved to the new repository <a 
href="https://github.com/apache/flink-ml";>flink-ml</a>
+under the Flink project. We here follow a similar approach like the 
<em>Stateful Functions</em> effort,
+where a separate repository has helped to speed up the development by allowing 
for more light-weight
+contribution workflows and separate release cycles.</p>
+
+<p>Stay tuned for more updates in the Machine Learning efforts, like the 
interplay with
+<a href="https://github.com/alibaba/Alink";>ALink</a> (suite of many common 
Machine Learning Algorithms on Flink)
+or the <a href="https://github.com/alibaba/flink-ai-extended";>Flink &amp; 
TensorFlow integration</a>.</p>
+
+<h1 id="notable-sql--table-api-improvements">Notable SQL &amp; Table API 
improvements</h1>
+
+<p>Like in previous releases, SQL and the Table API remain an area of big 
developments.</p>
+
+<h2 id="windows-via-table-valued-functions">Windows via Table-valued 
functions</h2>
+
+<p>Defining time windows is one of the most frequent operations in streaming 
SQL queries.
+Flink 1.13 introduces a new way to define windows: via <em>Table-valued 
Functions</em>.
+This approach is both more expressive (lets you define new types of windows) 
and fully
+in line with the SQL standard.</p>
+
+<p>Flink 1.13 supports <em>TUMBLE</em> and <em>HOP</em> windows in the new 
syntax, <em>SESSION</em> windows will
+follow in a subsequent release. To demonstrate the increased expressiveness, 
consider the two examples
+below.</p>
+
+<p>A new <em>CUMULATE</em> window function that assigns windows with an 
expanding step size until the maximum
+window size is reached:</p>
+
+<div class="highlight"><pre><code class="language-sql"><span 
class="k">SELECT</span> <span class="n">window_time</span><span 
class="p">,</span> <span class="n">window_start</span><span class="p">,</span> 
<span class="n">window_end</span><span class="p">,</span> <span 
class="k">SUM</span><span class="p">(</span><span class="n">price</span><span 
class="p">)</span> <span class="k">AS</span> <span class="n">total_price</span> 
+  <span class="k">FROM</span> <span class="k">TABLE</span><span 
class="p">(</span><span class="n">CUMULATE</span><span class="p">(</span><span 
class="k">TABLE</span> <span class="n">Bid</span><span class="p">,</span> <span 
class="k">DESCRIPTOR</span><span class="p">(</span><span 
class="n">bidtime</span><span class="p">),</span> <span 
class="nb">INTERVAL</span> <span class="s1">&#39;2&#39;</span> <span 
class="n">MINUTES</span><span class="p">,</span> <span 
class="nb">INTERVAL</span> <span [...]
+<span class="k">GROUP</span> <span class="k">BY</span> <span 
class="n">window_start</span><span class="p">,</span> <span 
class="n">window_end</span><span class="p">,</span> <span 
class="n">window_time</span><span class="p">;</span></code></pre></div>
+
+<p>You can reference the window start and window end time of the table-valued 
window functions,
+making new types of constructs possible. Beyond regular windowed aggregations 
and windowed joins,
+you can, for example, now express windowed Top-K aggregations:</p>
+
+<div class="highlight"><pre><code class="language-sql"><span 
class="k">SELECT</span> <span class="n">window_time</span><span 
class="p">,</span> <span class="p">...</span>
+  <span class="k">FROM</span> <span class="p">(</span>
+    <span class="k">SELECT</span> <span class="o">*</span><span 
class="p">,</span> <span class="n">ROW_NUMBER</span><span class="p">()</span> 
<span class="n">OVER</span> <span class="p">(</span><span 
class="n">PARTITION</span> <span class="k">BY</span> <span 
class="n">window_start</span><span class="p">,</span> <span 
class="n">window_end</span> <span class="k">ORDER</span> <span 
class="k">BY</span> <span class="n">total_price</span> <span 
class="k">DESC</span><span class="p">)</span> 
+      <span class="k">as</span> <span class="n">rank</span> 
+    <span class="k">FROM</span> <span class="n">t</span>
+  <span class="p">)</span> <span class="k">WHERE</span> <span 
class="n">rank</span> <span class="o">&lt;=</span> <span 
class="mi">100</span><span class="p">;</span></code></pre></div>
+
+<h2 
id="improved-interoperability-between-datastream-api-and-table-apisql">Improved 
interoperability between DataStream API and Table API/SQL</h2>
+
+<p>This release radically simplifies mixing DataStream API and Table API 
programs.</p>
+
+<p>The Table API is a great way to develop applications, with its declarative 
nature and its
+many built-in functions. But sometimes, you need to <em>escape</em> to the 
DataStream API for its
+expressiveness, flexibility, and explicit control over the state.</p>
+
+<p>The new methods 
<code>StreamTableEnvironment.toDataStream()/.fromDataStream()</code> can model
+a <code>DataStream</code> from the DataStream API as a table source or sink. 
Types are automatically
+converted, event-time, and watermarks carry across. In addition, the 
<code>Row</code> class (representing
+row events from the Table API) has received a major overhaul (improving the 
behavior of
+<code>toString()</code>/<code>hashCode()</code>/<code>equals()</code> methods) 
and now supports accessing fields by name, with
+support for sparse representations.</p>
+
+<div class="highlight"><pre><code class="language-java"><span 
class="n">Table</span> <span class="n">table</span><span 
class="o">=</span><span class="n">tableEnv</span><span class="o">.</span><span 
class="na">fromDataStream</span><span class="o">(</span>
+       <span class="n">dataStream</span><span class="o">,</span><span 
class="n">Schema</span><span class="o">.</span><span 
class="na">newBuilder</span><span class="o">()</span>
+       <span class="o">.</span><span class="na">columnByMetadata</span><span 
class="o">(</span><span class="s">&quot;rowtime&quot;</span><span 
class="o">,</span><span class="s">&quot;TIMESTAMP(3)&quot;</span><span 
class="o">)</span>
+       <span class="o">.</span><span class="na">watermark</span><span 
class="o">(</span><span class="s">&quot;rowtime&quot;</span><span 
class="o">,</span><span class="s">&quot;SOURCE_WATERMARK()&quot;</span><span 
class="o">)</span>
+       <span class="o">.</span><span class="na">build</span><span 
class="o">());</span>
+
+<span class="n">DataStream</span><span class="o">&lt;</span><span 
class="n">Row</span><span class="o">&gt;</span> <span 
class="n">dataStream</span><span class="o">=</span><span 
class="n">tableEnv</span><span class="o">.</span><span 
class="na">toDataStream</span><span class="o">(</span><span 
class="n">table</span><span class="o">)</span>
+       <span class="o">.</span><span class="na">keyBy</span><span 
class="o">(</span><span class="n">r</span><span class="o">-&gt;</span><span 
class="n">r</span><span class="o">.</span><span class="na">getField</span><span 
class="o">(</span><span class="s">&quot;user&quot;</span><span 
class="o">))</span>
+       <span class="o">.</span><span class="na">window</span><span 
class="o">(...)</span></code></pre></div>
+
+<h2 id="sql-client-init-scripts-and-statement-sets">SQL Client: Init scripts 
and Statement Sets</h2>
+
+<p>The SQL Client is a convenient way to run and deploy SQL streaming and 
batch jobs directly,
+without writing any code from the command line, or as part of a CI/CD 
workflow.</p>
+
+<p>This release vastly improves the functionality of the SQL client. Almost 
all operations as that
+are available to Java applications (when programmatically launching queries 
from the
+<code>TableEnvironment</code>) are now supported in the SQL Client and as SQL 
scripts.
+That means SQL users need much less glue code for their SQL deployments.</p>
+
+<p><strong>Easier Configuration and Code Sharing</strong></p>
+
+<p>The support of YAML files to configure the SQL Client will be discontinued. 
Instead, the client
+accepts one or more <em>initialization scripts</em> to configure a session 
before the main SQL script
+gets executed.</p>
+
+<p>These init scripts would typically be shared across teams/deployments and 
could be used for
+loading common catalogs, applying common configuration settings, or defining 
standard views.</p>
+
+<div class="highlight"><pre><code>./sql-client.sh -i init1.sql init2.sql -f 
sqljob.sql
+</code></pre></div>
+
+<p><strong>More config options</strong></p>
+
+<p>A greater set of recognized config options and improved 
<code>SET</code>/<code>RESET</code> commands make it easier to
+define and control the execution from within the SQL client and SQL 
scripts.</p>
+
+<p><strong>Multi-query Support with Statement Sets</strong></p>
+
+<p>Multi-query execution lets you execute multiple SQL queries (or statements) 
as a single Flink job.
+This is particularly useful for streaming SQL queries that run 
indefinitely.</p>
+
+<p><em>Statement Sets</em> are the mechanism to group the queries together 
that should be executed together.</p>
+
+<p>The following is an example of a SQL script that can be run via the SQL 
client. It sets up and
+configures the environment and executes multiple queries. The script captures 
end-to-end the
+queries and all environment setup and configuration work, making it a 
self-contained deployment
+artifact.</p>
+
+<div class="highlight"><pre><code class="language-sql"><span class="c1">-- set 
up a catalog</span>
+<span class="k">CREATE</span> <span class="k">CATALOG</span> <span 
class="n">hive_catalog</span> <span class="k">WITH</span> <span 
class="p">(</span><span class="s1">&#39;type&#39;</span> <span 
class="o">=</span> <span class="s1">&#39;hive&#39;</span><span 
class="p">);</span>
+<span class="n">USE</span> <span class="k">CATALOG</span> <span 
class="n">hive_catalog</span><span class="p">;</span>
+
+<span class="c1">-- or use temporary objects</span>
+<span class="k">CREATE</span> <span class="k">TEMPORARY</span> <span 
class="k">TABLE</span> <span class="n">clicks</span> <span class="p">(</span>
+  <span class="n">user_id</span> <span class="nb">BIGINT</span><span 
class="p">,</span>
+  <span class="n">page_id</span> <span class="nb">BIGINT</span><span 
class="p">,</span>
+  <span class="n">viewtime</span> <span class="k">TIMESTAMP</span>
+<span class="p">)</span> <span class="k">WITH</span> <span class="p">(</span>
+  <span class="s1">&#39;connector&#39;</span> <span class="o">=</span> <span 
class="s1">&#39;kafka&#39;</span><span class="p">,</span>
+  <span class="s1">&#39;topic&#39;</span> <span class="o">=</span> <span 
class="s1">&#39;clicks&#39;</span><span class="p">,</span>
+  <span class="s1">&#39;properties.bootstrap.servers&#39;</span> <span 
class="o">=</span> <span class="s1">&#39;...&#39;</span><span class="p">,</span>
+  <span class="s1">&#39;format&#39;</span> <span class="o">=</span> <span 
class="s1">&#39;avro&#39;</span>
+<span class="p">);</span>
+
+<span class="c1">-- set the execution mode for jobs</span>
+<span class="k">SET</span> <span class="n">execution</span><span 
class="p">.</span><span class="n">runtime</span><span class="o">-</span><span 
class="k">mode</span><span class="o">=</span><span 
class="n">streaming</span><span class="p">;</span>
+
+<span class="c1">-- set the sync/async mode for INSERT INTOs</span>
+<span class="k">SET</span> <span class="k">table</span><span 
class="p">.</span><span class="n">dml</span><span class="o">-</span><span 
class="n">sync</span><span class="o">=</span><span class="k">false</span><span 
class="p">;</span>
+
+<span class="c1">-- set the job&#39;s parallelism</span>
+<span class="k">SET</span> <span class="n">parallism</span><span 
class="p">.</span><span class="k">default</span><span class="o">=</span><span 
class="mi">10</span><span class="p">;</span>
+
+<span class="c1">-- set the job name</span>
+<span class="k">SET</span> <span class="n">pipeline</span><span 
class="p">.</span><span class="n">name</span> <span class="o">=</span> <span 
class="n">my_flink_job</span><span class="p">;</span>
+
+<span class="c1">-- restore state from the specific savepoint path</span>
+<span class="k">SET</span> <span class="n">execution</span><span 
class="p">.</span><span class="n">savepoint</span><span class="p">.</span><span 
class="n">path</span><span class="o">=/</span><span class="n">tmp</span><span 
class="o">/</span><span class="n">flink</span><span class="o">-</span><span 
class="n">savepoints</span><span class="o">/</span><span 
class="n">savepoint</span><span class="o">-</span><span 
class="n">bb0dab</span><span class="p">;</span>
+
+<span class="k">BEGIN</span> <span class="k">STATEMENT</span> <span 
class="k">SET</span><span class="p">;</span>
+
+<span class="k">INSERT</span> <span class="k">INTO</span> <span 
class="n">pageview_pv_sink</span>
+<span class="k">SELECT</span> <span class="n">page_id</span><span 
class="p">,</span> <span class="k">count</span><span class="p">(</span><span 
class="mi">1</span><span class="p">)</span> <span class="k">FROM</span> <span 
class="n">clicks</span> <span class="k">GROUP</span> <span class="k">BY</span> 
<span class="n">page_id</span><span class="p">;</span>
+
+<span class="k">INSERT</span> <span class="k">INTO</span> <span 
class="n">pageview_uv_sink</span>
+<span class="k">SELECT</span> <span class="n">page_id</span><span 
class="p">,</span> <span class="k">count</span><span class="p">(</span><span 
class="k">distinct</span> <span class="n">user_id</span><span 
class="p">)</span> <span class="k">FROM</span> <span class="n">clicks</span> 
<span class="k">GROUP</span> <span class="k">BY</span> <span 
class="n">page_id</span><span class="p">;</span>
+
+<span class="k">END</span><span class="p">;</span></code></pre></div>
+
+<h2 id="hive-query-syntax-compatibility">Hive query syntax compatibility</h2>
+
+<p>You can now write SQL queries against Flink using the Hive SQL syntax.
+In addition to Hive’s DDL dialect, Flink now also accepts the commonly-used 
Hive DML and DQL
+dialects.</p>
+
+<p>To use the Hive SQL dialect, set <code>table.sql-dialect</code> to 
<code>hive</code> and load the <code>HiveModule</code>.
+The latter is important because Hive’s built-in functions are required for 
proper syntax and
+semantics compatibility. The following example illustrates that:</p>
+
+<div class="highlight"><pre><code class="language-sql"><span 
class="k">CREATE</span> <span class="k">CATALOG</span> <span 
class="n">myhive</span> <span class="k">WITH</span> <span 
class="p">(</span><span class="s1">&#39;type&#39;</span> <span 
class="o">=</span> <span class="s1">&#39;hive&#39;</span><span 
class="p">);</span> <span class="c1">-- setup HiveCatalog</span>
+<span class="n">USE</span> <span class="k">CATALOG</span> <span 
class="n">myhive</span><span class="p">;</span>
+<span class="k">LOAD</span> <span class="n">MODULE</span> <span 
class="n">hive</span><span class="p">;</span> <span class="c1">-- setup 
HiveModule</span>
+<span class="n">USE</span> <span class="n">MODULES</span> <span 
class="n">hive</span><span class="p">,</span><span class="n">core</span><span 
class="p">;</span>
+<span class="k">SET</span> <span class="k">table</span><span 
class="p">.</span><span class="k">sql</span><span class="o">-</span><span 
class="n">dialect</span> <span class="o">=</span> <span 
class="n">hive</span><span class="p">;</span> <span class="c1">-- enable Hive 
dialect</span>
+<span class="k">SELECT</span> <span class="k">key</span><span 
class="p">,</span> <span class="n">value</span> <span class="k">FROM</span> 
<span class="n">src</span> <span class="k">CLUSTER</span> <span 
class="k">BY</span> <span class="k">key</span><span class="p">;</span> <span 
class="c1">-- run some Hive queries</span></code></pre></div>
+
+<p>Please note that the Hive dialect no longer supports Flink’s SQL syntax for 
DML and DQL statements.
+Switch back to the <code>default</code> dialect for Flink’s syntax.</p>
+
+<h2 id="improved-behavior-of-sql-time-functions">Improved behavior of SQL time 
functions</h2>
+
+<p>Working with time is a crucial element of any data processing. But 
simultaneously, handling different
+time zones, dates, and times is an <a 
href="https://xkcd.com/1883/";>increadibly delicate task</a> when working with 
data.</p>
+
+<p>In Flink 1.13. we put much effort into simplifying the usage of 
time-related functions. We adjusted (made
+more specific) the return types of functions such as: <code>PROCTIME()</code>, 
<code>CURRENT_TIMESTAMP</code>, <code>NOW()</code>.</p>
+
+<p>Moreover, you can now also define an event time attribute on a 
<em>TIMESTAMP_LTZ</em> column to gracefully
+do window processing with the support of Daylight Saving Time.</p>
+
+<p>Please see the release notes for a complete list of changes.</p>
+
+<hr />
+
+<h1 id="notable-pyflink-improvements">Notable PyFlink improvements</h1>
+
+<p>The general theme of this release in PyFlink is to bring the Python 
DataStream API and Table API
+closer to feature parity with the Java/Scala APIs.</p>
+
+<h3 id="stateful-operations-in-the-python-datastream-api">Stateful operations 
in the Python DataStream API</h3>
+
+<p>With Flink 1.13, Python programmers now also get to enjoy the full 
potential of Apache Flink’s
+stateful stream processing APIs. The rearchitected Python DataStream API, 
introduced in Flink 1.12,
+now has full stateful capabilities, allowing users to remember information 
from events in the state
+and act on it later.</p>
+
+<p>That stateful processing capability is the basis of many of the more 
sophisticated processing
+operations, which need to remember information across individual events (for 
example, Windowing
+Operations).</p>
+
+<p>This example shows a custom counting window implementation, using state:</p>
+
+<div class="highlight"><pre><code class="language-python"><span 
class="k">class</span> <span class="nc">CountWindowAverage</span><span 
class="p">(</span><span class="n">FlatMapFunction</span><span 
class="p">):</span>
+    <span class="k">def</span> <span class="nf">__init__</span><span 
class="p">(</span><span class="bp">self</span><span class="p">,</span> <span 
class="n">window_size</span><span class="p">):</span>
+        <span class="bp">self</span><span class="o">.</span><span 
class="n">window_size</span> <span class="o">=</span> <span 
class="n">window_size</span>
+
+    <span class="k">def</span> <span class="nf">open</span><span 
class="p">(</span><span class="bp">self</span><span class="p">,</span> <span 
class="n">runtime_context</span><span class="p">:</span> <span 
class="n">RuntimeContext</span><span class="p">):</span>
+        <span class="n">descriptor</span> <span class="o">=</span> <span 
class="n">ValueStateDescriptor</span><span class="p">(</span><span 
class="s">&quot;average&quot;</span><span class="p">,</span> <span 
class="n">Types</span><span class="o">.</span><span class="n">TUPLE</span><span 
class="p">([</span><span class="n">Types</span><span class="o">.</span><span 
class="n">LONG</span><span class="p">(),</span> <span 
class="n">Types</span><span class="o">.</span><span class="n">LONG</span>< [...]
+        <span class="bp">self</span><span class="o">.</span><span 
class="n">sum</span> <span class="o">=</span> <span 
class="n">runtime_context</span><span class="o">.</span><span 
class="n">get_state</span><span class="p">(</span><span 
class="n">descriptor</span><span class="p">)</span>
+
+    <span class="k">def</span> <span class="nf">flat_map</span><span 
class="p">(</span><span class="bp">self</span><span class="p">,</span> <span 
class="n">value</span><span class="p">):</span>
+        <span class="n">current_sum</span> <span class="o">=</span> <span 
class="bp">self</span><span class="o">.</span><span class="n">sum</span><span 
class="o">.</span><span class="n">value</span><span class="p">()</span>
+        <span class="k">if</span> <span class="n">current_sum</span> <span 
class="ow">is</span> <span class="bp">None</span><span class="p">:</span>
+            <span class="n">current_sum</span> <span class="o">=</span> <span 
class="p">(</span><span class="mi">0</span><span class="p">,</span> <span 
class="mi">0</span><span class="p">)</span>
+        <span class="c"># update the count</span>
+        <span class="n">current_sum</span> <span class="o">=</span> <span 
class="p">(</span><span class="n">current_sum</span><span 
class="p">[</span><span class="mi">0</span><span class="p">]</span> <span 
class="o">+</span> <span class="mi">1</span><span class="p">,</span> <span 
class="n">current_sum</span><span class="p">[</span><span 
class="mi">1</span><span class="p">]</span> <span class="o">+</span> <span 
class="n">value</span><span class="p">[</span><span class="mi">1</span><span c 
[...]
+        <span class="c"># if the count reaches window_size, emit the average 
and clear the state</span>
+        <span class="k">if</span> <span class="n">current_sum</span><span 
class="p">[</span><span class="mi">0</span><span class="p">]</span> <span 
class="o">&gt;=</span> <span class="bp">self</span><span 
class="o">.</span><span class="n">window_size</span><span class="p">:</span>
+            <span class="bp">self</span><span class="o">.</span><span 
class="n">sum</span><span class="o">.</span><span class="n">clear</span><span 
class="p">()</span>
+            <span class="k">yield</span> <span class="n">value</span><span 
class="p">[</span><span class="mi">0</span><span class="p">],</span> <span 
class="n">current_sum</span><span class="p">[</span><span 
class="mi">1</span><span class="p">]</span> <span class="o">//</span> <span 
class="n">current_sum</span><span class="p">[</span><span 
class="mi">0</span><span class="p">]</span>
+        <span class="k">else</span><span class="p">:</span>
+            <span class="bp">self</span><span class="o">.</span><span 
class="n">sum</span><span class="o">.</span><span class="n">update</span><span 
class="p">(</span><span class="n">current_sum</span><span class="p">)</span>
+
+<span class="n">ds</span> <span class="o">=</span> <span class="o">...</span>  
<span class="c"># type: DataStream</span>
+<span class="n">ds</span><span class="o">.</span><span 
class="n">key_by</span><span class="p">(</span><span class="k">lambda</span> 
<span class="n">row</span><span class="p">:</span> <span 
class="n">row</span><span class="p">[</span><span class="mi">0</span><span 
class="p">])</span> \
+  <span class="o">.</span><span class="n">flat_map</span><span 
class="p">(</span><span class="n">CountWindowAverage</span><span 
class="p">(</span><span class="mi">5</span><span 
class="p">))</span></code></pre></div>
+
+<h3 id="user-defined-windows-in-the-pyflink-datastream-api">User-defined 
Windows in the PyFlink DataStream API</h3>
+
+<p>Flink 1.13 adds support for user-defined windows to the PyFlink DataStream 
API. Programs can now use
+windows beyond the standard window definitions.</p>
+
+<p>Because windows are at the heart of all programs that process unbounded 
streams (by splitting the
+stream into “buckets” of bounded size), this greatly increases the 
expressiveness of the API.</p>
+
+<h3 id="row-based-operation-in-the-pyflink-table-api">Row-based operation in 
the PyFlink Table API</h3>
+
+<p>The Python Table API now supports row-based operations, i.e., custom 
transformation functions on rows.
+These functions are an easy way to apply data transformations on tables beyond 
the built-in functions.</p>
+
+<p>This is an example of using a <code>map()</code> operation in Python Table 
API:
+```python
+@udf(result_type=DataTypes.ROW(
+  [DataTypes.FIELD(“c1”, DataTypes.BIGINT()),
+   DataTypes.FIELD(“c2”, DataTypes.STRING())]))
+def increment_column(r: Row) -&gt; Row:
+  return Row(r[0] + 1, r[1])</p>
+
+<p>table = …  # type: Table
+mapped_result = table.map(increment_column)
+```</p>
+
+<p>In addition to <code>map()</code>, the API also supports 
<code>flat_map()</code>, <code>aggregate()</code>, 
<code>flat_aggregate()</code>,
+and other row-based operations. This brings the Python Table API a big step 
closer to feature
+parity with the Java Table API.</p>
+
+<h3 id="batch-execution-mode-for-pyflink-datastream-programs">Batch execution 
mode for PyFlink DataStream programs</h3>
+
+<p>The PyFlink DataStream API now also supports the batch execution mode for 
bounded streams,
+which was introduced for the Java DataStream API in Flink 1.12.</p>
+
+<p>The batch execution mode simplifies operations and improves the performance 
of programs on bounded streams,
+by exploiting the bounded stream nature to bypass state backends and 
checkpoints.</p>
+
+<h1 id="other-improvements">Other improvements</h1>
+
+<p><strong>Flink Documentation via Hugo</strong></p>
+
+<p>The Flink Documentation has been migrated from Jekyll to Hugo. If you find 
something missing, please let us know.
+We are also curious to hear if you like the new look &amp; feel.</p>
+
+<p><strong>Exception histories in the Web UI</strong></p>
+
+<p>The Flink Web UI will present up to <em>n</em> last exceptions that caused 
a job to fail.
+That helps to debug scenarios where a root failure caused subsequent failures. 
The root failure
+cause can be found in the exception history.</p>
+
+<p><strong>Better exception / failure-cause reporting for unsuccessful 
checkpoints</strong></p>
+
+<p>Flink now provides statistics for checkpoints that failed or were aborted 
to make it easier
+to determine the failure cause without having to analyze the logs.</p>
+
+<p>Prior versions of Flink were reporting metrics (e.g., size of persisted 
data, trigger time)
+only in case a checkpoint succeeded.</p>
+
+<p><strong>Exactly-once JDBC sink</strong></p>
+
+<p>From 1.13, JDBC sink can guarantee exactly-once delivery of results for 
XA-compliant databases
+by transactionally committing results on checkpoints. The target database must 
have (or be linked
+to) an XA Transaction Manager.</p>
+
+<p>The connector exists currently only for the <em>DataStream API</em>, and 
can be created through the
+<code>JdbcSink.exactlyOnceSink(...)</code> method (or by instantiating the 
<code>JdbcXaSinkFunction</code> directly).</p>
+
+<p><strong>PyFlink Table API supports User-Defined Aggregate Functions in 
Group Windows</strong></p>
+
+<p>Group Windows in PyFlink’s Table API now support both general Python 
User-defined Aggregate
+Functions (UDAFs) and Pandas UDAFs. Such functions are critical to many 
analysis- and ML training
+programs.</p>
+
+<p>Flink 1.13 improves upon previous releases, where these functions were only 
supported
+in unbounded Group-by aggregations.</p>
+
+<p><strong>Improved Sort-Merge Shuffle for Batch Execution</strong></p>
+
+<p>Flink 1.13 improves the memory stability and performance of the 
<em>sort-merge blocking shuffle</em>
+for batch-executed programs, initially introduced in Flink 1.12 via <a 
href="https://cwiki.apache.org/confluence/display/FLINK/FLIP-148%3A+Introduce+Sort-Merge+Based+Blocking+Shuffle+to+Flink";>FLIP-148</a>.</p>
+
+<p>Programs with higher parallelism (1000s) should no longer frequently 
trigger <em>OutOfMemoryError: Direct Memory</em>.
+The performance (especially on spinning disks) is improved through better I/O 
scheduling
+and broadcast optimizations.</p>
+
+<p><strong>HBase connector supports async lookup and lookup cache</strong></p>
+
+<p>The HBase Lookup Table Source now supports an <em>async lookup mode</em> 
and a lookup cache.
+This greatly benefits the performance of Table/SQL jobs with lookup joins 
against HBase, while
+reducing the I/O requests to HBase in the typical case.</p>
+
+<p>In prior versions, the HBase Lookup Source only communicated synchronously, 
resulting in lower
+pipeline utilization and throughput.</p>
+
+<h1 id="change-to-consider-when-upgrading-to-flink-113">Change to consider 
when upgrading to Flink 1.13</h1>
+
+<ul>
+  <li><a 
href="https://issues.apache.org/jira/browse/FLINK-21709";>FLINK-21709</a> - The 
old planner of the Table &amp;
+SQL API has been deprecated in Flink 1.13 and will be dropped in Flink 1.14.
+The <em>Blink</em> engine has been the default planner for some releases now 
and will be the only one going forward.
+That means that both the <code>BatchTableEnvironment</code> and SQL/DataSet 
interoperability are reaching
+the end of life. Please use the unified <code>TableEnvironment</code> for 
batch and stream processing going forward.</li>
+  <li><a 
href="https://issues.apache.org/jira/browse/FLINK-22352";>FLINK-22352</a> The 
community decided to deprecate
+the Apache Mesos support for Apache Flink. It is subject to removal in the 
future. Users are
+encouraged to switch to a different resource manager.</li>
+  <li><a 
href="https://issues.apache.org/jira/browse/FLINK-21935";>FLINK-21935</a> - The 
<code>state.backend.async</code>
+option is deprecated. Snapshots are always asynchronous now (as they were by 
default before) and
+there is no option to configure a synchronous snapshot anymore.</li>
+  <li><a 
href="https://issues.apache.org/jira/browse/FLINK-17012";>FLINK-17012</a> - The 
tasks’ <code>RUNNING</code> state was split
+into two states: <code>INITIALIZING</code> and <code>RUNNING</code>. A task is 
<code>INITIALIZING</code> while it loads the checkpointed state,
+and, in the case of unaligned checkpoints, until the checkpointed in-flight 
data has been recovered.
+This lets monitoring systems better determine when the tasks are really back 
to doing work by making
+the phase for state restoring explicit.</li>
+  <li><a 
href="https://issues.apache.org/jira/browse/FLINK-21698";>FLINK-21698</a> - The 
<em>CAST</em> operation between the
+NUMERIC type and the TIMESTAMP type is problematic and therefore no longer 
supported: Statements like 
+<code>CAST(numeric AS TIMESTAMP(3))</code> will now fail. Please use 
<code>TO_TIMESTAMP(FROM_UNIXTIME(numeric))</code> instead.</li>
+  <li><a 
href="https://issues.apache.org/jira/browse/FLINK-22133";>FLINK-22133</a> The 
unified source API for connectors
+has a minor breaking change: The <code>SplitEnumerator.snapshotState()</code> 
method was adjusted to accept the
+<em>Checkpoint ID</em> of the checkpoint for which the snapshot is 
created.</li>
+</ul>
+
+<h1 id="resources">Resources</h1>
+
+<p>The binary distribution and source artifacts are now available on the 
updated <a href="/downloads.html">Downloads page</a>
+of the Flink website, and the most recent distribution of PyFlink is available 
on <a href="https://pypi.org/project/apache-flink/";>PyPI</a>.</p>
+
+<p>Please review the <a 
href="https://ci.apache.org/projects/flink/flink-docs-release-1.13/release-notes/flink-1.13.html";>release
 notes</a>
+carefully if you plan to upgrade your setup to Flink 1.13. This version is 
API-compatible with
+previous 1.x releases for APIs annotated with the <code>@Public</code> 
annotation.</p>
+
+<p>You can also check the complete <a 
href="https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522&amp;version=12349287";>release
 changelog</a> 
+and <a 
href="https://ci.apache.org/projects/flink/flink-docs-release-1.13/";>updated 
documentation</a> for a detailed list of changes and new features.</p>
+
+<h1 id="list-of-contributors">List of Contributors</h1>
+
+<p>The Apache Flink community would like to thank each one of the contributors 
that have
+made this release possible:</p>
+
+<p>acqua.csq, AkisAya, Alexander Fedulov, Aljoscha Krettek, Ammar Al-Batool, 
Andrey Zagrebin, anlen321,
+Anton Kalashnikov, appleyuchi, Arvid Heise, Austin Cawley-Edwards, austin ce, 
azagrebin, blublinsky,
+Brian Zhou, bytesmithing, caozhen1937, chen qin, Chesnay Schepler, Congxian 
Qiu, Cristian,
+cxiiiiiii, Danny Chan, Danny Cranmer, David Anderson, Dawid Wysakowicz, 
dbgp2021, Dian Fu,
+DinoZhang, dixingxing, Dong Lin, Dylan Forciea, est08zw, Etienne Chauchot, 
fanrui03, Flora Tao,
+FLRNKS, fornaix, fuyli, George, Giacomo Gamba, GitHub, godfrey he, GuoWei Ma, 
Gyula Fora,
+hackergin, hameizi, Haoyuan Ge, Harshvardhan Chauhan, Haseeb Asif, hehuiyuan, 
huangxiao, HuangXiao,
+huangxingbo, HuangXingBo, humengyu2012, huzekang, Hwanju Kim, Ingo Bürk, I. 
Raleigh, Ivan, iyupeng,
+Jack, Jane, Jark Wu, Jerry Wang, Jiangjie (Becket) Qin, JiangXin, Jiayi Liao, 
JieFang.He, Jie Wang,
+jinfeng, Jingsong Lee, JingsongLi, Jing Zhang, Joao Boto, JohnTeslaa, Jun Qin, 
kanata163, kevin.cyj,
+KevinyhZou, Kezhu Wang, klion26, Kostas Kloudas, kougazhang, Kurt Young, 
laughing, legendtkl,
+leiqiang, Leonard Xu, liaojiayi, Lijie Wang, liming.1018, lincoln lee, 
lincoln-lil, liushouwei,
+liuyufei, LM Kang, lometheus, luyb, Lyn Zhang, Maciej Obuchowski, Maciek 
Próchniak, mans2singh,
+Marek Sabo, Matthias Pohl, meijie, Mika Naylor, Miklos Gergely, Mohit Paliwal, 
Moritz Manner,
+morsapaes, Mulan, Nico Kruber, openopen2, paul8263, Paul Lam, Peidian li, 
pengkangjing, Peter Huang,
+Piotr Nowojski, Qinghui Xu, Qingsheng Ren, Raghav Kumar Gautam, Rainie Li, 
Ricky Burnett, Rion
+Williams, Robert Metzger, Roc Marshal, Roman, Roman Khachatryan, Ruguo,
+Ruguo Yu, Rui Li, Sebastian Liu, Seth Wiesman, sharkdtu, sharkdtu(涂小刚), 
Shengkai, shizhengchao,
+shouweikun, Shuo Cheng, simenliuxing, SteNicholas, Stephan Ewen, Suo Lu, 
sv3ndk, Svend Vanderveken,
+taox, Terry Wang, Thelgis Kotsos, Thesharing, Thomas Weise, Till Rohrmann, 
Timo Walther, Ting Sun,
+totoro, totorooo, TsReaper, Tzu-Li (Gordon) Tai, V1ncentzzZ, vthinkxie, 
wangfeifan, wangpeibin,
+wangyang0918, wangyemao-github, Wei Zhong, Wenlong Lyu, wineandcheeze, wjc, 
xiaoHoly, Xintong Song,
+xixingya, xmarker, Xue Wang, Yadong Xie, yangsanity, Yangze Guo, Yao Zhang, 
Yuan Mei, yulei0824, Yu
+Li, Yun Gao, Yun Tang, yuruguo, yushujun, Yuval Itzchakov, yuzhao.cyz, zck, 
zhangjunfan,
+zhangzhengqi3, zhao_wei_nan, zhaown, zhaoxing, Zhenghua Gao, Zhenqiu Huang, 
zhisheng, zhongqishang,
+zhushang, zhuxiaoshang, Zhu Zhu, zjuwangg, zoucao, zoudan, 左元, 星, 肖佳文, 龙三</p>
+
+
+      </article>
+    </div>
+
+    <div class="row">
+      <div id="disqus_thread"></div>
+      <script type="text/javascript">
+        /* * * CONFIGURATION VARIABLES: EDIT BEFORE PASTING INTO YOUR WEBPAGE 
* * */
+        var disqus_shortname = 'stratosphere-eu'; // required: replace example 
with your forum shortname
+
+        /* * * DON'T EDIT BELOW THIS LINE * * */
+        (function() {
+            var dsq = document.createElement('script'); dsq.type = 
'text/javascript'; dsq.async = true;
+            dsq.src = '//' + disqus_shortname + '.disqus.com/embed.js';
+             (document.getElementsByTagName('head')[0] || 
document.getElementsByTagName('body')[0]).appendChild(dsq);
+        })();
+      </script>
+    </div>
+  </div>
+</div>
+      </div>
+    </div>
+
+    <hr />
+
+    <div class="row">
+      <div class="footer text-center col-sm-12">
+        <p>Copyright © 2014-2021 <a href="http://apache.org";>The Apache 
Software Foundation</a>. All Rights Reserved.</p>
+        <p>Apache Flink, Flink®, Apache®, the squirrel logo, and the Apache 
feather logo are either registered trademarks or trademarks of The Apache 
Software Foundation.</p>
+        <p><a href="/privacy-policy.html">Privacy Policy</a> &middot; <a 
href="/blog/feed.xml">RSS feed</a></p>
+      </div>
+    </div>
+    </div><!-- /.container -->
+
+    <!-- Include all compiled plugins (below), or include individual files as 
needed -->
+    <script src="/js/jquery.matchHeight-min.js"></script>
+    <script src="/js/bootstrap.min.js"></script>
+    <script src="/js/codetabs.js"></script>
+    <script src="/js/stickysidebar.js"></script>
+
+    <!-- Google Analytics -->
+    <script>
+      
(function(i,s,o,g,r,a,m){i['GoogleAnalyticsObject']=r;i[r]=i[r]||function(){
+      (i[r].q=i[r].q||[]).push(arguments)},i[r].l=1*new 
Date();a=s.createElement(o),
+      
m=s.getElementsByTagName(o)[0];a.async=1;a.src=g;m.parentNode.insertBefore(a,m)
+      
})(window,document,'script','//www.google-analytics.com/analytics.js','ga');
+
+      ga('create', 'UA-52545728-1', 'auto');
+      ga('send', 'pageview');
+    </script>
+  </body>
+</html>
diff --git a/img/blog/2021-05-03-release-1.13.0/7.png 
b/img/blog/2021-05-03-release-1.13.0/7.png
new file mode 100644
index 0000000..7801328
Binary files /dev/null and b/img/blog/2021-05-03-release-1.13.0/7.png differ
diff --git a/img/blog/2021-05-03-release-1.13.0/bottleneck.png 
b/img/blog/2021-05-03-release-1.13.0/bottleneck.png
new file mode 100644
index 0000000..7aa2c98
Binary files /dev/null and b/img/blog/2021-05-03-release-1.13.0/bottleneck.png 
differ

[flink-web] 01/02: Add Apache Flink release 1.13.0

Reply via email to