This is an automated email from the ASF dual-hosted git repository.

nkruber pushed a commit to branch asf-site
in repository https://gitbox.apache.org/repos/asf/flink-web.git

commit 6a50d296127bf0fdc2cae7863cfc8b801343414f
Author: Nico Kruber <nico.kru...@gmail.com>
AuthorDate: Tue Apr 14 17:32:53 2020 +0200

    Rebuild website
---
 content/blog/feed.xml                              | 387 ++++++++++----
 content/blog/index.html                            |  38 +-
 content/blog/page10/index.html                     |  43 +-
 content/blog/page11/index.html                     |  25 +
 content/blog/page2/index.html                      |  41 +-
 content/blog/page3/index.html                      |  39 +-
 content/blog/page4/index.html                      |  38 +-
 content/blog/page5/index.html                      |  40 +-
 content/blog/page6/index.html                      |  40 +-
 content/blog/page7/index.html                      |  40 +-
 content/blog/page8/index.html                      |  40 +-
 content/blog/page9/index.html                      |  43 +-
 ...-15-flink-serialization-performance-results.svg |   1 +
 content/index.html                                 |   6 +-
 .../04/15/flink-serialization-tuning-vol-1.html    | 555 +++++++++++++++++++++
 content/zh/index.html                              |   6 +-
 16 files changed, 1138 insertions(+), 244 deletions(-)

diff --git a/content/blog/feed.xml b/content/blog/feed.xml
index 02f6ac5..126b62b 100644
--- a/content/blog/feed.xml
+++ b/content/blog/feed.xml
@@ -7,6 +7,307 @@
 <atom:link href="https://flink.apache.org/blog/feed.xml"; rel="self" 
type="application/rss+xml" />
 
 <item>
+<title>Flink Serialization Tuning Vol. 1: Choosing your Serializer — if you 
can</title>
+<description>&lt;p&gt;Almost every Flink job has to exchange data between its 
operators and since these records may not only be sent to another instance in 
the same JVM but instead to a separate process, records need to be serialized 
to bytes first. Similarly, Flink’s off-heap state-backend is based on a local 
embedded RocksDB instance which is implemented in native C++ code and thus also 
needs transformation into bytes on every state access. Wire and state 
serialization alone can easily [...]
+
+&lt;p&gt;Since serialization is so crucial to your Flink job, we would like to 
highlight Flink’s serialization stack in a series of blog posts starting with 
looking at the different ways Flink can serialize your data types.&lt;/p&gt;
+
+&lt;div class=&quot;page-toc&quot;&gt;
+&lt;ul id=&quot;markdown-toc&quot;&gt;
+  &lt;li&gt;&lt;a href=&quot;#recap-flink-serialization&quot; 
id=&quot;markdown-toc-recap-flink-serialization&quot;&gt;Recap: Flink 
Serialization&lt;/a&gt;&lt;/li&gt;
+  &lt;li&gt;&lt;a href=&quot;#choice-of-serializer&quot; 
id=&quot;markdown-toc-choice-of-serializer&quot;&gt;Choice of 
Serializer&lt;/a&gt;    &lt;ul&gt;
+      &lt;li&gt;&lt;a href=&quot;#pojoserializer&quot; 
id=&quot;markdown-toc-pojoserializer&quot;&gt;PojoSerializer&lt;/a&gt;&lt;/li&gt;
+      &lt;li&gt;&lt;a href=&quot;#tuple-data-types&quot; 
id=&quot;markdown-toc-tuple-data-types&quot;&gt;Tuple Data 
Types&lt;/a&gt;&lt;/li&gt;
+      &lt;li&gt;&lt;a href=&quot;#row-data-types&quot; 
id=&quot;markdown-toc-row-data-types&quot;&gt;Row Data 
Types&lt;/a&gt;&lt;/li&gt;
+      &lt;li&gt;&lt;a href=&quot;#avro&quot; 
id=&quot;markdown-toc-avro&quot;&gt;Avro&lt;/a&gt;        &lt;ul&gt;
+          &lt;li&gt;&lt;a href=&quot;#avro-specific&quot; 
id=&quot;markdown-toc-avro-specific&quot;&gt;Avro Specific&lt;/a&gt;&lt;/li&gt;
+          &lt;li&gt;&lt;a href=&quot;#avro-generic&quot; 
id=&quot;markdown-toc-avro-generic&quot;&gt;Avro Generic&lt;/a&gt;&lt;/li&gt;
+          &lt;li&gt;&lt;a href=&quot;#avro-reflect&quot; 
id=&quot;markdown-toc-avro-reflect&quot;&gt;Avro Reflect&lt;/a&gt;&lt;/li&gt;
+        &lt;/ul&gt;
+      &lt;/li&gt;
+      &lt;li&gt;&lt;a href=&quot;#kryo&quot; 
id=&quot;markdown-toc-kryo&quot;&gt;Kryo&lt;/a&gt;        &lt;ul&gt;
+          &lt;li&gt;&lt;a href=&quot;#disabling-kryo&quot; 
id=&quot;markdown-toc-disabling-kryo&quot;&gt;Disabling 
Kryo&lt;/a&gt;&lt;/li&gt;
+        &lt;/ul&gt;
+      &lt;/li&gt;
+      &lt;li&gt;&lt;a href=&quot;#apache-thrift-via-kryo&quot; 
id=&quot;markdown-toc-apache-thrift-via-kryo&quot;&gt;Apache Thrift (via 
Kryo)&lt;/a&gt;&lt;/li&gt;
+      &lt;li&gt;&lt;a href=&quot;#protobuf-via-kryo&quot; 
id=&quot;markdown-toc-protobuf-via-kryo&quot;&gt;Protobuf (via 
Kryo)&lt;/a&gt;&lt;/li&gt;
+    &lt;/ul&gt;
+  &lt;/li&gt;
+  &lt;li&gt;&lt;a href=&quot;#state-schema-evolution&quot; 
id=&quot;markdown-toc-state-schema-evolution&quot;&gt;State Schema 
Evolution&lt;/a&gt;&lt;/li&gt;
+  &lt;li&gt;&lt;a href=&quot;#performance-comparison&quot; 
id=&quot;markdown-toc-performance-comparison&quot;&gt;Performance 
Comparison&lt;/a&gt;&lt;/li&gt;
+  &lt;li&gt;&lt;a href=&quot;#conclusion&quot; 
id=&quot;markdown-toc-conclusion&quot;&gt;Conclusion&lt;/a&gt;&lt;/li&gt;
+&lt;/ul&gt;
+
+&lt;/div&gt;
+
+&lt;h1 id=&quot;recap-flink-serialization&quot;&gt;Recap: Flink 
Serialization&lt;/h1&gt;
+
+&lt;p&gt;Flink handles &lt;a 
href=&quot;https://ci.apache.org/projects/flink/flink-docs-release-1.10/dev/types_serialization.html&quot;&gt;data
 types and serialization&lt;/a&gt; with its own type descriptors, generic type 
extraction, and type serialization framework. We recommend reading through the 
&lt;a 
href=&quot;https://ci.apache.org/projects/flink/flink-docs-release-1.10/dev/types_serialization.html&quot;&gt;documentation&lt;/a&gt;
 first in order to be able to follow the arguments w [...]
+&lt;code&gt;stream.keyBy(“ruleId”)&lt;/code&gt; or 
+&lt;code&gt;dataSet.join(another).where(&quot;name&quot;).equalTo(&quot;personName&quot;)&lt;/code&gt;.
 It also allows optimizations in the serialization format as well as reducing 
unnecessary de/serializations (mainly in certain Batch operations as well as in 
the SQL/Table APIs).&lt;/p&gt;
+
+&lt;h1 id=&quot;choice-of-serializer&quot;&gt;Choice of Serializer&lt;/h1&gt;
+
+&lt;p&gt;Apache Flink’s out-of-the-box serialization can be roughly divided 
into the following groups:&lt;/p&gt;
+
+&lt;ul&gt;
+  &lt;li&gt;
+    &lt;p&gt;&lt;strong&gt;Flink-provided special serializers&lt;/strong&gt; 
for basic types (Java primitives and their boxed form), arrays, composite types 
(tuples, Scala case classes, Rows), and a few auxiliary types (Option, Either, 
Lists, Maps, …),&lt;/p&gt;
+  &lt;/li&gt;
+  &lt;li&gt;
+    &lt;p&gt;&lt;strong&gt;POJOs&lt;/strong&gt;; a public, standalone class 
with a public no-argument constructor and all non-static, non-transient fields 
in the class hierarchy either public or with a public getter- and a 
setter-method; see &lt;a 
href=&quot;https://ci.apache.org/projects/flink/flink-docs-release-1.10/dev/types_serialization.html#rules-for-pojo-types&quot;&gt;POJO
 Rules&lt;/a&gt;,&lt;/p&gt;
+  &lt;/li&gt;
+  &lt;li&gt;
+    &lt;p&gt;&lt;strong&gt;Generic types&lt;/strong&gt;; user-defined data 
types that are not recognized as a POJO and then serialized via &lt;a 
href=&quot;https://github.com/EsotericSoftware/kryo&quot;&gt;Kryo&lt;/a&gt;.&lt;/p&gt;
+  &lt;/li&gt;
+&lt;/ul&gt;
+
+&lt;p&gt;Alternatively, you can also register &lt;a 
href=&quot;https://ci.apache.org/projects/flink/flink-docs-release-1.10/dev/custom_serializers.html&quot;&gt;custom
 serializers&lt;/a&gt; for user-defined data types. This includes writing your 
own serializers or integrating other serialization systems like &lt;a 
href=&quot;https://developers.google.com/protocol-buffers/&quot;&gt;Google 
Protobuf&lt;/a&gt; or &lt;a 
href=&quot;https://thrift.apache.org/&quot;&gt;Apache Thrift&lt;/a&gt; vi [...]
+
+&lt;h2 id=&quot;pojoserializer&quot;&gt;PojoSerializer&lt;/h2&gt;
+
+&lt;p&gt;As outlined above, if your data type is not covered by a specialized 
serializer but follows the &lt;a 
href=&quot;https://ci.apache.org/projects/flink/flink-docs-release-1.10/dev/types_serialization.html#rules-for-pojo-types&quot;&gt;POJO
 Rules&lt;/a&gt;, it will be serialized with the &lt;a 
href=&quot;https://github.com/apache/flink/blob/release-1.10.0/flink-core/src/main/java/org/apache/flink/api/java/typeutils/runtime/PojoSerializer.java&quot;&gt;PojoSerializer&lt;/a&gt;
 which [...]
+
+&lt;blockquote&gt;
+  &lt;p&gt;15:45:51,460 INFO  
org.apache.flink.api.java.typeutils.TypeExtractor             - Class … cannot 
be used as a POJO type because not all fields are valid POJO fields, and must 
be processed as GenericType. Please read the Flink documentation on “Data Types 
&amp;amp; Serialization” for details of the effect on performance.&lt;/p&gt;
+&lt;/blockquote&gt;
+
+&lt;p&gt;This means, that the PojoSerializer will not be used, but instead 
Flink will fall back to Kryo for serialization (see below). We will have a more 
detailed look into a few (more) situations that can lead to unexpected Kryo 
fallbacks in the second part of this blog post series.&lt;/p&gt;
+
+&lt;h2 id=&quot;tuple-data-types&quot;&gt;Tuple Data Types&lt;/h2&gt;
+
+&lt;p&gt;Flink comes with a predefined set of tuple types which all have a 
fixed length and contain a set of strongly-typed fields of potentially 
different types. There are implementations for &lt;code&gt;Tuple0&lt;/code&gt;, 
&lt;code&gt;Tuple1&amp;lt;T0&amp;gt;&lt;/code&gt;, …, 
&lt;code&gt;Tuple25&amp;lt;T0, T1, ..., T24&amp;gt;&lt;/code&gt; and they may 
serve as easy-to-use wrappers that spare the creation of POJOs for each and 
every combination of objects you need to pass between comp [...]
+
+&lt;div class=&quot;alert alert-info&quot;&gt;
+  &lt;p&gt;&lt;span class=&quot;label label-info&quot; style=&quot;display: 
inline-block&quot;&gt;&lt;span class=&quot;glyphicon glyphicon-info-sign&quot; 
aria-hidden=&quot;true&quot;&gt;&lt;/span&gt; Note&lt;/span&gt;
+ Since &lt;code&gt;Tuple0&lt;/code&gt; does not contain any data and therefore 
is probably a bit special anyway, it will use a special serializer 
implementation: &lt;a 
href=&quot;https://github.com/apache/flink/blob/release-1.10.0/flink-core/src/main/java/org/apache/flink/api/java/typeutils/runtime/Tuple0Serializer.java&quot;&gt;Tuple0Serializer&lt;/a&gt;.&lt;/p&gt;
+&lt;/div&gt;
+
+&lt;h2 id=&quot;row-data-types&quot;&gt;Row Data Types&lt;/h2&gt;
+
+&lt;p&gt;Row types are mainly used by the Table and SQL APIs of Flink. A 
&lt;code&gt;Row&lt;/code&gt; groups an arbitrary number of objects together 
similar to the tuples above. These fields are not strongly typed and may all be 
of different types. Because field types are missing, Flink’s type extraction 
cannot automatically extract type information and users of a 
&lt;code&gt;Row&lt;/code&gt; need to manually tell Flink about the row’s field 
types. The &lt;a href=&quot;https://github.com [...]
+
+&lt;p&gt;Row type information can be provided in two ways:&lt;/p&gt;
+
+&lt;ul&gt;
+  &lt;li&gt;you can have your source or operator implement 
&lt;code&gt;ResultTypeQueryable&amp;lt;Row&amp;gt;&lt;/code&gt;:&lt;/li&gt;
+&lt;/ul&gt;
+
+&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code 
class=&quot;language-java&quot;&gt;&lt;span 
class=&quot;kd&quot;&gt;public&lt;/span&gt; &lt;span 
class=&quot;kd&quot;&gt;static&lt;/span&gt; &lt;span 
class=&quot;kd&quot;&gt;class&lt;/span&gt; &lt;span 
class=&quot;nc&quot;&gt;RowSource&lt;/span&gt; &lt;span 
class=&quot;kd&quot;&gt;implements&lt;/span&gt; &lt;span 
class=&quot;n&quot;&gt;SourceFunction&lt;/span&gt;&lt;span 
class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class= [...]
+  &lt;span class=&quot;c1&quot;&gt;// ...&lt;/span&gt;
+
+  &lt;span class=&quot;nd&quot;&gt;@Override&lt;/span&gt;
+  &lt;span class=&quot;kd&quot;&gt;public&lt;/span&gt; &lt;span 
class=&quot;n&quot;&gt;TypeInformation&lt;/span&gt;&lt;span 
class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span 
class=&quot;n&quot;&gt;Row&lt;/span&gt;&lt;span 
class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span 
class=&quot;nf&quot;&gt;getProducedType&lt;/span&gt;&lt;span 
class=&quot;o&quot;&gt;()&lt;/span&gt; &lt;span 
class=&quot;o&quot;&gt;{&lt;/span&gt;
+    &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span 
class=&quot;n&quot;&gt;Types&lt;/span&gt;&lt;span 
class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span 
class=&quot;na&quot;&gt;ROW&lt;/span&gt;&lt;span 
class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span 
class=&quot;n&quot;&gt;Types&lt;/span&gt;&lt;span 
class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span 
class=&quot;na&quot;&gt;INT&lt;/span&gt;&lt;span 
class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span 
class=&quot;n&quot;&gt;Types&lt;/span&gt;&lt [...]
+  &lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;
+&lt;span 
class=&quot;o&quot;&gt;}&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
+
+&lt;ul&gt;
+  &lt;li&gt;you can provide the types when building the job graph by using 
&lt;code&gt;SingleOutputStreamOperator#returns()&lt;/code&gt;&lt;/li&gt;
+&lt;/ul&gt;
+
+&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code 
class=&quot;language-java&quot;&gt;&lt;span 
class=&quot;n&quot;&gt;DataStream&lt;/span&gt;&lt;span 
class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span 
class=&quot;n&quot;&gt;Row&lt;/span&gt;&lt;span 
class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span 
class=&quot;n&quot;&gt;sourceStream&lt;/span&gt; &lt;span 
class=&quot;o&quot;&gt;=&lt;/span&gt;
+    &lt;span class=&quot;n&quot;&gt;env&lt;/span&gt;&lt;span 
class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span 
class=&quot;na&quot;&gt;addSource&lt;/span&gt;&lt;span 
class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span 
class=&quot;k&quot;&gt;new&lt;/span&gt; &lt;span 
class=&quot;nf&quot;&gt;RowSource&lt;/span&gt;&lt;span 
class=&quot;o&quot;&gt;())&lt;/span&gt;
+        &lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span 
class=&quot;na&quot;&gt;returns&lt;/span&gt;&lt;span 
class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span 
class=&quot;n&quot;&gt;Types&lt;/span&gt;&lt;span 
class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span 
class=&quot;na&quot;&gt;ROW&lt;/span&gt;&lt;span 
class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span 
class=&quot;n&quot;&gt;Types&lt;/span&gt;&lt;span 
class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span 
class=&quot;na&quot;&gt;INT&lt;/span&gt;&lt [...]
+
+&lt;div class=&quot;alert alert-warning&quot;&gt;
+  &lt;p&gt;&lt;span class=&quot;label label-warning&quot; style=&quot;display: 
inline-block&quot;&gt;&lt;span class=&quot;glyphicon 
glyphicon-warning-sign&quot; aria-hidden=&quot;true&quot;&gt;&lt;/span&gt; 
Warning&lt;/span&gt;
+If you fail to provide the type information for a 
&lt;code&gt;Row&lt;/code&gt;, Flink identifies that 
&lt;code&gt;Row&lt;/code&gt; is not a valid POJO type according to the rules 
above and falls back to Kryo serialization (see below) which you will also see 
in the logs as:&lt;/p&gt;
+
+  &lt;p&gt;&lt;code&gt;13:10:11,148 INFO  
org.apache.flink.api.java.typeutils.TypeExtractor             - Class class 
org.apache.flink.types.Row cannot be used as a POJO type because not all fields 
are valid POJO fields, and must be processed as GenericType. Please read the 
Flink documentation on &quot;Data Types &amp;amp; Serialization&quot; for 
details of the effect on performance.&lt;/code&gt;&lt;/p&gt;
+&lt;/div&gt;
+
+&lt;h2 id=&quot;avro&quot;&gt;Avro&lt;/h2&gt;
+
+&lt;p&gt;Flink offers built-in support for the &lt;a 
href=&quot;http://avro.apache.org/&quot;&gt;Apache Avro&lt;/a&gt; serialization 
framework (currently using version 1.8.2) by adding the 
&lt;code&gt;org.apache.flink:flink-avro&lt;/code&gt; dependency into your job. 
Flink’s &lt;a 
href=&quot;https://github.com/apache/flink/blob/release-1.10.0/flink-formats/flink-avro/src/main/java/org/apache/flink/formats/avro/typeutils/AvroSerializer.java&quot;&gt;AvroSerializer&lt;/a&gt;
 can then use A [...]
+
+&lt;h3 id=&quot;avro-specific&quot;&gt;Avro Specific&lt;/h3&gt;
+
+&lt;p&gt;Avro specific records will be automatically detected by checking that 
the given type’s type hierarchy contains the 
&lt;code&gt;SpecificRecordBase&lt;/code&gt; class. You can either specify your 
concrete Avro type, or—if you want to be more generic and allow different types 
in your operator—use the &lt;code&gt;SpecificRecordBase&lt;/code&gt; type (or a 
subtype) in your user functions, in 
&lt;code&gt;ResultTypeQueryable#getProducedType()&lt;/code&gt;, or in 
&lt;code&gt;SingleOutpu [...]
+
+&lt;div class=&quot;alert alert-warning&quot;&gt;
+  &lt;p&gt;&lt;span class=&quot;label label-warning&quot; style=&quot;display: 
inline-block&quot;&gt;&lt;span class=&quot;glyphicon 
glyphicon-warning-sign&quot; aria-hidden=&quot;true&quot;&gt;&lt;/span&gt; 
Warning&lt;/span&gt; If you specify the Flink type as 
&lt;code&gt;SpecificRecord&lt;/code&gt; and not 
&lt;code&gt;SpecificRecordBase&lt;/code&gt;, Flink will not see this as an Avro 
type. Instead, it will use Kryo to de/serialize any objects which may be 
considerably slower.&lt;/p&gt;
+&lt;/div&gt;
+
+&lt;h3 id=&quot;avro-generic&quot;&gt;Avro Generic&lt;/h3&gt;
+
+&lt;p&gt;Avro’s &lt;code&gt;GenericRecord&lt;/code&gt; types cannot, 
unfortunately, be used automatically since they require the user to &lt;a 
href=&quot;https://avro.apache.org/docs/1.8.2/gettingstartedjava.html#Serializing+and+deserializing+without+code+generation&quot;&gt;specify
 a schema&lt;/a&gt; (either manually or by retrieving it from some schema 
registry). With that schema, you can provide the right type information by 
either of the following options just like for the Row Types  [...]
+
+&lt;ul&gt;
+  &lt;li&gt;implement 
&lt;code&gt;ResultTypeQueryable&amp;lt;GenericRecord&amp;gt;&lt;/code&gt;:&lt;/li&gt;
+&lt;/ul&gt;
+
+&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code 
class=&quot;language-java&quot;&gt;&lt;span 
class=&quot;kd&quot;&gt;public&lt;/span&gt; &lt;span 
class=&quot;kd&quot;&gt;static&lt;/span&gt; &lt;span 
class=&quot;kd&quot;&gt;class&lt;/span&gt; &lt;span 
class=&quot;nc&quot;&gt;AvroGenericSource&lt;/span&gt; &lt;span 
class=&quot;kd&quot;&gt;implements&lt;/span&gt; &lt;span 
class=&quot;n&quot;&gt;SourceFunction&lt;/span&gt;&lt;span 
class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;spa [...]
+  &lt;span class=&quot;kd&quot;&gt;private&lt;/span&gt; &lt;span 
class=&quot;kd&quot;&gt;final&lt;/span&gt; &lt;span 
class=&quot;n&quot;&gt;GenericRecordAvroTypeInfo&lt;/span&gt; &lt;span 
class=&quot;n&quot;&gt;producedType&lt;/span&gt;&lt;span 
class=&quot;o&quot;&gt;;&lt;/span&gt;
+
+  &lt;span class=&quot;kd&quot;&gt;public&lt;/span&gt; &lt;span 
class=&quot;nf&quot;&gt;AvroGenericSource&lt;/span&gt;&lt;span 
class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span 
class=&quot;n&quot;&gt;Schema&lt;/span&gt; &lt;span 
class=&quot;n&quot;&gt;schema&lt;/span&gt;&lt;span 
class=&quot;o&quot;&gt;)&lt;/span&gt; &lt;span 
class=&quot;o&quot;&gt;{&lt;/span&gt;
+    &lt;span class=&quot;k&quot;&gt;this&lt;/span&gt;&lt;span 
class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span 
class=&quot;na&quot;&gt;producedType&lt;/span&gt; &lt;span 
class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span 
class=&quot;k&quot;&gt;new&lt;/span&gt; &lt;span 
class=&quot;nf&quot;&gt;GenericRecordAvroTypeInfo&lt;/span&gt;&lt;span 
class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span 
class=&quot;n&quot;&gt;schema&lt;/span&gt;&lt;span 
class=&quot;o&quot;&gt;);&lt;/span&gt;
+  &lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;
+  
+  &lt;span class=&quot;nd&quot;&gt;@Override&lt;/span&gt;
+  &lt;span class=&quot;kd&quot;&gt;public&lt;/span&gt; &lt;span 
class=&quot;n&quot;&gt;TypeInformation&lt;/span&gt;&lt;span 
class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span 
class=&quot;n&quot;&gt;GenericRecord&lt;/span&gt;&lt;span 
class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span 
class=&quot;nf&quot;&gt;getProducedType&lt;/span&gt;&lt;span 
class=&quot;o&quot;&gt;()&lt;/span&gt; &lt;span 
class=&quot;o&quot;&gt;{&lt;/span&gt;
+    &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span 
class=&quot;n&quot;&gt;producedType&lt;/span&gt;&lt;span 
class=&quot;o&quot;&gt;;&lt;/span&gt;
+  &lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;
+&lt;span 
class=&quot;o&quot;&gt;}&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
+&lt;ul&gt;
+  &lt;li&gt;provide type information when building the job graph by using 
&lt;code&gt;SingleOutputStreamOperator#returns()&lt;/code&gt;&lt;/li&gt;
+&lt;/ul&gt;
+
+&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code 
class=&quot;language-java&quot;&gt;&lt;span 
class=&quot;n&quot;&gt;DataStream&lt;/span&gt;&lt;span 
class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span 
class=&quot;n&quot;&gt;GenericRecord&lt;/span&gt;&lt;span 
class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span 
class=&quot;n&quot;&gt;sourceStream&lt;/span&gt; &lt;span 
class=&quot;o&quot;&gt;=&lt;/span&gt;
+    &lt;span class=&quot;n&quot;&gt;env&lt;/span&gt;&lt;span 
class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span 
class=&quot;na&quot;&gt;addSource&lt;/span&gt;&lt;span 
class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span 
class=&quot;k&quot;&gt;new&lt;/span&gt; &lt;span 
class=&quot;nf&quot;&gt;AvroGenericSource&lt;/span&gt;&lt;span 
class=&quot;o&quot;&gt;())&lt;/span&gt;
+        &lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span 
class=&quot;na&quot;&gt;returns&lt;/span&gt;&lt;span 
class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span 
class=&quot;k&quot;&gt;new&lt;/span&gt; &lt;span 
class=&quot;nf&quot;&gt;GenericRecordAvroTypeInfo&lt;/span&gt;&lt;span 
class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span 
class=&quot;n&quot;&gt;schema&lt;/span&gt;&lt;span 
class=&quot;o&quot;&gt;));&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
+&lt;p&gt;Without this type information, Flink will fall back to Kryo for 
serialization which would serialize the schema into every record, over and over 
again. As a result, the serialized form will be bigger and more costly to 
create.&lt;/p&gt;
+
+&lt;div class=&quot;alert alert-info&quot;&gt;
+  &lt;p&gt;&lt;span class=&quot;label label-info&quot; style=&quot;display: 
inline-block&quot;&gt;&lt;span class=&quot;glyphicon glyphicon-info-sign&quot; 
aria-hidden=&quot;true&quot;&gt;&lt;/span&gt; Note&lt;/span&gt;
+ Since Avro’s &lt;code&gt;Schema&lt;/code&gt; class is not serializable, it 
can not be sent around as is. You can work around this by converting it to a 
String and parsing it back when needed. If you only do this once on 
initialization, there is practically no difference to sending it 
directly.&lt;/p&gt;
+&lt;/div&gt;
+
+&lt;h3 id=&quot;avro-reflect&quot;&gt;Avro Reflect&lt;/h3&gt;
+
+&lt;p&gt;The third way of using Avro is to exchange Flink’s PojoSerializer 
(for POJOs according to the rules above) for Avro’s reflection-based 
serializer. This can be enabled by calling&lt;/p&gt;
+
+&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code 
class=&quot;language-java&quot;&gt;&lt;span 
class=&quot;n&quot;&gt;env&lt;/span&gt;&lt;span 
class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span 
class=&quot;na&quot;&gt;getConfig&lt;/span&gt;&lt;span 
class=&quot;o&quot;&gt;().&lt;/span&gt;&lt;span 
class=&quot;na&quot;&gt;enableForceAvro&lt;/span&gt;&lt;span 
class=&quot;o&quot;&gt;();&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
+
+&lt;h2 id=&quot;kryo&quot;&gt;Kryo&lt;/h2&gt;
+
+&lt;p&gt;Any class or object which does not fall into the categories above or 
is covered by a Flink-provided special serializer is de/serialized with a 
fallback to &lt;a 
href=&quot;https://github.com/EsotericSoftware/kryo&quot;&gt;Kryo&lt;/a&gt; 
(currently version 2.24.0) which is a powerful and generic serialization 
framework in Java. Flink calls such a type a &lt;em&gt;generic type&lt;/em&gt; 
and you may stumble upon &lt;code&gt;GenericTypeInfo&lt;/code&gt; when 
debugging code. If you  [...]
+
+&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code 
class=&quot;language-java&quot;&gt;&lt;span 
class=&quot;n&quot;&gt;env&lt;/span&gt;&lt;span 
class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span 
class=&quot;na&quot;&gt;getConfig&lt;/span&gt;&lt;span 
class=&quot;o&quot;&gt;().&lt;/span&gt;&lt;span 
class=&quot;na&quot;&gt;registerKryoType&lt;/span&gt;&lt;span 
class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span 
class=&quot;n&quot;&gt;MyCustomType&lt;/span&gt;&lt;span 
class=&quot;o&quot;&gt;.&lt [...]
+&lt;p&gt;Registering types adds them to an internal map of classes to tags so 
that, during serialization, Kryo does not have to add the fully qualified class 
names as a prefix into the serialized form. Instead, Kryo uses these (integer) 
tags to identify the underlying classes and reduce serialization 
overhead.&lt;/p&gt;
+
+&lt;div class=&quot;alert alert-info&quot;&gt;
+  &lt;p&gt;&lt;span class=&quot;label label-info&quot; style=&quot;display: 
inline-block&quot;&gt;&lt;span class=&quot;glyphicon glyphicon-info-sign&quot; 
aria-hidden=&quot;true&quot;&gt;&lt;/span&gt; Note&lt;/span&gt;
+Flink will store Kryo serializer mappings from type registrations in its 
checkpoints and savepoints and will retain them across job (re)starts.&lt;/p&gt;
+&lt;/div&gt;
+
+&lt;h3 id=&quot;disabling-kryo&quot;&gt;Disabling Kryo&lt;/h3&gt;
+
+&lt;p&gt;If desired, you can disable the Kryo fallback, i.e. the ability to 
serialize generic types, by calling&lt;/p&gt;
+
+&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code 
class=&quot;language-java&quot;&gt;&lt;span 
class=&quot;n&quot;&gt;env&lt;/span&gt;&lt;span 
class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span 
class=&quot;na&quot;&gt;getConfig&lt;/span&gt;&lt;span 
class=&quot;o&quot;&gt;().&lt;/span&gt;&lt;span 
class=&quot;na&quot;&gt;disableGenericTypes&lt;/span&gt;&lt;span 
class=&quot;o&quot;&gt;();&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
+
+&lt;p&gt;This is mostly useful for finding out where these fallbacks are 
applied and replacing them with better serializers. If your job has any generic 
types with this configuration, it will fail with&lt;/p&gt;
+
+&lt;blockquote&gt;
+  &lt;p&gt;Exception in thread “main” java.lang.UnsupportedOperationException: 
Generic types have been disabled in the ExecutionConfig and type … is treated 
as a generic type.&lt;/p&gt;
+&lt;/blockquote&gt;
+
+&lt;p&gt;If you cannot immediately see from the type where it is being used, 
this log message also gives you a stacktrace that can be used to set 
breakpoints and find out more details in your IDE.&lt;/p&gt;
+
+&lt;h2 id=&quot;apache-thrift-via-kryo&quot;&gt;Apache Thrift (via 
Kryo)&lt;/h2&gt;
+
+&lt;p&gt;In addition to the variants above, Flink also allows you to &lt;a 
href=&quot;https://ci.apache.org/projects/flink/flink-docs-release-1.10/dev/custom_serializers.html#register-a-custom-serializer-for-your-flink-program&quot;&gt;register
 other type serialization frameworks&lt;/a&gt; with Kryo. After adding the 
appropriate dependencies from the &lt;a 
href=&quot;https://ci.apache.org/projects/flink/flink-docs-release-1.10/dev/custom_serializers.html#register-a-custom-serializer-for-
 [...]
+
+&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code 
class=&quot;language-java&quot;&gt;&lt;span 
class=&quot;n&quot;&gt;env&lt;/span&gt;&lt;span 
class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span 
class=&quot;na&quot;&gt;getConfig&lt;/span&gt;&lt;span 
class=&quot;o&quot;&gt;().&lt;/span&gt;&lt;span 
class=&quot;na&quot;&gt;addDefaultKryoSerializer&lt;/span&gt;&lt;span 
class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span 
class=&quot;n&quot;&gt;MyCustomType&lt;/span&gt;&lt;span class=&quot;o&quot; 
[...]
+
+&lt;p&gt;This only works if generic types are not disabled and 
&lt;code&gt;MyCustomType&lt;/code&gt; is a Thrift-generated data type. If the 
data type is not generated by Thrift, Flink will fail at runtime with an 
exception like this:&lt;/p&gt;
+
+&lt;blockquote&gt;
+  &lt;p&gt;java.lang.ClassCastException: class MyCustomType cannot be cast to 
class org.apache.thrift.TBase (MyCustomType and org.apache.thrift.TBase are in 
unnamed module of loader ‘app’)&lt;/p&gt;
+&lt;/blockquote&gt;
+
+&lt;div class=&quot;alert alert-info&quot;&gt;
+  &lt;p&gt;&lt;span class=&quot;label label-info&quot; style=&quot;display: 
inline-block&quot;&gt;&lt;span class=&quot;glyphicon glyphicon-info-sign&quot; 
aria-hidden=&quot;true&quot;&gt;&lt;/span&gt; Note&lt;/span&gt;
+Please note that &lt;code&gt;TBaseSerializer&lt;/code&gt; can be registered as 
a default Kryo serializer as above (and as specified in &lt;a 
href=&quot;https://github.com/twitter/chill/blob/v0.7.6/chill-thrift/src/main/java/com/twitter/chill/thrift/TBaseSerializer.java&quot;&gt;its
 documentation&lt;/a&gt;) or via 
&lt;code&gt;registerTypeWithKryoSerializer&lt;/code&gt;. In practice, we found 
both ways working. We also saw no difference between registering Thrift classes 
in addition to the [...]
+&lt;/div&gt;
+
+&lt;h2 id=&quot;protobuf-via-kryo&quot;&gt;Protobuf (via Kryo)&lt;/h2&gt;
+
+&lt;p&gt;In a way similar to Apache Thrift, &lt;a 
href=&quot;https://developers.google.com/protocol-buffers/&quot;&gt;Google 
Protobuf&lt;/a&gt; may be &lt;a 
href=&quot;https://ci.apache.org/projects/flink/flink-docs-release-1.10/dev/custom_serializers.html#register-a-custom-serializer-for-your-flink-program&quot;&gt;registered
 as a custom serializer&lt;/a&gt; after adding the right dependencies 
(&lt;code&gt;com.twitter:chill-protobuf&lt;/code&gt; and 
&lt;code&gt;com.google.protobuf:proto [...]
+
+&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code 
class=&quot;language-java&quot;&gt;&lt;span 
class=&quot;n&quot;&gt;env&lt;/span&gt;&lt;span 
class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span 
class=&quot;na&quot;&gt;getConfig&lt;/span&gt;&lt;span 
class=&quot;o&quot;&gt;().&lt;/span&gt;&lt;span 
class=&quot;na&quot;&gt;registerTypeWithKryoSerializer&lt;/span&gt;&lt;span 
class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span 
class=&quot;n&quot;&gt;MyCustomType&lt;/span&gt;&lt;span class=&quot;o [...]
+&lt;p&gt;This will work as long as generic types have not been disabled (this 
would disable Kryo for good). If &lt;code&gt;MyCustomType&lt;/code&gt; is not a 
Protobuf-generated class, your Flink job will fail at runtime with the 
following exception:&lt;/p&gt;
+
+&lt;blockquote&gt;
+  &lt;p&gt;java.lang.ClassCastException: class 
&lt;code&gt;MyCustomType&lt;/code&gt; cannot be cast to class 
com.google.protobuf.Message (&lt;code&gt;MyCustomType&lt;/code&gt; and 
com.google.protobuf.Message are in unnamed module of loader ‘app’)&lt;/p&gt;
+&lt;/blockquote&gt;
+
+&lt;div class=&quot;alert alert-info&quot;&gt;
+  &lt;p&gt;&lt;span class=&quot;label label-info&quot; style=&quot;display: 
inline-block&quot;&gt;&lt;span class=&quot;glyphicon glyphicon-info-sign&quot; 
aria-hidden=&quot;true&quot;&gt;&lt;/span&gt; Note&lt;/span&gt;
+Please note that &lt;code&gt;ProtobufSerializer&lt;/code&gt; can be registered 
as a default Kryo serializer (as specified in the &lt;a 
href=&quot;https://github.com/twitter/chill/blob/v0.7.6/chill-thrift/src/main/java/com/twitter/chill/thrift/TBaseSerializer.java&quot;&gt;Protobuf
 documentation&lt;/a&gt;) or via 
&lt;code&gt;registerTypeWithKryoSerializer&lt;/code&gt; (as presented here). In 
practice, we found both ways working. We also saw no difference between 
registering your Protobuf  [...]
+&lt;/div&gt;
+
+&lt;h1 id=&quot;state-schema-evolution&quot;&gt;State Schema 
Evolution&lt;/h1&gt;
+
+&lt;p&gt;Before taking a closer look at the performance of each of the 
serializers described above, we would like to emphasize that performance is not 
everything that counts inside a real-world Flink job. Types for storing state, 
for example, should be able to evolve their schema (add/remove/change fields) 
throughout the lifetime of the job without losing previous state. This is what 
Flink calls &lt;a 
href=&quot;https://ci.apache.org/projects/flink/flink-docs-stable/dev/stream/state/sche
 [...]
+
+&lt;h1 id=&quot;performance-comparison&quot;&gt;Performance 
Comparison&lt;/h1&gt;
+
+&lt;p&gt;With so many options for serialization, it is actually not easy to 
make the right choice. We already saw some technical advantages and 
disadvantages of each of them outlined above. Since serializers are at the core 
of your Flink jobs and usually also sit on the hot path (per record 
invocations), let us actually take a deeper look into their performance with 
the help of the Flink benchmarks project at &lt;a 
href=&quot;https://github.com/dataArtisans/flink-benchmarks&quot;&gt;http [...]
+
+&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code 
class=&quot;language-java&quot;&gt;&lt;span 
class=&quot;kd&quot;&gt;public&lt;/span&gt; &lt;span 
class=&quot;kd&quot;&gt;class&lt;/span&gt; &lt;span 
class=&quot;nc&quot;&gt;MyPojo&lt;/span&gt; &lt;span 
class=&quot;o&quot;&gt;{&lt;/span&gt;
+  &lt;span class=&quot;kd&quot;&gt;public&lt;/span&gt; &lt;span 
class=&quot;kt&quot;&gt;int&lt;/span&gt; &lt;span 
class=&quot;n&quot;&gt;id&lt;/span&gt;&lt;span 
class=&quot;o&quot;&gt;;&lt;/span&gt;
+  &lt;span class=&quot;kd&quot;&gt;private&lt;/span&gt; &lt;span 
class=&quot;n&quot;&gt;String&lt;/span&gt; &lt;span 
class=&quot;n&quot;&gt;name&lt;/span&gt;&lt;span 
class=&quot;o&quot;&gt;;&lt;/span&gt;
+  &lt;span class=&quot;kd&quot;&gt;private&lt;/span&gt; &lt;span 
class=&quot;n&quot;&gt;String&lt;/span&gt;&lt;span 
class=&quot;o&quot;&gt;[]&lt;/span&gt; &lt;span 
class=&quot;n&quot;&gt;operationNames&lt;/span&gt;&lt;span 
class=&quot;o&quot;&gt;;&lt;/span&gt;
+  &lt;span class=&quot;kd&quot;&gt;private&lt;/span&gt; &lt;span 
class=&quot;n&quot;&gt;MyOperation&lt;/span&gt;&lt;span 
class=&quot;o&quot;&gt;[]&lt;/span&gt; &lt;span 
class=&quot;n&quot;&gt;operations&lt;/span&gt;&lt;span 
class=&quot;o&quot;&gt;;&lt;/span&gt;
+  &lt;span class=&quot;kd&quot;&gt;private&lt;/span&gt; &lt;span 
class=&quot;kt&quot;&gt;int&lt;/span&gt; &lt;span 
class=&quot;n&quot;&gt;otherId1&lt;/span&gt;&lt;span 
class=&quot;o&quot;&gt;;&lt;/span&gt;
+  &lt;span class=&quot;kd&quot;&gt;private&lt;/span&gt; &lt;span 
class=&quot;kt&quot;&gt;int&lt;/span&gt; &lt;span 
class=&quot;n&quot;&gt;otherId2&lt;/span&gt;&lt;span 
class=&quot;o&quot;&gt;;&lt;/span&gt;
+  &lt;span class=&quot;kd&quot;&gt;private&lt;/span&gt; &lt;span 
class=&quot;kt&quot;&gt;int&lt;/span&gt; &lt;span 
class=&quot;n&quot;&gt;otherId3&lt;/span&gt;&lt;span 
class=&quot;o&quot;&gt;;&lt;/span&gt;
+  &lt;span class=&quot;kd&quot;&gt;private&lt;/span&gt; &lt;span 
class=&quot;n&quot;&gt;Object&lt;/span&gt; &lt;span 
class=&quot;n&quot;&gt;someObject&lt;/span&gt;&lt;span 
class=&quot;o&quot;&gt;;&lt;/span&gt;
+&lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;
+&lt;span class=&quot;kd&quot;&gt;public&lt;/span&gt; &lt;span 
class=&quot;kd&quot;&gt;class&lt;/span&gt; &lt;span 
class=&quot;nc&quot;&gt;MyOperation&lt;/span&gt; &lt;span 
class=&quot;o&quot;&gt;{&lt;/span&gt;
+  &lt;span class=&quot;kt&quot;&gt;int&lt;/span&gt; &lt;span 
class=&quot;n&quot;&gt;id&lt;/span&gt;&lt;span 
class=&quot;o&quot;&gt;;&lt;/span&gt;
+  &lt;span class=&quot;kd&quot;&gt;protected&lt;/span&gt; &lt;span 
class=&quot;n&quot;&gt;String&lt;/span&gt; &lt;span 
class=&quot;n&quot;&gt;name&lt;/span&gt;&lt;span 
class=&quot;o&quot;&gt;;&lt;/span&gt;
+&lt;span 
class=&quot;o&quot;&gt;}&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
+
+&lt;p&gt;This is mapped to tuples, rows, Avro specific records, Thrift and 
Protobuf representations appropriately and sent through a simple Flink job at 
parallelism 4 where the data type is used during network communication like 
this:&lt;/p&gt;
+
+&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code 
class=&quot;language-java&quot;&gt;&lt;span 
class=&quot;n&quot;&gt;env&lt;/span&gt;&lt;span 
class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span 
class=&quot;na&quot;&gt;setParallelism&lt;/span&gt;&lt;span 
class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span 
class=&quot;mi&quot;&gt;4&lt;/span&gt;&lt;span 
class=&quot;o&quot;&gt;);&lt;/span&gt;
+&lt;span class=&quot;n&quot;&gt;env&lt;/span&gt;&lt;span 
class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span 
class=&quot;na&quot;&gt;addSource&lt;/span&gt;&lt;span 
class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span 
class=&quot;k&quot;&gt;new&lt;/span&gt; &lt;span 
class=&quot;nf&quot;&gt;PojoSource&lt;/span&gt;&lt;span 
class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span 
class=&quot;n&quot;&gt;RECORDS_PER_INVOCATION&lt;/span&gt;&lt;span 
class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt [...]
+    &lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span 
class=&quot;na&quot;&gt;rebalance&lt;/span&gt;&lt;span 
class=&quot;o&quot;&gt;()&lt;/span&gt;
+    &lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span 
class=&quot;na&quot;&gt;addSink&lt;/span&gt;&lt;span 
class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span 
class=&quot;k&quot;&gt;new&lt;/span&gt; &lt;span 
class=&quot;n&quot;&gt;DiscardingSink&lt;/span&gt;&lt;span 
class=&quot;o&quot;&gt;&amp;lt;&amp;gt;());&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
+&lt;p&gt;After running this through the &lt;a 
href=&quot;http://openjdk.java.net/projects/code-tools/jmh/&quot;&gt;jmh&lt;/a&gt;
 micro-benchmarks defined in &lt;a 
href=&quot;https://github.com/dataArtisans/flink-benchmarks/blob/master/src/main/java/org/apache/flink/benchmark/full/SerializationFrameworkAllBenchmarks.java&quot;&gt;SerializationFrameworkAllBenchmarks.java&lt;/a&gt;,
 I retrieved the following performance results for Flink 1.10 on my machine (in 
number of operations per milli [...]
+&lt;br /&gt;&lt;/p&gt;
+
+&lt;center&gt;
+&lt;img 
src=&quot;/img/blog/2020-04-15-flink-serialization-performance-results.svg&quot;
 width=&quot;800px&quot; alt=&quot;Communication between the Flink operator and 
the Python execution environment&quot; /&gt;
+&lt;/center&gt;
+&lt;p&gt;&lt;br /&gt;&lt;/p&gt;
+
+&lt;p&gt;A few takeaways from these numbers:&lt;/p&gt;
+
+&lt;ul&gt;
+  &lt;li&gt;
+    &lt;p&gt;The default fallback from POJO to Kryo reduces performance by 
75%.&lt;br /&gt;
+Registering types with Kryo significantly improves its performance with only 
64% fewer operations than by using a POJO.&lt;/p&gt;
+  &lt;/li&gt;
+  &lt;li&gt;
+    &lt;p&gt;Avro GenericRecord and SpecificRecord are roughly serialized at 
the same speed.&lt;/p&gt;
+  &lt;/li&gt;
+  &lt;li&gt;
+    &lt;p&gt;Avro Reflect serialization is even slower than Kryo default 
(-45%).&lt;/p&gt;
+  &lt;/li&gt;
+  &lt;li&gt;
+    &lt;p&gt;Tuples are the fastest, closely followed by Rows. Both leverage 
fast specialized serialization code based on direct access without Java 
reflection.&lt;/p&gt;
+  &lt;/li&gt;
+  &lt;li&gt;
+    &lt;p&gt;Using a (nested) Tuple instead of a POJO may speed up your job by 
42% (but is less flexible!).
+ Having code-generation for the PojoSerializer (&lt;a 
href=&quot;https://jira.apache.org/jira/browse/FLINK-3599&quot;&gt;FLINK-3599&lt;/a&gt;)
 may actually close that gap (or at least move closer to the RowSerializer). If 
you feel like giving the implementation a go, please give the Flink community a 
note and we will see whether we can make that happen.&lt;/p&gt;
+  &lt;/li&gt;
+  &lt;li&gt;
+    &lt;p&gt;If you cannot use POJOs, try to define your data type with one of 
the serialization frameworks that generate specific code for it: Protobuf, 
Avro, Thrift (in that order, performance-wise).&lt;/p&gt;
+  &lt;/li&gt;
+&lt;/ul&gt;
+
+&lt;div class=&quot;alert alert-info&quot;&gt;
+  &lt;p&gt;&lt;span class=&quot;label label-info&quot; style=&quot;display: 
inline-block&quot;&gt;&lt;span class=&quot;glyphicon glyphicon-info-sign&quot; 
aria-hidden=&quot;true&quot;&gt;&lt;/span&gt; Note&lt;/span&gt; As with all 
benchmarks, please bear in mind that these numbers only give a hint on Flink’s 
serializer performance in a specific scenario. They may be different with your 
data types but the rough classification is probably the same. If you want to be 
sure, please verify the [...]
+&lt;/div&gt;
+
+&lt;h1 id=&quot;conclusion&quot;&gt;Conclusion&lt;/h1&gt;
+
+&lt;p&gt;In the sections above, we looked at how Flink performs serialization 
for different sorts of data types and elaborated the technical advantages and 
disadvantages. For data types used in Flink state, you probably want to 
leverage either POJO or Avro types which, currently, are the only ones 
supporting state evolution out of the box and allow your stateful application 
to develop over time. POJOs are usually faster in the de/serialization while 
Avro may support more flexible schema  [...]
+
+&lt;p&gt;The fastest de/serialization is achieved with Flink’s internal tuple 
and row serializers which can access these types’ fields directly without going 
via reflection. With roughly 30% decreased throughput as compared to tuples, 
Protobuf and POJO types do not perform too badly on their own and are more 
flexible and maintainable. Avro (specific and generic) records as well as 
Thrift data types further reduce performance by 20% and 30%, respectively. You 
definitely want to avoid Kryo [...]
+
+&lt;p&gt;The next article in this series will use this finding as a starting 
point to look into a few common pitfalls and obstacles of avoiding Kryo, how to 
get the most out of the PojoSerializer, and a few more tuning techniques with 
respect to serialization. Stay tuned for more.&lt;/p&gt;
+</description>
+<pubDate>Wed, 15 Apr 2020 10:00:00 +0200</pubDate>
+<link>https://flink.apache.org/news/2020/04/15/flink-serialization-tuning-vol-1.html</link>
+<guid 
isPermaLink="true">/news/2020/04/15/flink-serialization-tuning-vol-1.html</guid>
+</item>
+
+<item>
 <title>PyFlink: Introducing Python Support for UDFs in Flink&#39;s Table 
API</title>
 <description>&lt;p&gt;Flink 1.9 introduced the Python Table API, allowing 
developers and data engineers to write Python Table API jobs for Table 
transformations and analysis, such as Python ETL or aggregate jobs. However, 
Python users faced some limitations when it came to support for Python UDFs in 
Flink 1.9, preventing them from extending the system’s built-in 
functionality.&lt;/p&gt;
 
@@ -17002,91 +17303,5 @@ internally, fault tolerance, and performance 
measurements!&lt;/p&gt;
 <guid isPermaLink="true">/news/2015/02/04/january-in-flink.html</guid>
 </item>
 
-<item>
-<title>Apache Flink 0.8.0 available</title>
-<description>&lt;p&gt;We are pleased to announce the availability of Flink 
0.8.0. This release includes new user-facing features as well as performance 
and bug fixes, extends the support for filesystems and introduces the Scala API 
and flexible windowing semantics for Flink Streaming. A total of 33 people have 
contributed to this release, a big thanks to all of them!&lt;/p&gt;
-
-&lt;p&gt;&lt;a 
href=&quot;http://www.apache.org/dyn/closer.cgi/flink/flink-0.8.0/flink-0.8.0-bin-hadoop2.tgz&quot;&gt;Download
 Flink 0.8.0&lt;/a&gt;&lt;/p&gt;
-
-&lt;p&gt;&lt;a 
href=&quot;https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522&amp;amp;version=12328699&quot;&gt;See
 the release changelog&lt;/a&gt;&lt;/p&gt;
-
-&lt;h2 id=&quot;overview-of-major-new-features&quot;&gt;Overview of major new 
features&lt;/h2&gt;
-
-&lt;ul&gt;
-  &lt;li&gt;
-    &lt;p&gt;&lt;strong&gt;Extended filesystem support&lt;/strong&gt;: The 
former &lt;code&gt;DistributedFileSystem&lt;/code&gt; interface has been 
generalized to &lt;code&gt;HadoopFileSystem&lt;/code&gt; now supporting all sub 
classes of &lt;code&gt;org.apache.hadoop.fs.FileSystem&lt;/code&gt;. This 
allows users to use all file systems supported by Hadoop with Apache Flink.
-&lt;a 
href=&quot;http://ci.apache.org/projects/flink/flink-docs-release-0.8/example_connectors.html&quot;&gt;See
 connecting to other systems&lt;/a&gt;&lt;/p&gt;
-  &lt;/li&gt;
-  &lt;li&gt;
-    &lt;p&gt;&lt;strong&gt;Streaming Scala API&lt;/strong&gt;: As an 
alternative to the existing Java API Streaming is now also programmable in 
Scala. The Java and Scala APIs have now the same syntax and transformations and 
will be kept from now on in sync in every future release.&lt;/p&gt;
-  &lt;/li&gt;
-  &lt;li&gt;
-    &lt;p&gt;&lt;strong&gt;Streaming windowing semantics&lt;/strong&gt;: The 
new windowing api offers an expressive way to define custom logic for 
triggering the execution of a stream window and removing elements. The new 
features include out-of-the-box support for windows based in logical or 
physical time and data-driven properties on the events themselves among others. 
&lt;a 
href=&quot;http://ci.apache.org/projects/flink/flink-docs-release-0.8/streaming_guide.html#window-operators&quot
 [...]
-  &lt;/li&gt;
-  &lt;li&gt;
-    &lt;p&gt;&lt;strong&gt;Mutable and immutable objects in 
runtime&lt;/strong&gt; All Flink versions before 0.8.0 were always passing the 
same objects to functions written by users. This is a common performance 
optimization, also used in other systems such as Hadoop.
- However, this is error-prone for new users because one has to carefully check 
that references to the object aren’t kept in the user function. Starting from 
0.8.0, Flink allows to configure a mode which is disabling that 
mechanism.&lt;/p&gt;
-  &lt;/li&gt;
-  &lt;li&gt;&lt;strong&gt;Performance and usability 
improvements&lt;/strong&gt;: The new Apache Flink 0.8.0 release brings several 
new features which will significantly improve the performance and the usability 
of the system. Amongst others, these features include:
-    &lt;ul&gt;
-      &lt;li&gt;Improved input split assignment which maximizes computation 
locality&lt;/li&gt;
-      &lt;li&gt;Smart broadcasting mechanism which minimizes network 
I/O&lt;/li&gt;
-      &lt;li&gt;Custom partitioners which let the user control how the data is 
partitioned within the cluster. This helps to prevent data skewness and allows 
to implement highly efficient algorithms.&lt;/li&gt;
-      &lt;li&gt;coGroup operator now supports group sorting for its 
inputs&lt;/li&gt;
-    &lt;/ul&gt;
-  &lt;/li&gt;
-  &lt;li&gt;
-    &lt;p&gt;&lt;strong&gt;Kryo is the new fallback serializer&lt;/strong&gt;: 
Apache Flink has a sophisticated type analysis and serialization framework that 
is able to handle commonly used types very efficiently.
- In addition to that, there is a fallback serializer for types which are not 
supported. Older versions of Flink used the reflective &lt;a 
href=&quot;http://avro.apache.org/&quot;&gt;Avro&lt;/a&gt; serializer for that 
purpose. With this release, Flink is using the powerful &lt;a 
href=&quot;https://github.com/EsotericSoftware/kryo&quot;&gt;Kryo&lt;/a&gt; and 
twitter-chill library for support of types such as Java Collections and Scala 
specifc types.&lt;/p&gt;
-  &lt;/li&gt;
-  &lt;li&gt;
-    &lt;p&gt;&lt;strong&gt;Hadoop 2.2.0+ is now the default Hadoop 
dependency&lt;/strong&gt;: With Flink 0.8.0 we made the “hadoop2” build profile 
the default build for Flink. This means that all users using Hadoop 1 (0.2X or 
1.2.X versions) have to specify  version “0.8.0-hadoop1” in their pom 
files.&lt;/p&gt;
-  &lt;/li&gt;
-  &lt;li&gt;&lt;strong&gt;HBase module updated&lt;/strong&gt; The HBase 
version has been updated to 0.98.6.1. Also, Hbase is now available to the 
Hadoop1 and Hadoop2 profile of Flink.&lt;/li&gt;
-&lt;/ul&gt;
-
-&lt;h2 id=&quot;contributors&quot;&gt;Contributors&lt;/h2&gt;
-
-&lt;ul&gt;
-  &lt;li&gt;Marton Balassi&lt;/li&gt;
-  &lt;li&gt;Daniel Bali&lt;/li&gt;
-  &lt;li&gt;Carsten Brandt&lt;/li&gt;
-  &lt;li&gt;Moritz Borgmann&lt;/li&gt;
-  &lt;li&gt;Stefan Bunk&lt;/li&gt;
-  &lt;li&gt;Paris Carbone&lt;/li&gt;
-  &lt;li&gt;Ufuk Celebi&lt;/li&gt;
-  &lt;li&gt;Nils Engelbach&lt;/li&gt;
-  &lt;li&gt;Stephan Ewen&lt;/li&gt;
-  &lt;li&gt;Gyula Fora&lt;/li&gt;
-  &lt;li&gt;Gabor Hermann&lt;/li&gt;
-  &lt;li&gt;Fabian Hueske&lt;/li&gt;
-  &lt;li&gt;Vasiliki Kalavri&lt;/li&gt;
-  &lt;li&gt;Johannes Kirschnick&lt;/li&gt;
-  &lt;li&gt;Aljoscha Krettek&lt;/li&gt;
-  &lt;li&gt;Suneel Marthi&lt;/li&gt;
-  &lt;li&gt;Robert Metzger&lt;/li&gt;
-  &lt;li&gt;Felix Neutatz&lt;/li&gt;
-  &lt;li&gt;Chiwan Park&lt;/li&gt;
-  &lt;li&gt;Flavio Pompermaier&lt;/li&gt;
-  &lt;li&gt;Mingliang Qi&lt;/li&gt;
-  &lt;li&gt;Shiva Teja Reddy&lt;/li&gt;
-  &lt;li&gt;Till Rohrmann&lt;/li&gt;
-  &lt;li&gt;Henry Saputra&lt;/li&gt;
-  &lt;li&gt;Kousuke Saruta&lt;/li&gt;
-  &lt;li&gt;Chesney Schepler&lt;/li&gt;
-  &lt;li&gt;Erich Schubert&lt;/li&gt;
-  &lt;li&gt;Peter Szabo&lt;/li&gt;
-  &lt;li&gt;Jonas Traub&lt;/li&gt;
-  &lt;li&gt;Kostas Tzoumas&lt;/li&gt;
-  &lt;li&gt;Timo Walther&lt;/li&gt;
-  &lt;li&gt;Daniel Warneke&lt;/li&gt;
-  &lt;li&gt;Chen Xu&lt;/li&gt;
-&lt;/ul&gt;
-</description>
-<pubDate>Wed, 21 Jan 2015 11:00:00 +0100</pubDate>
-<link>https://flink.apache.org/news/2015/01/21/release-0.8.html</link>
-<guid isPermaLink="true">/news/2015/01/21/release-0.8.html</guid>
-</item>
-
 </channel>
 </rss>
diff --git a/content/blog/index.html b/content/blog/index.html
index 586a499..aa24919 100644
--- a/content/blog/index.html
+++ b/content/blog/index.html
@@ -195,6 +195,19 @@
     <!-- Blog posts -->
     
     <article>
+      <h2 class="blog-title"><a 
href="/news/2020/04/15/flink-serialization-tuning-vol-1.html">Flink 
Serialization Tuning Vol. 1: Choosing your Serializer — if you can</a></h2>
+
+      <p>15 Apr 2020
+       Nico Kruber </p>
+
+      <p>Serialization is a crucial element of your Flink job. This article is 
the first in a series of posts that will highlight Flink’s serialization stack, 
and looks at the different ways Flink can serialize your data types.</p>
+
+      <p><a 
href="/news/2020/04/15/flink-serialization-tuning-vol-1.html">Continue reading 
&raquo;</a></p>
+    </article>
+
+    <hr>
+    
+    <article>
       <h2 class="blog-title"><a 
href="/2020/04/09/pyflink-udf-support-flink.html">PyFlink: Introducing Python 
Support for UDFs in Flink's Table API</a></h2>
 
       <p>09 Apr 2020
@@ -318,21 +331,6 @@ This release marks a big milestone: Stateful Functions 2.0 
is not only an API up
 
     <hr>
     
-    <article>
-      <h2 class="blog-title"><a 
href="/news/2020/01/30/release-1.9.2.html">Apache Flink 1.9.2 Released</a></h2>
-
-      <p>30 Jan 2020
-       Hequn Cheng (<a href="https://twitter.com/HequnC";>@HequnC</a>)</p>
-
-      <p><p>The Apache Flink community released the second bugfix version of 
the Apache Flink 1.9 series.</p>
-
-</p>
-
-      <p><a href="/news/2020/01/30/release-1.9.2.html">Continue reading 
&raquo;</a></p>
-    </article>
-
-    <hr>
-    
 
     <!-- Pagination links -->
     
@@ -365,6 +363,16 @@ This release marks a big milestone: Stateful Functions 2.0 
is not only an API up
 
     <ul id="markdown-toc">
       
+      <li><a 
href="/news/2020/04/15/flink-serialization-tuning-vol-1.html">Flink 
Serialization Tuning Vol. 1: Choosing your Serializer — if you can</a></li>
+
+      
+        
+      
+    
+      
+      
+
+      
       <li><a href="/2020/04/09/pyflink-udf-support-flink.html">PyFlink: 
Introducing Python Support for UDFs in Flink's Table API</a></li>
 
       
diff --git a/content/blog/page10/index.html b/content/blog/page10/index.html
index d08d083..21caa67 100644
--- a/content/blog/page10/index.html
+++ b/content/blog/page10/index.html
@@ -195,6 +195,24 @@
     <!-- Blog posts -->
     
     <article>
+      <h2 class="blog-title"><a 
href="/news/2015/08/24/introducing-flink-gelly.html">Introducing Gelly: Graph 
Processing with Apache Flink</a></h2>
+
+      <p>24 Aug 2015
+      </p>
+
+      <p><p>This blog post introduces <strong>Gelly</strong>, Apache Flink’s 
<em>graph-processing API and library</em>. Flink’s native support
+for iterations makes it a suitable platform for large-scale graph analytics.
+By leveraging delta iterations, Gelly is able to map various graph processing 
models such as
+vertex-centric or gather-sum-apply to Flink dataflows.</p>
+
+</p>
+
+      <p><a href="/news/2015/08/24/introducing-flink-gelly.html">Continue 
reading &raquo;</a></p>
+    </article>
+
+    <hr>
+    
+    <article>
       <h2 class="blog-title"><a 
href="/news/2015/06/24/announcing-apache-flink-0.9.0-release.html">Announcing 
Apache Flink 0.9.0</a></h2>
 
       <p>24 Jun 2015
@@ -336,21 +354,6 @@ and offers a new API including definition of flexible 
windows.</p>
 
     <hr>
     
-    <article>
-      <h2 class="blog-title"><a 
href="/news/2015/01/21/release-0.8.html">Apache Flink 0.8.0 available</a></h2>
-
-      <p>21 Jan 2015
-      </p>
-
-      <p><p>We are pleased to announce the availability of Flink 0.8.0. This 
release includes new user-facing features as well as performance and bug fixes, 
extends the support for filesystems and introduces the Scala API and flexible 
windowing semantics for Flink Streaming. A total of 33 people have contributed 
to this release, a big thanks to all of them!</p>
-
-</p>
-
-      <p><a href="/news/2015/01/21/release-0.8.html">Continue reading 
&raquo;</a></p>
-    </article>
-
-    <hr>
-    
 
     <!-- Pagination links -->
     
@@ -383,6 +386,16 @@ and offers a new API including definition of flexible 
windows.</p>
 
     <ul id="markdown-toc">
       
+      <li><a 
href="/news/2020/04/15/flink-serialization-tuning-vol-1.html">Flink 
Serialization Tuning Vol. 1: Choosing your Serializer — if you can</a></li>
+
+      
+        
+      
+    
+      
+      
+
+      
       <li><a href="/2020/04/09/pyflink-udf-support-flink.html">PyFlink: 
Introducing Python Support for UDFs in Flink's Table API</a></li>
 
       
diff --git a/content/blog/page11/index.html b/content/blog/page11/index.html
index ff672cd..ed7f3aa 100644
--- a/content/blog/page11/index.html
+++ b/content/blog/page11/index.html
@@ -195,6 +195,21 @@
     <!-- Blog posts -->
     
     <article>
+      <h2 class="blog-title"><a 
href="/news/2015/01/21/release-0.8.html">Apache Flink 0.8.0 available</a></h2>
+
+      <p>21 Jan 2015
+      </p>
+
+      <p><p>We are pleased to announce the availability of Flink 0.8.0. This 
release includes new user-facing features as well as performance and bug fixes, 
extends the support for filesystems and introduces the Scala API and flexible 
windowing semantics for Flink Streaming. A total of 33 people have contributed 
to this release, a big thanks to all of them!</p>
+
+</p>
+
+      <p><a href="/news/2015/01/21/release-0.8.html">Continue reading 
&raquo;</a></p>
+    </article>
+
+    <hr>
+    
+    <article>
       <h2 class="blog-title"><a 
href="/news/2015/01/06/december-in-flink.html">December 2014 in the Flink 
community</a></h2>
 
       <p>06 Jan 2015
@@ -319,6 +334,16 @@ academic and open source project that Flink originates 
from.</p>
 
     <ul id="markdown-toc">
       
+      <li><a 
href="/news/2020/04/15/flink-serialization-tuning-vol-1.html">Flink 
Serialization Tuning Vol. 1: Choosing your Serializer — if you can</a></li>
+
+      
+        
+      
+    
+      
+      
+
+      
       <li><a href="/2020/04/09/pyflink-udf-support-flink.html">PyFlink: 
Introducing Python Support for UDFs in Flink's Table API</a></li>
 
       
diff --git a/content/blog/page2/index.html b/content/blog/page2/index.html
index 79dd8aa..42535b0 100644
--- a/content/blog/page2/index.html
+++ b/content/blog/page2/index.html
@@ -195,6 +195,21 @@
     <!-- Blog posts -->
     
     <article>
+      <h2 class="blog-title"><a 
href="/news/2020/01/30/release-1.9.2.html">Apache Flink 1.9.2 Released</a></h2>
+
+      <p>30 Jan 2020
+       Hequn Cheng (<a href="https://twitter.com/HequnC";>@HequnC</a>)</p>
+
+      <p><p>The Apache Flink community released the second bugfix version of 
the Apache Flink 1.9 series.</p>
+
+</p>
+
+      <p><a href="/news/2020/01/30/release-1.9.2.html">Continue reading 
&raquo;</a></p>
+    </article>
+
+    <hr>
+    
+    <article>
       <h2 class="blog-title"><a 
href="/news/2020/01/29/state-unlocked-interacting-with-state-in-apache-flink.html">State
 Unlocked: Interacting with State in Apache Flink</a></h2>
 
       <p>29 Jan 2020
@@ -317,22 +332,6 @@
 
     <hr>
     
-    <article>
-      <h2 class="blog-title"><a 
href="/news/2019/08/22/release-1.9.0.html">Apache Flink 1.9.0 Release 
Announcement</a></h2>
-
-      <p>22 Aug 2019
-      </p>
-
-      <p><p>The Apache Flink community is proud to announce the release of 
Apache Flink
-1.9.0.</p>
-
-</p>
-
-      <p><a href="/news/2019/08/22/release-1.9.0.html">Continue reading 
&raquo;</a></p>
-    </article>
-
-    <hr>
-    
 
     <!-- Pagination links -->
     
@@ -365,6 +364,16 @@
 
     <ul id="markdown-toc">
       
+      <li><a 
href="/news/2020/04/15/flink-serialization-tuning-vol-1.html">Flink 
Serialization Tuning Vol. 1: Choosing your Serializer — if you can</a></li>
+
+      
+        
+      
+    
+      
+      
+
+      
       <li><a href="/2020/04/09/pyflink-udf-support-flink.html">PyFlink: 
Introducing Python Support for UDFs in Flink's Table API</a></li>
 
       
diff --git a/content/blog/page3/index.html b/content/blog/page3/index.html
index da9d3fc..9abb37e 100644
--- a/content/blog/page3/index.html
+++ b/content/blog/page3/index.html
@@ -195,6 +195,22 @@
     <!-- Blog posts -->
     
     <article>
+      <h2 class="blog-title"><a 
href="/news/2019/08/22/release-1.9.0.html">Apache Flink 1.9.0 Release 
Announcement</a></h2>
+
+      <p>22 Aug 2019
+      </p>
+
+      <p><p>The Apache Flink community is proud to announce the release of 
Apache Flink
+1.9.0.</p>
+
+</p>
+
+      <p><a href="/news/2019/08/22/release-1.9.0.html">Continue reading 
&raquo;</a></p>
+    </article>
+
+    <hr>
+    
+    <article>
       <h2 class="blog-title"><a 
href="/2019/07/23/flink-network-stack-2.html">Flink Network Stack Vol. 2: 
Monitoring, Metrics, and that Backpressure Thing</a></h2>
 
       <p>23 Jul 2019
@@ -321,19 +337,6 @@ for more details.</p>
 
     <hr>
     
-    <article>
-      <h2 class="blog-title"><a 
href="/features/2019/03/11/prometheus-monitoring.html">Flink and Prometheus: 
Cloud-native monitoring of streaming applications</a></h2>
-
-      <p>11 Mar 2019
-       Maximilian Bode, TNG Technology Consulting (<a 
href="https://twitter.com/mxpbode";>@mxpbode</a>)</p>
-
-      <p>This blog post describes how developers can leverage Apache Flink's 
built-in metrics system together with Prometheus to observe and monitor 
streaming applications in an effective way.</p>
-
-      <p><a href="/features/2019/03/11/prometheus-monitoring.html">Continue 
reading &raquo;</a></p>
-    </article>
-
-    <hr>
-    
 
     <!-- Pagination links -->
     
@@ -366,6 +369,16 @@ for more details.</p>
 
     <ul id="markdown-toc">
       
+      <li><a 
href="/news/2020/04/15/flink-serialization-tuning-vol-1.html">Flink 
Serialization Tuning Vol. 1: Choosing your Serializer — if you can</a></li>
+
+      
+        
+      
+    
+      
+      
+
+      
       <li><a href="/2020/04/09/pyflink-udf-support-flink.html">PyFlink: 
Introducing Python Support for UDFs in Flink's Table API</a></li>
 
       
diff --git a/content/blog/page4/index.html b/content/blog/page4/index.html
index 4663315..4e0e20d 100644
--- a/content/blog/page4/index.html
+++ b/content/blog/page4/index.html
@@ -195,6 +195,19 @@
     <!-- Blog posts -->
     
     <article>
+      <h2 class="blog-title"><a 
href="/features/2019/03/11/prometheus-monitoring.html">Flink and Prometheus: 
Cloud-native monitoring of streaming applications</a></h2>
+
+      <p>11 Mar 2019
+       Maximilian Bode, TNG Technology Consulting (<a 
href="https://twitter.com/mxpbode";>@mxpbode</a>)</p>
+
+      <p>This blog post describes how developers can leverage Apache Flink's 
built-in metrics system together with Prometheus to observe and monitor 
streaming applications in an effective way.</p>
+
+      <p><a href="/features/2019/03/11/prometheus-monitoring.html">Continue 
reading &raquo;</a></p>
+    </article>
+
+    <hr>
+    
+    <article>
       <h2 class="blog-title"><a href="/news/2019/03/06/ffsf-preview.html">What 
to expect from Flink Forward San Francisco 2019</a></h2>
 
       <p>06 Mar 2019
@@ -325,21 +338,6 @@ Please check the <a 
href="https://issues.apache.org/jira/secure/ReleaseNote.jspa
 
     <hr>
     
-    <article>
-      <h2 class="blog-title"><a 
href="/news/2018/10/29/release-1.6.2.html">Apache Flink 1.6.2 Released</a></h2>
-
-      <p>29 Oct 2018
-      </p>
-
-      <p><p>The Apache Flink community released the second bugfix version of 
the Apache Flink 1.6 series.</p>
-
-</p>
-
-      <p><a href="/news/2018/10/29/release-1.6.2.html">Continue reading 
&raquo;</a></p>
-    </article>
-
-    <hr>
-    
 
     <!-- Pagination links -->
     
@@ -372,6 +370,16 @@ Please check the <a 
href="https://issues.apache.org/jira/secure/ReleaseNote.jspa
 
     <ul id="markdown-toc">
       
+      <li><a 
href="/news/2020/04/15/flink-serialization-tuning-vol-1.html">Flink 
Serialization Tuning Vol. 1: Choosing your Serializer — if you can</a></li>
+
+      
+        
+      
+    
+      
+      
+
+      
       <li><a href="/2020/04/09/pyflink-udf-support-flink.html">PyFlink: 
Introducing Python Support for UDFs in Flink's Table API</a></li>
 
       
diff --git a/content/blog/page5/index.html b/content/blog/page5/index.html
index 4d43c0f..fb48d96 100644
--- a/content/blog/page5/index.html
+++ b/content/blog/page5/index.html
@@ -195,6 +195,21 @@
     <!-- Blog posts -->
     
     <article>
+      <h2 class="blog-title"><a 
href="/news/2018/10/29/release-1.6.2.html">Apache Flink 1.6.2 Released</a></h2>
+
+      <p>29 Oct 2018
+      </p>
+
+      <p><p>The Apache Flink community released the second bugfix version of 
the Apache Flink 1.6 series.</p>
+
+</p>
+
+      <p><a href="/news/2018/10/29/release-1.6.2.html">Continue reading 
&raquo;</a></p>
+    </article>
+
+    <hr>
+    
+    <article>
       <h2 class="blog-title"><a 
href="/news/2018/10/29/release-1.5.5.html">Apache Flink 1.5.5 Released</a></h2>
 
       <p>29 Oct 2018
@@ -329,21 +344,6 @@
 
     <hr>
     
-    <article>
-      <h2 class="blog-title"><a 
href="/news/2018/03/08/release-1.4.2.html">Apache Flink 1.4.2 Released</a></h2>
-
-      <p>08 Mar 2018
-      </p>
-
-      <p><p>The Apache Flink community released the second bugfix version of 
the Apache Flink 1.4 series.</p>
-
-</p>
-
-      <p><a href="/news/2018/03/08/release-1.4.2.html">Continue reading 
&raquo;</a></p>
-    </article>
-
-    <hr>
-    
 
     <!-- Pagination links -->
     
@@ -376,6 +376,16 @@
 
     <ul id="markdown-toc">
       
+      <li><a 
href="/news/2020/04/15/flink-serialization-tuning-vol-1.html">Flink 
Serialization Tuning Vol. 1: Choosing your Serializer — if you can</a></li>
+
+      
+        
+      
+    
+      
+      
+
+      
       <li><a href="/2020/04/09/pyflink-udf-support-flink.html">PyFlink: 
Introducing Python Support for UDFs in Flink's Table API</a></li>
 
       
diff --git a/content/blog/page6/index.html b/content/blog/page6/index.html
index e812215..0a294fe 100644
--- a/content/blog/page6/index.html
+++ b/content/blog/page6/index.html
@@ -195,6 +195,21 @@
     <!-- Blog posts -->
     
     <article>
+      <h2 class="blog-title"><a 
href="/news/2018/03/08/release-1.4.2.html">Apache Flink 1.4.2 Released</a></h2>
+
+      <p>08 Mar 2018
+      </p>
+
+      <p><p>The Apache Flink community released the second bugfix version of 
the Apache Flink 1.4 series.</p>
+
+</p>
+
+      <p><a href="/news/2018/03/08/release-1.4.2.html">Continue reading 
&raquo;</a></p>
+    </article>
+
+    <hr>
+    
+    <article>
       <h2 class="blog-title"><a 
href="/features/2018/03/01/end-to-end-exactly-once-apache-flink.html">An 
Overview of End-to-End Exactly-Once Processing in Apache Flink (with Apache 
Kafka, too!)</a></h2>
 
       <p>01 Mar 2018
@@ -326,21 +341,6 @@ what’s coming in Flink 1.4.0 as well as a preview of what 
the Flink community
 
     <hr>
     
-    <article>
-      <h2 class="blog-title"><a 
href="/news/2017/06/01/release-1.3.0.html">Apache Flink 1.3.0 Release 
Announcement</a></h2>
-
-      <p>01 Jun 2017 by Robert Metzger (<a 
href="https://twitter.com/";>@rmetzger_</a>)
-      </p>
-
-      <p><p>The Apache Flink community is pleased to announce the 1.3.0 
release. Over the past 4 months, the Flink community has been working hard to 
resolve more than 680 issues. See the <a 
href="/blog/release_1.3.0-changelog.html">complete changelog</a> for more 
detail.</p>
-
-</p>
-
-      <p><a href="/news/2017/06/01/release-1.3.0.html">Continue reading 
&raquo;</a></p>
-    </article>
-
-    <hr>
-    
 
     <!-- Pagination links -->
     
@@ -373,6 +373,16 @@ what’s coming in Flink 1.4.0 as well as a preview of what 
the Flink community
 
     <ul id="markdown-toc">
       
+      <li><a 
href="/news/2020/04/15/flink-serialization-tuning-vol-1.html">Flink 
Serialization Tuning Vol. 1: Choosing your Serializer — if you can</a></li>
+
+      
+        
+      
+    
+      
+      
+
+      
       <li><a href="/2020/04/09/pyflink-udf-support-flink.html">PyFlink: 
Introducing Python Support for UDFs in Flink's Table API</a></li>
 
       
diff --git a/content/blog/page7/index.html b/content/blog/page7/index.html
index 8b31cba..8f00f06 100644
--- a/content/blog/page7/index.html
+++ b/content/blog/page7/index.html
@@ -195,6 +195,21 @@
     <!-- Blog posts -->
     
     <article>
+      <h2 class="blog-title"><a 
href="/news/2017/06/01/release-1.3.0.html">Apache Flink 1.3.0 Release 
Announcement</a></h2>
+
+      <p>01 Jun 2017 by Robert Metzger (<a 
href="https://twitter.com/";>@rmetzger_</a>)
+      </p>
+
+      <p><p>The Apache Flink community is pleased to announce the 1.3.0 
release. Over the past 4 months, the Flink community has been working hard to 
resolve more than 680 issues. See the <a 
href="/blog/release_1.3.0-changelog.html">complete changelog</a> for more 
detail.</p>
+
+</p>
+
+      <p><a href="/news/2017/06/01/release-1.3.0.html">Continue reading 
&raquo;</a></p>
+    </article>
+
+    <hr>
+    
+    <article>
       <h2 class="blog-title"><a 
href="/news/2017/05/16/official-docker-image.html">Introducing Docker Images 
for Apache Flink</a></h2>
 
       <p>16 May 2017 by Patrick Lucas (Data Artisans) and Ismaël Mejía 
(Talend) (<a href="https://twitter.com/";>@iemejia</a>)
@@ -322,21 +337,6 @@
 
     <hr>
     
-    <article>
-      <h2 class="blog-title"><a 
href="/news/2016/09/05/release-1.1.2.html">Apache Flink 1.1.2 Released</a></h2>
-
-      <p>05 Sep 2016
-      </p>
-
-      <p><p>The Apache Flink community released another bugfix version of the 
Apache Flink 1.1. series.</p>
-
-</p>
-
-      <p><a href="/news/2016/09/05/release-1.1.2.html">Continue reading 
&raquo;</a></p>
-    </article>
-
-    <hr>
-    
 
     <!-- Pagination links -->
     
@@ -369,6 +369,16 @@
 
     <ul id="markdown-toc">
       
+      <li><a 
href="/news/2020/04/15/flink-serialization-tuning-vol-1.html">Flink 
Serialization Tuning Vol. 1: Choosing your Serializer — if you can</a></li>
+
+      
+        
+      
+    
+      
+      
+
+      
       <li><a href="/2020/04/09/pyflink-udf-support-flink.html">PyFlink: 
Introducing Python Support for UDFs in Flink's Table API</a></li>
 
       
diff --git a/content/blog/page8/index.html b/content/blog/page8/index.html
index e5e719a..251627f 100644
--- a/content/blog/page8/index.html
+++ b/content/blog/page8/index.html
@@ -195,6 +195,21 @@
     <!-- Blog posts -->
     
     <article>
+      <h2 class="blog-title"><a 
href="/news/2016/09/05/release-1.1.2.html">Apache Flink 1.1.2 Released</a></h2>
+
+      <p>05 Sep 2016
+      </p>
+
+      <p><p>The Apache Flink community released another bugfix version of the 
Apache Flink 1.1. series.</p>
+
+</p>
+
+      <p><a href="/news/2016/09/05/release-1.1.2.html">Continue reading 
&raquo;</a></p>
+    </article>
+
+    <hr>
+    
+    <article>
       <h2 class="blog-title"><a 
href="/news/2016/08/24/ff16-keynotes-panels.html">Flink Forward 2016: 
Announcing Schedule, Keynotes, and Panel Discussion</a></h2>
 
       <p>24 Aug 2016
@@ -326,21 +341,6 @@
 
     <hr>
     
-    <article>
-      <h2 class="blog-title"><a 
href="/news/2016/03/08/release-1.0.0.html">Announcing Apache Flink 
1.0.0</a></h2>
-
-      <p>08 Mar 2016
-      </p>
-
-      <p><p>The Apache Flink community is pleased to announce the availability 
of the 1.0.0 release. The community put significant effort into improving and 
extending Apache Flink since the last release, focusing on improving the 
experience of writing and executing data stream processing pipelines in 
production.</p>
-
-</p>
-
-      <p><a href="/news/2016/03/08/release-1.0.0.html">Continue reading 
&raquo;</a></p>
-    </article>
-
-    <hr>
-    
 
     <!-- Pagination links -->
     
@@ -373,6 +373,16 @@
 
     <ul id="markdown-toc">
       
+      <li><a 
href="/news/2020/04/15/flink-serialization-tuning-vol-1.html">Flink 
Serialization Tuning Vol. 1: Choosing your Serializer — if you can</a></li>
+
+      
+        
+      
+    
+      
+      
+
+      
       <li><a href="/2020/04/09/pyflink-udf-support-flink.html">PyFlink: 
Introducing Python Support for UDFs in Flink's Table API</a></li>
 
       
diff --git a/content/blog/page9/index.html b/content/blog/page9/index.html
index 4b348a7..7029cab 100644
--- a/content/blog/page9/index.html
+++ b/content/blog/page9/index.html
@@ -195,6 +195,21 @@
     <!-- Blog posts -->
     
     <article>
+      <h2 class="blog-title"><a 
href="/news/2016/03/08/release-1.0.0.html">Announcing Apache Flink 
1.0.0</a></h2>
+
+      <p>08 Mar 2016
+      </p>
+
+      <p><p>The Apache Flink community is pleased to announce the availability 
of the 1.0.0 release. The community put significant effort into improving and 
extending Apache Flink since the last release, focusing on improving the 
experience of writing and executing data stream processing pipelines in 
production.</p>
+
+</p>
+
+      <p><a href="/news/2016/03/08/release-1.0.0.html">Continue reading 
&raquo;</a></p>
+    </article>
+
+    <hr>
+    
+    <article>
       <h2 class="blog-title"><a 
href="/news/2016/02/11/release-0.10.2.html">Flink 0.10.2 Released</a></h2>
 
       <p>11 Feb 2016
@@ -327,24 +342,6 @@ Apache Flink started.</p>
 
     <hr>
     
-    <article>
-      <h2 class="blog-title"><a 
href="/news/2015/08/24/introducing-flink-gelly.html">Introducing Gelly: Graph 
Processing with Apache Flink</a></h2>
-
-      <p>24 Aug 2015
-      </p>
-
-      <p><p>This blog post introduces <strong>Gelly</strong>, Apache Flink’s 
<em>graph-processing API and library</em>. Flink’s native support
-for iterations makes it a suitable platform for large-scale graph analytics.
-By leveraging delta iterations, Gelly is able to map various graph processing 
models such as
-vertex-centric or gather-sum-apply to Flink dataflows.</p>
-
-</p>
-
-      <p><a href="/news/2015/08/24/introducing-flink-gelly.html">Continue 
reading &raquo;</a></p>
-    </article>
-
-    <hr>
-    
 
     <!-- Pagination links -->
     
@@ -377,6 +374,16 @@ vertex-centric or gather-sum-apply to Flink dataflows.</p>
 
     <ul id="markdown-toc">
       
+      <li><a 
href="/news/2020/04/15/flink-serialization-tuning-vol-1.html">Flink 
Serialization Tuning Vol. 1: Choosing your Serializer — if you can</a></li>
+
+      
+        
+      
+    
+      
+      
+
+      
       <li><a href="/2020/04/09/pyflink-udf-support-flink.html">PyFlink: 
Introducing Python Support for UDFs in Flink's Table API</a></li>
 
       
diff --git 
a/content/img/blog/2020-04-15-flink-serialization-performance-results.svg 
b/content/img/blog/2020-04-15-flink-serialization-performance-results.svg
new file mode 100644
index 0000000..314ae7d
--- /dev/null
+++ b/content/img/blog/2020-04-15-flink-serialization-performance-results.svg
@@ -0,0 +1 @@
+<svg version="1.1" viewBox="0.0 0.0 1020.0 587.0" fill="none" stroke="none" 
stroke-linecap="square" stroke-miterlimit="10" width="1020" height="587" 
xmlns:xlink="http://www.w3.org/1999/xlink"; 
xmlns="http://www.w3.org/2000/svg";><path fill="#ffffff" d="M0 0L1020.0 0L1020.0 
587.0L0 587.0L0 0Z" fill-rule="nonzero"/><path stroke="#666666" 
stroke-width="1.0" stroke-linecap="butt" d="M379.5 72.5L379.5 500.5" 
fill-rule="nonzero"/><path stroke="#666666" stroke-width="1.0" 
stroke-linecap="butt" d= [...]
\ No newline at end of file
diff --git a/content/index.html b/content/index.html
index 94571bf..a374861 100644
--- a/content/index.html
+++ b/content/index.html
@@ -567,6 +567,9 @@
 
   <dl>
       
+        <dt> <a 
href="/news/2020/04/15/flink-serialization-tuning-vol-1.html">Flink 
Serialization Tuning Vol. 1: Choosing your Serializer — if you can</a></dt>
+        <dd>Serialization is a crucial element of your Flink job. This article 
is the first in a series of posts that will highlight Flink’s serialization 
stack, and looks at the different ways Flink can serialize your data types.</dd>
+      
         <dt> <a href="/2020/04/09/pyflink-udf-support-flink.html">PyFlink: 
Introducing Python Support for UDFs in Flink's Table API</a></dt>
         <dd>Flink 1.10 extends its support for Python by adding Python UDFs in 
PyFlink. This post explains how UDFs work in PyFlink and gives some practical 
examples of how to use UDFs in PyFlink.</dd>
       
@@ -583,9 +586,6 @@ This release marks a big milestone: Stateful Functions 2.0 
is not only an API up
         <dd><p>In this blog post, you will learn our motivation behind the 
Flink-Hive integration, and how Flink 1.10 can help modernize your data 
warehouse.</p>
 
 </dd>
-      
-        <dt> <a href="/news/2020/03/24/demo-fraud-detection-2.html">Advanced 
Flink Application Patterns Vol.2: Dynamic Updates of Application Logic</a></dt>
-        <dd>In this series of blog posts you will learn about powerful Flink 
patterns for building streaming applications.</dd>
     
   </dl>
 
diff --git a/content/news/2020/04/15/flink-serialization-tuning-vol-1.html 
b/content/news/2020/04/15/flink-serialization-tuning-vol-1.html
new file mode 100644
index 0000000..92e3e36
--- /dev/null
+++ b/content/news/2020/04/15/flink-serialization-tuning-vol-1.html
@@ -0,0 +1,555 @@
+<!DOCTYPE html>
+<html lang="en">
+  <head>
+    <meta charset="utf-8">
+    <meta http-equiv="X-UA-Compatible" content="IE=edge">
+    <meta name="viewport" content="width=device-width, initial-scale=1">
+    <!-- The above 3 meta tags *must* come first in the head; any other head 
content must come *after* these tags -->
+    <title>Apache Flink: Flink Serialization Tuning Vol. 1: Choosing your 
Serializer — if you can</title>
+    <link rel="shortcut icon" href="/favicon.ico" type="image/x-icon">
+    <link rel="icon" href="/favicon.ico" type="image/x-icon">
+
+    <!-- Bootstrap -->
+    <link rel="stylesheet" 
href="https://maxcdn.bootstrapcdn.com/bootstrap/3.4.1/css/bootstrap.min.css";>
+    <link rel="stylesheet" href="/css/flink.css">
+    <link rel="stylesheet" href="/css/syntax.css">
+
+    <!-- Blog RSS feed -->
+    <link href="/blog/feed.xml" rel="alternate" type="application/rss+xml" 
title="Apache Flink Blog: RSS feed" />
+
+    <!-- jQuery (necessary for Bootstrap's JavaScript plugins) -->
+    <!-- We need to load Jquery in the header for custom google analytics 
event tracking-->
+    <script src="/js/jquery.min.js"></script>
+
+    <!-- HTML5 shim and Respond.js for IE8 support of HTML5 elements and media 
queries -->
+    <!-- WARNING: Respond.js doesn't work if you view the page via file:// -->
+    <!--[if lt IE 9]>
+      <script 
src="https://oss.maxcdn.com/html5shiv/3.7.2/html5shiv.min.js";></script>
+      <script 
src="https://oss.maxcdn.com/respond/1.4.2/respond.min.js";></script>
+    <![endif]-->
+  </head>
+  <body>  
+    
+
+    <!-- Main content. -->
+    <div class="container">
+    <div class="row">
+
+      
+     <div id="sidebar" class="col-sm-3">
+        
+
+<!-- Top navbar. -->
+    <nav class="navbar navbar-default">
+        <!-- The logo. -->
+        <div class="navbar-header">
+          <button type="button" class="navbar-toggle collapsed" 
data-toggle="collapse" data-target="#bs-example-navbar-collapse-1">
+            <span class="icon-bar"></span>
+            <span class="icon-bar"></span>
+            <span class="icon-bar"></span>
+          </button>
+          <div class="navbar-logo">
+            <a href="/">
+              <img alt="Apache Flink" src="/img/flink-header-logo.svg" 
width="147px" height="73px">
+            </a>
+          </div>
+        </div><!-- /.navbar-header -->
+
+        <!-- The navigation links. -->
+        <div class="collapse navbar-collapse" 
id="bs-example-navbar-collapse-1">
+          <ul class="nav navbar-nav navbar-main">
+
+            <!-- First menu section explains visitors what Flink is -->
+
+            <!-- What is Stream Processing? -->
+            <!--
+            <li><a href="/streamprocessing1.html">What is Stream 
Processing?</a></li>
+            -->
+
+            <!-- What is Flink? -->
+            <li><a href="/flink-architecture.html">What is Apache 
Flink?</a></li>
+
+            
+            <ul class="nav navbar-nav navbar-subnav">
+              <li >
+                  <a href="/flink-architecture.html">Architecture</a>
+              </li>
+              <li >
+                  <a href="/flink-applications.html">Applications</a>
+              </li>
+              <li >
+                  <a href="/flink-operations.html">Operations</a>
+              </li>
+            </ul>
+            
+
+            <!-- What is Stateful Functions? -->
+
+            <li><a href="/stateful-functions.html">What is Stateful 
Functions?</a></li>
+
+            <!-- Use cases -->
+            <li><a href="/usecases.html">Use Cases</a></li>
+
+            <!-- Powered by -->
+            <li><a href="/poweredby.html">Powered By</a></li>
+
+
+            &nbsp;
+            <!-- Second menu section aims to support Flink users -->
+
+            <!-- Downloads -->
+            <li><a href="/downloads.html">Downloads</a></li>
+
+            <!-- Getting Started -->
+            <li class="dropdown">
+              <a class="dropdown-toggle" data-toggle="dropdown" 
href="#">Getting Started<span class="caret"></span></a>
+              <ul class="dropdown-menu">
+                <li><a 
href="https://ci.apache.org/projects/flink/flink-docs-release-1.10/getting-started/index.html";
 target="_blank">With Flink <small><span class="glyphicon 
glyphicon-new-window"></span></small></a></li>
+                <li><a 
href="https://ci.apache.org/projects/flink/flink-statefun-docs-release-2.0/getting-started/project-setup.html";
 target="_blank">With Flink Stateful Functions <small><span class="glyphicon 
glyphicon-new-window"></span></small></a></li>
+              </ul>
+            </li>
+
+            <!-- Documentation -->
+            <li class="dropdown">
+              <a class="dropdown-toggle" data-toggle="dropdown" 
href="#">Documentation<span class="caret"></span></a>
+              <ul class="dropdown-menu">
+                <li><a 
href="https://ci.apache.org/projects/flink/flink-docs-release-1.10"; 
target="_blank">Flink 1.10 (Latest stable release) <small><span 
class="glyphicon glyphicon-new-window"></span></small></a></li>
+                <li><a 
href="https://ci.apache.org/projects/flink/flink-docs-master"; 
target="_blank">Flink Master (Latest Snapshot) <small><span class="glyphicon 
glyphicon-new-window"></span></small></a></li>
+                <li><a 
href="https://ci.apache.org/projects/flink/flink-statefun-docs-release-2.0"; 
target="_blank">Flink Stateful Functions 2.0 (Latest stable release) 
<small><span class="glyphicon glyphicon-new-window"></span></small></a></li>
+                <li><a 
href="https://ci.apache.org/projects/flink/flink-statefun-docs-master"; 
target="_blank">Flink Stateful Functions Master (Latest Snapshot) <small><span 
class="glyphicon glyphicon-new-window"></span></small></a></li>
+              </ul>
+            </li>
+
+            <!-- getting help -->
+            <li><a href="/gettinghelp.html">Getting Help</a></li>
+
+            <!-- Blog -->
+            <li class="active"><a href="/blog/"><b>Flink Blog</b></a></li>
+
+
+            <!-- Flink-packages -->
+            <li>
+              <a href="https://flink-packages.org"; 
target="_blank">flink-packages.org <small><span class="glyphicon 
glyphicon-new-window"></span></small></a>
+            </li>
+            &nbsp;
+
+            <!-- Third menu section aim to support community and contributors 
-->
+
+            <!-- Community -->
+            <li><a href="/community.html">Community &amp; Project Info</a></li>
+
+            <!-- Roadmap -->
+            <li><a href="/roadmap.html">Roadmap</a></li>
+
+            <!-- Contribute -->
+            <li><a href="/contributing/how-to-contribute.html">How to 
Contribute</a></li>
+            
+
+            <!-- GitHub -->
+            <li>
+              <a href="https://github.com/apache/flink"; target="_blank">Flink 
on GitHub <small><span class="glyphicon 
glyphicon-new-window"></span></small></a>
+            </li>
+
+            &nbsp;
+
+            <!-- Language Switcher -->
+            <li>
+              
+                
+                  <!-- link to the Chinese home page when current is blog page 
-->
+                  <a href="/zh">中文版</a>
+                
+              
+            </li>
+
+          </ul>
+
+          <ul class="nav navbar-nav navbar-bottom">
+          <hr />
+
+            <!-- Twitter -->
+            <li><a href="https://twitter.com/apacheflink"; 
target="_blank">@ApacheFlink <small><span class="glyphicon 
glyphicon-new-window"></span></small></a></li>
+
+            <!-- Visualizer -->
+            <li class=" hidden-md hidden-sm"><a href="/visualizer/" 
target="_blank">Plan Visualizer <small><span class="glyphicon 
glyphicon-new-window"></span></small></a></li>
+
+          <hr />
+
+            <li><a href="https://apache.org"; target="_blank">Apache Software 
Foundation <small><span class="glyphicon 
glyphicon-new-window"></span></small></a></li>
+
+            <li>
+              <style>
+                .smalllinks:link {
+                  display: inline-block !important; background: none; 
padding-top: 0px; padding-bottom: 0px; padding-right: 0px; min-width: 75px;
+                }
+              </style>
+
+              <a class="smalllinks" href="https://www.apache.org/licenses/"; 
target="_blank">License</a> <small><span class="glyphicon 
glyphicon-new-window"></span></small>
+
+              <a class="smalllinks" href="https://www.apache.org/security/"; 
target="_blank">Security</a> <small><span class="glyphicon 
glyphicon-new-window"></span></small>
+
+              <a class="smalllinks" 
href="https://www.apache.org/foundation/sponsorship.html"; 
target="_blank">Donate</a> <small><span class="glyphicon 
glyphicon-new-window"></span></small>
+
+              <a class="smalllinks" 
href="https://www.apache.org/foundation/thanks.html"; target="_blank">Thanks</a> 
<small><span class="glyphicon glyphicon-new-window"></span></small>
+            </li>
+
+          </ul>
+        </div><!-- /.navbar-collapse -->
+    </nav>
+
+      </div>
+      <div class="col-sm-9">
+      <div class="row-fluid">
+  <div class="col-sm-12">
+    <div class="row">
+      <h1>Flink Serialization Tuning Vol. 1: Choosing your Serializer — if you 
can</h1>
+      <p><i></i></p>
+
+      <article>
+        <p>15 Apr 2020 Nico Kruber </p>
+
+<p>Almost every Flink job has to exchange data between its operators and since 
these records may not only be sent to another instance in the same JVM but 
instead to a separate process, records need to be serialized to bytes first. 
Similarly, Flink’s off-heap state-backend is based on a local embedded RocksDB 
instance which is implemented in native C++ code and thus also needs 
transformation into bytes on every state access. Wire and state serialization 
alone can easily cost a lot of your [...]
+
+<p>Since serialization is so crucial to your Flink job, we would like to 
highlight Flink’s serialization stack in a series of blog posts starting with 
looking at the different ways Flink can serialize your data types.</p>
+
+<div class="page-toc">
+<ul id="markdown-toc">
+  <li><a href="#recap-flink-serialization" 
id="markdown-toc-recap-flink-serialization">Recap: Flink Serialization</a></li>
+  <li><a href="#choice-of-serializer" 
id="markdown-toc-choice-of-serializer">Choice of Serializer</a>    <ul>
+      <li><a href="#pojoserializer" 
id="markdown-toc-pojoserializer">PojoSerializer</a></li>
+      <li><a href="#tuple-data-types" id="markdown-toc-tuple-data-types">Tuple 
Data Types</a></li>
+      <li><a href="#row-data-types" id="markdown-toc-row-data-types">Row Data 
Types</a></li>
+      <li><a href="#avro" id="markdown-toc-avro">Avro</a>        <ul>
+          <li><a href="#avro-specific" id="markdown-toc-avro-specific">Avro 
Specific</a></li>
+          <li><a href="#avro-generic" id="markdown-toc-avro-generic">Avro 
Generic</a></li>
+          <li><a href="#avro-reflect" id="markdown-toc-avro-reflect">Avro 
Reflect</a></li>
+        </ul>
+      </li>
+      <li><a href="#kryo" id="markdown-toc-kryo">Kryo</a>        <ul>
+          <li><a href="#disabling-kryo" 
id="markdown-toc-disabling-kryo">Disabling Kryo</a></li>
+        </ul>
+      </li>
+      <li><a href="#apache-thrift-via-kryo" 
id="markdown-toc-apache-thrift-via-kryo">Apache Thrift (via Kryo)</a></li>
+      <li><a href="#protobuf-via-kryo" 
id="markdown-toc-protobuf-via-kryo">Protobuf (via Kryo)</a></li>
+    </ul>
+  </li>
+  <li><a href="#state-schema-evolution" 
id="markdown-toc-state-schema-evolution">State Schema Evolution</a></li>
+  <li><a href="#performance-comparison" 
id="markdown-toc-performance-comparison">Performance Comparison</a></li>
+  <li><a href="#conclusion" id="markdown-toc-conclusion">Conclusion</a></li>
+</ul>
+
+</div>
+
+<h1 id="recap-flink-serialization">Recap: Flink Serialization</h1>
+
+<p>Flink handles <a 
href="https://ci.apache.org/projects/flink/flink-docs-release-1.10/dev/types_serialization.html";>data
 types and serialization</a> with its own type descriptors, generic type 
extraction, and type serialization framework. We recommend reading through the 
<a 
href="https://ci.apache.org/projects/flink/flink-docs-release-1.10/dev/types_serialization.html";>documentation</a>
 first in order to be able to follow the arguments we present below. In 
essence, Flink tries to infer  [...]
+<code>stream.keyBy(“ruleId”)</code> or 
+<code>dataSet.join(another).where("name").equalTo("personName")</code>. It 
also allows optimizations in the serialization format as well as reducing 
unnecessary de/serializations (mainly in certain Batch operations as well as in 
the SQL/Table APIs).</p>
+
+<h1 id="choice-of-serializer">Choice of Serializer</h1>
+
+<p>Apache Flink’s out-of-the-box serialization can be roughly divided into the 
following groups:</p>
+
+<ul>
+  <li>
+    <p><strong>Flink-provided special serializers</strong> for basic types 
(Java primitives and their boxed form), arrays, composite types (tuples, Scala 
case classes, Rows), and a few auxiliary types (Option, Either, Lists, Maps, 
…),</p>
+  </li>
+  <li>
+    <p><strong>POJOs</strong>; a public, standalone class with a public 
no-argument constructor and all non-static, non-transient fields in the class 
hierarchy either public or with a public getter- and a setter-method; see <a 
href="https://ci.apache.org/projects/flink/flink-docs-release-1.10/dev/types_serialization.html#rules-for-pojo-types";>POJO
 Rules</a>,</p>
+  </li>
+  <li>
+    <p><strong>Generic types</strong>; user-defined data types that are not 
recognized as a POJO and then serialized via <a 
href="https://github.com/EsotericSoftware/kryo";>Kryo</a>.</p>
+  </li>
+</ul>
+
+<p>Alternatively, you can also register <a 
href="https://ci.apache.org/projects/flink/flink-docs-release-1.10/dev/custom_serializers.html";>custom
 serializers</a> for user-defined data types. This includes writing your own 
serializers or integrating other serialization systems like <a 
href="https://developers.google.com/protocol-buffers/";>Google Protobuf</a> or 
<a href="https://thrift.apache.org/";>Apache Thrift</a> via <a 
href="https://github.com/EsotericSoftware/kryo";>Kryo</a>. Overall,  [...]
+
+<h2 id="pojoserializer">PojoSerializer</h2>
+
+<p>As outlined above, if your data type is not covered by a specialized 
serializer but follows the <a 
href="https://ci.apache.org/projects/flink/flink-docs-release-1.10/dev/types_serialization.html#rules-for-pojo-types";>POJO
 Rules</a>, it will be serialized with the <a 
href="https://github.com/apache/flink/blob/release-1.10.0/flink-core/src/main/java/org/apache/flink/api/java/typeutils/runtime/PojoSerializer.java";>PojoSerializer</a>
 which uses Java reflection to access an object’s fields [...]
+
+<blockquote>
+  <p>15:45:51,460 INFO  org.apache.flink.api.java.typeutils.TypeExtractor      
       - Class … cannot be used as a POJO type because not all fields are valid 
POJO fields, and must be processed as GenericType. Please read the Flink 
documentation on “Data Types &amp; Serialization” for details of the effect on 
performance.</p>
+</blockquote>
+
+<p>This means, that the PojoSerializer will not be used, but instead Flink 
will fall back to Kryo for serialization (see below). We will have a more 
detailed look into a few (more) situations that can lead to unexpected Kryo 
fallbacks in the second part of this blog post series.</p>
+
+<h2 id="tuple-data-types">Tuple Data Types</h2>
+
+<p>Flink comes with a predefined set of tuple types which all have a fixed 
length and contain a set of strongly-typed fields of potentially different 
types. There are implementations for <code>Tuple0</code>, 
<code>Tuple1&lt;T0&gt;</code>, …, <code>Tuple25&lt;T0, T1, ..., T24&gt;</code> 
and they may serve as easy-to-use wrappers that spare the creation of POJOs for 
each and every combination of objects you need to pass between computations. 
With the exception of <code>Tuple0</code>, these [...]
+
+<div class="alert alert-info">
+  <p><span class="label label-info" style="display: inline-block"><span 
class="glyphicon glyphicon-info-sign" aria-hidden="true"></span> Note</span>
+ Since <code>Tuple0</code> does not contain any data and therefore is probably 
a bit special anyway, it will use a special serializer implementation: <a 
href="https://github.com/apache/flink/blob/release-1.10.0/flink-core/src/main/java/org/apache/flink/api/java/typeutils/runtime/Tuple0Serializer.java";>Tuple0Serializer</a>.</p>
+</div>
+
+<h2 id="row-data-types">Row Data Types</h2>
+
+<p>Row types are mainly used by the Table and SQL APIs of Flink. A 
<code>Row</code> groups an arbitrary number of objects together similar to the 
tuples above. These fields are not strongly typed and may all be of different 
types. Because field types are missing, Flink’s type extraction cannot 
automatically extract type information and users of a <code>Row</code> need to 
manually tell Flink about the row’s field types. The <a 
href="https://github.com/apache/flink/blob/release-1.10.0/flin [...]
+
+<p>Row type information can be provided in two ways:</p>
+
+<ul>
+  <li>you can have your source or operator implement 
<code>ResultTypeQueryable&lt;Row&gt;</code>:</li>
+</ul>
+
+<div class="highlight"><pre><code class="language-java"><span 
class="kd">public</span> <span class="kd">static</span> <span 
class="kd">class</span> <span class="nc">RowSource</span> <span 
class="kd">implements</span> <span class="n">SourceFunction</span><span 
class="o">&lt;</span><span class="n">Row</span><span class="o">&gt;,</span> 
<span class="n">ResultTypeQueryable</span><span class="o">&lt;</span><span 
class="n">Row</span><span class="o">&gt;</span> <span class="o">{</span>
+  <span class="c1">// ...</span>
+
+  <span class="nd">@Override</span>
+  <span class="kd">public</span> <span class="n">TypeInformation</span><span 
class="o">&lt;</span><span class="n">Row</span><span class="o">&gt;</span> 
<span class="nf">getProducedType</span><span class="o">()</span> <span 
class="o">{</span>
+    <span class="k">return</span> <span class="n">Types</span><span 
class="o">.</span><span class="na">ROW</span><span class="o">(</span><span 
class="n">Types</span><span class="o">.</span><span class="na">INT</span><span 
class="o">,</span> <span class="n">Types</span><span class="o">.</span><span 
class="na">STRING</span><span class="o">,</span> <span 
class="n">Types</span><span class="o">.</span><span 
class="na">OBJECT_ARRAY</span><span class="o">(</span><span 
class="n">Types</span><spa [...]
+  <span class="o">}</span>
+<span class="o">}</span></code></pre></div>
+
+<ul>
+  <li>you can provide the types when building the job graph by using 
<code>SingleOutputStreamOperator#returns()</code></li>
+</ul>
+
+<div class="highlight"><pre><code class="language-java"><span 
class="n">DataStream</span><span class="o">&lt;</span><span 
class="n">Row</span><span class="o">&gt;</span> <span 
class="n">sourceStream</span> <span class="o">=</span>
+    <span class="n">env</span><span class="o">.</span><span 
class="na">addSource</span><span class="o">(</span><span class="k">new</span> 
<span class="nf">RowSource</span><span class="o">())</span>
+        <span class="o">.</span><span class="na">returns</span><span 
class="o">(</span><span class="n">Types</span><span class="o">.</span><span 
class="na">ROW</span><span class="o">(</span><span class="n">Types</span><span 
class="o">.</span><span class="na">INT</span><span class="o">,</span> <span 
class="n">Types</span><span class="o">.</span><span 
class="na">STRING</span><span class="o">,</span> <span 
class="n">Types</span><span class="o">.</span><span 
class="na">OBJECT_ARRAY</span><sp [...]
+
+<div class="alert alert-warning">
+  <p><span class="label label-warning" style="display: inline-block"><span 
class="glyphicon glyphicon-warning-sign" aria-hidden="true"></span> 
Warning</span>
+If you fail to provide the type information for a <code>Row</code>, Flink 
identifies that <code>Row</code> is not a valid POJO type according to the 
rules above and falls back to Kryo serialization (see below) which you will 
also see in the logs as:</p>
+
+  <p><code>13:10:11,148 INFO  
org.apache.flink.api.java.typeutils.TypeExtractor             - Class class 
org.apache.flink.types.Row cannot be used as a POJO type because not all fields 
are valid POJO fields, and must be processed as GenericType. Please read the 
Flink documentation on "Data Types &amp; Serialization" for details of the 
effect on performance.</code></p>
+</div>
+
+<h2 id="avro">Avro</h2>
+
+<p>Flink offers built-in support for the <a 
href="http://avro.apache.org/";>Apache Avro</a> serialization framework 
(currently using version 1.8.2) by adding the 
<code>org.apache.flink:flink-avro</code> dependency into your job. Flink’s <a 
href="https://github.com/apache/flink/blob/release-1.10.0/flink-formats/flink-avro/src/main/java/org/apache/flink/formats/avro/typeutils/AvroSerializer.java";>AvroSerializer</a>
 can then use Avro’s specific, generic, and reflective data serialization and 
[...]
+
+<h3 id="avro-specific">Avro Specific</h3>
+
+<p>Avro specific records will be automatically detected by checking that the 
given type’s type hierarchy contains the <code>SpecificRecordBase</code> class. 
You can either specify your concrete Avro type, or—if you want to be more 
generic and allow different types in your operator—use the 
<code>SpecificRecordBase</code> type (or a subtype) in your user functions, in 
<code>ResultTypeQueryable#getProducedType()</code>, or in 
<code>SingleOutputStreamOperator#returns()</code>. Since specific [...]
+
+<div class="alert alert-warning">
+  <p><span class="label label-warning" style="display: inline-block"><span 
class="glyphicon glyphicon-warning-sign" aria-hidden="true"></span> 
Warning</span> If you specify the Flink type as <code>SpecificRecord</code> and 
not <code>SpecificRecordBase</code>, Flink will not see this as an Avro type. 
Instead, it will use Kryo to de/serialize any objects which may be considerably 
slower.</p>
+</div>
+
+<h3 id="avro-generic">Avro Generic</h3>
+
+<p>Avro’s <code>GenericRecord</code> types cannot, unfortunately, be used 
automatically since they require the user to <a 
href="https://avro.apache.org/docs/1.8.2/gettingstartedjava.html#Serializing+and+deserializing+without+code+generation";>specify
 a schema</a> (either manually or by retrieving it from some schema registry). 
With that schema, you can provide the right type information by either of the 
following options just like for the Row Types above:</p>
+
+<ul>
+  <li>implement <code>ResultTypeQueryable&lt;GenericRecord&gt;</code>:</li>
+</ul>
+
+<div class="highlight"><pre><code class="language-java"><span 
class="kd">public</span> <span class="kd">static</span> <span 
class="kd">class</span> <span class="nc">AvroGenericSource</span> <span 
class="kd">implements</span> <span class="n">SourceFunction</span><span 
class="o">&lt;</span><span class="n">GenericRecord</span><span 
class="o">&gt;,</span> <span class="n">ResultTypeQueryable</span><span 
class="o">&lt;</span><span class="n">GenericRecord</span><span 
class="o">&gt;</span> <span [...]
+  <span class="kd">private</span> <span class="kd">final</span> <span 
class="n">GenericRecordAvroTypeInfo</span> <span 
class="n">producedType</span><span class="o">;</span>
+
+  <span class="kd">public</span> <span 
class="nf">AvroGenericSource</span><span class="o">(</span><span 
class="n">Schema</span> <span class="n">schema</span><span class="o">)</span> 
<span class="o">{</span>
+    <span class="k">this</span><span class="o">.</span><span 
class="na">producedType</span> <span class="o">=</span> <span 
class="k">new</span> <span class="nf">GenericRecordAvroTypeInfo</span><span 
class="o">(</span><span class="n">schema</span><span class="o">);</span>
+  <span class="o">}</span>
+  
+  <span class="nd">@Override</span>
+  <span class="kd">public</span> <span class="n">TypeInformation</span><span 
class="o">&lt;</span><span class="n">GenericRecord</span><span 
class="o">&gt;</span> <span class="nf">getProducedType</span><span 
class="o">()</span> <span class="o">{</span>
+    <span class="k">return</span> <span class="n">producedType</span><span 
class="o">;</span>
+  <span class="o">}</span>
+<span class="o">}</span></code></pre></div>
+<ul>
+  <li>provide type information when building the job graph by using 
<code>SingleOutputStreamOperator#returns()</code></li>
+</ul>
+
+<div class="highlight"><pre><code class="language-java"><span 
class="n">DataStream</span><span class="o">&lt;</span><span 
class="n">GenericRecord</span><span class="o">&gt;</span> <span 
class="n">sourceStream</span> <span class="o">=</span>
+    <span class="n">env</span><span class="o">.</span><span 
class="na">addSource</span><span class="o">(</span><span class="k">new</span> 
<span class="nf">AvroGenericSource</span><span class="o">())</span>
+        <span class="o">.</span><span class="na">returns</span><span 
class="o">(</span><span class="k">new</span> <span 
class="nf">GenericRecordAvroTypeInfo</span><span class="o">(</span><span 
class="n">schema</span><span class="o">));</span></code></pre></div>
+<p>Without this type information, Flink will fall back to Kryo for 
serialization which would serialize the schema into every record, over and over 
again. As a result, the serialized form will be bigger and more costly to 
create.</p>
+
+<div class="alert alert-info">
+  <p><span class="label label-info" style="display: inline-block"><span 
class="glyphicon glyphicon-info-sign" aria-hidden="true"></span> Note</span>
+ Since Avro’s <code>Schema</code> class is not serializable, it can not be 
sent around as is. You can work around this by converting it to a String and 
parsing it back when needed. If you only do this once on initialization, there 
is practically no difference to sending it directly.</p>
+</div>
+
+<h3 id="avro-reflect">Avro Reflect</h3>
+
+<p>The third way of using Avro is to exchange Flink’s PojoSerializer (for 
POJOs according to the rules above) for Avro’s reflection-based serializer. 
This can be enabled by calling</p>
+
+<div class="highlight"><pre><code class="language-java"><span 
class="n">env</span><span class="o">.</span><span 
class="na">getConfig</span><span class="o">().</span><span 
class="na">enableForceAvro</span><span class="o">();</span></code></pre></div>
+
+<h2 id="kryo">Kryo</h2>
+
+<p>Any class or object which does not fall into the categories above or is 
covered by a Flink-provided special serializer is de/serialized with a fallback 
to <a href="https://github.com/EsotericSoftware/kryo";>Kryo</a> (currently 
version 2.24.0) which is a powerful and generic serialization framework in 
Java. Flink calls such a type a <em>generic type</em> and you may stumble upon 
<code>GenericTypeInfo</code> when debugging code. If you are using Kryo 
serialization, make sure to register  [...]
+
+<div class="highlight"><pre><code class="language-java"><span 
class="n">env</span><span class="o">.</span><span 
class="na">getConfig</span><span class="o">().</span><span 
class="na">registerKryoType</span><span class="o">(</span><span 
class="n">MyCustomType</span><span class="o">.</span><span 
class="na">class</span><span class="o">);</span></code></pre></div>
+<p>Registering types adds them to an internal map of classes to tags so that, 
during serialization, Kryo does not have to add the fully qualified class names 
as a prefix into the serialized form. Instead, Kryo uses these (integer) tags 
to identify the underlying classes and reduce serialization overhead.</p>
+
+<div class="alert alert-info">
+  <p><span class="label label-info" style="display: inline-block"><span 
class="glyphicon glyphicon-info-sign" aria-hidden="true"></span> Note</span>
+Flink will store Kryo serializer mappings from type registrations in its 
checkpoints and savepoints and will retain them across job (re)starts.</p>
+</div>
+
+<h3 id="disabling-kryo">Disabling Kryo</h3>
+
+<p>If desired, you can disable the Kryo fallback, i.e. the ability to 
serialize generic types, by calling</p>
+
+<div class="highlight"><pre><code class="language-java"><span 
class="n">env</span><span class="o">.</span><span 
class="na">getConfig</span><span class="o">().</span><span 
class="na">disableGenericTypes</span><span 
class="o">();</span></code></pre></div>
+
+<p>This is mostly useful for finding out where these fallbacks are applied and 
replacing them with better serializers. If your job has any generic types with 
this configuration, it will fail with</p>
+
+<blockquote>
+  <p>Exception in thread “main” java.lang.UnsupportedOperationException: 
Generic types have been disabled in the ExecutionConfig and type … is treated 
as a generic type.</p>
+</blockquote>
+
+<p>If you cannot immediately see from the type where it is being used, this 
log message also gives you a stacktrace that can be used to set breakpoints and 
find out more details in your IDE.</p>
+
+<h2 id="apache-thrift-via-kryo">Apache Thrift (via Kryo)</h2>
+
+<p>In addition to the variants above, Flink also allows you to <a 
href="https://ci.apache.org/projects/flink/flink-docs-release-1.10/dev/custom_serializers.html#register-a-custom-serializer-for-your-flink-program";>register
 other type serialization frameworks</a> with Kryo. After adding the 
appropriate dependencies from the <a 
href="https://ci.apache.org/projects/flink/flink-docs-release-1.10/dev/custom_serializers.html#register-a-custom-serializer-for-your-flink-program";>documentation</a
 [...]
+
+<div class="highlight"><pre><code class="language-java"><span 
class="n">env</span><span class="o">.</span><span 
class="na">getConfig</span><span class="o">().</span><span 
class="na">addDefaultKryoSerializer</span><span class="o">(</span><span 
class="n">MyCustomType</span><span class="o">.</span><span 
class="na">class</span><span class="o">,</span> <span 
class="n">TBaseSerializer</span><span class="o">.</span><span 
class="na">class</span><span class="o">);</span></code></pre></div>
+
+<p>This only works if generic types are not disabled and 
<code>MyCustomType</code> is a Thrift-generated data type. If the data type is 
not generated by Thrift, Flink will fail at runtime with an exception like 
this:</p>
+
+<blockquote>
+  <p>java.lang.ClassCastException: class MyCustomType cannot be cast to class 
org.apache.thrift.TBase (MyCustomType and org.apache.thrift.TBase are in 
unnamed module of loader ‘app’)</p>
+</blockquote>
+
+<div class="alert alert-info">
+  <p><span class="label label-info" style="display: inline-block"><span 
class="glyphicon glyphicon-info-sign" aria-hidden="true"></span> Note</span>
+Please note that <code>TBaseSerializer</code> can be registered as a default 
Kryo serializer as above (and as specified in <a 
href="https://github.com/twitter/chill/blob/v0.7.6/chill-thrift/src/main/java/com/twitter/chill/thrift/TBaseSerializer.java";>its
 documentation</a>) or via <code>registerTypeWithKryoSerializer</code>. In 
practice, we found both ways working. We also saw no difference between 
registering Thrift classes in addition to the call above. Both may be different 
in your sce [...]
+</div>
+
+<h2 id="protobuf-via-kryo">Protobuf (via Kryo)</h2>
+
+<p>In a way similar to Apache Thrift, <a 
href="https://developers.google.com/protocol-buffers/";>Google Protobuf</a> may 
be <a 
href="https://ci.apache.org/projects/flink/flink-docs-release-1.10/dev/custom_serializers.html#register-a-custom-serializer-for-your-flink-program";>registered
 as a custom serializer</a> after adding the right dependencies 
(<code>com.twitter:chill-protobuf</code> and 
<code>com.google.protobuf:protobuf-java</code>):</p>
+
+<div class="highlight"><pre><code class="language-java"><span 
class="n">env</span><span class="o">.</span><span 
class="na">getConfig</span><span class="o">().</span><span 
class="na">registerTypeWithKryoSerializer</span><span class="o">(</span><span 
class="n">MyCustomType</span><span class="o">.</span><span 
class="na">class</span><span class="o">,</span> <span 
class="n">ProtobufSerializer</span><span class="o">.</span><span 
class="na">class</span><span class="o">);</span></code></pre></div>
+<p>This will work as long as generic types have not been disabled (this would 
disable Kryo for good). If <code>MyCustomType</code> is not a 
Protobuf-generated class, your Flink job will fail at runtime with the 
following exception:</p>
+
+<blockquote>
+  <p>java.lang.ClassCastException: class <code>MyCustomType</code> cannot be 
cast to class com.google.protobuf.Message (<code>MyCustomType</code> and 
com.google.protobuf.Message are in unnamed module of loader ‘app’)</p>
+</blockquote>
+
+<div class="alert alert-info">
+  <p><span class="label label-info" style="display: inline-block"><span 
class="glyphicon glyphicon-info-sign" aria-hidden="true"></span> Note</span>
+Please note that <code>ProtobufSerializer</code> can be registered as a 
default Kryo serializer (as specified in the <a 
href="https://github.com/twitter/chill/blob/v0.7.6/chill-thrift/src/main/java/com/twitter/chill/thrift/TBaseSerializer.java";>Protobuf
 documentation</a>) or via <code>registerTypeWithKryoSerializer</code> (as 
presented here). In practice, we found both ways working. We also saw no 
difference between registering your Protobuf classes in addition to the call 
above. Both ma [...]
+</div>
+
+<h1 id="state-schema-evolution">State Schema Evolution</h1>
+
+<p>Before taking a closer look at the performance of each of the serializers 
described above, we would like to emphasize that performance is not everything 
that counts inside a real-world Flink job. Types for storing state, for 
example, should be able to evolve their schema (add/remove/change fields) 
throughout the lifetime of the job without losing previous state. This is what 
Flink calls <a 
href="https://ci.apache.org/projects/flink/flink-docs-stable/dev/stream/state/schema_evolution.h
 [...]
+
+<h1 id="performance-comparison">Performance Comparison</h1>
+
+<p>With so many options for serialization, it is actually not easy to make the 
right choice. We already saw some technical advantages and disadvantages of 
each of them outlined above. Since serializers are at the core of your Flink 
jobs and usually also sit on the hot path (per record invocations), let us 
actually take a deeper look into their performance with the help of the Flink 
benchmarks project at <a 
href="https://github.com/dataArtisans/flink-benchmarks";>https://github.com/dataArt
 [...]
+
+<div class="highlight"><pre><code class="language-java"><span 
class="kd">public</span> <span class="kd">class</span> <span 
class="nc">MyPojo</span> <span class="o">{</span>
+  <span class="kd">public</span> <span class="kt">int</span> <span 
class="n">id</span><span class="o">;</span>
+  <span class="kd">private</span> <span class="n">String</span> <span 
class="n">name</span><span class="o">;</span>
+  <span class="kd">private</span> <span class="n">String</span><span 
class="o">[]</span> <span class="n">operationNames</span><span 
class="o">;</span>
+  <span class="kd">private</span> <span class="n">MyOperation</span><span 
class="o">[]</span> <span class="n">operations</span><span class="o">;</span>
+  <span class="kd">private</span> <span class="kt">int</span> <span 
class="n">otherId1</span><span class="o">;</span>
+  <span class="kd">private</span> <span class="kt">int</span> <span 
class="n">otherId2</span><span class="o">;</span>
+  <span class="kd">private</span> <span class="kt">int</span> <span 
class="n">otherId3</span><span class="o">;</span>
+  <span class="kd">private</span> <span class="n">Object</span> <span 
class="n">someObject</span><span class="o">;</span>
+<span class="o">}</span>
+<span class="kd">public</span> <span class="kd">class</span> <span 
class="nc">MyOperation</span> <span class="o">{</span>
+  <span class="kt">int</span> <span class="n">id</span><span class="o">;</span>
+  <span class="kd">protected</span> <span class="n">String</span> <span 
class="n">name</span><span class="o">;</span>
+<span class="o">}</span></code></pre></div>
+
+<p>This is mapped to tuples, rows, Avro specific records, Thrift and Protobuf 
representations appropriately and sent through a simple Flink job at 
parallelism 4 where the data type is used during network communication like 
this:</p>
+
+<div class="highlight"><pre><code class="language-java"><span 
class="n">env</span><span class="o">.</span><span 
class="na">setParallelism</span><span class="o">(</span><span 
class="mi">4</span><span class="o">);</span>
+<span class="n">env</span><span class="o">.</span><span 
class="na">addSource</span><span class="o">(</span><span class="k">new</span> 
<span class="nf">PojoSource</span><span class="o">(</span><span 
class="n">RECORDS_PER_INVOCATION</span><span class="o">,</span> <span 
class="mi">10</span><span class="o">))</span>
+    <span class="o">.</span><span class="na">rebalance</span><span 
class="o">()</span>
+    <span class="o">.</span><span class="na">addSink</span><span 
class="o">(</span><span class="k">new</span> <span 
class="n">DiscardingSink</span><span 
class="o">&lt;&gt;());</span></code></pre></div>
+<p>After running this through the <a 
href="http://openjdk.java.net/projects/code-tools/jmh/";>jmh</a> 
micro-benchmarks defined in <a 
href="https://github.com/dataArtisans/flink-benchmarks/blob/master/src/main/java/org/apache/flink/benchmark/full/SerializationFrameworkAllBenchmarks.java";>SerializationFrameworkAllBenchmarks.java</a>,
 I retrieved the following performance results for Flink 1.10 on my machine (in 
number of operations per millisecond):
+<br /></p>
+
+<center>
+<img src="/img/blog/2020-04-15-flink-serialization-performance-results.svg" 
width="800px" alt="Communication between the Flink operator and the Python 
execution environment" />
+</center>
+<p><br /></p>
+
+<p>A few takeaways from these numbers:</p>
+
+<ul>
+  <li>
+    <p>The default fallback from POJO to Kryo reduces performance by 75%.<br />
+Registering types with Kryo significantly improves its performance with only 
64% fewer operations than by using a POJO.</p>
+  </li>
+  <li>
+    <p>Avro GenericRecord and SpecificRecord are roughly serialized at the 
same speed.</p>
+  </li>
+  <li>
+    <p>Avro Reflect serialization is even slower than Kryo default (-45%).</p>
+  </li>
+  <li>
+    <p>Tuples are the fastest, closely followed by Rows. Both leverage fast 
specialized serialization code based on direct access without Java 
reflection.</p>
+  </li>
+  <li>
+    <p>Using a (nested) Tuple instead of a POJO may speed up your job by 42% 
(but is less flexible!).
+ Having code-generation for the PojoSerializer (<a 
href="https://jira.apache.org/jira/browse/FLINK-3599";>FLINK-3599</a>) may 
actually close that gap (or at least move closer to the RowSerializer). If you 
feel like giving the implementation a go, please give the Flink community a 
note and we will see whether we can make that happen.</p>
+  </li>
+  <li>
+    <p>If you cannot use POJOs, try to define your data type with one of the 
serialization frameworks that generate specific code for it: Protobuf, Avro, 
Thrift (in that order, performance-wise).</p>
+  </li>
+</ul>
+
+<div class="alert alert-info">
+  <p><span class="label label-info" style="display: inline-block"><span 
class="glyphicon glyphicon-info-sign" aria-hidden="true"></span> Note</span> As 
with all benchmarks, please bear in mind that these numbers only give a hint on 
Flink’s serializer performance in a specific scenario. They may be different 
with your data types but the rough classification is probably the same. If you 
want to be sure, please verify the results with your data types. You should be 
able to copy from <code>S [...]
+</div>
+
+<h1 id="conclusion">Conclusion</h1>
+
+<p>In the sections above, we looked at how Flink performs serialization for 
different sorts of data types and elaborated the technical advantages and 
disadvantages. For data types used in Flink state, you probably want to 
leverage either POJO or Avro types which, currently, are the only ones 
supporting state evolution out of the box and allow your stateful application 
to develop over time. POJOs are usually faster in the de/serialization while 
Avro may support more flexible schema evolut [...]
+
+<p>The fastest de/serialization is achieved with Flink’s internal tuple and 
row serializers which can access these types’ fields directly without going via 
reflection. With roughly 30% decreased throughput as compared to tuples, 
Protobuf and POJO types do not perform too badly on their own and are more 
flexible and maintainable. Avro (specific and generic) records as well as 
Thrift data types further reduce performance by 20% and 30%, respectively. You 
definitely want to avoid Kryo as th [...]
+
+<p>The next article in this series will use this finding as a starting point 
to look into a few common pitfalls and obstacles of avoiding Kryo, how to get 
the most out of the PojoSerializer, and a few more tuning techniques with 
respect to serialization. Stay tuned for more.</p>
+
+      </article>
+    </div>
+
+    <div class="row">
+      <div id="disqus_thread"></div>
+      <script type="text/javascript">
+        /* * * CONFIGURATION VARIABLES: EDIT BEFORE PASTING INTO YOUR WEBPAGE 
* * */
+        var disqus_shortname = 'stratosphere-eu'; // required: replace example 
with your forum shortname
+
+        /* * * DON'T EDIT BELOW THIS LINE * * */
+        (function() {
+            var dsq = document.createElement('script'); dsq.type = 
'text/javascript'; dsq.async = true;
+            dsq.src = '//' + disqus_shortname + '.disqus.com/embed.js';
+             (document.getElementsByTagName('head')[0] || 
document.getElementsByTagName('body')[0]).appendChild(dsq);
+        })();
+      </script>
+    </div>
+  </div>
+</div>
+      </div>
+    </div>
+
+    <hr />
+
+    <div class="row">
+      <div class="footer text-center col-sm-12">
+        <p>Copyright © 2014-2019 <a href="http://apache.org";>The Apache 
Software Foundation</a>. All Rights Reserved.</p>
+        <p>Apache Flink, Flink®, Apache®, the squirrel logo, and the Apache 
feather logo are either registered trademarks or trademarks of The Apache 
Software Foundation.</p>
+        <p><a href="/privacy-policy.html">Privacy Policy</a> &middot; <a 
href="/blog/feed.xml">RSS feed</a></p>
+      </div>
+    </div>
+    </div><!-- /.container -->
+
+    <!-- Include all compiled plugins (below), or include individual files as 
needed -->
+    <script 
src="https://maxcdn.bootstrapcdn.com/bootstrap/3.3.4/js/bootstrap.min.js";></script>
+    <script 
src="https://cdnjs.cloudflare.com/ajax/libs/jquery.matchHeight/0.7.0/jquery.matchHeight-min.js";></script>
+    <script src="/js/codetabs.js"></script>
+    <script src="/js/stickysidebar.js"></script>
+
+    <!-- Google Analytics -->
+    <script>
+      
(function(i,s,o,g,r,a,m){i['GoogleAnalyticsObject']=r;i[r]=i[r]||function(){
+      (i[r].q=i[r].q||[]).push(arguments)},i[r].l=1*new 
Date();a=s.createElement(o),
+      
m=s.getElementsByTagName(o)[0];a.async=1;a.src=g;m.parentNode.insertBefore(a,m)
+      
})(window,document,'script','//www.google-analytics.com/analytics.js','ga');
+
+      ga('create', 'UA-52545728-1', 'auto');
+      ga('send', 'pageview');
+    </script>
+  </body>
+</html>
diff --git a/content/zh/index.html b/content/zh/index.html
index 2ba5e74..c67dadd 100644
--- a/content/zh/index.html
+++ b/content/zh/index.html
@@ -564,6 +564,9 @@
 
   <dl>
       
+        <dt> <a 
href="/news/2020/04/15/flink-serialization-tuning-vol-1.html">Flink 
Serialization Tuning Vol. 1: Choosing your Serializer — if you can</a></dt>
+        <dd>Serialization is a crucial element of your Flink job. This article 
is the first in a series of posts that will highlight Flink’s serialization 
stack, and looks at the different ways Flink can serialize your data types.</dd>
+      
         <dt> <a href="/2020/04/09/pyflink-udf-support-flink.html">PyFlink: 
Introducing Python Support for UDFs in Flink's Table API</a></dt>
         <dd>Flink 1.10 extends its support for Python by adding Python UDFs in 
PyFlink. This post explains how UDFs work in PyFlink and gives some practical 
examples of how to use UDFs in PyFlink.</dd>
       
@@ -580,9 +583,6 @@ This release marks a big milestone: Stateful Functions 2.0 
is not only an API up
         <dd><p>In this blog post, you will learn our motivation behind the 
Flink-Hive integration, and how Flink 1.10 can help modernize your data 
warehouse.</p>
 
 </dd>
-      
-        <dt> <a href="/news/2020/03/24/demo-fraud-detection-2.html">Advanced 
Flink Application Patterns Vol.2: Dynamic Updates of Application Logic</a></dt>
-        <dd>In this series of blog posts you will learn about powerful Flink 
patterns for building streaming applications.</dd>
     
   </dl>
 

Reply via email to