This is an automated email from the ASF dual-hosted git repository. chesnay pushed a commit to branch asf-site in repository https://gitbox.apache.org/repos/asf/flink-web.git
The following commit(s) were added to refs/heads/asf-site by this push: new 104f831 Rebuild website 104f831 is described below commit 104f831ad72a80653f20f84842e4e71ddbd4db06 Author: Chesnay Schepler <ches...@apache.org> AuthorDate: Thu Jan 20 09:42:14 2022 +0100 Rebuild website --- content/2022/01/20/pravega-connector-101.html | 504 ++++++++++++++++++++++++++ content/blog/feed.xml | 354 +++++++++++++----- content/blog/index.html | 36 +- content/blog/page10/index.html | 38 +- content/blog/page11/index.html | 40 +- content/blog/page12/index.html | 38 +- content/blog/page13/index.html | 37 +- content/blog/page14/index.html | 39 +- content/blog/page15/index.html | 38 +- content/blog/page16/index.html | 37 +- content/blog/page17/index.html | 39 +- content/blog/page18/index.html | 25 ++ content/blog/page2/index.html | 36 +- content/blog/page3/index.html | 38 +- content/blog/page4/index.html | 38 +- content/blog/page5/index.html | 36 +- content/blog/page6/index.html | 36 +- content/blog/page7/index.html | 36 +- content/blog/page8/index.html | 38 +- content/blog/page9/index.html | 38 +- content/index.html | 8 +- content/zh/index.html | 8 +- 22 files changed, 1195 insertions(+), 342 deletions(-) diff --git a/content/2022/01/20/pravega-connector-101.html b/content/2022/01/20/pravega-connector-101.html new file mode 100644 index 0000000..54ca238 --- /dev/null +++ b/content/2022/01/20/pravega-connector-101.html @@ -0,0 +1,504 @@ +<!DOCTYPE html> +<html lang="en"> + <head> + <meta charset="utf-8"> + <meta http-equiv="X-UA-Compatible" content="IE=edge"> + <meta name="viewport" content="width=device-width, initial-scale=1"> + <!-- The above 3 meta tags *must* come first in the head; any other head content must come *after* these tags --> + <title>Apache Flink: Pravega Flink Connector 101</title> + <link rel="shortcut icon" href="/favicon.ico" type="image/x-icon"> + <link rel="icon" href="/favicon.ico" type="image/x-icon"> + + <!-- Bootstrap --> + <link rel="stylesheet" href="/css/bootstrap.min.css"> + <link rel="stylesheet" href="/css/flink.css"> + <link rel="stylesheet" href="/css/syntax.css"> + + <!-- Blog RSS feed --> + <link href="/blog/feed.xml" rel="alternate" type="application/rss+xml" title="Apache Flink Blog: RSS feed" /> + + <!-- jQuery (necessary for Bootstrap's JavaScript plugins) --> + <!-- We need to load Jquery in the header for custom google analytics event tracking--> + <script src="/js/jquery.min.js"></script> + + <!-- HTML5 shim and Respond.js for IE8 support of HTML5 elements and media queries --> + <!-- WARNING: Respond.js doesn't work if you view the page via file:// --> + <!--[if lt IE 9]> + <script src="https://oss.maxcdn.com/html5shiv/3.7.2/html5shiv.min.js"></script> + <script src="https://oss.maxcdn.com/respond/1.4.2/respond.min.js"></script> + <![endif]--> + </head> + <body> + + + <!-- Main content. --> + <div class="container"> + <div class="row"> + + + <div id="sidebar" class="col-sm-3"> + + +<!-- Top navbar. --> + <nav class="navbar navbar-default"> + <!-- The logo. --> + <div class="navbar-header"> + <button type="button" class="navbar-toggle collapsed" data-toggle="collapse" data-target="#bs-example-navbar-collapse-1"> + <span class="icon-bar"></span> + <span class="icon-bar"></span> + <span class="icon-bar"></span> + </button> + <div class="navbar-logo"> + <a href="/"> + <img alt="Apache Flink" src="/img/flink-header-logo.svg" width="147px" height="73px"> + </a> + </div> + </div><!-- /.navbar-header --> + + <!-- The navigation links. --> + <div class="collapse navbar-collapse" id="bs-example-navbar-collapse-1"> + <ul class="nav navbar-nav navbar-main"> + + <!-- First menu section explains visitors what Flink is --> + + <!-- What is Stream Processing? --> + <!-- + <li><a href="/streamprocessing1.html">What is Stream Processing?</a></li> + --> + + <!-- What is Flink? --> + <li><a href="/flink-architecture.html">What is Apache Flink?</a></li> + + + + <!-- What is Stateful Functions? --> + + <li><a href="/stateful-functions.html">What is Stateful Functions?</a></li> + + <!-- Use cases --> + <li><a href="/usecases.html">Use Cases</a></li> + + <!-- Powered by --> + <li><a href="/poweredby.html">Powered By</a></li> + + + + <!-- Second menu section aims to support Flink users --> + + <!-- Downloads --> + <li><a href="/downloads.html">Downloads</a></li> + + <!-- Getting Started --> + <li class="dropdown"> + <a class="dropdown-toggle" data-toggle="dropdown" href="#">Getting Started<span class="caret"></span></a> + <ul class="dropdown-menu"> + <li><a href="https://nightlies.apache.org/flink/flink-docs-release-1.14//docs/try-flink/local_installation/" target="_blank">With Flink <small><span class="glyphicon glyphicon-new-window"></span></small></a></li> + <li><a href="https://nightlies.apache.org/flink/flink-statefun-docs-release-3.1/getting-started/project-setup.html" target="_blank">With Flink Stateful Functions <small><span class="glyphicon glyphicon-new-window"></span></small></a></li> + <li><a href="/training.html">Training Course</a></li> + </ul> + </li> + + <!-- Documentation --> + <li class="dropdown"> + <a class="dropdown-toggle" data-toggle="dropdown" href="#">Documentation<span class="caret"></span></a> + <ul class="dropdown-menu"> + <li><a href="https://nightlies.apache.org/flink/flink-docs-release-1.14" target="_blank">Flink 1.14 (Latest stable release) <small><span class="glyphicon glyphicon-new-window"></span></small></a></li> + <li><a href="https://nightlies.apache.org/flink/flink-docs-master" target="_blank">Flink Master (Latest Snapshot) <small><span class="glyphicon glyphicon-new-window"></span></small></a></li> + <li><a href="https://nightlies.apache.org/flink/flink-statefun-docs-release-3.1" target="_blank">Flink Stateful Functions 3.1 (Latest stable release) <small><span class="glyphicon glyphicon-new-window"></span></small></a></li> + <li><a href="https://nightlies.apache.org/flink/flink-statefun-docs-master" target="_blank">Flink Stateful Functions Master (Latest Snapshot) <small><span class="glyphicon glyphicon-new-window"></span></small></a></li> + </ul> + </li> + + <!-- getting help --> + <li><a href="/gettinghelp.html">Getting Help</a></li> + + <!-- Blog --> + <li><a href="/blog/"><b>Flink Blog</b></a></li> + + + <!-- Flink-packages --> + <li> + <a href="https://flink-packages.org" target="_blank">flink-packages.org <small><span class="glyphicon glyphicon-new-window"></span></small></a> + </li> + + + <!-- Third menu section aim to support community and contributors --> + + <!-- Community --> + <li><a href="/community.html">Community & Project Info</a></li> + + <!-- Roadmap --> + <li><a href="/roadmap.html">Roadmap</a></li> + + <!-- Contribute --> + <li><a href="/contributing/how-to-contribute.html">How to Contribute</a></li> + + + <!-- GitHub --> + <li> + <a href="https://github.com/apache/flink" target="_blank">Flink on GitHub <small><span class="glyphicon glyphicon-new-window"></span></small></a> + </li> + + + + <!-- Language Switcher --> + <li> + + + <a href="/zh/2022/01/20/pravega-connector-101.html">中文版</a> + + + </li> + + </ul> + + <style> + .smalllinks:link { + display: inline-block !important; background: none; padding-top: 0px; padding-bottom: 0px; padding-right: 0px; min-width: 75px; + } + </style> + + <ul class="nav navbar-nav navbar-bottom"> + <hr /> + + <!-- Twitter --> + <li><a href="https://twitter.com/apacheflink" target="_blank">@ApacheFlink <small><span class="glyphicon glyphicon-new-window"></span></small></a></li> + + <!-- Visualizer --> + <li class=" hidden-md hidden-sm"><a href="/visualizer/" target="_blank">Plan Visualizer <small><span class="glyphicon glyphicon-new-window"></span></small></a></li> + + <li > + <a href="/security.html">Flink Security</a> + </li> + + <hr /> + + <li><a href="https://apache.org" target="_blank">Apache Software Foundation <small><span class="glyphicon glyphicon-new-window"></span></small></a></li> + + <li> + + <a class="smalllinks" href="https://www.apache.org/licenses/" target="_blank">License</a> <small><span class="glyphicon glyphicon-new-window"></span></small> + + <a class="smalllinks" href="https://www.apache.org/security/" target="_blank">Security</a> <small><span class="glyphicon glyphicon-new-window"></span></small> + + <a class="smalllinks" href="https://www.apache.org/foundation/sponsorship.html" target="_blank">Donate</a> <small><span class="glyphicon glyphicon-new-window"></span></small> + + <a class="smalllinks" href="https://www.apache.org/foundation/thanks.html" target="_blank">Thanks</a> <small><span class="glyphicon glyphicon-new-window"></span></small> + </li> + + </ul> + </div><!-- /.navbar-collapse --> + </nav> + + </div> + <div class="col-sm-9"> + <div class="row-fluid"> + <div class="col-sm-12"> + <div class="row"> + <h1>Pravega Flink Connector 101</h1> + <p><i></i></p> + + <article> + <p>20 Jan 2022 Yumin Zhou (Brian) (<a href="https://twitter.com/crazy__zhou">@crazy__zhou</a>)</p> + +<p><a href="https://cncf.pravega.io/">Pravega</a>, which is now a CNCF sandbox project, is a cloud-native storage system based on abstractions for both batch and streaming data consumption. Pravega streams (a new storage abstraction) are durable, consistent, and elastic, while natively supporting long-term data retention. In comparison, <a href="https://flink.apache.org/">Apache Flink</a> is a popular real-time computing engine that provides unified batch and stream processing. Flink pro [...] + +<p>That’s also the main reason why Pravega has chosen to use Flink as the first integrated execution engine among the various distributed computing engines on the market. With the help of Flink, users can use flexible APIs for windowing, complex event processing (CEP), or table abstractions to process streaming data easily and enrich the data being stored. Since its inception in 2016, Pravega has established communication with Flink PMC members and developed the connector together.</p> + +<p>In 2017, the Pravega Flink connector module started to move out of the Pravega main repository and has been maintained in a new separate <a href="https://github.com/pravega/flink-connectors">repository</a> since then. During years of development, many features have been implemented, including:</p> + +<ul> + <li>exactly-once processing guarantees for both Reader and Writer, supporting end-to-end exactly-once processing pipelines</li> + <li>seamless integration with Flink’s checkpoints and savepoints</li> + <li>parallel Readers and Writers supporting high throughput and low latency processing</li> + <li>support for Batch, Streaming, and Table API to access Pravega Streams</li> +</ul> + +<p>These key features make streaming pipeline applications easier to develop without worrying about performance and correctness which are the common pain points for many streaming use cases.</p> + +<p>In this blog post, we will discuss how to use this connector to read and write Pravega streams with the Flink DataStream API.</p> + +<div class="page-toc"> +<ul id="markdown-toc"> + <li><a href="#basic-usages" id="markdown-toc-basic-usages">Basic usages</a> <ul> + <li><a href="#dependency" id="markdown-toc-dependency">Dependency</a></li> + <li><a href="#api-introduction" id="markdown-toc-api-introduction">API introduction</a> <ul> + <li><a href="#configurations" id="markdown-toc-configurations">Configurations</a></li> + <li><a href="#serializationdeserialization" id="markdown-toc-serializationdeserialization">Serialization/Deserialization</a></li> + <li><a href="#flinkpravegareader" id="markdown-toc-flinkpravegareader"><code>FlinkPravegaReader</code></a></li> + <li><a href="#flinkpravegawriter" id="markdown-toc-flinkpravegawriter"><code>FlinkPravegaWriter</code></a></li> + </ul> + </li> + </ul> + </li> + <li><a href="#internals-of-reader-and-writer" id="markdown-toc-internals-of-reader-and-writer">Internals of reader and writer</a> <ul> + <li><a href="#checkpoint-integration" id="markdown-toc-checkpoint-integration">Checkpoint integration</a></li> + <li><a href="#end-to-end-exactly-once-semantics" id="markdown-toc-end-to-end-exactly-once-semantics">End-to-end exactly-once semantics</a></li> + </ul> + </li> + <li><a href="#summary" id="markdown-toc-summary">Summary</a></li> + <li><a href="#future-plans" id="markdown-toc-future-plans">Future plans</a></li> +</ul> + +</div> + +<h1 id="basic-usages">Basic usages</h1> + +<h2 id="dependency">Dependency</h2> +<p>To use this connector in your application, add the dependency to your project:</p> + +<div class="highlight"><pre><code class="language-xml"><span class="nt"><dependency></span> + <span class="nt"><groupId></span>io.pravega<span class="nt"></groupId></span> + <span class="nt"><artifactId></span>pravega-connectors-flink-1.13_2.12<span class="nt"></artifactId></span> + <span class="nt"><version></span>0.10.1<span class="nt"></version></span> +<span class="nt"></dependency></span></code></pre></div> + +<p>In the above example,</p> + +<p><code>1.13</code> is the Flink major version which is put in the middle of the artifact name. The Pravega Flink connector maintains compatibility for the <em>three</em> most recent major versions of Flink.</p> + +<p><code>0.10.1</code> is the version that aligns with the Pravega version.</p> + +<p>You can find the latest release with a support matrix on the <a href="https://github.com/pravega/flink-connectors/releases">GitHub Releases page</a>.</p> + +<h2 id="api-introduction">API introduction</h2> + +<h3 id="configurations">Configurations</h3> + +<p>The connector provides a common top-level object <code>PravegaConfig</code> for Pravega connection configurations. The config object automatically configures itself from <em>environment variables</em>, <em>system properties</em> and <em>program arguments</em>.</p> + +<p>The basic controller URI and the default scope can be set like this:</p> + +<table> + <thead> + <tr> + <th>Setting</th> + <th>Environment Variable /<br />System Property /<br />Program Argument</th> + <th>Default Value</th> + </tr> + </thead> + <tbody> + <tr> + <td>Controller URI</td> + <td><code>PRAVEGA_CONTROLLER_URI</code><br /><code>pravega.controller.uri</code><br /><code>--controller</code></td> + <td><code>tcp://localhost:9090</code></td> + </tr> + <tr> + <td>Default Scope</td> + <td><code>PRAVEGA_SCOPE</code><br /><code>pravega.scope</code><br /><code>--scope</code></td> + <td>-</td> + </tr> + </tbody> +</table> + +<p>The recommended way to create an instance of <code>PravegaConfig</code> is the following:</p> + +<div class="highlight"><pre><code class="language-java"><span class="c1">// From default environment</span> +<span class="n">PravegaConfig</span> <span class="n">config</span> <span class="o">=</span> <span class="n">PravegaConfig</span><span class="o">.</span><span class="na">fromDefaults</span><span class="o">();</span> + +<span class="c1">// From program arguments</span> +<span class="n">ParameterTool</span> <span class="n">params</span> <span class="o">=</span> <span class="n">ParameterTool</span><span class="o">.</span><span class="na">fromArgs</span><span class="o">(</span><span class="n">args</span><span class="o">);</span> +<span class="n">PravegaConfig</span> <span class="n">config</span> <span class="o">=</span> <span class="n">PravegaConfig</span><span class="o">.</span><span class="na">fromParams</span><span class="o">(</span><span class="n">params</span><span class="o">);</span> + +<span class="c1">// From user specification</span> +<span class="n">PravegaConfig</span> <span class="n">config</span> <span class="o">=</span> <span class="n">PravegaConfig</span><span class="o">.</span><span class="na">fromDefaults</span><span class="o">()</span> + <span class="o">.</span><span class="na">withControllerURI</span><span class="o">(</span><span class="s">"tcp://..."</span><span class="o">)</span> + <span class="o">.</span><span class="na">withDefaultScope</span><span class="o">(</span><span class="s">"SCOPE-NAME"</span><span class="o">)</span> + <span class="o">.</span><span class="na">withCredentials</span><span class="o">(</span><span class="n">credentials</span><span class="o">)</span> + <span class="o">.</span><span class="na">withHostnameValidation</span><span class="o">(</span><span class="kc">false</span><span class="o">);</span></code></pre></div> + +<h3 id="serializationdeserialization">Serialization/Deserialization</h3> + +<p>Pravega has defined <a href="http://pravega.io/docs/latest/javadoc/clients/io/pravega/client/stream/Serializer.html"><code>io.pravega.client.stream.Serializer</code></a> for the serialization/deserialization, while Flink has also defined standard interfaces for the purpose.</p> + +<ul> + <li><a href="https://ci.apache.org/projects/flink/flink-docs-stable/api/java/org/apache/flink/api/common/serialization/SerializationSchema.html"><code>org.apache.flink.api.common.serialization.SerializationSchema</code></a></li> + <li><a href="https://ci.apache.org/projects/flink/flink-docs-stable/api/java/org/apache/flink/api/common/serialization/DeserializationSchema.html"><code>org.apache.flink.api.common.serialization.DeserializationSchema</code></a></li> +</ul> + +<p>For interoperability with other pravega client applications, we have built-in adapters <code>PravegaSerializationSchema</code> and <code>PravegaDeserializationSchema</code> to support processing Pravega stream data produced by a non-Flink application.</p> + +<p>Here is the adapter for Pravega Java serializer:</p> + +<div class="highlight"><pre><code class="language-java"><span class="kn">import</span> <span class="nn">io.pravega.client.stream.impl.JavaSerializer</span><span class="o">;</span> +<span class="o">...</span> +<span class="n">DeserializationSchema</span><span class="o"><</span><span class="n">MyEvent</span><span class="o">></span> <span class="n">adapter</span> <span class="o">=</span> <span class="k">new</span> <span class="n">PravegaDeserializationSchema</span><span class="o"><>(</span> + <span class="n">MyEvent</span><span class="o">.</span><span class="na">class</span><span class="o">,</span> <span class="k">new</span> <span class="n">JavaSerializer</span><span class="o"><</span><span class="n">MyEvent</span><span class="o">>());</span></code></pre></div> + +<h3 id="flinkpravegareader"><code>FlinkPravegaReader</code></h3> + +<p><code>FlinkPravegaReader</code> is a Flink <code>SourceFunction</code> implementation which supports parallel reads from one or more Pravega streams. Internally, it initiates a Pravega reader group and creates Pravega <code>EventStreamReader</code> instances to read the data from the stream(s). It provides a builder-style API to construct, and can allow streamcuts to mark the start and end of the read.</p> + +<p>You can use it like this:</p> + +<div class="highlight"><pre><code class="language-java"><span class="kd">final</span> <span class="n">StreamExecutionEnvironment</span> <span class="n">env</span> <span class="o">=</span> <span class="n">StreamExecutionEnvironment</span><span class="o">.</span><span class="na">getExecutionEnvironment</span><span class="o">();</span> + +<span class="c1">// Enable Flink checkpoint to make state fault tolerant</span> +<span class="n">env</span><span class="o">.</span><span class="na">enableCheckpointing</span><span class="o">(</span><span class="mi">60000</span><span class="o">);</span> + +<span class="c1">// Define the Pravega configuration</span> +<span class="n">ParameterTool</span> <span class="n">params</span> <span class="o">=</span> <span class="n">ParameterTool</span><span class="o">.</span><span class="na">fromArgs</span><span class="o">(</span><span class="n">args</span><span class="o">);</span> +<span class="n">PravegaConfig</span> <span class="n">config</span> <span class="o">=</span> <span class="n">PravegaConfig</span><span class="o">.</span><span class="na">fromParams</span><span class="o">(</span><span class="n">params</span><span class="o">);</span> + +<span class="c1">// Define the event deserializer</span> +<span class="n">DeserializationSchema</span><span class="o"><</span><span class="n">MyClass</span><span class="o">></span> <span class="n">deserializer</span> <span class="o">=</span> <span class="o">...</span> + +<span class="c1">// Define the data stream</span> +<span class="n">FlinkPravegaReader</span><span class="o"><</span><span class="n">MyClass</span><span class="o">></span> <span class="n">pravegaSource</span> <span class="o">=</span> <span class="n">FlinkPravegaReader</span><span class="o">.<</span><span class="n">MyClass</span><span class="o">></span><span class="n">builder</span><span class="o">()</span> + <span class="o">.</span><span class="na">forStream</span><span class="o">(...)</span> + <span class="o">.</span><span class="na">withPravegaConfig</span><span class="o">(</span><span class="n">config</span><span class="o">)</span> + <span class="o">.</span><span class="na">withDeserializationSchema</span><span class="o">(</span><span class="n">deserializer</span><span class="o">)</span> + <span class="o">.</span><span class="na">build</span><span class="o">();</span> +<span class="n">DataStream</span><span class="o"><</span><span class="n">MyClass</span><span class="o">></span> <span class="n">stream</span> <span class="o">=</span> <span class="n">env</span><span class="o">.</span><span class="na">addSource</span><span class="o">(</span><span class="n">pravegaSource</span><span class="o">)</span> + <span class="o">.</span><span class="na">setParallelism</span><span class="o">(</span><span class="mi">4</span><span class="o">)</span> + <span class="o">.</span><span class="na">uid</span><span class="o">(</span><span class="s">"pravega-source"</span><span class="o">);</span></code></pre></div> + +<h3 id="flinkpravegawriter"><code>FlinkPravegaWriter</code></h3> + +<p><code>FlinkPravegaWriter</code> is a Flink <code>SinkFunction</code> implementation which supports parallel writes to Pravega streams.</p> + +<p>It supports three writer modes that relate to guarantees about the persistence of events emitted by the sink to a Pravega Stream:</p> + +<ol> + <li><strong>Best-effort</strong> - Any write failures will be ignored and there could be data loss.</li> + <li><strong>At-least-once</strong>(default) - All events are persisted in Pravega. Duplicate events are possible, due to retries or in case of failure and subsequent recovery.</li> + <li><strong>Exactly-once</strong> - All events are persisted in Pravega using a transactional approach integrated with the Flink checkpointing feature.</li> +</ol> + +<p>Internally, it will initiate several Pravega <code>EventStreamWriter</code> or <code>TransactionalEventStreamWriter</code> (depends on the writer mode) instances to write data to the stream. It provides a builder-style API to construct.</p> + +<p>A basic usage looks like this:</p> + +<div class="highlight"><pre><code class="language-java"><span class="n">StreamExecutionEnvironment</span> <span class="n">env</span> <span class="o">=</span> <span class="n">StreamExecutionEnvironment</span><span class="o">.</span><span class="na">getExecutionEnvironment</span><span class="o">();</span> + +<span class="c1">// Define the Pravega configuration</span> +<span class="n">PravegaConfig</span> <span class="n">config</span> <span class="o">=</span> <span class="n">PravegaConfig</span><span class="o">.</span><span class="na">fromParams</span><span class="o">(</span><span class="n">params</span><span class="o">);</span> + +<span class="c1">// Define the event serializer</span> +<span class="n">SerializationSchema</span><span class="o"><</span><span class="n">MyClass</span><span class="o">></span> <span class="n">serializer</span> <span class="o">=</span> <span class="o">...</span> + +<span class="c1">// Define the event router for selecting the Routing Key</span> +<span class="n">PravegaEventRouter</span><span class="o"><</span><span class="n">MyClass</span><span class="o">></span> <span class="n">router</span> <span class="o">=</span> <span class="o">...</span> + +<span class="c1">// Define the sink function</span> +<span class="n">FlinkPravegaWriter</span><span class="o"><</span><span class="n">MyClass</span><span class="o">></span> <span class="n">pravegaSink</span> <span class="o">=</span> <span class="n">FlinkPravegaWriter</span><span class="o">.<</span><span class="n">MyClass</span><span class="o">></span><span class="n">builder</span><span class="o">()</span> + <span class="o">.</span><span class="na">forStream</span><span class="o">(...)</span> + <span class="o">.</span><span class="na">withPravegaConfig</span><span class="o">(</span><span class="n">config</span><span class="o">)</span> + <span class="o">.</span><span class="na">withSerializationSchema</span><span class="o">(</span><span class="n">serializer</span><span class="o">)</span> + <span class="o">.</span><span class="na">withEventRouter</span><span class="o">(</span><span class="n">router</span><span class="o">)</span> + <span class="o">.</span><span class="na">withWriterMode</span><span class="o">(</span><span class="n">EXACTLY_ONCE</span><span class="o">)</span> + <span class="o">.</span><span class="na">build</span><span class="o">();</span> + +<span class="n">DataStream</span><span class="o"><</span><span class="n">MyClass</span><span class="o">></span> <span class="n">stream</span> <span class="o">=</span> <span class="o">...</span> +<span class="n">stream</span><span class="o">.</span><span class="na">addSink</span><span class="o">(</span><span class="n">pravegaSink</span><span class="o">)</span> + <span class="o">.</span><span class="na">setParallelism</span><span class="o">(</span><span class="mi">4</span><span class="o">)</span> + <span class="o">.</span><span class="na">uid</span><span class="o">(</span><span class="s">"pravega-sink"</span><span class="o">);</span></code></pre></div> + +<p>You can see some more examples <a href="https://github.com/pravega/pravega-samples">here</a>.</p> + +<h1 id="internals-of-reader-and-writer">Internals of reader and writer</h1> + +<h2 id="checkpoint-integration">Checkpoint integration</h2> + +<p>Flink has periodic checkpoints based on the Chandy-Lamport algorithm to make state in Flink fault-tolerant. By allowing state and the corresponding stream positions to be recovered, the application is given the same semantics as a failure-free execution.</p> + +<p>Pravega also has its own Checkpoint concept which is to create a consistent “point in time” persistence of the state of each Reader in the Reader Group, by using a specialized Event (<em>Checkpoint Event</em>) to signal each Reader to preserve its state. Once a Checkpoint has been completed, the application can use the Checkpoint to reset all the Readers in the Reader Group to the known consistent state represented by the Checkpoint.</p> + +<p>This means that our end-to-end recovery story is not like other messaging systems such as Kafka, which uses a more coupled method and persists its offset in the Flink task state and lets Flink do the coordination. Flink delegates the Pravega source recovery completely to the Pravega server and uses only a lightweight hook to connect. We collaborated with the Flink community and added a new interface <code>ExternallyInducedSource</code> (<a href="https://issues.apache.org/jira/browse/F [...] + +<p>The checkpoint mechanism works as a two-step process:</p> + +<ul> + <li> + <p>The <a href="https://ci.apache.org/projects/flink/flink-docs-stable/api/java/org/apache/flink/runtime/checkpoint/MasterTriggerRestoreHook.html">master hook</a> handler from the JobManager initiates the <a href="https://ci.apache.org/projects/flink/flink-docs-stable/api/java/org/apache/flink/runtime/checkpoint/MasterTriggerRestoreHook.html#triggerCheckpoint-long-long-java.util.concurrent.Executor-"><code>triggerCheckpoint</code></a> request to the <code>ReaderCheckpointHook</code> [...] + </li> + <li> + <p>A <code>Checkpoint</code> event will be sent by Pravega as part of the data stream flow and, upon receiving the event, the <code>FlinkPravegaReader</code> will initiate a <a href="https://github.com/apache/flink/blob/master/flink-streaming-java/src/main/java/org/apache/flink/streaming/api/checkpoint/ExternallyInducedSource.java#L73"><code>triggerCheckpoint</code></a> request to effectively let Flink continue and complete the checkpoint process.</p> + </li> +</ul> + +<h2 id="end-to-end-exactly-once-semantics">End-to-end exactly-once semantics</h2> + +<p>In the early years of big data processing, results from real-time stream processing were always considered inaccurate/approximate/speculative. However, this correctness is extremely important for some use cases and in some industries such as finance.</p> + +<p>This constraint stems mainly from two issues:</p> + +<ul> + <li>unordered data source in event time</li> + <li>end-to-end exactly-once semantics guarantee</li> +</ul> + +<p>During recent years of development, watermarking has been introduced as a tradeoff between correctness and latency, which is now considered a good solution for unordered data sources in event time.</p> + +<p>The guarantee of end-to-end exactly-once semantics is more tricky. When we say “exactly-once semantics”, what we mean is that each incoming event affects the final results exactly once. Even in the event of a machine or software failure, there is no duplicate data and no data that goes unprocessed. This is quite difficult because of the demands of message acknowledgment and recovery during such fast processing and is also why some early distributed streaming engines like Storm(without [...] + +<p>Flink is one of the first streaming systems that was able to provide exactly-once semantics due to its delicate <a href="https://www.ververica.com/blog/high-throughput-low-latency-and-exactly-once-stream-processing-with-apache-flink">checkpoint mechanism</a>. But to make it work end-to-end, the final stage needs to apply the semantic to external message system sinks that support commits and rollbacks.</p> + +<p>To work around this problem, Pravega introduced <a href="https://cncf.pravega.io/docs/latest/transactions/">transactional writes</a>. A Pravega transaction allows an application to prepare a set of events that can be written “all at once” to a Stream. This allows an application to “commit” a bunch of events atomically. When writes are idempotent, it is possible to implement end-to-end exactly-once pipelines together with Flink.</p> + +<p>To build such an end-to-end solution requires coordination between Flink and the Pravega sink, which is still challenging. A common approach for coordinating commits and rollbacks in a distributed system is the two-phase commit protocol. We used this protocol and, together with the Flink community, implemented the sink function in a two-phase commit way coordinated with Flink checkpoints.</p> + +<p>The Flink community then extracted the common logic from the two-phase commit protocol and provided a general interface <code>TwoPhaseCommitSinkFunction</code> (<a href="https://issues.apache.org/jira/browse/FLINK-7210">FLINK-7210</a>) to make it possible to build end-to-end exactly-once applications with other message systems that have transaction support. This includes Apache Kafka versions 0.11 and above. There is an official Flink <a href="https://flink.apache.org/features/2018/03 [...] + +<h1 id="summary">Summary</h1> +<p>The Pravega Flink connector enables Pravega to connect to Flink and allows Pravega to act as a key data store in a streaming pipeline. Both projects share a common design philosophy and can integrate well with each other. Pravega has its own concept of checkpointing and has implemented transactional writes to support end-to-end exactly-once guarantees.</p> + +<h1 id="future-plans">Future plans</h1> + +<p><code>FlinkPravegaInputFormat</code> and <code>FlinkPravegaOutputFormat</code> are now provided to support batch reads and writes in Flink, but these are under the legacy DataSet API. Since Flink is now making efforts to unify batch and streaming, it is improving its APIs and providing new interfaces for the <a href="https://cwiki.apache.org/confluence/display/FLINK/FLIP-27%3A+Refactor+Source+Interface">source</a> and <a href="https://cwiki.apache.org/confluence/display/FLINK/FLIP-143 [...] + +<p>We will also put more effort into SQL / Table API support in order to provide a better user experience since it is simpler to understand and even more powerful to use in some cases.</p> + +<p><strong>Note:</strong> the original blog post can be found <a href="https://cncf.pravega.io/blog/2021/11/01/pravega-flink-connector-101/">here</a>.</p> + + </article> + </div> + + <div class="row"> + <div id="disqus_thread"></div> + <script type="text/javascript"> + /* * * CONFIGURATION VARIABLES: EDIT BEFORE PASTING INTO YOUR WEBPAGE * * */ + var disqus_shortname = 'stratosphere-eu'; // required: replace example with your forum shortname + + /* * * DON'T EDIT BELOW THIS LINE * * */ + (function() { + var dsq = document.createElement('script'); dsq.type = 'text/javascript'; dsq.async = true; + dsq.src = '//' + disqus_shortname + '.disqus.com/embed.js'; + (document.getElementsByTagName('head')[0] || document.getElementsByTagName('body')[0]).appendChild(dsq); + })(); + </script> + </div> + </div> +</div> + </div> + </div> + + <hr /> + + <div class="row"> + <div class="footer text-center col-sm-12"> + <p>Copyright © 2014-2021 <a href="http://apache.org">The Apache Software Foundation</a>. All Rights Reserved.</p> + <p>Apache Flink, Flink®, Apache®, the squirrel logo, and the Apache feather logo are either registered trademarks or trademarks of The Apache Software Foundation.</p> + <p><a href="/privacy-policy.html">Privacy Policy</a> · <a href="/blog/feed.xml">RSS feed</a></p> + </div> + </div> + </div><!-- /.container --> + + <!-- Include all compiled plugins (below), or include individual files as needed --> + <script src="/js/jquery.matchHeight-min.js"></script> + <script src="/js/bootstrap.min.js"></script> + <script src="/js/codetabs.js"></script> + <script src="/js/stickysidebar.js"></script> + + <!-- Google Analytics --> + <script> + (function(i,s,o,g,r,a,m){i['GoogleAnalyticsObject']=r;i[r]=i[r]||function(){ + (i[r].q=i[r].q||[]).push(arguments)},i[r].l=1*new Date();a=s.createElement(o), + m=s.getElementsByTagName(o)[0];a.async=1;a.src=g;m.parentNode.insertBefore(a,m) + })(window,document,'script','//www.google-analytics.com/analytics.js','ga'); + + ga('create', 'UA-52545728-1', 'auto'); + ga('send', 'pageview'); + </script> + </body> +</html> diff --git a/content/blog/feed.xml b/content/blog/feed.xml index 24fd2a2..de45911 100644 --- a/content/blog/feed.xml +++ b/content/blog/feed.xml @@ -7,6 +7,263 @@ <atom:link href="https://flink.apache.org/blog/feed.xml" rel="self" type="application/rss+xml" /> <item> +<title>Pravega Flink Connector 101</title> +<description><p><a href="https://cncf.pravega.io/">Pravega</a>, which is now a CNCF sandbox project, is a cloud-native storage system based on abstractions for both batch and streaming data consumption. Pravega streams (a new storage abstraction) are durable, consistent, and elastic, while natively supporting long-term data retention. In comparison, <a href="https://flink.apache.org/">Apache Flink</a> is a popular real-time computing engi [...] + +<p>That’s also the main reason why Pravega has chosen to use Flink as the first integrated execution engine among the various distributed computing engines on the market. With the help of Flink, users can use flexible APIs for windowing, complex event processing (CEP), or table abstractions to process streaming data easily and enrich the data being stored. Since its inception in 2016, Pravega has established communication with Flink PMC members and developed the connector together. [...] + +<p>In 2017, the Pravega Flink connector module started to move out of the Pravega main repository and has been maintained in a new separate <a href="https://github.com/pravega/flink-connectors">repository</a> since then. During years of development, many features have been implemented, including:</p> + +<ul> + <li>exactly-once processing guarantees for both Reader and Writer, supporting end-to-end exactly-once processing pipelines</li> + <li>seamless integration with Flink’s checkpoints and savepoints</li> + <li>parallel Readers and Writers supporting high throughput and low latency processing</li> + <li>support for Batch, Streaming, and Table API to access Pravega Streams</li> +</ul> + +<p>These key features make streaming pipeline applications easier to develop without worrying about performance and correctness which are the common pain points for many streaming use cases.</p> + +<p>In this blog post, we will discuss how to use this connector to read and write Pravega streams with the Flink DataStream API.</p> + +<div class="page-toc"> +<ul id="markdown-toc"> + <li><a href="#basic-usages" id="markdown-toc-basic-usages">Basic usages</a> <ul> + <li><a href="#dependency" id="markdown-toc-dependency">Dependency</a></li> + <li><a href="#api-introduction" id="markdown-toc-api-introduction">API introduction</a> <ul> + <li><a href="#configurations" id="markdown-toc-configurations">Configurations</a></li> + <li><a href="#serializationdeserialization" id="markdown-toc-serializationdeserialization">Serialization/Deserialization</a></li> + <li><a href="#flinkpravegareader" id="markdown-toc-flinkpravegareader"><code>FlinkPravegaReader</code></a></li> + <li><a href="#flinkpravegawriter" id="markdown-toc-flinkpravegawriter"><code>FlinkPravegaWriter</code></a></li> + </ul> + </li> + </ul> + </li> + <li><a href="#internals-of-reader-and-writer" id="markdown-toc-internals-of-reader-and-writer">Internals of reader and writer</a> <ul> + <li><a href="#checkpoint-integration" id="markdown-toc-checkpoint-integration">Checkpoint integration</a></li> + <li><a href="#end-to-end-exactly-once-semantics" id="markdown-toc-end-to-end-exactly-once-semantics">End-to-end exactly-once semantics</a></li> + </ul> + </li> + <li><a href="#summary" id="markdown-toc-summary">Summary</a></li> + <li><a href="#future-plans" id="markdown-toc-future-plans">Future plans</a></li> +</ul> + +</div> + +<h1 id="basic-usages">Basic usages</h1> + +<h2 id="dependency">Dependency</h2> +<p>To use this connector in your application, add the dependency to your project:</p> + +<div class="highlight"><pre><code class="language-xml"><span class="nt">&lt;dependency&gt;</span> + <span class="nt">&lt;groupId&gt;</span>io.pravega<span class="nt">&lt;/groupId&gt;</span> + <span class="nt">&lt;artifactId&gt;</span>pravega-connectors-flink-1.13_2.12<span class="nt">&lt;/artifactId&gt;</span> + <span class="nt">&lt;version&gt;</span>0.10.1<span class="nt">&lt;/version&gt;</span> +<span class="nt">&lt;/dependency&gt;</span></code></pre></div> + +<p>In the above example,</p> + +<p><code>1.13</code> is the Flink major version which is put in the middle of the artifact name. The Pravega Flink connector maintains compatibility for the <em>three</em> most recent major versions of Flink.</p> + +<p><code>0.10.1</code> is the version that aligns with the Pravega version.</p> + +<p>You can find the latest release with a support matrix on the <a href="https://github.com/pravega/flink-connectors/releases">GitHub Releases page</a>.</p> + +<h2 id="api-introduction">API introduction</h2> + +<h3 id="configurations">Configurations</h3> + +<p>The connector provides a common top-level object <code>PravegaConfig</code> for Pravega connection configurations. The config object automatically configures itself from <em>environment variables</em>, <em>system properties</em> and <em>program arguments</em>.</p> + +<p>The basic controller URI and the default scope can be set like this:</p> + +<table> + <thead> + <tr> + <th>Setting</th> + <th>Environment Variable /<br />System Property /<br />Program Argument</th> + <th>Default Value</th> + </tr> + </thead> + <tbody> + <tr> + <td>Controller URI</td> + <td><code>PRAVEGA_CONTROLLER_URI</code><br /><code>pravega.controller.uri</code><br /><code>--controller</code></td> + <td><code>tcp://localhost:9090</code></td> + </tr> + <tr> + <td>Default Scope</td> + <td><code>PRAVEGA_SCOPE</code><br /><code>pravega.scope</code><br /><code>--scope</code></td> + <td>-</td> + </tr> + </tbody> +</table> + +<p>The recommended way to create an instance of <code>PravegaConfig</code> is the following:</p> + +<div class="highlight"><pre><code class="language-java"><span class="c1">// From default environment</span> +<span class="n">PravegaConfig</span> <span class="n">config</span> <span class="o">=</span> <span class="n">PravegaConfig</span><span class="o">.</span><span class="na">fromDefaults</span><span class="o">();</span> + +<span class="c1">// From program arguments</span> +<span class="n">ParameterTool</span> <span class="n">params</span> <span class="o">=</span> <span class="n">ParameterTool</span><span class="o">.</span><span class="na">fromArgs</span><span class="o">(</span><span class="n">args</span><span class="o">);</span> +<span class="n">PravegaConfig</span> <span class="n">config</span> <span class="o">=</span> <span class="n">PravegaConfig</span><span class="o">.</span><span class="na">fromParams</span><span class="o">(</span><span class="n">params</span><span class="o">);</span> + +<span class="c1">// From user specification</span> +<span class="n">PravegaConfig</span> <span class="n">config</span> <span class="o">=</span> <span class="n">PravegaConfig</span><span class="o">.</span><span class="na">fromDefaults</span><span class="o">()</span> + <span class="o">.</span><span class="na">withControllerURI</span><span class="o">(</span><span class="s">&quot;tcp://...&quot;</span><span class="o">)</span> + <span class="o">.</span><span class="na">withDefaultScope</span><span class="o">(</span><span class="s">&quot;SCOPE-NAME&quot;</span><span class="o">)</span> + <span class="o">.</span><span class="na">withCredentials</span><span class="o">(</span><span class="n">credentials</span><span class="o">)</span> + <span class="o">.</span><span class="na">withHostnameValidation</span><span class="o">(</span><span class="kc">false</span><span class="o">);</span></code></pre></div> + +<h3 id="serializationdeserialization">Serialization/Deserialization</h3> + +<p>Pravega has defined <a href="http://pravega.io/docs/latest/javadoc/clients/io/pravega/client/stream/Serializer.html"><code>io.pravega.client.stream.Serializer</code></a> for the serialization/deserialization, while Flink has also defined standard interfaces for the purpose.</p> + +<ul> + <li><a href="https://ci.apache.org/projects/flink/flink-docs-stable/api/java/org/apache/flink/api/common/serialization/SerializationSchema.html"><code>org.apache.flink.api.common.serialization.SerializationSchema</code></a></li> + <li><a href="https://ci.apache.org/projects/flink/flink-docs-stable/api/java/org/apache/flink/api/common/serialization/DeserializationSchema.html"><code>org.apache.flink.api.common.serialization.DeserializationSchema</code></a></li> +</ul> + +<p>For interoperability with other pravega client applications, we have built-in adapters <code>PravegaSerializationSchema</code> and <code>PravegaDeserializationSchema</code> to support processing Pravega stream data produced by a non-Flink application.</p> + +<p>Here is the adapter for Pravega Java serializer:</p> + +<div class="highlight"><pre><code class="language-java"><span class="kn">import</span> <span class="nn">io.pravega.client.stream.impl.JavaSerializer</span><span class="o">;</span> +<span class="o">...</span> +<span class="n">DeserializationSchema</span><span class="o">&lt;</span><span class="n">MyEvent</span><span class="o">&gt;</span> <span class="n">adapter</span> <span class="o">=</span> <span class="k">new</span> <span class="n">PravegaDeserializationSchema</span><span class="o">&lt;& [...] + <span class="n">MyEvent</span><span class="o">.</span><span class="na">class</span><span class="o">,</span> <span class="k">new</span> <span class="n">JavaSerializer</span><span class="o">&lt;</span><span class="n">MyEvent</span><span class="o">&gt;());</span></code></pre& [...] + +<h3 id="flinkpravegareader"><code>FlinkPravegaReader</code></h3> + +<p><code>FlinkPravegaReader</code> is a Flink <code>SourceFunction</code> implementation which supports parallel reads from one or more Pravega streams. Internally, it initiates a Pravega reader group and creates Pravega <code>EventStreamReader</code> instances to read the data from the stream(s). It provides a builder-style API to construct, and can allow streamcuts to mark the start and end of the read.</p> + +<p>You can use it like this:</p> + +<div class="highlight"><pre><code class="language-java"><span class="kd">final</span> <span class="n">StreamExecutionEnvironment</span> <span class="n">env</span> <span class="o">=</span> <span class="n">StreamExecutionEnvironment</span><span class="o">.</span><span class="na">getExecutionEnvironment</ [...] + +<span class="c1">// Enable Flink checkpoint to make state fault tolerant</span> +<span class="n">env</span><span class="o">.</span><span class="na">enableCheckpointing</span><span class="o">(</span><span class="mi">60000</span><span class="o">);</span> + +<span class="c1">// Define the Pravega configuration</span> +<span class="n">ParameterTool</span> <span class="n">params</span> <span class="o">=</span> <span class="n">ParameterTool</span><span class="o">.</span><span class="na">fromArgs</span><span class="o">(</span><span class="n">args</span><span class="o">);</span> +<span class="n">PravegaConfig</span> <span class="n">config</span> <span class="o">=</span> <span class="n">PravegaConfig</span><span class="o">.</span><span class="na">fromParams</span><span class="o">(</span><span class="n">params</span><span class="o">);</span> + +<span class="c1">// Define the event deserializer</span> +<span class="n">DeserializationSchema</span><span class="o">&lt;</span><span class="n">MyClass</span><span class="o">&gt;</span> <span class="n">deserializer</span> <span class="o">=</span> <span class="o">...</span> + +<span class="c1">// Define the data stream</span> +<span class="n">FlinkPravegaReader</span><span class="o">&lt;</span><span class="n">MyClass</span><span class="o">&gt;</span> <span class="n">pravegaSource</span> <span class="o">=</span> <span class="n">FlinkPravegaReader</span><span class="o">.&lt;</span><span class="n">MyClass</spa [...] + <span class="o">.</span><span class="na">forStream</span><span class="o">(...)</span> + <span class="o">.</span><span class="na">withPravegaConfig</span><span class="o">(</span><span class="n">config</span><span class="o">)</span> + <span class="o">.</span><span class="na">withDeserializationSchema</span><span class="o">(</span><span class="n">deserializer</span><span class="o">)</span> + <span class="o">.</span><span class="na">build</span><span class="o">();</span> +<span class="n">DataStream</span><span class="o">&lt;</span><span class="n">MyClass</span><span class="o">&gt;</span> <span class="n">stream</span> <span class="o">=</span> <span class="n">env</span><span class="o">.</span><span class="na">addSource</span><span class="o"&g [...] + <span class="o">.</span><span class="na">setParallelism</span><span class="o">(</span><span class="mi">4</span><span class="o">)</span> + <span class="o">.</span><span class="na">uid</span><span class="o">(</span><span class="s">&quot;pravega-source&quot;</span><span class="o">);</span></code></pre></div> + +<h3 id="flinkpravegawriter"><code>FlinkPravegaWriter</code></h3> + +<p><code>FlinkPravegaWriter</code> is a Flink <code>SinkFunction</code> implementation which supports parallel writes to Pravega streams.</p> + +<p>It supports three writer modes that relate to guarantees about the persistence of events emitted by the sink to a Pravega Stream:</p> + +<ol> + <li><strong>Best-effort</strong> - Any write failures will be ignored and there could be data loss.</li> + <li><strong>At-least-once</strong>(default) - All events are persisted in Pravega. Duplicate events are possible, due to retries or in case of failure and subsequent recovery.</li> + <li><strong>Exactly-once</strong> - All events are persisted in Pravega using a transactional approach integrated with the Flink checkpointing feature.</li> +</ol> + +<p>Internally, it will initiate several Pravega <code>EventStreamWriter</code> or <code>TransactionalEventStreamWriter</code> (depends on the writer mode) instances to write data to the stream. It provides a builder-style API to construct.</p> + +<p>A basic usage looks like this:</p> + +<div class="highlight"><pre><code class="language-java"><span class="n">StreamExecutionEnvironment</span> <span class="n">env</span> <span class="o">=</span> <span class="n">StreamExecutionEnvironment</span><span class="o">.</span><span class="na">getExecutionEnvironment</span><span class="o">();</span> + +<span class="c1">// Define the Pravega configuration</span> +<span class="n">PravegaConfig</span> <span class="n">config</span> <span class="o">=</span> <span class="n">PravegaConfig</span><span class="o">.</span><span class="na">fromParams</span><span class="o">(</span><span class="n">params</span><span class="o">);</span> + +<span class="c1">// Define the event serializer</span> +<span class="n">SerializationSchema</span><span class="o">&lt;</span><span class="n">MyClass</span><span class="o">&gt;</span> <span class="n">serializer</span> <span class="o">=</span> <span class="o">...</span> + +<span class="c1">// Define the event router for selecting the Routing Key</span> +<span class="n">PravegaEventRouter</span><span class="o">&lt;</span><span class="n">MyClass</span><span class="o">&gt;</span> <span class="n">router</span> <span class="o">=</span> <span class="o">...</span> + +<span class="c1">// Define the sink function</span> +<span class="n">FlinkPravegaWriter</span><span class="o">&lt;</span><span class="n">MyClass</span><span class="o">&gt;</span> <span class="n">pravegaSink</span> <span class="o">=</span> <span class="n">FlinkPravegaWriter</span><span class="o">.&lt;</span><span class="n">MyClass</span& [...] + <span class="o">.</span><span class="na">forStream</span><span class="o">(...)</span> + <span class="o">.</span><span class="na">withPravegaConfig</span><span class="o">(</span><span class="n">config</span><span class="o">)</span> + <span class="o">.</span><span class="na">withSerializationSchema</span><span class="o">(</span><span class="n">serializer</span><span class="o">)</span> + <span class="o">.</span><span class="na">withEventRouter</span><span class="o">(</span><span class="n">router</span><span class="o">)</span> + <span class="o">.</span><span class="na">withWriterMode</span><span class="o">(</span><span class="n">EXACTLY_ONCE</span><span class="o">)</span> + <span class="o">.</span><span class="na">build</span><span class="o">();</span> + +<span class="n">DataStream</span><span class="o">&lt;</span><span class="n">MyClass</span><span class="o">&gt;</span> <span class="n">stream</span> <span class="o">=</span> <span class="o">...</span> +<span class="n">stream</span><span class="o">.</span><span class="na">addSink</span><span class="o">(</span><span class="n">pravegaSink</span><span class="o">)</span> + <span class="o">.</span><span class="na">setParallelism</span><span class="o">(</span><span class="mi">4</span><span class="o">)</span> + <span class="o">.</span><span class="na">uid</span><span class="o">(</span><span class="s">&quot;pravega-sink&quot;</span><span class="o">);</span></code></pre></div> + +<p>You can see some more examples <a href="https://github.com/pravega/pravega-samples">here</a>.</p> + +<h1 id="internals-of-reader-and-writer">Internals of reader and writer</h1> + +<h2 id="checkpoint-integration">Checkpoint integration</h2> + +<p>Flink has periodic checkpoints based on the Chandy-Lamport algorithm to make state in Flink fault-tolerant. By allowing state and the corresponding stream positions to be recovered, the application is given the same semantics as a failure-free execution.</p> + +<p>Pravega also has its own Checkpoint concept which is to create a consistent “point in time” persistence of the state of each Reader in the Reader Group, by using a specialized Event (<em>Checkpoint Event</em>) to signal each Reader to preserve its state. Once a Checkpoint has been completed, the application can use the Checkpoint to reset all the Readers in the Reader Group to the known consistent state represented by the Checkpoint.</p> + +<p>This means that our end-to-end recovery story is not like other messaging systems such as Kafka, which uses a more coupled method and persists its offset in the Flink task state and lets Flink do the coordination. Flink delegates the Pravega source recovery completely to the Pravega server and uses only a lightweight hook to connect. We collaborated with the Flink community and added a new interface <code>ExternallyInducedSource</code> (<a href="https://issue [...] + +<p>The checkpoint mechanism works as a two-step process:</p> + +<ul> + <li> + <p>The <a href="https://ci.apache.org/projects/flink/flink-docs-stable/api/java/org/apache/flink/runtime/checkpoint/MasterTriggerRestoreHook.html">master hook</a> handler from the JobManager initiates the <a href="https://ci.apache.org/projects/flink/flink-docs-stable/api/java/org/apache/flink/runtime/checkpoint/MasterTriggerRestoreHook.html#triggerCheckpoint-long-long-java.util.concurrent.Executor-"><code>triggerCheckpoint</code&g [...] + </li> + <li> + <p>A <code>Checkpoint</code> event will be sent by Pravega as part of the data stream flow and, upon receiving the event, the <code>FlinkPravegaReader</code> will initiate a <a href="https://github.com/apache/flink/blob/master/flink-streaming-java/src/main/java/org/apache/flink/streaming/api/checkpoint/ExternallyInducedSource.java#L73"><code>triggerCheckpoint</code></a> request to effectively let Flink continue and compl [...] + </li> +</ul> + +<h2 id="end-to-end-exactly-once-semantics">End-to-end exactly-once semantics</h2> + +<p>In the early years of big data processing, results from real-time stream processing were always considered inaccurate/approximate/speculative. However, this correctness is extremely important for some use cases and in some industries such as finance.</p> + +<p>This constraint stems mainly from two issues:</p> + +<ul> + <li>unordered data source in event time</li> + <li>end-to-end exactly-once semantics guarantee</li> +</ul> + +<p>During recent years of development, watermarking has been introduced as a tradeoff between correctness and latency, which is now considered a good solution for unordered data sources in event time.</p> + +<p>The guarantee of end-to-end exactly-once semantics is more tricky. When we say “exactly-once semantics”, what we mean is that each incoming event affects the final results exactly once. Even in the event of a machine or software failure, there is no duplicate data and no data that goes unprocessed. This is quite difficult because of the demands of message acknowledgment and recovery during such fast processing and is also why some early distributed streaming engines like Storm(w [...] + +<p>Flink is one of the first streaming systems that was able to provide exactly-once semantics due to its delicate <a href="https://www.ververica.com/blog/high-throughput-low-latency-and-exactly-once-stream-processing-with-apache-flink">checkpoint mechanism</a>. But to make it work end-to-end, the final stage needs to apply the semantic to external message system sinks that support commits and rollbacks.</p> + +<p>To work around this problem, Pravega introduced <a href="https://cncf.pravega.io/docs/latest/transactions/">transactional writes</a>. A Pravega transaction allows an application to prepare a set of events that can be written “all at once” to a Stream. This allows an application to “commit” a bunch of events atomically. When writes are idempotent, it is possible to implement end-to-end exactly-once pipelines together with Flink.</p> + +<p>To build such an end-to-end solution requires coordination between Flink and the Pravega sink, which is still challenging. A common approach for coordinating commits and rollbacks in a distributed system is the two-phase commit protocol. We used this protocol and, together with the Flink community, implemented the sink function in a two-phase commit way coordinated with Flink checkpoints.</p> + +<p>The Flink community then extracted the common logic from the two-phase commit protocol and provided a general interface <code>TwoPhaseCommitSinkFunction</code> (<a href="https://issues.apache.org/jira/browse/FLINK-7210">FLINK-7210</a>) to make it possible to build end-to-end exactly-once applications with other message systems that have transaction support. This includes Apache Kafka versions 0.11 and above. There is an official Flink <a href [...] + +<h1 id="summary">Summary</h1> +<p>The Pravega Flink connector enables Pravega to connect to Flink and allows Pravega to act as a key data store in a streaming pipeline. Both projects share a common design philosophy and can integrate well with each other. Pravega has its own concept of checkpointing and has implemented transactional writes to support end-to-end exactly-once guarantees.</p> + +<h1 id="future-plans">Future plans</h1> + +<p><code>FlinkPravegaInputFormat</code> and <code>FlinkPravegaOutputFormat</code> are now provided to support batch reads and writes in Flink, but these are under the legacy DataSet API. Since Flink is now making efforts to unify batch and streaming, it is improving its APIs and providing new interfaces for the <a href="https://cwiki.apache.org/confluence/display/FLINK/FLIP-27%3A+Refactor+Source+Interface">source</a> and <a href=&quo [...] + +<p>We will also put more effort into SQL / Table API support in order to provide a better user experience since it is simpler to understand and even more powerful to use in some cases.</p> + +<p><strong>Note:</strong> the original blog post can be found <a href="https://cncf.pravega.io/blog/2021/11/01/pravega-flink-connector-101/">here</a>.</p> +</description> +<pubDate>Thu, 20 Jan 2022 01:00:00 +0100</pubDate> +<link>https://flink.apache.org/2022/01/20/pravega-connector-101.html</link> +<guid isPermaLink="true">/2022/01/20/pravega-connector-101.html</guid> +</item> + +<item> <title>Apache Flink 1.14.3 Release Announcement</title> <description><p>The Apache Flink community released the second bugfix version of the Apache Flink 1.14 series. The first bugfix release was 1.14.2, being an emergency release due to an Apache Log4j Zero Day (CVE-2021-44228). Flink 1.14.1 was abandoned. @@ -20026,102 +20283,5 @@ for a full reference of Flink’s metrics system.</p> <guid isPermaLink="true">/news/2019/02/25/monitoring-best-practices.html</guid> </item> -<item> -<title>Apache Flink 1.6.4 Released</title> -<description><p>The Apache Flink community released the fourth bugfix version of the Apache Flink 1.6 series.</p> - -<p>This release includes more than 25 fixes and minor improvements for Flink 1.6.3. The list below includes a detailed list of all fixes.</p> - -<p>We highly recommend all users to upgrade to Flink 1.6.4.</p> - -<p>Updated Maven dependencies:</p> - -<div class="highlight"><pre><code class="language-xml"><span class="nt">&lt;dependency&gt;</span> - <span class="nt">&lt;groupId&gt;</span>org.apache.flink<span class="nt">&lt;/groupId&gt;</span> - <span class="nt">&lt;artifactId&gt;</span>flink-java<span class="nt">&lt;/artifactId&gt;</span> - <span class="nt">&lt;version&gt;</span>1.6.4<span class="nt">&lt;/version&gt;</span> -<span class="nt">&lt;/dependency&gt;</span> -<span class="nt">&lt;dependency&gt;</span> - <span class="nt">&lt;groupId&gt;</span>org.apache.flink<span class="nt">&lt;/groupId&gt;</span> - <span class="nt">&lt;artifactId&gt;</span>flink-streaming-java_2.11<span class="nt">&lt;/artifactId&gt;</span> - <span class="nt">&lt;version&gt;</span>1.6.4<span class="nt">&lt;/version&gt;</span> -<span class="nt">&lt;/dependency&gt;</span> -<span class="nt">&lt;dependency&gt;</span> - <span class="nt">&lt;groupId&gt;</span>org.apache.flink<span class="nt">&lt;/groupId&gt;</span> - <span class="nt">&lt;artifactId&gt;</span>flink-clients_2.11<span class="nt">&lt;/artifactId&gt;</span> - <span class="nt">&lt;version&gt;</span>1.6.4<span class="nt">&lt;/version&gt;</span> -<span class="nt">&lt;/dependency&gt;</span></code></pre></div> - -<p>You can find the binaries on the updated <a href="/downloads.html">Downloads page</a>.</p> - -<p>List of resolved issues:</p> - -<h2> Bug -</h2> -<ul> -<li>[<a href="https://issues.apache.org/jira/browse/FLINK-10721">FLINK-10721</a>] - Kafka discovery-loop exceptions may be swallowed -</li> -<li>[<a href="https://issues.apache.org/jira/browse/FLINK-10761">FLINK-10761</a>] - MetricGroup#getAllVariables can deadlock -</li> -<li>[<a href="https://issues.apache.org/jira/browse/FLINK-10774">FLINK-10774</a>] - connection leak when partition discovery is disabled and open throws exception -</li> -<li>[<a href="https://issues.apache.org/jira/browse/FLINK-10848">FLINK-10848</a>] - Flink&#39;s Yarn ResourceManager can allocate too many excess containers -</li> -<li>[<a href="https://issues.apache.org/jira/browse/FLINK-11022">FLINK-11022</a>] - Update LICENSE and NOTICE files for older releases -</li> -<li>[<a href="https://issues.apache.org/jira/browse/FLINK-11071">FLINK-11071</a>] - Dynamic proxy classes cannot be resolved when deserializing job graph -</li> -<li>[<a href="https://issues.apache.org/jira/browse/FLINK-11084">FLINK-11084</a>] - Incorrect ouput after two consecutive split and select -</li> -<li>[<a href="https://issues.apache.org/jira/browse/FLINK-11119">FLINK-11119</a>] - Incorrect Scala example for Table Function -</li> -<li>[<a href="https://issues.apache.org/jira/browse/FLINK-11134">FLINK-11134</a>] - Invalid REST API request should not log the full exception in Flink logs -</li> -<li>[<a href="https://issues.apache.org/jira/browse/FLINK-11151">FLINK-11151</a>] - FileUploadHandler stops working if the upload directory is removed -</li> -<li>[<a href="https://issues.apache.org/jira/browse/FLINK-11173">FLINK-11173</a>] - Proctime attribute validation throws an incorrect exception message -</li> -<li>[<a href="https://issues.apache.org/jira/browse/FLINK-11224">FLINK-11224</a>] - Log is missing in scala-shell -</li> -<li>[<a href="https://issues.apache.org/jira/browse/FLINK-11232">FLINK-11232</a>] - Empty Start Time of sub-task on web dashboard -</li> -<li>[<a href="https://issues.apache.org/jira/browse/FLINK-11234">FLINK-11234</a>] - ExternalTableCatalogBuilder unable to build a batch-only table -</li> -<li>[<a href="https://issues.apache.org/jira/browse/FLINK-11235">FLINK-11235</a>] - Elasticsearch connector leaks threads if no connection could be established -</li> -<li>[<a href="https://issues.apache.org/jira/browse/FLINK-11251">FLINK-11251</a>] - Incompatible metric name on prometheus reporter -</li> -<li>[<a href="https://issues.apache.org/jira/browse/FLINK-11389">FLINK-11389</a>] - Incorrectly use job information when call getSerializedTaskInformation in class TaskDeploymentDescriptor -</li> -<li>[<a href="https://issues.apache.org/jira/browse/FLINK-11584">FLINK-11584</a>] - ConfigDocsCompletenessITCase fails DescriptionBuilder#linebreak() is used -</li> -<li>[<a href="https://issues.apache.org/jira/browse/FLINK-11585">FLINK-11585</a>] - Prefix matching in ConfigDocsGenerator can result in wrong assignments -</li> -</ul> - -<h2> Improvement -</h2> -<ul> -<li>[<a href="https://issues.apache.org/jira/browse/FLINK-10910">FLINK-10910</a>] - Harden Kubernetes e2e test -</li> -<li>[<a href="https://issues.apache.org/jira/browse/FLINK-11079">FLINK-11079</a>] - Skip deployment for flnk-storm-examples -</li> -<li>[<a href="https://issues.apache.org/jira/browse/FLINK-11207">FLINK-11207</a>] - Update Apache commons-compress from 1.4.1 to 1.18 -</li> -<li>[<a href="https://issues.apache.org/jira/browse/FLINK-11262">FLINK-11262</a>] - Bump jython-standalone to 2.7.1 -</li> -<li>[<a href="https://issues.apache.org/jira/browse/FLINK-11289">FLINK-11289</a>] - Rework example module structure to account for licensing -</li> -<li>[<a href="https://issues.apache.org/jira/browse/FLINK-11304">FLINK-11304</a>] - Typo in time attributes doc -</li> -<li>[<a href="https://issues.apache.org/jira/browse/FLINK-11469">FLINK-11469</a>] - fix Tuning Checkpoints and Large State doc -</li> -</ul> -</description> -<pubDate>Mon, 25 Feb 2019 01:00:00 +0100</pubDate> -<link>https://flink.apache.org/news/2019/02/25/release-1.6.4.html</link> -<guid isPermaLink="true">/news/2019/02/25/release-1.6.4.html</guid> -</item> - </channel> </rss> diff --git a/content/blog/index.html b/content/blog/index.html index 92d6f2d..9c1226f 100644 --- a/content/blog/index.html +++ b/content/blog/index.html @@ -201,6 +201,19 @@ <!-- Blog posts --> <article> + <h2 class="blog-title"><a href="/2022/01/20/pravega-connector-101.html">Pravega Flink Connector 101</a></h2> + + <p>20 Jan 2022 + Yumin Zhou (Brian) (<a href="https://twitter.com/crazy__zhou">@crazy__zhou</a>)</p> + + <p>A brief introduction to the Pravega Flink Connector</p> + + <p><a href="/2022/01/20/pravega-connector-101.html">Continue reading »</a></p> + </article> + + <hr> + + <article> <h2 class="blog-title"><a href="/news/2022/01/17/release-1.14.3.html">Apache Flink 1.14.3 Release Announcement</a></h2> <p>17 Jan 2022 @@ -321,19 +334,6 @@ <hr> - <article> - <h2 class="blog-title"><a href="/2021/10/26/sort-shuffle-part1.html">Sort-Based Blocking Shuffle Implementation in Flink - Part One</a></h2> - - <p>26 Oct 2021 - Yingjie Cao (Kevin) & Daisy Tsang </p> - - <p>Flink has implemented the sort-based blocking shuffle (FLIP-148) for batch data processing. In this blog post, we will take a close look at the design & implementation details and see what we can gain from it.</p> - - <p><a href="/2021/10/26/sort-shuffle-part1.html">Continue reading »</a></p> - </article> - - <hr> - <!-- Pagination links --> @@ -366,6 +366,16 @@ <ul id="markdown-toc"> + <li><a href="/2022/01/20/pravega-connector-101.html">Pravega Flink Connector 101</a></li> + + + + + + + + + <li><a href="/news/2022/01/17/release-1.14.3.html">Apache Flink 1.14.3 Release Announcement</a></li> diff --git a/content/blog/page10/index.html b/content/blog/page10/index.html index 880bc61..4c25b09 100644 --- a/content/blog/page10/index.html +++ b/content/blog/page10/index.html @@ -201,6 +201,19 @@ <!-- Blog posts --> <article> + <h2 class="blog-title"><a href="/2019/06/26/broadcast-state.html">A Practical Guide to Broadcast State in Apache Flink</a></h2> + + <p>26 Jun 2019 + Fabian Hueske (<a href="https://twitter.com/fhueske">@fhueske</a>)</p> + + <p>Apache Flink has multiple types of operator state, one of which is called Broadcast State. In this post, we explain what Broadcast State is, and show an example of how it can be applied to an application that evaluates dynamic patterns on an event stream.</p> + + <p><a href="/2019/06/26/broadcast-state.html">Continue reading »</a></p> + </article> + + <hr> + + <article> <h2 class="blog-title"><a href="/2019/06/05/flink-network-stack.html">A Deep-Dive into Flink's Network Stack</a></h2> <p>05 Jun 2019 @@ -325,21 +338,6 @@ for more details.</p> <hr> - <article> - <h2 class="blog-title"><a href="/news/2019/02/25/release-1.6.4.html">Apache Flink 1.6.4 Released</a></h2> - - <p>25 Feb 2019 - </p> - - <p><p>The Apache Flink community released the fourth bugfix version of the Apache Flink 1.6 series.</p> - -</p> - - <p><a href="/news/2019/02/25/release-1.6.4.html">Continue reading »</a></p> - </article> - - <hr> - <!-- Pagination links --> @@ -372,6 +370,16 @@ for more details.</p> <ul id="markdown-toc"> + <li><a href="/2022/01/20/pravega-connector-101.html">Pravega Flink Connector 101</a></li> + + + + + + + + + <li><a href="/news/2022/01/17/release-1.14.3.html">Apache Flink 1.14.3 Release Announcement</a></li> diff --git a/content/blog/page11/index.html b/content/blog/page11/index.html index 05703d1..50cf1a2 100644 --- a/content/blog/page11/index.html +++ b/content/blog/page11/index.html @@ -201,6 +201,21 @@ <!-- Blog posts --> <article> + <h2 class="blog-title"><a href="/news/2019/02/25/release-1.6.4.html">Apache Flink 1.6.4 Released</a></h2> + + <p>25 Feb 2019 + </p> + + <p><p>The Apache Flink community released the fourth bugfix version of the Apache Flink 1.6 series.</p> + +</p> + + <p><a href="/news/2019/02/25/release-1.6.4.html">Continue reading »</a></p> + </article> + + <hr> + + <article> <h2 class="blog-title"><a href="/news/2019/02/15/release-1.7.2.html">Apache Flink 1.7.2 Released</a></h2> <p>15 Feb 2019 @@ -335,21 +350,6 @@ Please check the <a href="https://issues.apache.org/jira/secure/ReleaseNote.jspa <hr> - <article> - <h2 class="blog-title"><a href="/news/2018/09/20/release-1.5.4.html">Apache Flink 1.5.4 Released</a></h2> - - <p>20 Sep 2018 - </p> - - <p><p>The Apache Flink community released the fourth bugfix version of the Apache Flink 1.5 series.</p> - -</p> - - <p><a href="/news/2018/09/20/release-1.5.4.html">Continue reading »</a></p> - </article> - - <hr> - <!-- Pagination links --> @@ -382,6 +382,16 @@ Please check the <a href="https://issues.apache.org/jira/secure/ReleaseNote.jspa <ul id="markdown-toc"> + <li><a href="/2022/01/20/pravega-connector-101.html">Pravega Flink Connector 101</a></li> + + + + + + + + + <li><a href="/news/2022/01/17/release-1.14.3.html">Apache Flink 1.14.3 Release Announcement</a></li> diff --git a/content/blog/page12/index.html b/content/blog/page12/index.html index 7ce300f..1049ee7 100644 --- a/content/blog/page12/index.html +++ b/content/blog/page12/index.html @@ -201,6 +201,21 @@ <!-- Blog posts --> <article> + <h2 class="blog-title"><a href="/news/2018/09/20/release-1.5.4.html">Apache Flink 1.5.4 Released</a></h2> + + <p>20 Sep 2018 + </p> + + <p><p>The Apache Flink community released the fourth bugfix version of the Apache Flink 1.5 series.</p> + +</p> + + <p><a href="/news/2018/09/20/release-1.5.4.html">Continue reading »</a></p> + </article> + + <hr> + + <article> <h2 class="blog-title"><a href="/news/2018/08/21/release-1.5.3.html">Apache Flink 1.5.3 Released</a></h2> <p>21 Aug 2018 @@ -333,19 +348,6 @@ <hr> - <article> - <h2 class="blog-title"><a href="/features/2018/01/30/incremental-checkpointing.html">Managing Large State in Apache Flink: An Intro to Incremental Checkpointing</a></h2> - - <p>30 Jan 2018 - Stefan Ricther (<a href="https://twitter.com/StefanRRicther">@StefanRRicther</a>) & Chris Ward (<a href="https://twitter.com/chrischinch">@chrischinch</a>)</p> - - <p>Flink 1.3.0 introduced incremental checkpointing, making it possible for applications with large state to generate checkpoints more efficiently.</p> - - <p><a href="/features/2018/01/30/incremental-checkpointing.html">Continue reading »</a></p> - </article> - - <hr> - <!-- Pagination links --> @@ -378,6 +380,16 @@ <ul id="markdown-toc"> + <li><a href="/2022/01/20/pravega-connector-101.html">Pravega Flink Connector 101</a></li> + + + + + + + + + <li><a href="/news/2022/01/17/release-1.14.3.html">Apache Flink 1.14.3 Release Announcement</a></li> diff --git a/content/blog/page13/index.html b/content/blog/page13/index.html index 547f587..a6c6c11 100644 --- a/content/blog/page13/index.html +++ b/content/blog/page13/index.html @@ -201,6 +201,19 @@ <!-- Blog posts --> <article> + <h2 class="blog-title"><a href="/features/2018/01/30/incremental-checkpointing.html">Managing Large State in Apache Flink: An Intro to Incremental Checkpointing</a></h2> + + <p>30 Jan 2018 + Stefan Ricther (<a href="https://twitter.com/StefanRRicther">@StefanRRicther</a>) & Chris Ward (<a href="https://twitter.com/chrischinch">@chrischinch</a>)</p> + + <p>Flink 1.3.0 introduced incremental checkpointing, making it possible for applications with large state to generate checkpoints more efficiently.</p> + + <p><a href="/features/2018/01/30/incremental-checkpointing.html">Continue reading »</a></p> + </article> + + <hr> + + <article> <h2 class="blog-title"><a href="/news/2017/12/21/2017-year-in-review.html">Apache Flink in 2017: Year in Review</a></h2> <p>21 Dec 2017 @@ -336,20 +349,6 @@ what’s coming in Flink 1.4.0 as well as a preview of what the Flink community <hr> - <article> - <h2 class="blog-title"><a href="/news/2017/04/04/dynamic-tables.html">Continuous Queries on Dynamic Tables</a></h2> - - <p>04 Apr 2017 by Fabian Hueske, Shaoxuan Wang, and Xiaowei Jiang - </p> - - <p><p>Flink's relational APIs, the Table API and SQL, are unified APIs for stream and batch processing, meaning that a query produces the same result when being evaluated on streaming or static data.</p> -<p>In this blog post we discuss the future of these APIs and introduce the concept of Dynamic Tables. Dynamic tables will significantly expand the scope of the Table API and SQL on streams and enable many more advanced use cases. We discuss how streams and dynamic tables relate to each other and explain the semantics of continuously evaluating queries on dynamic tables.</p></p> - - <p><a href="/news/2017/04/04/dynamic-tables.html">Continue reading »</a></p> - </article> - - <hr> - <!-- Pagination links --> @@ -382,6 +381,16 @@ what’s coming in Flink 1.4.0 as well as a preview of what the Flink community <ul id="markdown-toc"> + <li><a href="/2022/01/20/pravega-connector-101.html">Pravega Flink Connector 101</a></li> + + + + + + + + + <li><a href="/news/2022/01/17/release-1.14.3.html">Apache Flink 1.14.3 Release Announcement</a></li> diff --git a/content/blog/page14/index.html b/content/blog/page14/index.html index 7cfd096..4d8bac0 100644 --- a/content/blog/page14/index.html +++ b/content/blog/page14/index.html @@ -201,6 +201,20 @@ <!-- Blog posts --> <article> + <h2 class="blog-title"><a href="/news/2017/04/04/dynamic-tables.html">Continuous Queries on Dynamic Tables</a></h2> + + <p>04 Apr 2017 by Fabian Hueske, Shaoxuan Wang, and Xiaowei Jiang + </p> + + <p><p>Flink's relational APIs, the Table API and SQL, are unified APIs for stream and batch processing, meaning that a query produces the same result when being evaluated on streaming or static data.</p> +<p>In this blog post we discuss the future of these APIs and introduce the concept of Dynamic Tables. Dynamic tables will significantly expand the scope of the Table API and SQL on streams and enable many more advanced use cases. We discuss how streams and dynamic tables relate to each other and explain the semantics of continuously evaluating queries on dynamic tables.</p></p> + + <p><a href="/news/2017/04/04/dynamic-tables.html">Continue reading »</a></p> + </article> + + <hr> + + <article> <h2 class="blog-title"><a href="/news/2017/03/29/table-sql-api-update.html">From Streams to Tables and Back Again: An Update on Flink's Table & SQL API</a></h2> <p>29 Mar 2017 by Timo Walther (<a href="https://twitter.com/">@twalthr</a>) @@ -329,21 +343,6 @@ <hr> - <article> - <h2 class="blog-title"><a href="/news/2016/08/08/release-1.1.0.html">Announcing Apache Flink 1.1.0</a></h2> - - <p>08 Aug 2016 - </p> - - <p><div class="alert alert-success"><strong>Important</strong>: The Maven artifacts published with version 1.1.0 on Maven central have a Hadoop dependency issue. It is highly recommended to use <strong>1.1.1</strong> or <strong>1.1.1-hadoop1</strong> as the Flink version.</div> - -</p> - - <p><a href="/news/2016/08/08/release-1.1.0.html">Continue reading »</a></p> - </article> - - <hr> - <!-- Pagination links --> @@ -376,6 +375,16 @@ <ul id="markdown-toc"> + <li><a href="/2022/01/20/pravega-connector-101.html">Pravega Flink Connector 101</a></li> + + + + + + + + + <li><a href="/news/2022/01/17/release-1.14.3.html">Apache Flink 1.14.3 Release Announcement</a></li> diff --git a/content/blog/page15/index.html b/content/blog/page15/index.html index 7b1d8dc..58a760a 100644 --- a/content/blog/page15/index.html +++ b/content/blog/page15/index.html @@ -201,6 +201,21 @@ <!-- Blog posts --> <article> + <h2 class="blog-title"><a href="/news/2016/08/08/release-1.1.0.html">Announcing Apache Flink 1.1.0</a></h2> + + <p>08 Aug 2016 + </p> + + <p><div class="alert alert-success"><strong>Important</strong>: The Maven artifacts published with version 1.1.0 on Maven central have a Hadoop dependency issue. It is highly recommended to use <strong>1.1.1</strong> or <strong>1.1.1-hadoop1</strong> as the Flink version.</div> + +</p> + + <p><a href="/news/2016/08/08/release-1.1.0.html">Continue reading »</a></p> + </article> + + <hr> + + <article> <h2 class="blog-title"><a href="/news/2016/05/24/stream-sql.html">Stream Processing for Everyone with SQL and Apache Flink</a></h2> <p>24 May 2016 by Fabian Hueske (<a href="https://twitter.com/">@fhueske</a>) @@ -330,19 +345,6 @@ <hr> - <article> - <h2 class="blog-title"><a href="/news/2015/12/11/storm-compatibility.html">Storm Compatibility in Apache Flink: How to run existing Storm topologies on Flink</a></h2> - - <p>11 Dec 2015 by Matthias J. Sax (<a href="https://twitter.com/">@MatthiasJSax</a>) - </p> - - <p>In this blog post, we describe Flink's compatibility package for <a href="https://storm.apache.org">Apache Storm</a> that allows to embed Spouts (sources) and Bolts (operators) in a regular Flink streaming job. Furthermore, the compatibility package provides a Storm compatible API in order to execute whole Storm topologies with (almost) no code adaption.</p> - - <p><a href="/news/2015/12/11/storm-compatibility.html">Continue reading »</a></p> - </article> - - <hr> - <!-- Pagination links --> @@ -375,6 +377,16 @@ <ul id="markdown-toc"> + <li><a href="/2022/01/20/pravega-connector-101.html">Pravega Flink Connector 101</a></li> + + + + + + + + + <li><a href="/news/2022/01/17/release-1.14.3.html">Apache Flink 1.14.3 Release Announcement</a></li> diff --git a/content/blog/page16/index.html b/content/blog/page16/index.html index ea1de5d..bdec7b2 100644 --- a/content/blog/page16/index.html +++ b/content/blog/page16/index.html @@ -201,6 +201,19 @@ <!-- Blog posts --> <article> + <h2 class="blog-title"><a href="/news/2015/12/11/storm-compatibility.html">Storm Compatibility in Apache Flink: How to run existing Storm topologies on Flink</a></h2> + + <p>11 Dec 2015 by Matthias J. Sax (<a href="https://twitter.com/">@MatthiasJSax</a>) + </p> + + <p>In this blog post, we describe Flink's compatibility package for <a href="https://storm.apache.org">Apache Storm</a> that allows to embed Spouts (sources) and Bolts (operators) in a regular Flink streaming job. Furthermore, the compatibility package provides a Storm compatible API in order to execute whole Storm topologies with (almost) no code adaption.</p> + + <p><a href="/news/2015/12/11/storm-compatibility.html">Continue reading »</a></p> + </article> + + <hr> + + <article> <h2 class="blog-title"><a href="/news/2015/12/04/Introducing-windows.html">Introducing Stream Windows in Apache Flink</a></h2> <p>04 Dec 2015 by Fabian Hueske (<a href="https://twitter.com/">@fhueske</a>) @@ -338,20 +351,6 @@ vertex-centric or gather-sum-apply to Flink dataflows.</p> <hr> - <article> - <h2 class="blog-title"><a href="/news/2015/05/11/Juggling-with-Bits-and-Bytes.html">Juggling with Bits and Bytes</a></h2> - - <p>11 May 2015 by Fabian Hüske (<a href="https://twitter.com/">@fhueske</a>) - </p> - - <p><p>Nowadays, a lot of open-source systems for analyzing large data sets are implemented in Java or other JVM-based programming languages. The most well-known example is Apache Hadoop, but also newer frameworks such as Apache Spark, Apache Drill, and also Apache Flink run on JVMs. A common challenge that JVM-based data analysis engines face is to store large amounts of data in memory - both for caching and for efficient processing such as sorting and joining of data. Managing the [...] -<p>In this blog post we discuss how Apache Flink manages memory, talk about its custom data de/serialization stack, and show how it operates on binary data.</p></p> - - <p><a href="/news/2015/05/11/Juggling-with-Bits-and-Bytes.html">Continue reading »</a></p> - </article> - - <hr> - <!-- Pagination links --> @@ -384,6 +383,16 @@ vertex-centric or gather-sum-apply to Flink dataflows.</p> <ul id="markdown-toc"> + <li><a href="/2022/01/20/pravega-connector-101.html">Pravega Flink Connector 101</a></li> + + + + + + + + + <li><a href="/news/2022/01/17/release-1.14.3.html">Apache Flink 1.14.3 Release Announcement</a></li> diff --git a/content/blog/page17/index.html b/content/blog/page17/index.html index e8411dc..aaf0bac 100644 --- a/content/blog/page17/index.html +++ b/content/blog/page17/index.html @@ -201,6 +201,20 @@ <!-- Blog posts --> <article> + <h2 class="blog-title"><a href="/news/2015/05/11/Juggling-with-Bits-and-Bytes.html">Juggling with Bits and Bytes</a></h2> + + <p>11 May 2015 by Fabian Hüske (<a href="https://twitter.com/">@fhueske</a>) + </p> + + <p><p>Nowadays, a lot of open-source systems for analyzing large data sets are implemented in Java or other JVM-based programming languages. The most well-known example is Apache Hadoop, but also newer frameworks such as Apache Spark, Apache Drill, and also Apache Flink run on JVMs. A common challenge that JVM-based data analysis engines face is to store large amounts of data in memory - both for caching and for efficient processing such as sorting and joining of data. Managing the [...] +<p>In this blog post we discuss how Apache Flink manages memory, talk about its custom data de/serialization stack, and show how it operates on binary data.</p></p> + + <p><a href="/news/2015/05/11/Juggling-with-Bits-and-Bytes.html">Continue reading »</a></p> + </article> + + <hr> + + <article> <h2 class="blog-title"><a href="/news/2015/04/13/release-0.9.0-milestone1.html">Announcing Flink 0.9.0-milestone1 preview release</a></h2> <p>13 Apr 2015 @@ -345,21 +359,6 @@ and offers a new API including definition of flexible windows.</p> <hr> - <article> - <h2 class="blog-title"><a href="/news/2014/11/04/release-0.7.0.html">Apache Flink 0.7.0 available</a></h2> - - <p>04 Nov 2014 - </p> - - <p><p>We are pleased to announce the availability of Flink 0.7.0. This release includes new user-facing features as well as performance and bug fixes, brings the Scala and Java APIs in sync, and introduces Flink Streaming. A total of 34 people have contributed to this release, a big thanks to all of them!</p> - -</p> - - <p><a href="/news/2014/11/04/release-0.7.0.html">Continue reading »</a></p> - </article> - - <hr> - <!-- Pagination links --> @@ -392,6 +391,16 @@ and offers a new API including definition of flexible windows.</p> <ul id="markdown-toc"> + <li><a href="/2022/01/20/pravega-connector-101.html">Pravega Flink Connector 101</a></li> + + + + + + + + + <li><a href="/news/2022/01/17/release-1.14.3.html">Apache Flink 1.14.3 Release Announcement</a></li> diff --git a/content/blog/page18/index.html b/content/blog/page18/index.html index 870d5b4..427ce18 100644 --- a/content/blog/page18/index.html +++ b/content/blog/page18/index.html @@ -201,6 +201,21 @@ <!-- Blog posts --> <article> + <h2 class="blog-title"><a href="/news/2014/11/04/release-0.7.0.html">Apache Flink 0.7.0 available</a></h2> + + <p>04 Nov 2014 + </p> + + <p><p>We are pleased to announce the availability of Flink 0.7.0. This release includes new user-facing features as well as performance and bug fixes, brings the Scala and Java APIs in sync, and introduces Flink Streaming. A total of 34 people have contributed to this release, a big thanks to all of them!</p> + +</p> + + <p><a href="/news/2014/11/04/release-0.7.0.html">Continue reading »</a></p> + </article> + + <hr> + + <article> <h2 class="blog-title"><a href="/news/2014/10/03/upcoming_events.html">Upcoming Events</a></h2> <p>03 Oct 2014 @@ -280,6 +295,16 @@ academic and open source project that Flink originates from.</p> <ul id="markdown-toc"> + <li><a href="/2022/01/20/pravega-connector-101.html">Pravega Flink Connector 101</a></li> + + + + + + + + + <li><a href="/news/2022/01/17/release-1.14.3.html">Apache Flink 1.14.3 Release Announcement</a></li> diff --git a/content/blog/page2/index.html b/content/blog/page2/index.html index 407187c..cf244b9 100644 --- a/content/blog/page2/index.html +++ b/content/blog/page2/index.html @@ -201,6 +201,19 @@ <!-- Blog posts --> <article> + <h2 class="blog-title"><a href="/2021/10/26/sort-shuffle-part1.html">Sort-Based Blocking Shuffle Implementation in Flink - Part One</a></h2> + + <p>26 Oct 2021 + Yingjie Cao (Kevin) & Daisy Tsang </p> + + <p>Flink has implemented the sort-based blocking shuffle (FLIP-148) for batch data processing. In this blog post, we will take a close look at the design & implementation details and see what we can gain from it.</p> + + <p><a href="/2021/10/26/sort-shuffle-part1.html">Continue reading »</a></p> + </article> + + <hr> + + <article> <h2 class="blog-title"><a href="/news/2021/10/19/release-1.13.3.html">Apache Flink 1.13.3 Released</a></h2> <p>19 Oct 2021 @@ -341,19 +354,6 @@ This new release brings various improvements to the StateFun runtime, a leaner w <hr> - <article> - <h2 class="blog-title"><a href="/2021/07/07/backpressure.html">How to identify the source of backpressure?</a></h2> - - <p>07 Jul 2021 - Piotr Nowojski (<a href="https://twitter.com/PiotrNowojski">@PiotrNowojski</a>)</p> - - <p>Apache Flink 1.13 introduced a couple of important changes in the area of backpressure monitoring and performance analysis of Flink Jobs. This blog post aims to introduce those changes and explain how to use them.</p> - - <p><a href="/2021/07/07/backpressure.html">Continue reading »</a></p> - </article> - - <hr> - <!-- Pagination links --> @@ -386,6 +386,16 @@ This new release brings various improvements to the StateFun runtime, a leaner w <ul id="markdown-toc"> + <li><a href="/2022/01/20/pravega-connector-101.html">Pravega Flink Connector 101</a></li> + + + + + + + + + <li><a href="/news/2022/01/17/release-1.14.3.html">Apache Flink 1.14.3 Release Announcement</a></li> diff --git a/content/blog/page3/index.html b/content/blog/page3/index.html index 59dc953..fd2b672 100644 --- a/content/blog/page3/index.html +++ b/content/blog/page3/index.html @@ -201,6 +201,19 @@ <!-- Blog posts --> <article> + <h2 class="blog-title"><a href="/2021/07/07/backpressure.html">How to identify the source of backpressure?</a></h2> + + <p>07 Jul 2021 + Piotr Nowojski (<a href="https://twitter.com/PiotrNowojski">@PiotrNowojski</a>)</p> + + <p>Apache Flink 1.13 introduced a couple of important changes in the area of backpressure monitoring and performance analysis of Flink Jobs. This blog post aims to introduce those changes and explain how to use them.</p> + + <p><a href="/2021/07/07/backpressure.html">Continue reading »</a></p> + </article> + + <hr> + + <article> <h2 class="blog-title"><a href="/news/2021/05/28/release-1.13.1.html">Apache Flink 1.13.1 Released</a></h2> <p>28 May 2021 @@ -329,21 +342,6 @@ to develop scalable, consistent, and elastic distributed applications.</p> <hr> - <article> - <h2 class="blog-title"><a href="/news/2021/01/29/release-1.10.3.html">Apache Flink 1.10.3 Released</a></h2> - - <p>29 Jan 2021 - Xintong Song </p> - - <p><p>The Apache Flink community released the third bugfix version of the Apache Flink 1.10 series.</p> - -</p> - - <p><a href="/news/2021/01/29/release-1.10.3.html">Continue reading »</a></p> - </article> - - <hr> - <!-- Pagination links --> @@ -376,6 +374,16 @@ to develop scalable, consistent, and elastic distributed applications.</p> <ul id="markdown-toc"> + <li><a href="/2022/01/20/pravega-connector-101.html">Pravega Flink Connector 101</a></li> + + + + + + + + + <li><a href="/news/2022/01/17/release-1.14.3.html">Apache Flink 1.14.3 Release Announcement</a></li> diff --git a/content/blog/page4/index.html b/content/blog/page4/index.html index 40b62c6..0d55ab1 100644 --- a/content/blog/page4/index.html +++ b/content/blog/page4/index.html @@ -201,6 +201,21 @@ <!-- Blog posts --> <article> + <h2 class="blog-title"><a href="/news/2021/01/29/release-1.10.3.html">Apache Flink 1.10.3 Released</a></h2> + + <p>29 Jan 2021 + Xintong Song </p> + + <p><p>The Apache Flink community released the third bugfix version of the Apache Flink 1.10 series.</p> + +</p> + + <p><a href="/news/2021/01/29/release-1.10.3.html">Continue reading »</a></p> + </article> + + <hr> + + <article> <h2 class="blog-title"><a href="/news/2021/01/19/release-1.12.1.html">Apache Flink 1.12.1 Released</a></h2> <p>19 Jan 2021 @@ -325,19 +340,6 @@ <hr> - <article> - <h2 class="blog-title"><a href="/2020/10/15/from-aligned-to-unaligned-checkpoints-part-1.html">From Aligned to Unaligned Checkpoints - Part 1: Checkpoints, Alignment, and Backpressure</a></h2> - - <p>15 Oct 2020 - Arvid Heise & Stephan Ewen </p> - - <p>Apache Flink’s checkpoint-based fault tolerance mechanism is one of its defining features. Because of that design, Flink unifies batch and stream processing, can easily scale to both very small and extremely large scenarios and provides support for many operational features. In this post we recap the original checkpointing process in Flink, its core properties and issues under backpressure.</p> - - <p><a href="/2020/10/15/from-aligned-to-unaligned-checkpoints-part-1.html">Continue reading »</a></p> - </article> - - <hr> - <!-- Pagination links --> @@ -370,6 +372,16 @@ <ul id="markdown-toc"> + <li><a href="/2022/01/20/pravega-connector-101.html">Pravega Flink Connector 101</a></li> + + + + + + + + + <li><a href="/news/2022/01/17/release-1.14.3.html">Apache Flink 1.14.3 Release Announcement</a></li> diff --git a/content/blog/page5/index.html b/content/blog/page5/index.html index 626159e..d6e2af0 100644 --- a/content/blog/page5/index.html +++ b/content/blog/page5/index.html @@ -201,6 +201,19 @@ <!-- Blog posts --> <article> + <h2 class="blog-title"><a href="/2020/10/15/from-aligned-to-unaligned-checkpoints-part-1.html">From Aligned to Unaligned Checkpoints - Part 1: Checkpoints, Alignment, and Backpressure</a></h2> + + <p>15 Oct 2020 + Arvid Heise & Stephan Ewen </p> + + <p>Apache Flink’s checkpoint-based fault tolerance mechanism is one of its defining features. Because of that design, Flink unifies batch and stream processing, can easily scale to both very small and extremely large scenarios and provides support for many operational features. In this post we recap the original checkpointing process in Flink, its core properties and issues under backpressure.</p> + + <p><a href="/2020/10/15/from-aligned-to-unaligned-checkpoints-part-1.html">Continue reading »</a></p> + </article> + + <hr> + + <article> <h2 class="blog-title"><a href="/news/2020/10/13/stateful-serverless-internals.html">Stateful Functions Internals: Behind the scenes of Stateful Serverless</a></h2> <p>13 Oct 2020 @@ -329,19 +342,6 @@ as well as increased observability for operational purposes.</p> <hr> - <article> - <h2 class="blog-title"><a href="/2020/08/04/pyflink-pandas-udf-support-flink.html">PyFlink: The integration of Pandas into PyFlink</a></h2> - - <p>04 Aug 2020 - Jincheng Sun (<a href="https://twitter.com/sunjincheng121">@sunjincheng121</a>) & Markos Sfikas (<a href="https://twitter.com/MarkSfik">@MarkSfik</a>)</p> - - <p>The Apache Flink community put some great effort into integrating Pandas with PyFlink in the latest Flink version 1.11. Some of the added features include support for Pandas UDF and the conversion between Pandas DataFrame and Table. In this article, we will introduce how these functionalities work and how to use them with a step-by-step example.</p> - - <p><a href="/2020/08/04/pyflink-pandas-udf-support-flink.html">Continue reading »</a></p> - </article> - - <hr> - <!-- Pagination links --> @@ -374,6 +374,16 @@ as well as increased observability for operational purposes.</p> <ul id="markdown-toc"> + <li><a href="/2022/01/20/pravega-connector-101.html">Pravega Flink Connector 101</a></li> + + + + + + + + + <li><a href="/news/2022/01/17/release-1.14.3.html">Apache Flink 1.14.3 Release Announcement</a></li> diff --git a/content/blog/page6/index.html b/content/blog/page6/index.html index 3098079..af18872 100644 --- a/content/blog/page6/index.html +++ b/content/blog/page6/index.html @@ -201,6 +201,19 @@ <!-- Blog posts --> <article> + <h2 class="blog-title"><a href="/2020/08/04/pyflink-pandas-udf-support-flink.html">PyFlink: The integration of Pandas into PyFlink</a></h2> + + <p>04 Aug 2020 + Jincheng Sun (<a href="https://twitter.com/sunjincheng121">@sunjincheng121</a>) & Markos Sfikas (<a href="https://twitter.com/MarkSfik">@MarkSfik</a>)</p> + + <p>The Apache Flink community put some great effort into integrating Pandas with PyFlink in the latest Flink version 1.11. Some of the added features include support for Pandas UDF and the conversion between Pandas DataFrame and Table. In this article, we will introduce how these functionalities work and how to use them with a step-by-step example.</p> + + <p><a href="/2020/08/04/pyflink-pandas-udf-support-flink.html">Continue reading »</a></p> + </article> + + <hr> + + <article> <h2 class="blog-title"><a href="/news/2020/07/30/demo-fraud-detection-3.html">Advanced Flink Application Patterns Vol.3: Custom Window Processing</a></h2> <p>30 Jul 2020 @@ -335,19 +348,6 @@ and provide a tutorial for running Streaming ETL with Flink on Zeppelin.</p> <hr> - <article> - <h2 class="blog-title"><a href="/news/2020/06/11/community-update.html">Flink Community Update - June'20</a></h2> - - <p>11 Jun 2020 - Marta Paes (<a href="https://twitter.com/morsapaes">@morsapaes</a>)</p> - - <p>And suddenly it’s June. The previous month has been calm on the surface, but quite hectic underneath — the final testing phase for Flink 1.11 is moving at full speed, Stateful Functions 2.1 is out in the wild and Flink has made it into Google Season of Docs 2020.</p> - - <p><a href="/news/2020/06/11/community-update.html">Continue reading »</a></p> - </article> - - <hr> - <!-- Pagination links --> @@ -380,6 +380,16 @@ and provide a tutorial for running Streaming ETL with Flink on Zeppelin.</p> <ul id="markdown-toc"> + <li><a href="/2022/01/20/pravega-connector-101.html">Pravega Flink Connector 101</a></li> + + + + + + + + + <li><a href="/news/2022/01/17/release-1.14.3.html">Apache Flink 1.14.3 Release Announcement</a></li> diff --git a/content/blog/page7/index.html b/content/blog/page7/index.html index 24a11ad..96d9aa4 100644 --- a/content/blog/page7/index.html +++ b/content/blog/page7/index.html @@ -201,6 +201,19 @@ <!-- Blog posts --> <article> + <h2 class="blog-title"><a href="/news/2020/06/11/community-update.html">Flink Community Update - June'20</a></h2> + + <p>11 Jun 2020 + Marta Paes (<a href="https://twitter.com/morsapaes">@morsapaes</a>)</p> + + <p>And suddenly it’s June. The previous month has been calm on the surface, but quite hectic underneath — the final testing phase for Flink 1.11 is moving at full speed, Stateful Functions 2.1 is out in the wild and Flink has made it into Google Season of Docs 2020.</p> + + <p><a href="/news/2020/06/11/community-update.html">Continue reading »</a></p> + </article> + + <hr> + + <article> <h2 class="blog-title"><a href="/news/2020/06/09/release-statefun-2.1.0.html">Stateful Functions 2.1.0 Release Announcement</a></h2> <p>09 Jun 2020 @@ -326,19 +339,6 @@ This release marks a big milestone: Stateful Functions 2.0 is not only an API up <hr> - <article> - <h2 class="blog-title"><a href="/news/2020/04/01/community-update.html">Flink Community Update - April'20</a></h2> - - <p>01 Apr 2020 - Marta Paes (<a href="https://twitter.com/morsapaes">@morsapaes</a>)</p> - - <p>While things slow down around us, the Apache Flink community is privileged to remain as active as ever. This blogpost combs through the past few months to give you an update on the state of things in Flink — from core releases to Stateful Functions; from some good old community stats to a new development blog.</p> - - <p><a href="/news/2020/04/01/community-update.html">Continue reading »</a></p> - </article> - - <hr> - <!-- Pagination links --> @@ -371,6 +371,16 @@ This release marks a big milestone: Stateful Functions 2.0 is not only an API up <ul id="markdown-toc"> + <li><a href="/2022/01/20/pravega-connector-101.html">Pravega Flink Connector 101</a></li> + + + + + + + + + <li><a href="/news/2022/01/17/release-1.14.3.html">Apache Flink 1.14.3 Release Announcement</a></li> diff --git a/content/blog/page8/index.html b/content/blog/page8/index.html index 39bfecd..2229ac9 100644 --- a/content/blog/page8/index.html +++ b/content/blog/page8/index.html @@ -201,6 +201,19 @@ <!-- Blog posts --> <article> + <h2 class="blog-title"><a href="/news/2020/04/01/community-update.html">Flink Community Update - April'20</a></h2> + + <p>01 Apr 2020 + Marta Paes (<a href="https://twitter.com/morsapaes">@morsapaes</a>)</p> + + <p>While things slow down around us, the Apache Flink community is privileged to remain as active as ever. This blogpost combs through the past few months to give you an update on the state of things in Flink — from core releases to Stateful Functions; from some good old community stats to a new development blog.</p> + + <p><a href="/news/2020/04/01/community-update.html">Continue reading »</a></p> + </article> + + <hr> + + <article> <h2 class="blog-title"><a href="/features/2020/03/27/flink-for-data-warehouse.html">Flink as Unified Engine for Modern Data Warehousing: Production-Ready Hive Integration</a></h2> <p>27 Mar 2020 @@ -323,21 +336,6 @@ <hr> - <article> - <h2 class="blog-title"><a href="/news/2019/12/11/release-1.8.3.html">Apache Flink 1.8.3 Released</a></h2> - - <p>11 Dec 2019 - Hequn Cheng </p> - - <p><p>The Apache Flink community released the third bugfix version of the Apache Flink 1.8 series.</p> - -</p> - - <p><a href="/news/2019/12/11/release-1.8.3.html">Continue reading »</a></p> - </article> - - <hr> - <!-- Pagination links --> @@ -370,6 +368,16 @@ <ul id="markdown-toc"> + <li><a href="/2022/01/20/pravega-connector-101.html">Pravega Flink Connector 101</a></li> + + + + + + + + + <li><a href="/news/2022/01/17/release-1.14.3.html">Apache Flink 1.14.3 Release Announcement</a></li> diff --git a/content/blog/page9/index.html b/content/blog/page9/index.html index fbf0c30..869cf9b 100644 --- a/content/blog/page9/index.html +++ b/content/blog/page9/index.html @@ -201,6 +201,21 @@ <!-- Blog posts --> <article> + <h2 class="blog-title"><a href="/news/2019/12/11/release-1.8.3.html">Apache Flink 1.8.3 Released</a></h2> + + <p>11 Dec 2019 + Hequn Cheng </p> + + <p><p>The Apache Flink community released the third bugfix version of the Apache Flink 1.8 series.</p> + +</p> + + <p><a href="/news/2019/12/11/release-1.8.3.html">Continue reading »</a></p> + </article> + + <hr> + + <article> <h2 class="blog-title"><a href="/news/2019/12/09/flink-kubernetes-kudo.html">Running Apache Flink on Kubernetes with KUDO</a></h2> <p>09 Dec 2019 @@ -326,19 +341,6 @@ <hr> - <article> - <h2 class="blog-title"><a href="/2019/06/26/broadcast-state.html">A Practical Guide to Broadcast State in Apache Flink</a></h2> - - <p>26 Jun 2019 - Fabian Hueske (<a href="https://twitter.com/fhueske">@fhueske</a>)</p> - - <p>Apache Flink has multiple types of operator state, one of which is called Broadcast State. In this post, we explain what Broadcast State is, and show an example of how it can be applied to an application that evaluates dynamic patterns on an event stream.</p> - - <p><a href="/2019/06/26/broadcast-state.html">Continue reading »</a></p> - </article> - - <hr> - <!-- Pagination links --> @@ -371,6 +373,16 @@ <ul id="markdown-toc"> + <li><a href="/2022/01/20/pravega-connector-101.html">Pravega Flink Connector 101</a></li> + + + + + + + + + <li><a href="/news/2022/01/17/release-1.14.3.html">Apache Flink 1.14.3 Release Announcement</a></li> diff --git a/content/index.html b/content/index.html index 20e1220..61e6e81 100644 --- a/content/index.html +++ b/content/index.html @@ -365,6 +365,9 @@ <dl> + <dt> <a href="/2022/01/20/pravega-connector-101.html">Pravega Flink Connector 101</a></dt> + <dd>A brief introduction to the Pravega Flink Connector</dd> + <dt> <a href="/news/2022/01/17/release-1.14.3.html">Apache Flink 1.14.3 Release Announcement</a></dt> <dd>The Apache Flink community released the second bugfix version of the Apache Flink 1.14 series.</dd> @@ -376,11 +379,6 @@ <dt> <a href="/2022/01/04/scheduler-performance-part-one.html">How We Improved Scheduler Performance for Large-scale Jobs - Part One</a></dt> <dd>To improve the performance of the scheduler for large-scale jobs, several optimizations were introduced in Flink 1.13 and 1.14. In this blog post we'll take a look at them.</dd> - - <dt> <a href="/news/2021/12/22/log4j-statefun-release.html">Apache Flink StateFun Log4j emergency release</a></dt> - <dd><p>The Apache Flink community has released an emergency bugfix version of Apache Flink Stateful Function 3.1.1.</p> - -</dd> </dl> diff --git a/content/zh/index.html b/content/zh/index.html index ad65b66..4e8ecc1 100644 --- a/content/zh/index.html +++ b/content/zh/index.html @@ -362,6 +362,9 @@ <dl> + <dt> <a href="/2022/01/20/pravega-connector-101.html">Pravega Flink Connector 101</a></dt> + <dd>A brief introduction to the Pravega Flink Connector</dd> + <dt> <a href="/news/2022/01/17/release-1.14.3.html">Apache Flink 1.14.3 Release Announcement</a></dt> <dd>The Apache Flink community released the second bugfix version of the Apache Flink 1.14 series.</dd> @@ -373,11 +376,6 @@ <dt> <a href="/2022/01/04/scheduler-performance-part-one.html">How We Improved Scheduler Performance for Large-scale Jobs - Part One</a></dt> <dd>To improve the performance of the scheduler for large-scale jobs, several optimizations were introduced in Flink 1.13 and 1.14. In this blog post we'll take a look at them.</dd> - - <dt> <a href="/news/2021/12/22/log4j-statefun-release.html">Apache Flink StateFun Log4j emergency release</a></dt> - <dd><p>The Apache Flink community has released an emergency bugfix version of Apache Flink Stateful Function 3.1.1.</p> - -</dd> </dl>