[flink-web] 02/02: Rebuild website

dianfu Mon, 03 Aug 2020 23:28:36 -0700

This is an automated email from the ASF dual-hosted git repository.

dianfu pushed a commit to branch asf-site
in repository https://gitbox.apache.org/repos/asf/flink-web.git


commit 1e216e8661e121dfd4f57c19c66242699d4736ec
Author: Dian Fu <dia...@apache.org>
AuthorDate: Tue Aug 4 14:22:33 2020 +0800

    Rebuild website
---
 .../07/28/pyflink-pandas-udf-support-flink.html    | 463 +++++++++++++++++++++
 content/blog/feed.xml                              | 356 +++++++++-------
 content/blog/index.html                            |  36 +-
 content/blog/page10/index.html                     |  38 +-
 content/blog/page11/index.html                     |  37 +-
 content/blog/page12/index.html                     |  39 +-
 content/blog/page13/index.html                     |  25 ++
 content/blog/page2/index.html                      |  36 +-
 content/blog/page3/index.html                      |  38 +-
 content/blog/page4/index.html                      |  38 +-
 content/blog/page5/index.html                      |  38 +-
 content/blog/page6/index.html                      |  40 +-
 content/blog/page7/index.html                      |  38 +-
 content/blog/page8/index.html                      |  37 +-
 content/blog/page9/index.html                      |  39 +-
 .../mission-of-pyFlink.gif                         | Bin 0 -> 656600 bytes
 .../python-scientific-stack.png                    | Bin 0 -> 535909 bytes
 .../2020-07-28-pyflink-pandas/vm-communication.png | Bin 0 -> 51408 bytes
 content/index.html                                 |   8 +-
 content/zh/index.html                              |   8 +-
 20 files changed, 989 insertions(+), 325 deletions(-)

diff --git a/content/2020/07/28/pyflink-pandas-udf-support-flink.html 
b/content/2020/07/28/pyflink-pandas-udf-support-flink.html
new file mode 100644
index 0000000..05b2e6e
--- /dev/null
+++ b/content/2020/07/28/pyflink-pandas-udf-support-flink.html
@@ -0,0 +1,463 @@
+<!DOCTYPE html>
+<html lang="en">
+  <head>
+    <meta charset="utf-8">
+    <meta http-equiv="X-UA-Compatible" content="IE=edge">
+    <meta name="viewport" content="width=device-width, initial-scale=1">
+    <!-- The above 3 meta tags *must* come first in the head; any other head 
content must come *after* these tags -->
+    <title>Apache Flink: PyFlink: The integration of Pandas into 
PyFlink</title>
+    <link rel="shortcut icon" href="/favicon.ico" type="image/x-icon">
+    <link rel="icon" href="/favicon.ico" type="image/x-icon">
+
+    <!-- Bootstrap -->
+    <link rel="stylesheet" 
href="https://maxcdn.bootstrapcdn.com/bootstrap/3.4.1/css/bootstrap.min.css";>
+    <link rel="stylesheet" href="/css/flink.css">
+    <link rel="stylesheet" href="/css/syntax.css">
+
+    <!-- Blog RSS feed -->
+    <link href="/blog/feed.xml" rel="alternate" type="application/rss+xml" 
title="Apache Flink Blog: RSS feed" />
+
+    <!-- jQuery (necessary for Bootstrap's JavaScript plugins) -->
+    <!-- We need to load Jquery in the header for custom google analytics 
event tracking-->
+    <script src="/js/jquery.min.js"></script>
+
+    <!-- HTML5 shim and Respond.js for IE8 support of HTML5 elements and media 
queries -->
+    <!-- WARNING: Respond.js doesn't work if you view the page via file:// -->
+    <!--[if lt IE 9]>
+      <script 
src="https://oss.maxcdn.com/html5shiv/3.7.2/html5shiv.min.js";></script>
+      <script 
src="https://oss.maxcdn.com/respond/1.4.2/respond.min.js";></script>
+    <![endif]-->
+  </head>
+  <body>  
+    
+
+    <!-- Main content. -->
+    <div class="container">
+    <div class="row">
+
+      
+     <div id="sidebar" class="col-sm-3">
+        
+
+<!-- Top navbar. -->
+    <nav class="navbar navbar-default">
+        <!-- The logo. -->
+        <div class="navbar-header">
+          <button type="button" class="navbar-toggle collapsed" 
data-toggle="collapse" data-target="#bs-example-navbar-collapse-1">
+            <span class="icon-bar"></span>
+            <span class="icon-bar"></span>
+            <span class="icon-bar"></span>
+          </button>
+          <div class="navbar-logo">
+            <a href="/">
+              <img alt="Apache Flink" src="/img/flink-header-logo.svg" 
width="147px" height="73px">
+            </a>
+          </div>
+        </div><!-- /.navbar-header -->
+
+        <!-- The navigation links. -->
+        <div class="collapse navbar-collapse" 
id="bs-example-navbar-collapse-1">
+          <ul class="nav navbar-nav navbar-main">
+
+            <!-- First menu section explains visitors what Flink is -->
+
+            <!-- What is Stream Processing? -->
+            <!--
+            <li><a href="/streamprocessing1.html">What is Stream 
Processing?</a></li>
+            -->
+
+            <!-- What is Flink? -->
+            <li><a href="/flink-architecture.html">What is Apache 
Flink?</a></li>
+
+            
+            <ul class="nav navbar-nav navbar-subnav">
+              <li >
+                  <a href="/flink-architecture.html">Architecture</a>
+              </li>
+              <li >
+                  <a href="/flink-applications.html">Applications</a>
+              </li>
+              <li >
+                  <a href="/flink-operations.html">Operations</a>
+              </li>
+            </ul>
+            
+
+            <!-- What is Stateful Functions? -->
+
+            <li><a href="/stateful-functions.html">What is Stateful 
Functions?</a></li>
+
+            <!-- Use cases -->
+            <li><a href="/usecases.html">Use Cases</a></li>
+
+            <!-- Powered by -->
+            <li><a href="/poweredby.html">Powered By</a></li>
+
+
+            &nbsp;
+            <!-- Second menu section aims to support Flink users -->
+
+            <!-- Downloads -->
+            <li><a href="/downloads.html">Downloads</a></li>
+
+            <!-- Getting Started -->
+            <li class="dropdown">
+              <a class="dropdown-toggle" data-toggle="dropdown" 
href="#">Getting Started<span class="caret"></span></a>
+              <ul class="dropdown-menu">
+                <li><a 
href="https://ci.apache.org/projects/flink/flink-docs-release-1.11/getting-started/index.html";
 target="_blank">With Flink <small><span class="glyphicon 
glyphicon-new-window"></span></small></a></li>
+                <li><a 
href="https://ci.apache.org/projects/flink/flink-statefun-docs-release-2.1/getting-started/project-setup.html";
 target="_blank">With Flink Stateful Functions <small><span class="glyphicon 
glyphicon-new-window"></span></small></a></li>
+                <li><a href="/training.html">Training Course</a></li>
+              </ul>
+            </li>
+
+            <!-- Documentation -->
+            <li class="dropdown">
+              <a class="dropdown-toggle" data-toggle="dropdown" 
href="#">Documentation<span class="caret"></span></a>
+              <ul class="dropdown-menu">
+                <li><a 
href="https://ci.apache.org/projects/flink/flink-docs-release-1.11"; 
target="_blank">Flink 1.11 (Latest stable release) <small><span 
class="glyphicon glyphicon-new-window"></span></small></a></li>
+                <li><a 
href="https://ci.apache.org/projects/flink/flink-docs-master"; 
target="_blank">Flink Master (Latest Snapshot) <small><span class="glyphicon 
glyphicon-new-window"></span></small></a></li>
+                <li><a 
href="https://ci.apache.org/projects/flink/flink-statefun-docs-release-2.1"; 
target="_blank">Flink Stateful Functions 2.1 (Latest stable release) 
<small><span class="glyphicon glyphicon-new-window"></span></small></a></li>
+                <li><a 
href="https://ci.apache.org/projects/flink/flink-statefun-docs-master"; 
target="_blank">Flink Stateful Functions Master (Latest Snapshot) <small><span 
class="glyphicon glyphicon-new-window"></span></small></a></li>
+              </ul>
+            </li>
+
+            <!-- getting help -->
+            <li><a href="/gettinghelp.html">Getting Help</a></li>
+
+            <!-- Blog -->
+            <li><a href="/blog/"><b>Flink Blog</b></a></li>
+
+
+            <!-- Flink-packages -->
+            <li>
+              <a href="https://flink-packages.org"; 
target="_blank">flink-packages.org <small><span class="glyphicon 
glyphicon-new-window"></span></small></a>
+            </li>
+            &nbsp;
+
+            <!-- Third menu section aim to support community and contributors 
-->
+
+            <!-- Community -->
+            <li><a href="/community.html">Community &amp; Project Info</a></li>
+
+            <!-- Roadmap -->
+            <li><a href="/roadmap.html">Roadmap</a></li>
+
+            <!-- Contribute -->
+            <li><a href="/contributing/how-to-contribute.html">How to 
Contribute</a></li>
+            
+
+            <!-- GitHub -->
+            <li>
+              <a href="https://github.com/apache/flink"; target="_blank">Flink 
on GitHub <small><span class="glyphicon 
glyphicon-new-window"></span></small></a>
+            </li>
+
+            &nbsp;
+
+            <!-- Language Switcher -->
+            <li>
+              
+                
+                  <a 
href="/zh/2020/07/28/pyflink-pandas-udf-support-flink.html">中文版</a>
+                
+              
+            </li>
+
+          </ul>
+
+          <ul class="nav navbar-nav navbar-bottom">
+          <hr />
+
+            <!-- Twitter -->
+            <li><a href="https://twitter.com/apacheflink"; 
target="_blank">@ApacheFlink <small><span class="glyphicon 
glyphicon-new-window"></span></small></a></li>
+
+            <!-- Visualizer -->
+            <li class=" hidden-md hidden-sm"><a href="/visualizer/" 
target="_blank">Plan Visualizer <small><span class="glyphicon 
glyphicon-new-window"></span></small></a></li>
+
+          <hr />
+
+            <li><a href="https://apache.org"; target="_blank">Apache Software 
Foundation <small><span class="glyphicon 
glyphicon-new-window"></span></small></a></li>
+
+            <li>
+              <style>
+                .smalllinks:link {
+                  display: inline-block !important; background: none; 
padding-top: 0px; padding-bottom: 0px; padding-right: 0px; min-width: 75px;
+                }
+              </style>
+
+              <a class="smalllinks" href="https://www.apache.org/licenses/"; 
target="_blank">License</a> <small><span class="glyphicon 
glyphicon-new-window"></span></small>
+
+              <a class="smalllinks" href="https://www.apache.org/security/"; 
target="_blank">Security</a> <small><span class="glyphicon 
glyphicon-new-window"></span></small>
+
+              <a class="smalllinks" 
href="https://www.apache.org/foundation/sponsorship.html"; 
target="_blank">Donate</a> <small><span class="glyphicon 
glyphicon-new-window"></span></small>
+
+              <a class="smalllinks" 
href="https://www.apache.org/foundation/thanks.html"; target="_blank">Thanks</a> 
<small><span class="glyphicon glyphicon-new-window"></span></small>
+            </li>
+
+          </ul>
+        </div><!-- /.navbar-collapse -->
+    </nav>
+
+      </div>
+      <div class="col-sm-9">
+      <div class="row-fluid">
+  <div class="col-sm-12">
+    <div class="row">
+      <h1>PyFlink: The integration of Pandas into PyFlink</h1>
+      <p><i></i></p>
+
+      <article>
+        <p>28 Jul 2020 Jincheng Sun (<a 
href="https://twitter.com/sunjincheng121";>@sunjincheng121</a>) &amp; Markos 
Sfikas (<a href="https://twitter.com/MarkSfik";>@MarkSfik</a>)</p>
+
+<p>Python has evolved into one of the most important programming languages for 
many fields of data processing. So big has been Python’s popularity, that it 
has pretty much become the default data processing language for data 
scientists. On top of that, there is a plethora of Python-based data processing 
tools such as NumPy, Pandas, and Scikit-learn that have gained additional 
popularity due to their flexibility or powerful functionalities.</p>
+
+<center>
+<img src="/img/blog/2020-07-28-pyflink-pandas/python-scientific-stack.png" 
width="450px" alt="Python Scientific Stack" />
+</center>
+<center>
+  <a 
href="https://speakerdeck.com/jakevdp/the-unexpected-effectiveness-of-python-in-science?slide=52";>Pic
 source: VanderPlas 2017, slide 52.</a>
+</center>
+<p><br /></p>
+
+<p>In an effort to meet the user needs and demands, the Flink community hopes 
to leverage and make better use of these tools.  Along this direction, the 
Flink community put some great effort in integrating Pandas into PyFlink with 
the latest Flink version 1.11. Some of the added features include 
<strong>support for Pandas UDF</strong> and the <strong>conversion between 
Pandas DataFrame and Table</strong>. Pandas UDF not only greatly improve the 
execution performance of Python UDF, but al [...]
+
+<div class="alert alert-info">
+  <p><span class="label label-info" style="display: inline-block"><span 
class="glyphicon glyphicon-info-sign" aria-hidden="true"></span> Note</span>
+Currently, only Scalar Pandas UDFs are supported in PyFlink.</p>
+</div>
+
+<h1 id="pandas-udf-in-flink-111">Pandas UDF in Flink 1.11</h1>
+
+<p>Using scalar Python UDF was already possible in Flink 1.10 as described in 
a <a 
href="https://flink.apache.org/2020/04/09/pyflink-udf-support-flink.html";>previous
 article on the Flink blog</a>. Scalar Python UDFs work based on three primary 
steps:</p>
+
+<ul>
+  <li>
+    <p>the Java operator serializes one input row to bytes and sends them to 
the Python worker;</p>
+  </li>
+  <li>
+    <p>the Python worker deserializes the input row and evaluates the Python 
UDF with it;</p>
+  </li>
+  <li>
+    <p>the resulting row is serialized and sent back to the Java operator</p>
+  </li>
+</ul>
+
+<p>While providing support for Python UDFs in PyFlink greatly improved the 
user experience, it had some drawbacks, namely resulting in:</p>
+
+<ul>
+  <li>
+    <p>High serialization/deserialization overhead</p>
+  </li>
+  <li>
+    <p>Difficulty when leveraging popular Python libraries used by data 
scientists — such as Pandas or NumPy — that provide high-performance data 
structure and functions.</p>
+  </li>
+</ul>
+
+<p>The introduction of Pandas UDF is used to address these drawbacks. For 
Pandas UDF, a batch of rows is transferred between the JVM and PVM in a 
columnar format (<a 
href="https://arrow.apache.org/docs/format/Columnar.html";>Arrow memory 
format</a>). The batch of rows will be converted into a collection of Pandas 
Series and will be transferred to the Pandas UDF to then leverage popular 
Python libraries (such as Pandas, or NumPy) for the Python UDF 
implementation.</p>
+
+<center>
+<img src="/img/blog/2020-07-28-pyflink-pandas/vm-communication.png" 
width="550px" alt="VM Communication" />
+</center>
+
+<p>The performance of vectorized UDFs is usually much higher when compared to 
the normal Python UDF, as the serialization/deserialization overhead is 
minimized by falling back to <a href="https://arrow.apache.org/";>Apache 
Arrow</a>, while handling <code>pandas.Series</code> as input/output allows us 
to take full advantage of the Pandas and NumPy libraries, making it a popular 
solution to parallelize Machine Learning and other large-scale, distributed 
data science workloads (e.g. feature  [...]
+
+<h1 id="conversion-between-pyflink-table-and-pandas-dataframe">Conversion 
between PyFlink Table and Pandas DataFrame</h1>
+
+<p>Pandas DataFrame is the de-facto standard for working with tabular data in 
the Python community while PyFlink Table is Flink’s representation of the 
tabular data in Python. Enabling the conversion between PyFlink Table and 
Pandas DataFrame allows switching between PyFlink and Pandas seamlessly when 
processing data in Python. Users can process data by utilizing one execution 
engine and switch to a different one effortlessly. For example, in case users 
already have a Pandas DataFrame at [...]
+
+<h1 id="examples">Examples</h1>
+
+<p>Using Python in Apache Flink requires installing PyFlink, which is 
available on <a href="https://pypi.org/project/apache-flink/";>PyPI</a> and can 
be easily installed using <code>pip</code>. Before installing PyFlink, check 
the working version of Python running in your system using:</p>
+
+<div class="highlight"><pre><code class="language-bash"><span class="nv">$ 
</span>python --version
+Python 3.7.6</code></pre></div>
+
+<div class="alert alert-info">
+  <p><span class="label label-info" style="display: inline-block"><span 
class="glyphicon glyphicon-info-sign" aria-hidden="true"></span> Note</span>
+Please note that Python 3.5 or higher is required to install and run 
PyFlink</p>
+</div>
+
+<div class="highlight"><pre><code class="language-bash"><span class="nv">$ 
</span>python -m pip install apache-flink</code></pre></div>
+
+<h2 id="using-pandas-udf">Using Pandas UDF</h2>
+
+<p>Pandas UDFs take <code>pandas.Series</code> as the input and return a 
<code>pandas.Series</code> of the same length as the output. Pandas UDFs can be 
used at the exact same place where non-Pandas functions are currently being 
utilized. To mark a UDF as a Pandas UDF, you only need to add an extra 
parameter udf_type=”pandas” in the udf decorator:</p>
+
+<div class="highlight"><pre><code class="language-python"><span 
class="nd">@udf</span><span class="p">(</span><span 
class="n">input_types</span><span class="o">=</span><span 
class="p">[</span><span class="n">DataTypes</span><span class="o">.</span><span 
class="n">STRING</span><span class="p">(),</span> <span 
class="n">DataTypes</span><span class="o">.</span><span 
class="n">FLOAT</span><span class="p">()],</span>
+     <span class="n">result_type</span><span class="o">=</span><span 
class="n">DataTypes</span><span class="o">.</span><span 
class="n">FLOAT</span><span class="p">(),</span> <span 
class="n">udf_type</span><span class="o">=</span><span 
class="s">&#39;pandas&#39;</span><span class="p">)</span>
+<span class="k">def</span> <span class="nf">interpolate</span><span 
class="p">(</span><span class="nb">id</span><span class="p">,</span> <span 
class="n">temperature</span><span class="p">):</span>
+    <span class="c"># takes id: pandas.Series and temperature: pandas.Series 
as input</span>
+    <span class="n">df</span> <span class="o">=</span> <span 
class="n">pd</span><span class="o">.</span><span 
class="n">DataFrame</span><span class="p">({</span><span 
class="s">&#39;id&#39;</span><span class="p">:</span> <span 
class="nb">id</span><span class="p">,</span> <span 
class="s">&#39;temperature&#39;</span><span class="p">:</span> <span 
class="n">temperature</span><span class="p">})</span>
+
+    <span class="c"># use interpolate() to interpolate the missing 
temperature</span>
+    <span class="n">interpolated_df</span> <span class="o">=</span> <span 
class="n">df</span><span class="o">.</span><span class="n">groupby</span><span 
class="p">(</span><span class="s">&#39;id&#39;</span><span 
class="p">)</span><span class="o">.</span><span class="n">apply</span><span 
class="p">(</span>
+        <span class="k">lambda</span> <span class="n">group</span><span 
class="p">:</span> <span class="n">group</span><span class="o">.</span><span 
class="n">interpolate</span><span class="p">(</span><span 
class="n">limit_direction</span><span class="o">=</span><span 
class="s">&#39;both&#39;</span><span class="p">))</span>
+
+    <span class="c"># output temperature: pandas.Series</span>
+    <span class="k">return</span> <span class="n">interpolated_df</span><span 
class="p">[</span><span class="s">&#39;temperature&#39;</span><span 
class="p">]</span></code></pre></div>
+
+<p>The Pandas UDF above uses the Pandas <code>dataframe.interpolate()</code> 
function to interpolate the missing temperature data for each equipment id. 
This is a common IoT scenario whereby each equipment/device reports it’s id and 
temperature to be analyzed, but the temperature field may be null due to 
various reasons.
+With the function, you can register and use it in the same way as the <a 
href="https://flink.apache.org/2020/04/09/pyflink-udf-support-flink.html";>normal
 Python UDF</a>. Below is a complete example of how to use the Pandas UDF in 
PyFlink.</p>
+
+<div class="highlight"><pre><code class="language-python"><span 
class="kn">from</span> <span class="nn">pyflink.datastream</span> <span 
class="kn">import</span> <span class="n">StreamExecutionEnvironment</span>
+<span class="kn">from</span> <span class="nn">pyflink.table</span> <span 
class="kn">import</span> <span class="n">StreamTableEnvironment</span><span 
class="p">,</span> <span class="n">DataTypes</span>
+<span class="kn">from</span> <span class="nn">pyflink.table.udf</span> <span 
class="kn">import</span> <span class="n">udf</span>
+<span class="kn">import</span> <span class="nn">pandas</span> <span 
class="kn">as</span> <span class="nn">pd</span>
+
+<span class="n">env</span> <span class="o">=</span> <span 
class="n">StreamExecutionEnvironment</span><span class="o">.</span><span 
class="n">get_execution_environment</span><span class="p">()</span>
+<span class="n">env</span><span class="o">.</span><span 
class="n">set_parallelism</span><span class="p">(</span><span 
class="mi">1</span><span class="p">)</span>
+<span class="n">t_env</span> <span class="o">=</span> <span 
class="n">StreamTableEnvironment</span><span class="o">.</span><span 
class="n">create</span><span class="p">(</span><span class="n">env</span><span 
class="p">)</span>
+<span class="n">t_env</span><span class="o">.</span><span 
class="n">get_config</span><span class="p">()</span><span 
class="o">.</span><span class="n">get_configuration</span><span 
class="p">()</span><span class="o">.</span><span 
class="n">set_boolean</span><span class="p">(</span><span 
class="s">&quot;python.fn-execution.memory.managed&quot;</span><span 
class="p">,</span> <span class="bp">True</span><span class="p">)</span>
+
+<span class="nd">@udf</span><span class="p">(</span><span 
class="n">input_types</span><span class="o">=</span><span 
class="p">[</span><span class="n">DataTypes</span><span class="o">.</span><span 
class="n">STRING</span><span class="p">(),</span> <span 
class="n">DataTypes</span><span class="o">.</span><span 
class="n">FLOAT</span><span class="p">()],</span>
+     <span class="n">result_type</span><span class="o">=</span><span 
class="n">DataTypes</span><span class="o">.</span><span 
class="n">FLOAT</span><span class="p">(),</span> <span 
class="n">udf_type</span><span class="o">=</span><span 
class="s">&#39;pandas&#39;</span><span class="p">)</span>
+<span class="k">def</span> <span class="nf">interpolate</span><span 
class="p">(</span><span class="nb">id</span><span class="p">,</span> <span 
class="n">temperature</span><span class="p">):</span>
+    <span class="c"># takes id: pandas.Series and temperature: pandas.Series 
as input</span>
+    <span class="n">df</span> <span class="o">=</span> <span 
class="n">pd</span><span class="o">.</span><span 
class="n">DataFrame</span><span class="p">({</span><span 
class="s">&#39;id&#39;</span><span class="p">:</span> <span 
class="nb">id</span><span class="p">,</span> <span 
class="s">&#39;temperature&#39;</span><span class="p">:</span> <span 
class="n">temperature</span><span class="p">})</span>
+
+    <span class="c"># use interpolate() to interpolate the missing 
temperature</span>
+    <span class="n">interpolated_df</span> <span class="o">=</span> <span 
class="n">df</span><span class="o">.</span><span class="n">groupby</span><span 
class="p">(</span><span class="s">&#39;id&#39;</span><span 
class="p">)</span><span class="o">.</span><span class="n">apply</span><span 
class="p">(</span>
+        <span class="k">lambda</span> <span class="n">group</span><span 
class="p">:</span> <span class="n">group</span><span class="o">.</span><span 
class="n">interpolate</span><span class="p">(</span><span 
class="n">limit_direction</span><span class="o">=</span><span 
class="s">&#39;both&#39;</span><span class="p">))</span>
+
+    <span class="c"># output temperature: pandas.Series</span>
+    <span class="k">return</span> <span class="n">interpolated_df</span><span 
class="p">[</span><span class="s">&#39;temperature&#39;</span><span 
class="p">]</span>
+
+<span class="n">t_env</span><span class="o">.</span><span 
class="n">register_function</span><span class="p">(</span><span 
class="s">&quot;interpolate&quot;</span><span class="p">,</span> <span 
class="n">interpolate</span><span class="p">)</span>
+
+<span class="n">my_source_ddl</span> <span class="o">=</span> <span 
class="s">&quot;&quot;&quot;</span>
+<span class="s">    create table mySource (</span>
+<span class="s">        id INT,</span>
+<span class="s">        temperature FLOAT </span>
+<span class="s">    ) with (</span>
+<span class="s">        &#39;connector.type&#39; = &#39;filesystem&#39;,</span>
+<span class="s">        &#39;format.type&#39; = &#39;csv&#39;,</span>
+<span class="s">        &#39;connector.path&#39; = &#39;/tmp/input&#39;</span>
+<span class="s">    )</span>
+<span class="s">&quot;&quot;&quot;</span>
+
+<span class="n">my_sink_ddl</span> <span class="o">=</span> <span 
class="s">&quot;&quot;&quot;</span>
+<span class="s">    create table mySink (</span>
+<span class="s">        id INT,</span>
+<span class="s">        temperature FLOAT </span>
+<span class="s">    ) with (</span>
+<span class="s">        &#39;connector.type&#39; = &#39;filesystem&#39;,</span>
+<span class="s">        &#39;format.type&#39; = &#39;csv&#39;,</span>
+<span class="s">        &#39;connector.path&#39; = &#39;/tmp/output&#39;</span>
+<span class="s">    )</span>
+<span class="s">&quot;&quot;&quot;</span>
+
+<span class="n">t_env</span><span class="o">.</span><span 
class="n">execute_sql</span><span class="p">(</span><span 
class="n">my_source_ddl</span><span class="p">)</span>
+<span class="n">t_env</span><span class="o">.</span><span 
class="n">execute_sql</span><span class="p">(</span><span 
class="n">my_sink_ddl</span><span class="p">)</span>
+
+<span class="n">t_env</span><span class="o">.</span><span 
class="n">from_path</span><span class="p">(</span><span 
class="s">&#39;mySource&#39;</span><span class="p">)</span>\
+    <span class="o">.</span><span class="n">select</span><span 
class="p">(</span><span class="s">&quot;id, interpolate(id, temperature) as 
temperature&quot;</span><span class="p">)</span> \
+    <span class="o">.</span><span class="n">insert_into</span><span 
class="p">(</span><span class="s">&#39;mySink&#39;</span><span 
class="p">)</span>
+
+<span class="n">t_env</span><span class="o">.</span><span 
class="n">execute</span><span class="p">(</span><span 
class="s">&quot;pandas_udf_demo&quot;</span><span 
class="p">)</span></code></pre></div>
+
+<p>To submit the job:</p>
+
+<ul>
+  <li>Firstly, you need to prepare the input data in the “/tmp/input” file. 
For example,</li>
+</ul>
+
+<div class="highlight"><pre><code class="language-bash"><span class="nv">$ 
</span><span class="nb">echo</span> -e  <span 
class="s2">&quot;1,98.0\n1,\n1,100.0\n2,99.0&quot;</span> &gt; 
/tmp/input</code></pre></div>
+
+<ul>
+  <li>Next, you can run this example on the command line,</li>
+</ul>
+
+<div class="highlight"><pre><code class="language-bash"><span class="nv">$ 
</span>python pandas_udf_demo.py</code></pre></div>
+
+<p>The command builds and runs the Python Table API program in a local 
mini-cluster. You can also submit the Python Table API program to a remote 
cluster using different command lines, see more details <a 
href="https://ci.apache.org/projects/flink/flink-docs-release-1.11/ops/cli.html#job-submission-examples";>here</a>.</p>
+
+<ul>
+  <li>Finally, you can see the execution result on the command line. As you 
can see, all the temperature data with an empty value has been 
interpolated:</li>
+</ul>
+
+<div class="highlight"><pre><code class="language-bash"><span class="nv">$ 
</span> cat /tmp/output
+1,98.0
+1,99.0
+1,100.0
+2,99.0</code></pre></div>
+
+<h2 id="conversion-between-pyflink-table-and-pandas-dataframe-1">Conversion 
between PyFlink Table and Pandas DataFrame</h2>
+
+<p>You can use the <code>from_pandas()</code> method to create a PyFlink Table 
from a Pandas DataFrame or use the <code>to_pandas()</code> method to convert a 
PyFlink Table to a Pandas DataFrame.</p>
+
+<div class="highlight"><pre><code class="language-python"><span 
class="kn">from</span> <span class="nn">pyflink.datastream</span> <span 
class="kn">import</span> <span class="n">StreamExecutionEnvironment</span>
+<span class="kn">from</span> <span class="nn">pyflink.table</span> <span 
class="kn">import</span> <span class="n">StreamTableEnvironment</span>
+<span class="kn">import</span> <span class="nn">pandas</span> <span 
class="kn">as</span> <span class="nn">pd</span>
+<span class="kn">import</span> <span class="nn">numpy</span> <span 
class="kn">as</span> <span class="nn">np</span>
+
+<span class="n">env</span> <span class="o">=</span> <span 
class="n">StreamExecutionEnvironment</span><span class="o">.</span><span 
class="n">get_execution_environment</span><span class="p">()</span>
+<span class="n">t_env</span> <span class="o">=</span> <span 
class="n">StreamTableEnvironment</span><span class="o">.</span><span 
class="n">create</span><span class="p">(</span><span class="n">env</span><span 
class="p">)</span>
+
+<span class="c"># Create a PyFlink Table</span>
+<span class="n">pdf</span> <span class="o">=</span> <span 
class="n">pd</span><span class="o">.</span><span 
class="n">DataFrame</span><span class="p">(</span><span 
class="n">np</span><span class="o">.</span><span class="n">random</span><span 
class="o">.</span><span class="n">rand</span><span class="p">(</span><span 
class="mi">1000</span><span class="p">,</span> <span class="mi">2</span><span 
class="p">))</span>
+<span class="n">table</span> <span class="o">=</span> <span 
class="n">t_env</span><span class="o">.</span><span 
class="n">from_pandas</span><span class="p">(</span><span 
class="n">pdf</span><span class="p">,</span> <span class="p">[</span><span 
class="s">&quot;a&quot;</span><span class="p">,</span> <span 
class="s">&quot;b&quot;</span><span class="p">])</span><span 
class="o">.</span><span class="n">filter</span><span class="p">(</span><span 
class="s">&quot;a &gt; 0.5&quot;</span><span cla [...]
+
+<span class="c"># Convert the PyFlink Table to a Pandas DataFrame</span>
+<span class="n">pdf</span> <span class="o">=</span> <span 
class="n">table</span><span class="o">.</span><span 
class="n">to_pandas</span><span class="p">()</span>
+<span class="k">print</span><span class="p">(</span><span 
class="n">pdf</span><span class="p">)</span></code></pre></div>
+
+<h1 id="conclusion--upcoming-work">Conclusion &amp; Upcoming work</h1>
+
+<p>In this article, we introduce the integration of Pandas in Flink 1.11, 
including Pandas UDF and the conversion between Table and Pandas. In fact, in 
the latest Apache Flink release, there are many excellent features added to 
PyFlink, such as support of User-defined Table functions and User-defined 
Metrics for Python UDFs. What’s more, from Flink 1.11, you can build PyFlink 
with Cython support and “Cythonize” your Python UDFs to substantially improve 
code execution speed (up to 30x fas [...]
+
+<p>Future work by the community will focus on adding more features and 
bringing additional optimizations with follow up releases.  Such optimizations 
and additions include a Python DataStream API and more integration with the 
Python ecosystem, such as support for distributed Pandas in Flink. Stay tuned 
for more information and updates with the upcoming releases!</p>
+
+<center>
+<img src="/img/blog/2020-07-28-pyflink-pandas/mission-of-pyFlink.gif" 
width="600px" alt="Mission of PyFlink" />
+</center>
+
+      </article>
+    </div>
+
+    <div class="row">
+      <div id="disqus_thread"></div>
+      <script type="text/javascript">
+        /* * * CONFIGURATION VARIABLES: EDIT BEFORE PASTING INTO YOUR WEBPAGE 
* * */
+        var disqus_shortname = 'stratosphere-eu'; // required: replace example 
with your forum shortname
+
+        /* * * DON'T EDIT BELOW THIS LINE * * */
+        (function() {
+            var dsq = document.createElement('script'); dsq.type = 
'text/javascript'; dsq.async = true;
+            dsq.src = '//' + disqus_shortname + '.disqus.com/embed.js';
+             (document.getElementsByTagName('head')[0] || 
document.getElementsByTagName('body')[0]).appendChild(dsq);
+        })();
+      </script>
+    </div>
+  </div>
+</div>
+      </div>
+    </div>
+
+    <hr />
+
+    <div class="row">
+      <div class="footer text-center col-sm-12">
+        <p>Copyright © 2014-2019 <a href="http://apache.org";>The Apache 
Software Foundation</a>. All Rights Reserved.</p>
+        <p>Apache Flink, Flink®, Apache®, the squirrel logo, and the Apache 
feather logo are either registered trademarks or trademarks of The Apache 
Software Foundation.</p>
+        <p><a href="/privacy-policy.html">Privacy Policy</a> &middot; <a 
href="/blog/feed.xml">RSS feed</a></p>
+      </div>
+    </div>
+    </div><!-- /.container -->
+
+    <!-- Include all compiled plugins (below), or include individual files as 
needed -->
+    <script 
src="https://maxcdn.bootstrapcdn.com/bootstrap/3.3.4/js/bootstrap.min.js";></script>
+    <script 
src="https://cdnjs.cloudflare.com/ajax/libs/jquery.matchHeight/0.7.0/jquery.matchHeight-min.js";></script>
+    <script src="/js/codetabs.js"></script>
+    <script src="/js/stickysidebar.js"></script>
+
+    <!-- Google Analytics -->
+    <script>
+      
(function(i,s,o,g,r,a,m){i['GoogleAnalyticsObject']=r;i[r]=i[r]||function(){
+      (i[r].q=i[r].q||[]).push(arguments)},i[r].l=1*new 
Date();a=s.createElement(o),
+      
m=s.getElementsByTagName(o)[0];a.async=1;a.src=g;m.parentNode.insertBefore(a,m)
+      
})(window,document,'script','//www.google-analytics.com/analytics.js','ga');
+
+      ga('create', 'UA-52545728-1', 'auto');
+      ga('send', 'pageview');
+    </script>
+  </body>
+</html>
diff --git a/content/blog/feed.xml b/content/blog/feed.xml
index 3bf09d8..c794bc5 100644
--- a/content/blog/feed.xml
+++ b/content/blog/feed.xml
@@ -672,6 +672,215 @@ face of ever-increasing data volumes.&lt;/p&gt;
 </item>
 
 <item>
+<title>PyFlink: The integration of Pandas into PyFlink</title>
+<description>&lt;p&gt;Python has evolved into one of the most important 
programming languages for many fields of data processing. So big has been 
Python’s popularity, that it has pretty much become the default data processing 
language for data scientists. On top of that, there is a plethora of 
Python-based data processing tools such as NumPy, Pandas, and Scikit-learn that 
have gained additional popularity due to their flexibility or powerful 
functionalities.&lt;/p&gt;
+
+&lt;center&gt;
+&lt;img 
src=&quot;/img/blog/2020-07-28-pyflink-pandas/python-scientific-stack.png&quot; 
width=&quot;450px&quot; alt=&quot;Python Scientific Stack&quot; /&gt;
+&lt;/center&gt;
+&lt;center&gt;
+  &lt;a 
href=&quot;https://speakerdeck.com/jakevdp/the-unexpected-effectiveness-of-python-in-science?slide=52&quot;&gt;Pic
 source: VanderPlas 2017, slide 52.&lt;/a&gt;
+&lt;/center&gt;
+&lt;p&gt;&lt;br /&gt;&lt;/p&gt;
+
+&lt;p&gt;In an effort to meet the user needs and demands, the Flink community 
hopes to leverage and make better use of these tools.  Along this direction, 
the Flink community put some great effort in integrating Pandas into PyFlink 
with the latest Flink version 1.11. Some of the added features include 
&lt;strong&gt;support for Pandas UDF&lt;/strong&gt; and the 
&lt;strong&gt;conversion between Pandas DataFrame and Table&lt;/strong&gt;. 
Pandas UDF not only greatly improve the execution per [...]
+
+&lt;div class=&quot;alert alert-info&quot;&gt;
+  &lt;p&gt;&lt;span class=&quot;label label-info&quot; style=&quot;display: 
inline-block&quot;&gt;&lt;span class=&quot;glyphicon glyphicon-info-sign&quot; 
aria-hidden=&quot;true&quot;&gt;&lt;/span&gt; Note&lt;/span&gt;
+Currently, only Scalar Pandas UDFs are supported in PyFlink.&lt;/p&gt;
+&lt;/div&gt;
+
+&lt;h1 id=&quot;pandas-udf-in-flink-111&quot;&gt;Pandas UDF in Flink 
1.11&lt;/h1&gt;
+
+&lt;p&gt;Using scalar Python UDF was already possible in Flink 1.10 as 
described in a &lt;a 
href=&quot;https://flink.apache.org/2020/04/09/pyflink-udf-support-flink.html&quot;&gt;previous
 article on the Flink blog&lt;/a&gt;. Scalar Python UDFs work based on three 
primary steps:&lt;/p&gt;
+
+&lt;ul&gt;
+  &lt;li&gt;
+    &lt;p&gt;the Java operator serializes one input row to bytes and sends 
them to the Python worker;&lt;/p&gt;
+  &lt;/li&gt;
+  &lt;li&gt;
+    &lt;p&gt;the Python worker deserializes the input row and evaluates the 
Python UDF with it;&lt;/p&gt;
+  &lt;/li&gt;
+  &lt;li&gt;
+    &lt;p&gt;the resulting row is serialized and sent back to the Java 
operator&lt;/p&gt;
+  &lt;/li&gt;
+&lt;/ul&gt;
+
+&lt;p&gt;While providing support for Python UDFs in PyFlink greatly improved 
the user experience, it had some drawbacks, namely resulting in:&lt;/p&gt;
+
+&lt;ul&gt;
+  &lt;li&gt;
+    &lt;p&gt;High serialization/deserialization overhead&lt;/p&gt;
+  &lt;/li&gt;
+  &lt;li&gt;
+    &lt;p&gt;Difficulty when leveraging popular Python libraries used by data 
scientists — such as Pandas or NumPy — that provide high-performance data 
structure and functions.&lt;/p&gt;
+  &lt;/li&gt;
+&lt;/ul&gt;
+
+&lt;p&gt;The introduction of Pandas UDF is used to address these drawbacks. 
For Pandas UDF, a batch of rows is transferred between the JVM and PVM in a 
columnar format (&lt;a 
href=&quot;https://arrow.apache.org/docs/format/Columnar.html&quot;&gt;Arrow 
memory format&lt;/a&gt;). The batch of rows will be converted into a collection 
of Pandas Series and will be transferred to the Pandas UDF to then leverage 
popular Python libraries (such as Pandas, or NumPy) for the Python UDF 
implementatio [...]
+
+&lt;center&gt;
+&lt;img 
src=&quot;/img/blog/2020-07-28-pyflink-pandas/vm-communication.png&quot; 
width=&quot;550px&quot; alt=&quot;VM Communication&quot; /&gt;
+&lt;/center&gt;
+
+&lt;p&gt;The performance of vectorized UDFs is usually much higher when 
compared to the normal Python UDF, as the serialization/deserialization 
overhead is minimized by falling back to &lt;a 
href=&quot;https://arrow.apache.org/&quot;&gt;Apache Arrow&lt;/a&gt;, while 
handling &lt;code&gt;pandas.Series&lt;/code&gt; as input/output allows us to 
take full advantage of the Pandas and NumPy libraries, making it a popular 
solution to parallelize Machine Learning and other large-scale, distribut [...]
+
+&lt;h1 
id=&quot;conversion-between-pyflink-table-and-pandas-dataframe&quot;&gt;Conversion
 between PyFlink Table and Pandas DataFrame&lt;/h1&gt;
+
+&lt;p&gt;Pandas DataFrame is the de-facto standard for working with tabular 
data in the Python community while PyFlink Table is Flink’s representation of 
the tabular data in Python. Enabling the conversion between PyFlink Table and 
Pandas DataFrame allows switching between PyFlink and Pandas seamlessly when 
processing data in Python. Users can process data by utilizing one execution 
engine and switch to a different one effortlessly. For example, in case users 
already have a Pandas DataFr [...]
+
+&lt;h1 id=&quot;examples&quot;&gt;Examples&lt;/h1&gt;
+
+&lt;p&gt;Using Python in Apache Flink requires installing PyFlink, which is 
available on &lt;a 
href=&quot;https://pypi.org/project/apache-flink/&quot;&gt;PyPI&lt;/a&gt; and 
can be easily installed using &lt;code&gt;pip&lt;/code&gt;. Before installing 
PyFlink, check the working version of Python running in your system 
using:&lt;/p&gt;
+
+&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code 
class=&quot;language-bash&quot;&gt;&lt;span class=&quot;nv&quot;&gt;$ 
&lt;/span&gt;python --version
+Python 3.7.6&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
+
+&lt;div class=&quot;alert alert-info&quot;&gt;
+  &lt;p&gt;&lt;span class=&quot;label label-info&quot; style=&quot;display: 
inline-block&quot;&gt;&lt;span class=&quot;glyphicon glyphicon-info-sign&quot; 
aria-hidden=&quot;true&quot;&gt;&lt;/span&gt; Note&lt;/span&gt;
+Please note that Python 3.5 or higher is required to install and run 
PyFlink&lt;/p&gt;
+&lt;/div&gt;
+
+&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code 
class=&quot;language-bash&quot;&gt;&lt;span class=&quot;nv&quot;&gt;$ 
&lt;/span&gt;python -m pip install 
apache-flink&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
+
+&lt;h2 id=&quot;using-pandas-udf&quot;&gt;Using Pandas UDF&lt;/h2&gt;
+
+&lt;p&gt;Pandas UDFs take &lt;code&gt;pandas.Series&lt;/code&gt; as the input 
and return a &lt;code&gt;pandas.Series&lt;/code&gt; of the same length as the 
output. Pandas UDFs can be used at the exact same place where non-Pandas 
functions are currently being utilized. To mark a UDF as a Pandas UDF, you only 
need to add an extra parameter udf_type=”pandas” in the udf decorator:&lt;/p&gt;
+
+&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code 
class=&quot;language-python&quot;&gt;&lt;span 
class=&quot;nd&quot;&gt;@udf&lt;/span&gt;&lt;span 
class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span 
class=&quot;n&quot;&gt;input_types&lt;/span&gt;&lt;span 
class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span 
class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span 
class=&quot;n&quot;&gt;DataTypes&lt;/span&gt;&lt;span 
class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span 
class=&quot;n&quot;&gt;STRING&lt;/span&gt;& [...]
+     &lt;span class=&quot;n&quot;&gt;result_type&lt;/span&gt;&lt;span 
class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span 
class=&quot;n&quot;&gt;DataTypes&lt;/span&gt;&lt;span 
class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span 
class=&quot;n&quot;&gt;FLOAT&lt;/span&gt;&lt;span 
class=&quot;p&quot;&gt;(),&lt;/span&gt; &lt;span 
class=&quot;n&quot;&gt;udf_type&lt;/span&gt;&lt;span 
class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span 
class=&quot;s&quot;&gt;&amp;#39;pandas&amp;#39;&lt;/span&gt;&lt;span class=&qu 
[...]
+&lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span 
class=&quot;nf&quot;&gt;interpolate&lt;/span&gt;&lt;span 
class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span 
class=&quot;nb&quot;&gt;id&lt;/span&gt;&lt;span 
class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span 
class=&quot;n&quot;&gt;temperature&lt;/span&gt;&lt;span 
class=&quot;p&quot;&gt;):&lt;/span&gt;
+    &lt;span class=&quot;c&quot;&gt;# takes id: pandas.Series and temperature: 
pandas.Series as input&lt;/span&gt;
+    &lt;span class=&quot;n&quot;&gt;df&lt;/span&gt; &lt;span 
class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span 
class=&quot;n&quot;&gt;pd&lt;/span&gt;&lt;span 
class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span 
class=&quot;n&quot;&gt;DataFrame&lt;/span&gt;&lt;span 
class=&quot;p&quot;&gt;({&lt;/span&gt;&lt;span 
class=&quot;s&quot;&gt;&amp;#39;id&amp;#39;&lt;/span&gt;&lt;span 
class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span 
class=&quot;nb&quot;&gt;id&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/s 
[...]
+
+    &lt;span class=&quot;c&quot;&gt;# use interpolate() to interpolate the 
missing temperature&lt;/span&gt;
+    &lt;span class=&quot;n&quot;&gt;interpolated_df&lt;/span&gt; &lt;span 
class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span 
class=&quot;n&quot;&gt;df&lt;/span&gt;&lt;span 
class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span 
class=&quot;n&quot;&gt;groupby&lt;/span&gt;&lt;span 
class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span 
class=&quot;s&quot;&gt;&amp;#39;id&amp;#39;&lt;/span&gt;&lt;span 
class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span 
class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt; [...]
+        &lt;span class=&quot;k&quot;&gt;lambda&lt;/span&gt; &lt;span 
class=&quot;n&quot;&gt;group&lt;/span&gt;&lt;span 
class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span 
class=&quot;n&quot;&gt;group&lt;/span&gt;&lt;span 
class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span 
class=&quot;n&quot;&gt;interpolate&lt;/span&gt;&lt;span 
class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span 
class=&quot;n&quot;&gt;limit_direction&lt;/span&gt;&lt;span 
class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s&quot;&gt [...]
+
+    &lt;span class=&quot;c&quot;&gt;# output temperature: 
pandas.Series&lt;/span&gt;
+    &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span 
class=&quot;n&quot;&gt;interpolated_df&lt;/span&gt;&lt;span 
class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span 
class=&quot;s&quot;&gt;&amp;#39;temperature&amp;#39;&lt;/span&gt;&lt;span 
class=&quot;p&quot;&gt;]&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
+
+&lt;p&gt;The Pandas UDF above uses the Pandas 
&lt;code&gt;dataframe.interpolate()&lt;/code&gt; function to interpolate the 
missing temperature data for each equipment id. This is a common IoT scenario 
whereby each equipment/device reports it’s id and temperature to be analyzed, 
but the temperature field may be null due to various reasons.
+With the function, you can register and use it in the same way as the &lt;a 
href=&quot;https://flink.apache.org/2020/04/09/pyflink-udf-support-flink.html&quot;&gt;normal
 Python UDF&lt;/a&gt;. Below is a complete example of how to use the Pandas UDF 
in PyFlink.&lt;/p&gt;
+
+&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code 
class=&quot;language-python&quot;&gt;&lt;span 
class=&quot;kn&quot;&gt;from&lt;/span&gt; &lt;span 
class=&quot;nn&quot;&gt;pyflink.datastream&lt;/span&gt; &lt;span 
class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span 
class=&quot;n&quot;&gt;StreamExecutionEnvironment&lt;/span&gt;
+&lt;span class=&quot;kn&quot;&gt;from&lt;/span&gt; &lt;span 
class=&quot;nn&quot;&gt;pyflink.table&lt;/span&gt; &lt;span 
class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span 
class=&quot;n&quot;&gt;StreamTableEnvironment&lt;/span&gt;&lt;span 
class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span 
class=&quot;n&quot;&gt;DataTypes&lt;/span&gt;
+&lt;span class=&quot;kn&quot;&gt;from&lt;/span&gt; &lt;span 
class=&quot;nn&quot;&gt;pyflink.table.udf&lt;/span&gt; &lt;span 
class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span 
class=&quot;n&quot;&gt;udf&lt;/span&gt;
+&lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span 
class=&quot;nn&quot;&gt;pandas&lt;/span&gt; &lt;span 
class=&quot;kn&quot;&gt;as&lt;/span&gt; &lt;span 
class=&quot;nn&quot;&gt;pd&lt;/span&gt;
+
+&lt;span class=&quot;n&quot;&gt;env&lt;/span&gt; &lt;span 
class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span 
class=&quot;n&quot;&gt;StreamExecutionEnvironment&lt;/span&gt;&lt;span 
class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span 
class=&quot;n&quot;&gt;get_execution_environment&lt;/span&gt;&lt;span 
class=&quot;p&quot;&gt;()&lt;/span&gt;
+&lt;span class=&quot;n&quot;&gt;env&lt;/span&gt;&lt;span 
class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span 
class=&quot;n&quot;&gt;set_parallelism&lt;/span&gt;&lt;span 
class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span 
class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span 
class=&quot;p&quot;&gt;)&lt;/span&gt;
+&lt;span class=&quot;n&quot;&gt;t_env&lt;/span&gt; &lt;span 
class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span 
class=&quot;n&quot;&gt;StreamTableEnvironment&lt;/span&gt;&lt;span 
class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span 
class=&quot;n&quot;&gt;create&lt;/span&gt;&lt;span 
class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span 
class=&quot;n&quot;&gt;env&lt;/span&gt;&lt;span 
class=&quot;p&quot;&gt;)&lt;/span&gt;
+&lt;span class=&quot;n&quot;&gt;t_env&lt;/span&gt;&lt;span 
class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span 
class=&quot;n&quot;&gt;get_config&lt;/span&gt;&lt;span 
class=&quot;p&quot;&gt;()&lt;/span&gt;&lt;span 
class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span 
class=&quot;n&quot;&gt;get_configuration&lt;/span&gt;&lt;span 
class=&quot;p&quot;&gt;()&lt;/span&gt;&lt;span 
class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span 
class=&quot;n&quot;&gt;set_boolean&lt;/span&gt;&lt;span 
class=&quot;p&quot;&gt;(&lt; [...]
+
+&lt;span class=&quot;nd&quot;&gt;@udf&lt;/span&gt;&lt;span 
class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span 
class=&quot;n&quot;&gt;input_types&lt;/span&gt;&lt;span 
class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span 
class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span 
class=&quot;n&quot;&gt;DataTypes&lt;/span&gt;&lt;span 
class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span 
class=&quot;n&quot;&gt;STRING&lt;/span&gt;&lt;span 
class=&quot;p&quot;&gt;(),&lt;/span&gt; &lt;span 
class=&quot;n&quot;&gt;DataTypes&lt;/sp [...]
+     &lt;span class=&quot;n&quot;&gt;result_type&lt;/span&gt;&lt;span 
class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span 
class=&quot;n&quot;&gt;DataTypes&lt;/span&gt;&lt;span 
class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span 
class=&quot;n&quot;&gt;FLOAT&lt;/span&gt;&lt;span 
class=&quot;p&quot;&gt;(),&lt;/span&gt; &lt;span 
class=&quot;n&quot;&gt;udf_type&lt;/span&gt;&lt;span 
class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span 
class=&quot;s&quot;&gt;&amp;#39;pandas&amp;#39;&lt;/span&gt;&lt;span class=&qu 
[...]
+&lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span 
class=&quot;nf&quot;&gt;interpolate&lt;/span&gt;&lt;span 
class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span 
class=&quot;nb&quot;&gt;id&lt;/span&gt;&lt;span 
class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span 
class=&quot;n&quot;&gt;temperature&lt;/span&gt;&lt;span 
class=&quot;p&quot;&gt;):&lt;/span&gt;
+    &lt;span class=&quot;c&quot;&gt;# takes id: pandas.Series and temperature: 
pandas.Series as input&lt;/span&gt;
+    &lt;span class=&quot;n&quot;&gt;df&lt;/span&gt; &lt;span 
class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span 
class=&quot;n&quot;&gt;pd&lt;/span&gt;&lt;span 
class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span 
class=&quot;n&quot;&gt;DataFrame&lt;/span&gt;&lt;span 
class=&quot;p&quot;&gt;({&lt;/span&gt;&lt;span 
class=&quot;s&quot;&gt;&amp;#39;id&amp;#39;&lt;/span&gt;&lt;span 
class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span 
class=&quot;nb&quot;&gt;id&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/s 
[...]
+
+    &lt;span class=&quot;c&quot;&gt;# use interpolate() to interpolate the 
missing temperature&lt;/span&gt;
+    &lt;span class=&quot;n&quot;&gt;interpolated_df&lt;/span&gt; &lt;span 
class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span 
class=&quot;n&quot;&gt;df&lt;/span&gt;&lt;span 
class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span 
class=&quot;n&quot;&gt;groupby&lt;/span&gt;&lt;span 
class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span 
class=&quot;s&quot;&gt;&amp;#39;id&amp;#39;&lt;/span&gt;&lt;span 
class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span 
class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt; [...]
+        &lt;span class=&quot;k&quot;&gt;lambda&lt;/span&gt; &lt;span 
class=&quot;n&quot;&gt;group&lt;/span&gt;&lt;span 
class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span 
class=&quot;n&quot;&gt;group&lt;/span&gt;&lt;span 
class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span 
class=&quot;n&quot;&gt;interpolate&lt;/span&gt;&lt;span 
class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span 
class=&quot;n&quot;&gt;limit_direction&lt;/span&gt;&lt;span 
class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s&quot;&gt [...]
+
+    &lt;span class=&quot;c&quot;&gt;# output temperature: 
pandas.Series&lt;/span&gt;
+    &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span 
class=&quot;n&quot;&gt;interpolated_df&lt;/span&gt;&lt;span 
class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span 
class=&quot;s&quot;&gt;&amp;#39;temperature&amp;#39;&lt;/span&gt;&lt;span 
class=&quot;p&quot;&gt;]&lt;/span&gt;
+
+&lt;span class=&quot;n&quot;&gt;t_env&lt;/span&gt;&lt;span 
class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span 
class=&quot;n&quot;&gt;register_function&lt;/span&gt;&lt;span 
class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span 
class=&quot;s&quot;&gt;&amp;quot;interpolate&amp;quot;&lt;/span&gt;&lt;span 
class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span 
class=&quot;n&quot;&gt;interpolate&lt;/span&gt;&lt;span 
class=&quot;p&quot;&gt;)&lt;/span&gt;
+
+&lt;span class=&quot;n&quot;&gt;my_source_ddl&lt;/span&gt; &lt;span 
class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span 
class=&quot;s&quot;&gt;&amp;quot;&amp;quot;&amp;quot;&lt;/span&gt;
+&lt;span class=&quot;s&quot;&gt;    create table mySource (&lt;/span&gt;
+&lt;span class=&quot;s&quot;&gt;        id INT,&lt;/span&gt;
+&lt;span class=&quot;s&quot;&gt;        temperature FLOAT &lt;/span&gt;
+&lt;span class=&quot;s&quot;&gt;    ) with (&lt;/span&gt;
+&lt;span class=&quot;s&quot;&gt;        &amp;#39;connector.type&amp;#39; = 
&amp;#39;filesystem&amp;#39;,&lt;/span&gt;
+&lt;span class=&quot;s&quot;&gt;        &amp;#39;format.type&amp;#39; = 
&amp;#39;csv&amp;#39;,&lt;/span&gt;
+&lt;span class=&quot;s&quot;&gt;        &amp;#39;connector.path&amp;#39; = 
&amp;#39;/tmp/input&amp;#39;&lt;/span&gt;
+&lt;span class=&quot;s&quot;&gt;    )&lt;/span&gt;
+&lt;span class=&quot;s&quot;&gt;&amp;quot;&amp;quot;&amp;quot;&lt;/span&gt;
+
+&lt;span class=&quot;n&quot;&gt;my_sink_ddl&lt;/span&gt; &lt;span 
class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span 
class=&quot;s&quot;&gt;&amp;quot;&amp;quot;&amp;quot;&lt;/span&gt;
+&lt;span class=&quot;s&quot;&gt;    create table mySink (&lt;/span&gt;
+&lt;span class=&quot;s&quot;&gt;        id INT,&lt;/span&gt;
+&lt;span class=&quot;s&quot;&gt;        temperature FLOAT &lt;/span&gt;
+&lt;span class=&quot;s&quot;&gt;    ) with (&lt;/span&gt;
+&lt;span class=&quot;s&quot;&gt;        &amp;#39;connector.type&amp;#39; = 
&amp;#39;filesystem&amp;#39;,&lt;/span&gt;
+&lt;span class=&quot;s&quot;&gt;        &amp;#39;format.type&amp;#39; = 
&amp;#39;csv&amp;#39;,&lt;/span&gt;
+&lt;span class=&quot;s&quot;&gt;        &amp;#39;connector.path&amp;#39; = 
&amp;#39;/tmp/output&amp;#39;&lt;/span&gt;
+&lt;span class=&quot;s&quot;&gt;    )&lt;/span&gt;
+&lt;span class=&quot;s&quot;&gt;&amp;quot;&amp;quot;&amp;quot;&lt;/span&gt;
+
+&lt;span class=&quot;n&quot;&gt;t_env&lt;/span&gt;&lt;span 
class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span 
class=&quot;n&quot;&gt;execute_sql&lt;/span&gt;&lt;span 
class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span 
class=&quot;n&quot;&gt;my_source_ddl&lt;/span&gt;&lt;span 
class=&quot;p&quot;&gt;)&lt;/span&gt;
+&lt;span class=&quot;n&quot;&gt;t_env&lt;/span&gt;&lt;span 
class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span 
class=&quot;n&quot;&gt;execute_sql&lt;/span&gt;&lt;span 
class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span 
class=&quot;n&quot;&gt;my_sink_ddl&lt;/span&gt;&lt;span 
class=&quot;p&quot;&gt;)&lt;/span&gt;
+
+&lt;span class=&quot;n&quot;&gt;t_env&lt;/span&gt;&lt;span 
class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span 
class=&quot;n&quot;&gt;from_path&lt;/span&gt;&lt;span 
class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span 
class=&quot;s&quot;&gt;&amp;#39;mySource&amp;#39;&lt;/span&gt;&lt;span 
class=&quot;p&quot;&gt;)&lt;/span&gt;\
+    &lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span 
class=&quot;n&quot;&gt;select&lt;/span&gt;&lt;span 
class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span 
class=&quot;s&quot;&gt;&amp;quot;id, interpolate(id, temperature) as 
temperature&amp;quot;&lt;/span&gt;&lt;span 
class=&quot;p&quot;&gt;)&lt;/span&gt; \
+    &lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span 
class=&quot;n&quot;&gt;insert_into&lt;/span&gt;&lt;span 
class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span 
class=&quot;s&quot;&gt;&amp;#39;mySink&amp;#39;&lt;/span&gt;&lt;span 
class=&quot;p&quot;&gt;)&lt;/span&gt;
+
+&lt;span class=&quot;n&quot;&gt;t_env&lt;/span&gt;&lt;span 
class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span 
class=&quot;n&quot;&gt;execute&lt;/span&gt;&lt;span 
class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span 
class=&quot;s&quot;&gt;&amp;quot;pandas_udf_demo&amp;quot;&lt;/span&gt;&lt;span 
class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
+
+&lt;p&gt;To submit the job:&lt;/p&gt;
+
+&lt;ul&gt;
+  &lt;li&gt;Firstly, you need to prepare the input data in the “/tmp/input” 
file. For example,&lt;/li&gt;
+&lt;/ul&gt;
+
+&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code 
class=&quot;language-bash&quot;&gt;&lt;span class=&quot;nv&quot;&gt;$ 
&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;echo&lt;/span&gt; -e  &lt;span 
class=&quot;s2&quot;&gt;&amp;quot;1,98.0\n1,\n1,100.0\n2,99.0&amp;quot;&lt;/span&gt;
 &amp;gt; /tmp/input&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
+
+&lt;ul&gt;
+  &lt;li&gt;Next, you can run this example on the command line,&lt;/li&gt;
+&lt;/ul&gt;
+
+&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code 
class=&quot;language-bash&quot;&gt;&lt;span class=&quot;nv&quot;&gt;$ 
&lt;/span&gt;python pandas_udf_demo.py&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
+
+&lt;p&gt;The command builds and runs the Python Table API program in a local 
mini-cluster. You can also submit the Python Table API program to a remote 
cluster using different command lines, see more details &lt;a 
href=&quot;https://ci.apache.org/projects/flink/flink-docs-release-1.11/ops/cli.html#job-submission-examples&quot;&gt;here&lt;/a&gt;.&lt;/p&gt;
+
+&lt;ul&gt;
+  &lt;li&gt;Finally, you can see the execution result on the command line. As 
you can see, all the temperature data with an empty value has been 
interpolated:&lt;/li&gt;
+&lt;/ul&gt;
+
+&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code 
class=&quot;language-bash&quot;&gt;&lt;span class=&quot;nv&quot;&gt;$ 
&lt;/span&gt; cat /tmp/output
+1,98.0
+1,99.0
+1,100.0
+2,99.0&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
+
+&lt;h2 
id=&quot;conversion-between-pyflink-table-and-pandas-dataframe-1&quot;&gt;Conversion
 between PyFlink Table and Pandas DataFrame&lt;/h2&gt;
+
+&lt;p&gt;You can use the &lt;code&gt;from_pandas()&lt;/code&gt; method to 
create a PyFlink Table from a Pandas DataFrame or use the 
&lt;code&gt;to_pandas()&lt;/code&gt; method to convert a PyFlink Table to a 
Pandas DataFrame.&lt;/p&gt;
+
+&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code 
class=&quot;language-python&quot;&gt;&lt;span 
class=&quot;kn&quot;&gt;from&lt;/span&gt; &lt;span 
class=&quot;nn&quot;&gt;pyflink.datastream&lt;/span&gt; &lt;span 
class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span 
class=&quot;n&quot;&gt;StreamExecutionEnvironment&lt;/span&gt;
+&lt;span class=&quot;kn&quot;&gt;from&lt;/span&gt; &lt;span 
class=&quot;nn&quot;&gt;pyflink.table&lt;/span&gt; &lt;span 
class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span 
class=&quot;n&quot;&gt;StreamTableEnvironment&lt;/span&gt;
+&lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span 
class=&quot;nn&quot;&gt;pandas&lt;/span&gt; &lt;span 
class=&quot;kn&quot;&gt;as&lt;/span&gt; &lt;span 
class=&quot;nn&quot;&gt;pd&lt;/span&gt;
+&lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span 
class=&quot;nn&quot;&gt;numpy&lt;/span&gt; &lt;span 
class=&quot;kn&quot;&gt;as&lt;/span&gt; &lt;span 
class=&quot;nn&quot;&gt;np&lt;/span&gt;
+
+&lt;span class=&quot;n&quot;&gt;env&lt;/span&gt; &lt;span 
class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span 
class=&quot;n&quot;&gt;StreamExecutionEnvironment&lt;/span&gt;&lt;span 
class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span 
class=&quot;n&quot;&gt;get_execution_environment&lt;/span&gt;&lt;span 
class=&quot;p&quot;&gt;()&lt;/span&gt;
+&lt;span class=&quot;n&quot;&gt;t_env&lt;/span&gt; &lt;span 
class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span 
class=&quot;n&quot;&gt;StreamTableEnvironment&lt;/span&gt;&lt;span 
class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span 
class=&quot;n&quot;&gt;create&lt;/span&gt;&lt;span 
class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span 
class=&quot;n&quot;&gt;env&lt;/span&gt;&lt;span 
class=&quot;p&quot;&gt;)&lt;/span&gt;
+
+&lt;span class=&quot;c&quot;&gt;# Create a PyFlink Table&lt;/span&gt;
+&lt;span class=&quot;n&quot;&gt;pdf&lt;/span&gt; &lt;span 
class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span 
class=&quot;n&quot;&gt;pd&lt;/span&gt;&lt;span 
class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span 
class=&quot;n&quot;&gt;DataFrame&lt;/span&gt;&lt;span 
class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span 
class=&quot;n&quot;&gt;np&lt;/span&gt;&lt;span 
class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span 
class=&quot;n&quot;&gt;random&lt;/span&gt;&lt;span 
class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span clas [...]
+&lt;span class=&quot;n&quot;&gt;table&lt;/span&gt; &lt;span 
class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span 
class=&quot;n&quot;&gt;t_env&lt;/span&gt;&lt;span 
class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span 
class=&quot;n&quot;&gt;from_pandas&lt;/span&gt;&lt;span 
class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span 
class=&quot;n&quot;&gt;pdf&lt;/span&gt;&lt;span 
class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span 
class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span 
class=&quot;s&quot;&gt;&amp;quot;a&amp;quot;&l [...]
+
+&lt;span class=&quot;c&quot;&gt;# Convert the PyFlink Table to a Pandas 
DataFrame&lt;/span&gt;
+&lt;span class=&quot;n&quot;&gt;pdf&lt;/span&gt; &lt;span 
class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span 
class=&quot;n&quot;&gt;table&lt;/span&gt;&lt;span 
class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span 
class=&quot;n&quot;&gt;to_pandas&lt;/span&gt;&lt;span 
class=&quot;p&quot;&gt;()&lt;/span&gt;
+&lt;span class=&quot;k&quot;&gt;print&lt;/span&gt;&lt;span 
class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span 
class=&quot;n&quot;&gt;pdf&lt;/span&gt;&lt;span 
class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
+
+&lt;h1 id=&quot;conclusion--upcoming-work&quot;&gt;Conclusion &amp;amp; 
Upcoming work&lt;/h1&gt;
+
+&lt;p&gt;In this article, we introduce the integration of Pandas in Flink 
1.11, including Pandas UDF and the conversion between Table and Pandas. In 
fact, in the latest Apache Flink release, there are many excellent features 
added to PyFlink, such as support of User-defined Table functions and 
User-defined Metrics for Python UDFs. What’s more, from Flink 1.11, you can 
build PyFlink with Cython support and “Cythonize” your Python UDFs to 
substantially improve code execution speed (up to 3 [...]
+
+&lt;p&gt;Future work by the community will focus on adding more features and 
bringing additional optimizations with follow up releases.  Such optimizations 
and additions include a Python DataStream API and more integration with the 
Python ecosystem, such as support for distributed Pandas in Flink. Stay tuned 
for more information and updates with the upcoming releases!&lt;/p&gt;
+
+&lt;center&gt;
+&lt;img 
src=&quot;/img/blog/2020-07-28-pyflink-pandas/mission-of-pyFlink.gif&quot; 
width=&quot;600px&quot; alt=&quot;Mission of PyFlink&quot; /&gt;
+&lt;/center&gt;
+</description>
+<pubDate>Tue, 28 Jul 2020 14:00:00 +0200</pubDate>
+<link>https://flink.apache.org/2020/07/28/pyflink-pandas-udf-support-flink.html</link>
+<guid 
isPermaLink="true">/2020/07/28/pyflink-pandas-udf-support-flink.html</guid>
+</item>
+
+<item>
 <title>Flink SQL Demo: Building an End-to-End Streaming Application</title>
 <description>&lt;p&gt;Apache Flink 1.11 has released many exciting new 
features, including many developments in Flink SQL which is evolving at a fast 
pace. This article takes a closer look at how to quickly build streaming 
applications with Flink SQL from a practical point of view.&lt;/p&gt;
 
@@ -17072,152 +17281,5 @@ on the Flink mailing lists.&lt;/p&gt;
 <guid isPermaLink="true">/news/2015/12/18/a-year-in-review.html</guid>
 </item>
 
-<item>
-<title>Storm Compatibility in Apache Flink: How to run existing Storm 
topologies on Flink</title>
-<description>&lt;p&gt;&lt;a 
href=&quot;https://storm.apache.org&quot;&gt;Apache Storm&lt;/a&gt; was one of 
the first distributed and scalable stream processing systems available in the 
open source space offering (near) real-time tuple-by-tuple processing semantics.
-Initially released by the developers at Backtype in 2011 under the Eclipse 
open-source license, it became popular very quickly.
-Only shortly afterwards, Twitter acquired Backtype.
-Since then, Storm has been growing in popularity, is used in production at 
many big companies, and is the de-facto industry standard for big data stream 
processing.
-In 2013, Storm entered the Apache incubator program, followed by its 
graduation to top-level in 2014.&lt;/p&gt;
-
-&lt;p&gt;Apache Flink is a stream processing engine that improves upon older 
technologies like Storm in several dimensions,
-including &lt;a 
href=&quot;https://ci.apache.org/projects/flink/flink-docs-master/internals/stream_checkpointing.html&quot;&gt;strong
 consistency guarantees&lt;/a&gt; (“exactly once”),
-a higher level &lt;a 
href=&quot;https://ci.apache.org/projects/flink/flink-docs-master/apis/streaming_guide.html&quot;&gt;DataStream
 API&lt;/a&gt;,
-support for &lt;a 
href=&quot;http://flink.apache.org/news/2015/12/04/Introducing-windows.html&quot;&gt;event
 time and a rich windowing system&lt;/a&gt;,
-as well as &lt;a 
href=&quot;https://data-artisans.com/high-throughput-low-latency-and-exactly-once-stream-processing-with-apache-flink/&quot;&gt;superior
 throughput with competitive low latency&lt;/a&gt;.&lt;/p&gt;
-
-&lt;p&gt;While Flink offers several technical benefits over Storm, an existing 
investment on a codebase of applications developed for Storm often makes it 
difficult to switch engines.
-For these reasons, as part of the Flink 0.10 release, Flink ships with a Storm 
compatibility package that allows users to:&lt;/p&gt;
-
-&lt;ul&gt;
-  &lt;li&gt;Run &lt;strong&gt;unmodified&lt;/strong&gt; Storm topologies using 
Apache Flink benefiting from superior performance.&lt;/li&gt;
-  &lt;li&gt;&lt;strong&gt;Embed&lt;/strong&gt; Storm code (spouts and bolts) 
as operators inside Flink DataStream programs.&lt;/li&gt;
-&lt;/ul&gt;
-
-&lt;p&gt;Only minor code changes are required in order to submit the program 
to Flink instead of Storm.
-This minimizes the work for developers to run existing Storm topologies while 
leveraging Apache Flink’s fast and robust execution engine.&lt;/p&gt;
-
-&lt;p&gt;We note that the Storm compatibility package is continuously 
improving and does not cover the full spectrum of Storm’s API.
-However, it is powerful enough to cover many use cases.&lt;/p&gt;
-
-&lt;h2 id=&quot;executing-storm-topologies-with-flink&quot;&gt;Executing Storm 
topologies with Flink&lt;/h2&gt;
-
-&lt;center&gt;
-&lt;img src=&quot;/img/blog/flink-storm.png&quot; 
style=&quot;height:200px;margin:15px&quot; /&gt;
-&lt;/center&gt;
-
-&lt;p&gt;The easiest way to use the Storm compatibility package is by 
executing a whole Storm topology in Flink.
-For this, you only need to replace the dependency 
&lt;code&gt;storm-core&lt;/code&gt; by &lt;code&gt;flink-storm&lt;/code&gt; in 
your Storm project and &lt;strong&gt;change two lines of code&lt;/strong&gt; in 
your original Storm program.&lt;/p&gt;
-
-&lt;p&gt;The following example shows a simple Storm-Word-Count-Program that 
can be executed in Flink.
-First, the program is assembled the Storm way without any code change to 
Spouts, Bolts, or the topology itself.&lt;/p&gt;
-
-&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code 
class=&quot;language-java&quot;&gt;&lt;span class=&quot;c1&quot;&gt;// assemble 
topology, the Storm way&lt;/span&gt;
-&lt;span class=&quot;n&quot;&gt;TopologyBuilder&lt;/span&gt; &lt;span 
class=&quot;n&quot;&gt;builder&lt;/span&gt; &lt;span 
class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span 
class=&quot;k&quot;&gt;new&lt;/span&gt; &lt;span 
class=&quot;nf&quot;&gt;TopologyBuilder&lt;/span&gt;&lt;span 
class=&quot;o&quot;&gt;();&lt;/span&gt;
-&lt;span class=&quot;n&quot;&gt;builder&lt;/span&gt;&lt;span 
class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span 
class=&quot;na&quot;&gt;setSpout&lt;/span&gt;&lt;span 
class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span 
class=&quot;s&quot;&gt;&amp;quot;source&amp;quot;&lt;/span&gt;&lt;span 
class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span 
class=&quot;k&quot;&gt;new&lt;/span&gt; &lt;span 
class=&quot;nf&quot;&gt;StormFileSpout&lt;/span&gt;&lt;span 
class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot; [...]
-&lt;span class=&quot;n&quot;&gt;builder&lt;/span&gt;&lt;span 
class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span 
class=&quot;na&quot;&gt;setBolt&lt;/span&gt;&lt;span 
class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span 
class=&quot;s&quot;&gt;&amp;quot;tokenizer&amp;quot;&lt;/span&gt;&lt;span 
class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span 
class=&quot;k&quot;&gt;new&lt;/span&gt; &lt;span 
class=&quot;nf&quot;&gt;StormBoltTokenizer&lt;/span&gt;&lt;span 
class=&quot;o&quot;&gt;())&lt;/span&gt;
-       &lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span 
class=&quot;na&quot;&gt;shuffleGrouping&lt;/span&gt;&lt;span 
class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span 
class=&quot;s&quot;&gt;&amp;quot;source&amp;quot;&lt;/span&gt;&lt;span 
class=&quot;o&quot;&gt;);&lt;/span&gt;
-&lt;span class=&quot;n&quot;&gt;builder&lt;/span&gt;&lt;span 
class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span 
class=&quot;na&quot;&gt;setBolt&lt;/span&gt;&lt;span 
class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span 
class=&quot;s&quot;&gt;&amp;quot;counter&amp;quot;&lt;/span&gt;&lt;span 
class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span 
class=&quot;k&quot;&gt;new&lt;/span&gt; &lt;span 
class=&quot;nf&quot;&gt;StormBoltCounter&lt;/span&gt;&lt;span 
class=&quot;o&quot;&gt;())&lt;/span&gt;
-       &lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span 
class=&quot;na&quot;&gt;fieldsGrouping&lt;/span&gt;&lt;span 
class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span 
class=&quot;s&quot;&gt;&amp;quot;tokenizer&amp;quot;&lt;/span&gt;&lt;span 
class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span 
class=&quot;k&quot;&gt;new&lt;/span&gt; &lt;span 
class=&quot;nf&quot;&gt;Fields&lt;/span&gt;&lt;span 
class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span 
class=&quot;s&quot;&gt;&amp;quot;word&amp;quot;&lt;/span [...]
-&lt;span class=&quot;n&quot;&gt;builder&lt;/span&gt;&lt;span 
class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span 
class=&quot;na&quot;&gt;setBolt&lt;/span&gt;&lt;span 
class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span 
class=&quot;s&quot;&gt;&amp;quot;sink&amp;quot;&lt;/span&gt;&lt;span 
class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span 
class=&quot;k&quot;&gt;new&lt;/span&gt; &lt;span 
class=&quot;nf&quot;&gt;StormBoltFileSink&lt;/span&gt;&lt;span 
class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot; [...]
-       &lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span 
class=&quot;na&quot;&gt;shuffleGrouping&lt;/span&gt;&lt;span 
class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span 
class=&quot;s&quot;&gt;&amp;quot;counter&amp;quot;&lt;/span&gt;&lt;span 
class=&quot;o&quot;&gt;);&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
-
-&lt;p&gt;In order to execute the topology, we need to translate it to a 
&lt;code&gt;FlinkTopology&lt;/code&gt; and submit it to a local or remote Flink 
cluster, very similar to submitting the application to a Storm 
cluster.&lt;sup&gt;&lt;a href=&quot;#fn1&quot; 
id=&quot;ref1&quot;&gt;1&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;
-
-&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code 
class=&quot;language-java&quot;&gt;&lt;span class=&quot;c1&quot;&gt;// 
transform Storm topology to Flink program&lt;/span&gt;
-&lt;span class=&quot;c1&quot;&gt;// replaces: StormTopology topology = 
builder.createTopology();&lt;/span&gt;
-&lt;span class=&quot;n&quot;&gt;FlinkTopology&lt;/span&gt; &lt;span 
class=&quot;n&quot;&gt;topology&lt;/span&gt; &lt;span 
class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span 
class=&quot;n&quot;&gt;FlinkTopology&lt;/span&gt;&lt;span 
class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span 
class=&quot;na&quot;&gt;createTopology&lt;/span&gt;&lt;span 
class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span 
class=&quot;n&quot;&gt;builder&lt;/span&gt;&lt;span 
class=&quot;o&quot;&gt;);&lt;/span&gt;
-
-&lt;span class=&quot;n&quot;&gt;Config&lt;/span&gt; &lt;span 
class=&quot;n&quot;&gt;conf&lt;/span&gt; &lt;span 
class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span 
class=&quot;k&quot;&gt;new&lt;/span&gt; &lt;span 
class=&quot;nf&quot;&gt;Config&lt;/span&gt;&lt;span 
class=&quot;o&quot;&gt;();&lt;/span&gt;
-&lt;span class=&quot;k&quot;&gt;if&lt;/span&gt;&lt;span 
class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span 
class=&quot;n&quot;&gt;runLocal&lt;/span&gt;&lt;span 
class=&quot;o&quot;&gt;)&lt;/span&gt; &lt;span 
class=&quot;o&quot;&gt;{&lt;/span&gt;
-       &lt;span class=&quot;c1&quot;&gt;// use FlinkLocalCluster instead of 
LocalCluster&lt;/span&gt;
-       &lt;span class=&quot;n&quot;&gt;FlinkLocalCluster&lt;/span&gt; &lt;span 
class=&quot;n&quot;&gt;cluster&lt;/span&gt; &lt;span 
class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span 
class=&quot;n&quot;&gt;FlinkLocalCluster&lt;/span&gt;&lt;span 
class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span 
class=&quot;na&quot;&gt;getLocalCluster&lt;/span&gt;&lt;span 
class=&quot;o&quot;&gt;();&lt;/span&gt;
-       &lt;span class=&quot;n&quot;&gt;cluster&lt;/span&gt;&lt;span 
class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span 
class=&quot;na&quot;&gt;submitTopology&lt;/span&gt;&lt;span 
class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span 
class=&quot;s&quot;&gt;&amp;quot;WordCount&amp;quot;&lt;/span&gt;&lt;span 
class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span 
class=&quot;n&quot;&gt;conf&lt;/span&gt;&lt;span 
class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span 
class=&quot;n&quot;&gt;topology&lt;/span&gt;&lt;span class=&q [...]
-&lt;span class=&quot;o&quot;&gt;}&lt;/span&gt; &lt;span 
class=&quot;k&quot;&gt;else&lt;/span&gt; &lt;span 
class=&quot;o&quot;&gt;{&lt;/span&gt;
-       &lt;span class=&quot;c1&quot;&gt;// use FlinkSubmitter instead of 
StormSubmitter&lt;/span&gt;
-       &lt;span class=&quot;n&quot;&gt;FlinkSubmitter&lt;/span&gt;&lt;span 
class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span 
class=&quot;na&quot;&gt;submitTopology&lt;/span&gt;&lt;span 
class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span 
class=&quot;s&quot;&gt;&amp;quot;WordCount&amp;quot;&lt;/span&gt;&lt;span 
class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span 
class=&quot;n&quot;&gt;conf&lt;/span&gt;&lt;span 
class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span 
class=&quot;n&quot;&gt;topology&lt;/span&gt;&lt;span c [...]
-&lt;span 
class=&quot;o&quot;&gt;}&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
-
-&lt;p&gt;As a shorter Flink-style alternative that replaces the Storm-style 
submission code, you can also use context-based job execution:&lt;/p&gt;
-
-&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code 
class=&quot;language-java&quot;&gt;&lt;span class=&quot;c1&quot;&gt;// 
transform Storm topology to Flink program (as above)&lt;/span&gt;
-&lt;span class=&quot;n&quot;&gt;FlinkTopology&lt;/span&gt; &lt;span 
class=&quot;n&quot;&gt;topology&lt;/span&gt; &lt;span 
class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span 
class=&quot;n&quot;&gt;FlinkTopology&lt;/span&gt;&lt;span 
class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span 
class=&quot;na&quot;&gt;createTopology&lt;/span&gt;&lt;span 
class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span 
class=&quot;n&quot;&gt;builder&lt;/span&gt;&lt;span 
class=&quot;o&quot;&gt;);&lt;/span&gt;
-
-&lt;span class=&quot;c1&quot;&gt;// executes locally by default or remotely if 
submitted with Flink&amp;#39;s command-line client&lt;/span&gt;
-&lt;span class=&quot;n&quot;&gt;topology&lt;/span&gt;&lt;span 
class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span 
class=&quot;na&quot;&gt;execute&lt;/span&gt;&lt;span 
class=&quot;o&quot;&gt;()&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
-
-&lt;p&gt;After the code is packaged in a jar file (e.g., 
&lt;code&gt;StormWordCount.jar&lt;/code&gt;), it can be easily submitted to 
Flink via&lt;/p&gt;
-
-&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code&gt;bin/flink run 
StormWordCount.jar
-&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
-
-&lt;p&gt;The used Spouts and Bolts as well as the topology assemble code is 
not changed at all!
-Only the translation and submission step have to be changed to the Storm-API 
compatible Flink pendants.
-This allows for minimal code changes and easy adaption to Flink.&lt;/p&gt;
-
-&lt;h3 
id=&quot;embedding-spouts-and-bolts-in-flink-programs&quot;&gt;Embedding Spouts 
and Bolts in Flink programs&lt;/h3&gt;
-
-&lt;p&gt;It is also possible to use Spouts and Bolts within a regular Flink 
DataStream program.
-The compatibility package provides wrapper classes for Spouts and Bolts which 
are implemented as a Flink &lt;code&gt;SourceFunction&lt;/code&gt; and 
&lt;code&gt;StreamOperator&lt;/code&gt; respectively.
-Those wrappers automatically translate incoming Flink POJO and 
&lt;code&gt;TupleXX&lt;/code&gt; records into Storm’s 
&lt;code&gt;Tuple&lt;/code&gt; type and emitted &lt;code&gt;Values&lt;/code&gt; 
back into either POJOs or &lt;code&gt;TupleXX&lt;/code&gt; types for further 
processing by Flink operators.
-As Storm is type agnostic, it is required to specify the output type of 
embedded Spouts/Bolts manually to get a fully typed Flink streaming 
program.&lt;/p&gt;
-
-&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code 
class=&quot;language-java&quot;&gt;&lt;span class=&quot;c1&quot;&gt;// use 
regular Flink streaming environment&lt;/span&gt;
-&lt;span class=&quot;n&quot;&gt;StreamExecutionEnvironment&lt;/span&gt; 
&lt;span class=&quot;n&quot;&gt;env&lt;/span&gt; &lt;span 
class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span 
class=&quot;n&quot;&gt;StreamExecutionEnvironment&lt;/span&gt;&lt;span 
class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span 
class=&quot;na&quot;&gt;getExecutionEnvironment&lt;/span&gt;&lt;span 
class=&quot;o&quot;&gt;();&lt;/span&gt;
-
-&lt;span class=&quot;c1&quot;&gt;// use Spout as source&lt;/span&gt;
-&lt;span class=&quot;n&quot;&gt;DataStream&lt;/span&gt;&lt;span 
class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span 
class=&quot;n&quot;&gt;Tuple1&lt;/span&gt;&lt;span 
class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span 
class=&quot;n&quot;&gt;String&lt;/span&gt;&lt;span 
class=&quot;o&quot;&gt;&amp;gt;&amp;gt;&lt;/span&gt; &lt;span 
class=&quot;n&quot;&gt;source&lt;/span&gt; &lt;span 
class=&quot;o&quot;&gt;=&lt;/span&gt; 
-  &lt;span class=&quot;n&quot;&gt;env&lt;/span&gt;&lt;span 
class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span 
class=&quot;na&quot;&gt;addSource&lt;/span&gt;&lt;span 
class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;// Flink 
provided wrapper including original Spout&lt;/span&gt;
-                &lt;span class=&quot;k&quot;&gt;new&lt;/span&gt; &lt;span 
class=&quot;n&quot;&gt;SpoutWrapper&lt;/span&gt;&lt;span 
class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span 
class=&quot;n&quot;&gt;String&lt;/span&gt;&lt;span 
class=&quot;o&quot;&gt;&amp;gt;(&lt;/span&gt;&lt;span 
class=&quot;k&quot;&gt;new&lt;/span&gt; &lt;span 
class=&quot;nf&quot;&gt;FileSpout&lt;/span&gt;&lt;span 
class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span 
class=&quot;n&quot;&gt;localFilePath&lt;/span&gt;&lt; [...]
-                &lt;span class=&quot;c1&quot;&gt;// specify output type 
manually&lt;/span&gt;
-                &lt;span 
class=&quot;n&quot;&gt;TypeExtractor&lt;/span&gt;&lt;span 
class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span 
class=&quot;na&quot;&gt;getForObject&lt;/span&gt;&lt;span 
class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span 
class=&quot;k&quot;&gt;new&lt;/span&gt; &lt;span 
class=&quot;n&quot;&gt;Tuple1&lt;/span&gt;&lt;span 
class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span 
class=&quot;n&quot;&gt;String&lt;/span&gt;&lt;span 
class=&quot;o&quot;&gt;&amp;gt;(&lt;/span&gt;&lt;span c [...]
-&lt;span class=&quot;c1&quot;&gt;// FileSpout cannot be 
parallelized&lt;/span&gt;
-&lt;span class=&quot;n&quot;&gt;DataStream&lt;/span&gt;&lt;span 
class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span 
class=&quot;n&quot;&gt;Tuple1&lt;/span&gt;&lt;span 
class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span 
class=&quot;n&quot;&gt;String&lt;/span&gt;&lt;span 
class=&quot;o&quot;&gt;&amp;gt;&amp;gt;&lt;/span&gt; &lt;span 
class=&quot;n&quot;&gt;text&lt;/span&gt; &lt;span 
class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span 
class=&quot;n&quot;&gt;source&lt;/span&gt;&lt;span class=&quo [...]
-
-&lt;span class=&quot;c1&quot;&gt;// further processing with Flink&lt;/span&gt;
-&lt;span class=&quot;n&quot;&gt;DataStream&lt;/span&gt;&lt;span 
class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span 
class=&quot;n&quot;&gt;Tuple2&lt;/span&gt;&lt;span 
class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span 
class=&quot;n&quot;&gt;String&lt;/span&gt;&lt;span 
class=&quot;o&quot;&gt;,&lt;/span&gt;&lt;span 
class=&quot;n&quot;&gt;Integer&lt;/span&gt;&lt;span 
class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span 
class=&quot;n&quot;&gt;tokens&lt;/span&gt; &lt;span class=&quot;o&qu [...]
-
-&lt;span class=&quot;c1&quot;&gt;// use Bolt for counting&lt;/span&gt;
-&lt;span class=&quot;n&quot;&gt;DataStream&lt;/span&gt;&lt;span 
class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span 
class=&quot;n&quot;&gt;Tuple2&lt;/span&gt;&lt;span 
class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span 
class=&quot;n&quot;&gt;String&lt;/span&gt;&lt;span 
class=&quot;o&quot;&gt;,&lt;/span&gt;&lt;span 
class=&quot;n&quot;&gt;Integer&lt;/span&gt;&lt;span 
class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span 
class=&quot;n&quot;&gt;counts&lt;/span&gt; &lt;span class=&quot;o&qu [...]
-  &lt;span class=&quot;n&quot;&gt;tokens&lt;/span&gt;&lt;span 
class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span 
class=&quot;na&quot;&gt;transform&lt;/span&gt;&lt;span 
class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span 
class=&quot;s&quot;&gt;&amp;quot;Counter&amp;quot;&lt;/span&gt;&lt;span 
class=&quot;o&quot;&gt;,&lt;/span&gt;
-                   &lt;span class=&quot;c1&quot;&gt;// specify output type 
manually&lt;/span&gt;
-                   &lt;span 
class=&quot;n&quot;&gt;TypeExtractor&lt;/span&gt;&lt;span 
class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span 
class=&quot;na&quot;&gt;getForObject&lt;/span&gt;&lt;span 
class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span 
class=&quot;k&quot;&gt;new&lt;/span&gt; &lt;span 
class=&quot;n&quot;&gt;Tuple2&lt;/span&gt;&lt;span 
class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span 
class=&quot;n&quot;&gt;String&lt;/span&gt;&lt;span 
class=&quot;o&quot;&gt;,&lt;/span&gt;&lt;span class= [...]
-                   &lt;span class=&quot;c1&quot;&gt;// Flink provided wrapper 
including original Bolt&lt;/span&gt;
-                   &lt;span class=&quot;k&quot;&gt;new&lt;/span&gt; &lt;span 
class=&quot;n&quot;&gt;BoltWrapper&lt;/span&gt;&lt;span 
class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span 
class=&quot;n&quot;&gt;String&lt;/span&gt;&lt;span 
class=&quot;o&quot;&gt;,&lt;/span&gt;&lt;span 
class=&quot;n&quot;&gt;Tuple2&lt;/span&gt;&lt;span 
class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span 
class=&quot;n&quot;&gt;String&lt;/span&gt;&lt;span 
class=&quot;o&quot;&gt;,&lt;/span&gt;&lt;span class=&q [...]
-
-&lt;span class=&quot;c1&quot;&gt;// write result to file via Flink 
sink&lt;/span&gt;
-&lt;span class=&quot;n&quot;&gt;counts&lt;/span&gt;&lt;span 
class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span 
class=&quot;na&quot;&gt;writeAsText&lt;/span&gt;&lt;span 
class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span 
class=&quot;n&quot;&gt;outputPath&lt;/span&gt;&lt;span 
class=&quot;o&quot;&gt;);&lt;/span&gt;
-
-&lt;span class=&quot;c1&quot;&gt;// start Flink job&lt;/span&gt;
-&lt;span class=&quot;n&quot;&gt;env&lt;/span&gt;&lt;span 
class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span 
class=&quot;na&quot;&gt;execute&lt;/span&gt;&lt;span 
class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span 
class=&quot;s&quot;&gt;&amp;quot;WordCount with Spout source and Bolt 
counter&amp;quot;&lt;/span&gt;&lt;span 
class=&quot;o&quot;&gt;);&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
-
-&lt;p&gt;Although some boilerplate code is needed (we plan to address this 
soon!), the actual embedded Spout and Bolt code can be used unmodified.
-We also note that the resulting program is fully typed, and type errors will 
be found by Flink’s type extractor even if the original Spouts and Bolts are 
not.&lt;/p&gt;
-
-&lt;h2 id=&quot;outlook&quot;&gt;Outlook&lt;/h2&gt;
-
-&lt;p&gt;The Storm compatibility package is currently in beta and undergoes 
continuous development.
-We are currently working on providing consistency guarantees for stateful 
Bolts.
-Furthermore, we want to provide a better API integration for embedded Spouts 
and Bolts by providing a “StormExecutionEnvironment” as a special extension of 
Flink’s &lt;code&gt;StreamExecutionEnvironment&lt;/code&gt;.
-We are also investigating the integration of Storm’s higher-level programming 
API Trident.&lt;/p&gt;
-
-&lt;h2 id=&quot;summary&quot;&gt;Summary&lt;/h2&gt;
-
-&lt;p&gt;Flink’s compatibility package for Storm allows using unmodified 
Spouts and Bolts within Flink.
-This enables you to even embed third-party Spouts and Bolts where the source 
code is not available.
-While you can embed Spouts/Bolts in a Flink program and mix-and-match them 
with Flink operators, running whole topologies is the easiest way to get 
started and can be achieved with almost no code changes.&lt;/p&gt;
-
-&lt;p&gt;If you want to try out Flink’s Storm compatibility package checkout 
our &lt;a 
href=&quot;https://ci.apache.org/projects/flink/flink-docs-master/apis/streaming/storm_compatibility.html&quot;&gt;Documentation&lt;/a&gt;.&lt;/p&gt;
-
-&lt;hr /&gt;
-
-&lt;p&gt;&lt;sup id=&quot;fn1&quot;&gt;1. We confess, there are three lines 
changed compared to a Storm project &lt;img class=&quot;emoji&quot; 
style=&quot;width:16px;height:16px;align:absmiddle&quot; 
src=&quot;/img/blog/smirk.png&quot; /&gt;—because the example covers local 
&lt;em&gt;and&lt;/em&gt; remote execution. &lt;a href=&quot;#ref1&quot; 
title=&quot;Back to text.&quot;&gt;↩&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;
-
-</description>
-<pubDate>Fri, 11 Dec 2015 11:00:00 +0100</pubDate>
-<link>https://flink.apache.org/news/2015/12/11/storm-compatibility.html</link>
-<guid isPermaLink="true">/news/2015/12/11/storm-compatibility.html</guid>
-</item>
-
 </channel>
 </rss>
diff --git a/content/blog/index.html b/content/blog/index.html
index 009ad78..af9a77b 100644
--- a/content/blog/index.html
+++ b/content/blog/index.html
@@ -209,6 +209,19 @@
     <hr>
     
     <article>
+      <h2 class="blog-title"><a 
href="/2020/07/28/pyflink-pandas-udf-support-flink.html">PyFlink: The 
integration of Pandas into PyFlink</a></h2>
+
+      <p>28 Jul 2020
+       Jincheng Sun (<a 
href="https://twitter.com/sunjincheng121";>@sunjincheng121</a>) &amp; Markos 
Sfikas (<a href="https://twitter.com/MarkSfik";>@MarkSfik</a>)</p>
+
+      <p>The Apache Flink community put some great effort into integrating 
Pandas with PyFlink in the latest Flink version 1.11. Some of the added 
features include support for Pandas UDF and the conversion between Pandas 
DataFrame and Table. In this article, we will introduce how these 
functionalities work and how to use them with a step-by-step example.</p>
+
+      <p><a href="/2020/07/28/pyflink-pandas-udf-support-flink.html">Continue 
reading &raquo;</a></p>
+    </article>
+
+    <hr>
+    
+    <article>
       <h2 class="blog-title"><a 
href="/2020/07/28/flink-sql-demo-building-e2e-streaming-application.html">Flink 
SQL Demo: Building an End-to-End Streaming Application</a></h2>
 
       <p>28 Jul 2020
@@ -330,19 +343,6 @@ and provide a tutorial for running Streaming ETL with 
Flink on Zeppelin.</p>
 
     <hr>
     
-    <article>
-      <h2 class="blog-title"><a 
href="/news/2020/06/11/community-update.html">Flink Community Update - 
June'20</a></h2>
-
-      <p>11 Jun 2020
-       Marta Paes (<a href="https://twitter.com/morsapaes";>@morsapaes</a>)</p>
-
-      <p>And suddenly it’s June. The previous month has been calm on the 
surface, but quite hectic underneath — the final testing phase for Flink 1.11 
is moving at full speed, Stateful Functions 2.1 is out in the wild and Flink 
has made it into Google Season of Docs 2020.</p>
-
-      <p><a href="/news/2020/06/11/community-update.html">Continue reading 
&raquo;</a></p>
-    </article>
-
-    <hr>
-    
 
     <!-- Pagination links -->
     
@@ -385,6 +385,16 @@ and provide a tutorial for running Streaming ETL with 
Flink on Zeppelin.</p>
       
 
       
+      <li><a href="/2020/07/28/pyflink-pandas-udf-support-flink.html">PyFlink: 
The integration of Pandas into PyFlink</a></li>
+
+      
+        
+      
+    
+      
+      
+
+      
       <li><a 
href="/2020/07/28/flink-sql-demo-building-e2e-streaming-application.html">Flink 
SQL Demo: Building an End-to-End Streaming Application</a></li>
 
       
diff --git a/content/blog/page10/index.html b/content/blog/page10/index.html
index 676b25e..591ef5b 100644
--- a/content/blog/page10/index.html
+++ b/content/blog/page10/index.html
@@ -196,6 +196,21 @@
     <!-- Blog posts -->
     
     <article>
+      <h2 class="blog-title"><a 
href="/news/2016/08/08/release-1.1.0.html">Announcing Apache Flink 
1.1.0</a></h2>
+
+      <p>08 Aug 2016
+      </p>
+
+      <p><div class="alert alert-success"><strong>Important</strong>: The 
Maven artifacts published with version 1.1.0 on Maven central have a Hadoop 
dependency issue. It is highly recommended to use <strong>1.1.1</strong> or 
<strong>1.1.1-hadoop1</strong> as the Flink version.</div>
+
+</p>
+
+      <p><a href="/news/2016/08/08/release-1.1.0.html">Continue reading 
&raquo;</a></p>
+    </article>
+
+    <hr>
+    
+    <article>
       <h2 class="blog-title"><a href="/news/2016/05/24/stream-sql.html">Stream 
Processing for Everyone with SQL and Apache Flink</a></h2>
 
       <p>24 May 2016 by Fabian Hueske (<a 
href="https://twitter.com/";>@fhueske</a>)
@@ -325,19 +340,6 @@
 
     <hr>
     
-    <article>
-      <h2 class="blog-title"><a 
href="/news/2015/12/11/storm-compatibility.html">Storm Compatibility in Apache 
Flink: How to run existing Storm topologies on Flink</a></h2>
-
-      <p>11 Dec 2015 by Matthias J. Sax (<a 
href="https://twitter.com/";>@MatthiasJSax</a>)
-      </p>
-
-      <p>In this blog post, we describe Flink's compatibility package for <a 
href="https://storm.apache.org";>Apache Storm</a> that allows to embed Spouts 
(sources) and Bolts (operators) in a regular Flink streaming job. Furthermore, 
the compatibility package provides a Storm compatible API in order to execute 
whole Storm topologies with (almost) no code adaption.</p>
-
-      <p><a href="/news/2015/12/11/storm-compatibility.html">Continue reading 
&raquo;</a></p>
-    </article>
-
-    <hr>
-    
 
     <!-- Pagination links -->
     
@@ -380,6 +382,16 @@
       
 
       
+      <li><a href="/2020/07/28/pyflink-pandas-udf-support-flink.html">PyFlink: 
The integration of Pandas into PyFlink</a></li>
+
+      
+        
+      
+    
+      
+      
+
+      
       <li><a 
href="/2020/07/28/flink-sql-demo-building-e2e-streaming-application.html">Flink 
SQL Demo: Building an End-to-End Streaming Application</a></li>
 
       
diff --git a/content/blog/page11/index.html b/content/blog/page11/index.html
index 9f40763..27263a4 100644
--- a/content/blog/page11/index.html
+++ b/content/blog/page11/index.html
@@ -196,6 +196,19 @@
     <!-- Blog posts -->
     
     <article>
+      <h2 class="blog-title"><a 
href="/news/2015/12/11/storm-compatibility.html">Storm Compatibility in Apache 
Flink: How to run existing Storm topologies on Flink</a></h2>
+
+      <p>11 Dec 2015 by Matthias J. Sax (<a 
href="https://twitter.com/";>@MatthiasJSax</a>)
+      </p>
+
+      <p>In this blog post, we describe Flink's compatibility package for <a 
href="https://storm.apache.org";>Apache Storm</a> that allows to embed Spouts 
(sources) and Bolts (operators) in a regular Flink streaming job. Furthermore, 
the compatibility package provides a Storm compatible API in order to execute 
whole Storm topologies with (almost) no code adaption.</p>
+
+      <p><a href="/news/2015/12/11/storm-compatibility.html">Continue reading 
&raquo;</a></p>
+    </article>
+
+    <hr>
+    
+    <article>
       <h2 class="blog-title"><a 
href="/news/2015/12/04/Introducing-windows.html">Introducing Stream Windows in 
Apache Flink</a></h2>
 
       <p>04 Dec 2015 by Fabian Hueske (<a 
href="https://twitter.com/";>@fhueske</a>)
@@ -333,20 +346,6 @@ vertex-centric or gather-sum-apply to Flink dataflows.</p>
 
     <hr>
     
-    <article>
-      <h2 class="blog-title"><a 
href="/news/2015/05/11/Juggling-with-Bits-and-Bytes.html">Juggling with Bits 
and Bytes</a></h2>
-
-      <p>11 May 2015 by Fabian Hüske (<a 
href="https://twitter.com/";>@fhueske</a>)
-      </p>
-
-      <p><p>Nowadays, a lot of open-source systems for analyzing large data 
sets are implemented in Java or other JVM-based programming languages. The most 
well-known example is Apache Hadoop, but also newer frameworks such as Apache 
Spark, Apache Drill, and also Apache Flink run on JVMs. A common challenge that 
JVM-based data analysis engines face is to store large amounts of data in 
memory - both for caching and for efficient processing such as sorting and 
joining of data. Managing the [...]
-<p>In this blog post we discuss how Apache Flink manages memory, talk about 
its custom data de/serialization stack, and show how it operates on binary 
data.</p></p>
-
-      <p><a href="/news/2015/05/11/Juggling-with-Bits-and-Bytes.html">Continue 
reading &raquo;</a></p>
-    </article>
-
-    <hr>
-    
 
     <!-- Pagination links -->
     
@@ -389,6 +388,16 @@ vertex-centric or gather-sum-apply to Flink dataflows.</p>
       
 
       
+      <li><a href="/2020/07/28/pyflink-pandas-udf-support-flink.html">PyFlink: 
The integration of Pandas into PyFlink</a></li>
+
+      
+        
+      
+    
+      
+      
+
+      
       <li><a 
href="/2020/07/28/flink-sql-demo-building-e2e-streaming-application.html">Flink 
SQL Demo: Building an End-to-End Streaming Application</a></li>
 
       
diff --git a/content/blog/page12/index.html b/content/blog/page12/index.html
index ba6dd47..fb9e164 100644
--- a/content/blog/page12/index.html
+++ b/content/blog/page12/index.html
@@ -196,6 +196,20 @@
     <!-- Blog posts -->
     
     <article>
+      <h2 class="blog-title"><a 
href="/news/2015/05/11/Juggling-with-Bits-and-Bytes.html">Juggling with Bits 
and Bytes</a></h2>
+
+      <p>11 May 2015 by Fabian Hüske (<a 
href="https://twitter.com/";>@fhueske</a>)
+      </p>
+
+      <p><p>Nowadays, a lot of open-source systems for analyzing large data 
sets are implemented in Java or other JVM-based programming languages. The most 
well-known example is Apache Hadoop, but also newer frameworks such as Apache 
Spark, Apache Drill, and also Apache Flink run on JVMs. A common challenge that 
JVM-based data analysis engines face is to store large amounts of data in 
memory - both for caching and for efficient processing such as sorting and 
joining of data. Managing the [...]
+<p>In this blog post we discuss how Apache Flink manages memory, talk about 
its custom data de/serialization stack, and show how it operates on binary 
data.</p></p>
+
+      <p><a href="/news/2015/05/11/Juggling-with-Bits-and-Bytes.html">Continue 
reading &raquo;</a></p>
+    </article>
+
+    <hr>
+    
+    <article>
       <h2 class="blog-title"><a 
href="/news/2015/04/13/release-0.9.0-milestone1.html">Announcing Flink 
0.9.0-milestone1 preview release</a></h2>
 
       <p>13 Apr 2015
@@ -340,21 +354,6 @@ and offers a new API including definition of flexible 
windows.</p>
 
     <hr>
     
-    <article>
-      <h2 class="blog-title"><a 
href="/news/2014/11/04/release-0.7.0.html">Apache Flink 0.7.0 available</a></h2>
-
-      <p>04 Nov 2014
-      </p>
-
-      <p><p>We are pleased to announce the availability of Flink 0.7.0. This 
release includes new user-facing features as well as performance and bug fixes, 
brings the Scala and Java APIs in sync, and introduces Flink Streaming. A total 
of 34 people have contributed to this release, a big thanks to all of them!</p>
-
-</p>
-
-      <p><a href="/news/2014/11/04/release-0.7.0.html">Continue reading 
&raquo;</a></p>
-    </article>
-
-    <hr>
-    
 
     <!-- Pagination links -->
     
@@ -397,6 +396,16 @@ and offers a new API including definition of flexible 
windows.</p>
       
 
       
+      <li><a href="/2020/07/28/pyflink-pandas-udf-support-flink.html">PyFlink: 
The integration of Pandas into PyFlink</a></li>
+
+      
+        
+      
+    
+      
+      
+
+      
       <li><a 
href="/2020/07/28/flink-sql-demo-building-e2e-streaming-application.html">Flink 
SQL Demo: Building an End-to-End Streaming Application</a></li>
 
       
diff --git a/content/blog/page13/index.html b/content/blog/page13/index.html
index 3da2d59..1450726 100644
--- a/content/blog/page13/index.html
+++ b/content/blog/page13/index.html
@@ -196,6 +196,21 @@
     <!-- Blog posts -->
     
     <article>
+      <h2 class="blog-title"><a 
href="/news/2014/11/04/release-0.7.0.html">Apache Flink 0.7.0 available</a></h2>
+
+      <p>04 Nov 2014
+      </p>
+
+      <p><p>We are pleased to announce the availability of Flink 0.7.0. This 
release includes new user-facing features as well as performance and bug fixes, 
brings the Scala and Java APIs in sync, and introduces Flink Streaming. A total 
of 34 people have contributed to this release, a big thanks to all of them!</p>
+
+</p>
+
+      <p><a href="/news/2014/11/04/release-0.7.0.html">Continue reading 
&raquo;</a></p>
+    </article>
+
+    <hr>
+    
+    <article>
       <h2 class="blog-title"><a 
href="/news/2014/10/03/upcoming_events.html">Upcoming Events</a></h2>
 
       <p>03 Oct 2014
@@ -285,6 +300,16 @@ academic and open source project that Flink originates 
from.</p>
       
 
       
+      <li><a href="/2020/07/28/pyflink-pandas-udf-support-flink.html">PyFlink: 
The integration of Pandas into PyFlink</a></li>
+
+      
+        
+      
+    
+      
+      
+
+      
       <li><a 
href="/2020/07/28/flink-sql-demo-building-e2e-streaming-application.html">Flink 
SQL Demo: Building an End-to-End Streaming Application</a></li>
 
       
diff --git a/content/blog/page2/index.html b/content/blog/page2/index.html
index 59e48c1..2a294d5 100644
--- a/content/blog/page2/index.html
+++ b/content/blog/page2/index.html
@@ -196,6 +196,19 @@
     <!-- Blog posts -->
     
     <article>
+      <h2 class="blog-title"><a 
href="/news/2020/06/11/community-update.html">Flink Community Update - 
June'20</a></h2>
+
+      <p>11 Jun 2020
+       Marta Paes (<a href="https://twitter.com/morsapaes";>@morsapaes</a>)</p>
+
+      <p>And suddenly it’s June. The previous month has been calm on the 
surface, but quite hectic underneath — the final testing phase for Flink 1.11 
is moving at full speed, Stateful Functions 2.1 is out in the wild and Flink 
has made it into Google Season of Docs 2020.</p>
+
+      <p><a href="/news/2020/06/11/community-update.html">Continue reading 
&raquo;</a></p>
+    </article>
+
+    <hr>
+    
+    <article>
       <h2 class="blog-title"><a 
href="/news/2020/06/09/release-statefun-2.1.0.html">Stateful Functions 2.1.0 
Release Announcement</a></h2>
 
       <p>09 Jun 2020
@@ -321,19 +334,6 @@ This release marks a big milestone: Stateful Functions 2.0 
is not only an API up
 
     <hr>
     
-    <article>
-      <h2 class="blog-title"><a 
href="/news/2020/04/01/community-update.html">Flink Community Update - 
April'20</a></h2>
-
-      <p>01 Apr 2020
-       Marta Paes (<a href="https://twitter.com/morsapaes";>@morsapaes</a>)</p>
-
-      <p>While things slow down around us, the Apache Flink community is 
privileged to remain as active as ever. This blogpost combs through the past 
few months to give you an update on the state of things in Flink — from core 
releases to Stateful Functions; from some good old community stats to a new 
development blog.</p>
-
-      <p><a href="/news/2020/04/01/community-update.html">Continue reading 
&raquo;</a></p>
-    </article>
-
-    <hr>
-    
 
     <!-- Pagination links -->
     
@@ -376,6 +376,16 @@ This release marks a big milestone: Stateful Functions 2.0 
is not only an API up
       
 
       
+      <li><a href="/2020/07/28/pyflink-pandas-udf-support-flink.html">PyFlink: 
The integration of Pandas into PyFlink</a></li>
+
+      
+        
+      
+    
+      
+      
+
+      
       <li><a 
href="/2020/07/28/flink-sql-demo-building-e2e-streaming-application.html">Flink 
SQL Demo: Building an End-to-End Streaming Application</a></li>
 
       
diff --git a/content/blog/page3/index.html b/content/blog/page3/index.html
index 3f87858..7a9feec 100644
--- a/content/blog/page3/index.html
+++ b/content/blog/page3/index.html
@@ -196,6 +196,19 @@
     <!-- Blog posts -->
     
     <article>
+      <h2 class="blog-title"><a 
href="/news/2020/04/01/community-update.html">Flink Community Update - 
April'20</a></h2>
+
+      <p>01 Apr 2020
+       Marta Paes (<a href="https://twitter.com/morsapaes";>@morsapaes</a>)</p>
+
+      <p>While things slow down around us, the Apache Flink community is 
privileged to remain as active as ever. This blogpost combs through the past 
few months to give you an update on the state of things in Flink — from core 
releases to Stateful Functions; from some good old community stats to a new 
development blog.</p>
+
+      <p><a href="/news/2020/04/01/community-update.html">Continue reading 
&raquo;</a></p>
+    </article>
+
+    <hr>
+    
+    <article>
       <h2 class="blog-title"><a 
href="/features/2020/03/27/flink-for-data-warehouse.html">Flink as Unified 
Engine for Modern Data Warehousing: Production-Ready Hive Integration</a></h2>
 
       <p>27 Mar 2020
@@ -318,21 +331,6 @@
 
     <hr>
     
-    <article>
-      <h2 class="blog-title"><a 
href="/news/2019/12/11/release-1.8.3.html">Apache Flink 1.8.3 Released</a></h2>
-
-      <p>11 Dec 2019
-       Hequn Cheng </p>
-
-      <p><p>The Apache Flink community released the third bugfix version of 
the Apache Flink 1.8 series.</p>
-
-</p>
-
-      <p><a href="/news/2019/12/11/release-1.8.3.html">Continue reading 
&raquo;</a></p>
-    </article>
-
-    <hr>
-    
 
     <!-- Pagination links -->
     
@@ -375,6 +373,16 @@
       
 
       
+      <li><a href="/2020/07/28/pyflink-pandas-udf-support-flink.html">PyFlink: 
The integration of Pandas into PyFlink</a></li>
+
+      
+        
+      
+    
+      
+      
+
+      
       <li><a 
href="/2020/07/28/flink-sql-demo-building-e2e-streaming-application.html">Flink 
SQL Demo: Building an End-to-End Streaming Application</a></li>
 
       
diff --git a/content/blog/page4/index.html b/content/blog/page4/index.html
index e9e5df6..a1d94fe 100644
--- a/content/blog/page4/index.html
+++ b/content/blog/page4/index.html
@@ -196,6 +196,21 @@
     <!-- Blog posts -->
     
     <article>
+      <h2 class="blog-title"><a 
href="/news/2019/12/11/release-1.8.3.html">Apache Flink 1.8.3 Released</a></h2>
+
+      <p>11 Dec 2019
+       Hequn Cheng </p>
+
+      <p><p>The Apache Flink community released the third bugfix version of 
the Apache Flink 1.8 series.</p>
+
+</p>
+
+      <p><a href="/news/2019/12/11/release-1.8.3.html">Continue reading 
&raquo;</a></p>
+    </article>
+
+    <hr>
+    
+    <article>
       <h2 class="blog-title"><a 
href="/news/2019/12/09/flink-kubernetes-kudo.html">Running Apache Flink on 
Kubernetes with KUDO</a></h2>
 
       <p>09 Dec 2019
@@ -321,19 +336,6 @@
 
     <hr>
     
-    <article>
-      <h2 class="blog-title"><a href="/2019/06/26/broadcast-state.html">A 
Practical Guide to Broadcast State in Apache Flink</a></h2>
-
-      <p>26 Jun 2019
-       Fabian Hueske (<a href="https://twitter.com/fhueske";>@fhueske</a>)</p>
-
-      <p>Apache Flink has multiple types of operator state, one of which is 
called Broadcast State. In this post, we explain what Broadcast State is, and 
show an example of how it can be applied to an application that evaluates 
dynamic patterns on an event stream.</p>
-
-      <p><a href="/2019/06/26/broadcast-state.html">Continue reading 
&raquo;</a></p>
-    </article>
-
-    <hr>
-    
 
     <!-- Pagination links -->
     
@@ -376,6 +378,16 @@
       
 
       
+      <li><a href="/2020/07/28/pyflink-pandas-udf-support-flink.html">PyFlink: 
The integration of Pandas into PyFlink</a></li>
+
+      
+        
+      
+    
+      
+      
+
+      
       <li><a 
href="/2020/07/28/flink-sql-demo-building-e2e-streaming-application.html">Flink 
SQL Demo: Building an End-to-End Streaming Application</a></li>
 
       
diff --git a/content/blog/page5/index.html b/content/blog/page5/index.html
index dac3fb9..e105256 100644
--- a/content/blog/page5/index.html
+++ b/content/blog/page5/index.html
@@ -196,6 +196,19 @@
     <!-- Blog posts -->
     
     <article>
+      <h2 class="blog-title"><a href="/2019/06/26/broadcast-state.html">A 
Practical Guide to Broadcast State in Apache Flink</a></h2>
+
+      <p>26 Jun 2019
+       Fabian Hueske (<a href="https://twitter.com/fhueske";>@fhueske</a>)</p>
+
+      <p>Apache Flink has multiple types of operator state, one of which is 
called Broadcast State. In this post, we explain what Broadcast State is, and 
show an example of how it can be applied to an application that evaluates 
dynamic patterns on an event stream.</p>
+
+      <p><a href="/2019/06/26/broadcast-state.html">Continue reading 
&raquo;</a></p>
+    </article>
+
+    <hr>
+    
+    <article>
       <h2 class="blog-title"><a href="/2019/06/05/flink-network-stack.html">A 
Deep-Dive into Flink's Network Stack</a></h2>
 
       <p>05 Jun 2019
@@ -320,21 +333,6 @@ for more details.</p>
 
     <hr>
     
-    <article>
-      <h2 class="blog-title"><a 
href="/news/2019/02/25/release-1.6.4.html">Apache Flink 1.6.4 Released</a></h2>
-
-      <p>25 Feb 2019
-      </p>
-
-      <p><p>The Apache Flink community released the fourth bugfix version of 
the Apache Flink 1.6 series.</p>
-
-</p>
-
-      <p><a href="/news/2019/02/25/release-1.6.4.html">Continue reading 
&raquo;</a></p>
-    </article>
-
-    <hr>
-    
 
     <!-- Pagination links -->
     
@@ -377,6 +375,16 @@ for more details.</p>
       
 
       
+      <li><a href="/2020/07/28/pyflink-pandas-udf-support-flink.html">PyFlink: 
The integration of Pandas into PyFlink</a></li>
+
+      
+        
+      
+    
+      
+      
+
+      
       <li><a 
href="/2020/07/28/flink-sql-demo-building-e2e-streaming-application.html">Flink 
SQL Demo: Building an End-to-End Streaming Application</a></li>
 
       
diff --git a/content/blog/page6/index.html b/content/blog/page6/index.html
index 4bfda30..fffc301 100644
--- a/content/blog/page6/index.html
+++ b/content/blog/page6/index.html
@@ -196,6 +196,21 @@
     <!-- Blog posts -->
     
     <article>
+      <h2 class="blog-title"><a 
href="/news/2019/02/25/release-1.6.4.html">Apache Flink 1.6.4 Released</a></h2>
+
+      <p>25 Feb 2019
+      </p>
+
+      <p><p>The Apache Flink community released the fourth bugfix version of 
the Apache Flink 1.6 series.</p>
+
+</p>
+
+      <p><a href="/news/2019/02/25/release-1.6.4.html">Continue reading 
&raquo;</a></p>
+    </article>
+
+    <hr>
+    
+    <article>
       <h2 class="blog-title"><a 
href="/news/2019/02/15/release-1.7.2.html">Apache Flink 1.7.2 Released</a></h2>
 
       <p>15 Feb 2019
@@ -330,21 +345,6 @@ Please check the <a 
href="https://issues.apache.org/jira/secure/ReleaseNote.jspa
 
     <hr>
     
-    <article>
-      <h2 class="blog-title"><a 
href="/news/2018/09/20/release-1.5.4.html">Apache Flink 1.5.4 Released</a></h2>
-
-      <p>20 Sep 2018
-      </p>
-
-      <p><p>The Apache Flink community released the fourth bugfix version of 
the Apache Flink 1.5 series.</p>
-
-</p>
-
-      <p><a href="/news/2018/09/20/release-1.5.4.html">Continue reading 
&raquo;</a></p>
-    </article>
-
-    <hr>
-    
 
     <!-- Pagination links -->
     
@@ -387,6 +387,16 @@ Please check the <a 
href="https://issues.apache.org/jira/secure/ReleaseNote.jspa
       
 
       
+      <li><a href="/2020/07/28/pyflink-pandas-udf-support-flink.html">PyFlink: 
The integration of Pandas into PyFlink</a></li>
+
+      
+        
+      
+    
+      
+      
+
+      
       <li><a 
href="/2020/07/28/flink-sql-demo-building-e2e-streaming-application.html">Flink 
SQL Demo: Building an End-to-End Streaming Application</a></li>
 
       
diff --git a/content/blog/page7/index.html b/content/blog/page7/index.html
index 7fb0917..1c0e8ec 100644
--- a/content/blog/page7/index.html
+++ b/content/blog/page7/index.html
@@ -196,6 +196,21 @@
     <!-- Blog posts -->
     
     <article>
+      <h2 class="blog-title"><a 
href="/news/2018/09/20/release-1.5.4.html">Apache Flink 1.5.4 Released</a></h2>
+
+      <p>20 Sep 2018
+      </p>
+
+      <p><p>The Apache Flink community released the fourth bugfix version of 
the Apache Flink 1.5 series.</p>
+
+</p>
+
+      <p><a href="/news/2018/09/20/release-1.5.4.html">Continue reading 
&raquo;</a></p>
+    </article>
+
+    <hr>
+    
+    <article>
       <h2 class="blog-title"><a 
href="/news/2018/08/21/release-1.5.3.html">Apache Flink 1.5.3 Released</a></h2>
 
       <p>21 Aug 2018
@@ -328,19 +343,6 @@
 
     <hr>
     
-    <article>
-      <h2 class="blog-title"><a 
href="/features/2018/01/30/incremental-checkpointing.html">Managing Large State 
in Apache Flink: An Intro to Incremental Checkpointing</a></h2>
-
-      <p>30 Jan 2018
-       Stefan Ricther (<a 
href="https://twitter.com/StefanRRicther";>@StefanRRicther</a>) &amp; Chris Ward 
(<a href="https://twitter.com/chrischinch";>@chrischinch</a>)</p>
-
-      <p>Flink 1.3.0 introduced incremental checkpointing, making it possible 
for applications with large state to generate checkpoints more efficiently.</p>
-
-      <p><a 
href="/features/2018/01/30/incremental-checkpointing.html">Continue reading 
&raquo;</a></p>
-    </article>
-
-    <hr>
-    
 
     <!-- Pagination links -->
     
@@ -383,6 +385,16 @@
       
 
       
+      <li><a href="/2020/07/28/pyflink-pandas-udf-support-flink.html">PyFlink: 
The integration of Pandas into PyFlink</a></li>
+
+      
+        
+      
+    
+      
+      
+
+      
       <li><a 
href="/2020/07/28/flink-sql-demo-building-e2e-streaming-application.html">Flink 
SQL Demo: Building an End-to-End Streaming Application</a></li>
 
       
diff --git a/content/blog/page8/index.html b/content/blog/page8/index.html
index 4151269..908fb48 100644
--- a/content/blog/page8/index.html
+++ b/content/blog/page8/index.html
@@ -196,6 +196,19 @@
     <!-- Blog posts -->
     
     <article>
+      <h2 class="blog-title"><a 
href="/features/2018/01/30/incremental-checkpointing.html">Managing Large State 
in Apache Flink: An Intro to Incremental Checkpointing</a></h2>
+
+      <p>30 Jan 2018
+       Stefan Ricther (<a 
href="https://twitter.com/StefanRRicther";>@StefanRRicther</a>) &amp; Chris Ward 
(<a href="https://twitter.com/chrischinch";>@chrischinch</a>)</p>
+
+      <p>Flink 1.3.0 introduced incremental checkpointing, making it possible 
for applications with large state to generate checkpoints more efficiently.</p>
+
+      <p><a 
href="/features/2018/01/30/incremental-checkpointing.html">Continue reading 
&raquo;</a></p>
+    </article>
+
+    <hr>
+    
+    <article>
       <h2 class="blog-title"><a 
href="/news/2017/12/21/2017-year-in-review.html">Apache Flink in 2017: Year in 
Review</a></h2>
 
       <p>21 Dec 2017
@@ -331,20 +344,6 @@ what’s coming in Flink 1.4.0 as well as a preview of what 
the Flink community
 
     <hr>
     
-    <article>
-      <h2 class="blog-title"><a 
href="/news/2017/04/04/dynamic-tables.html">Continuous Queries on Dynamic 
Tables</a></h2>
-
-      <p>04 Apr 2017 by Fabian Hueske, Shaoxuan Wang, and Xiaowei Jiang
-      </p>
-
-      <p><p>Flink's relational APIs, the Table API and SQL, are unified APIs 
for stream and batch processing, meaning that a query produces the same result 
when being evaluated on streaming or static data.</p>
-<p>In this blog post we discuss the future of these APIs and introduce the 
concept of Dynamic Tables. Dynamic tables will significantly expand the scope 
of the Table API and SQL on streams and enable many more advanced use cases. We 
discuss how streams and dynamic tables relate to each other and explain the 
semantics of continuously evaluating queries on dynamic tables.</p></p>
-
-      <p><a href="/news/2017/04/04/dynamic-tables.html">Continue reading 
&raquo;</a></p>
-    </article>
-
-    <hr>
-    
 
     <!-- Pagination links -->
     
@@ -387,6 +386,16 @@ what’s coming in Flink 1.4.0 as well as a preview of what 
the Flink community
       
 
       
+      <li><a href="/2020/07/28/pyflink-pandas-udf-support-flink.html">PyFlink: 
The integration of Pandas into PyFlink</a></li>
+
+      
+        
+      
+    
+      
+      
+
+      
       <li><a 
href="/2020/07/28/flink-sql-demo-building-e2e-streaming-application.html">Flink 
SQL Demo: Building an End-to-End Streaming Application</a></li>
 
       
diff --git a/content/blog/page9/index.html b/content/blog/page9/index.html
index e26b9e5..cbe1955 100644
--- a/content/blog/page9/index.html
+++ b/content/blog/page9/index.html
@@ -196,6 +196,20 @@
     <!-- Blog posts -->
     
     <article>
+      <h2 class="blog-title"><a 
href="/news/2017/04/04/dynamic-tables.html">Continuous Queries on Dynamic 
Tables</a></h2>
+
+      <p>04 Apr 2017 by Fabian Hueske, Shaoxuan Wang, and Xiaowei Jiang
+      </p>
+
+      <p><p>Flink's relational APIs, the Table API and SQL, are unified APIs 
for stream and batch processing, meaning that a query produces the same result 
when being evaluated on streaming or static data.</p>
+<p>In this blog post we discuss the future of these APIs and introduce the 
concept of Dynamic Tables. Dynamic tables will significantly expand the scope 
of the Table API and SQL on streams and enable many more advanced use cases. We 
discuss how streams and dynamic tables relate to each other and explain the 
semantics of continuously evaluating queries on dynamic tables.</p></p>
+
+      <p><a href="/news/2017/04/04/dynamic-tables.html">Continue reading 
&raquo;</a></p>
+    </article>
+
+    <hr>
+    
+    <article>
       <h2 class="blog-title"><a 
href="/news/2017/03/29/table-sql-api-update.html">From Streams to Tables and 
Back Again: An Update on Flink's Table & SQL API</a></h2>
 
       <p>29 Mar 2017 by Timo Walther (<a 
href="https://twitter.com/";>@twalthr</a>)
@@ -324,21 +338,6 @@
 
     <hr>
     
-    <article>
-      <h2 class="blog-title"><a 
href="/news/2016/08/08/release-1.1.0.html">Announcing Apache Flink 
1.1.0</a></h2>
-
-      <p>08 Aug 2016
-      </p>
-
-      <p><div class="alert alert-success"><strong>Important</strong>: The 
Maven artifacts published with version 1.1.0 on Maven central have a Hadoop 
dependency issue. It is highly recommended to use <strong>1.1.1</strong> or 
<strong>1.1.1-hadoop1</strong> as the Flink version.</div>
-
-</p>
-
-      <p><a href="/news/2016/08/08/release-1.1.0.html">Continue reading 
&raquo;</a></p>
-    </article>
-
-    <hr>
-    
 
     <!-- Pagination links -->
     
@@ -381,6 +380,16 @@
       
 
       
+      <li><a href="/2020/07/28/pyflink-pandas-udf-support-flink.html">PyFlink: 
The integration of Pandas into PyFlink</a></li>
+
+      
+        
+      
+    
+      
+      
+
+      
       <li><a 
href="/2020/07/28/flink-sql-demo-building-e2e-streaming-application.html">Flink 
SQL Demo: Building an End-to-End Streaming Application</a></li>
 
       
diff --git a/content/img/blog/2020-07-28-pyflink-pandas/mission-of-pyFlink.gif 
b/content/img/blog/2020-07-28-pyflink-pandas/mission-of-pyFlink.gif
new file mode 100644
index 0000000..dda05ff
Binary files /dev/null and 
b/content/img/blog/2020-07-28-pyflink-pandas/mission-of-pyFlink.gif differ
diff --git 
a/content/img/blog/2020-07-28-pyflink-pandas/python-scientific-stack.png 
b/content/img/blog/2020-07-28-pyflink-pandas/python-scientific-stack.png
new file mode 100644
index 0000000..4d84179
Binary files /dev/null and 
b/content/img/blog/2020-07-28-pyflink-pandas/python-scientific-stack.png differ
diff --git a/content/img/blog/2020-07-28-pyflink-pandas/vm-communication.png 
b/content/img/blog/2020-07-28-pyflink-pandas/vm-communication.png
new file mode 100644
index 0000000..228787a
Binary files /dev/null and 
b/content/img/blog/2020-07-28-pyflink-pandas/vm-communication.png differ
diff --git a/content/index.html b/content/index.html
index f0a9067..ae18b0c 100644
--- a/content/index.html
+++ b/content/index.html
@@ -571,6 +571,9 @@
         <dt> <a href="/news/2020/07/30/demo-fraud-detection-3.html">Advanced 
Flink Application Patterns Vol.3: Custom Window Processing</a></dt>
         <dd>In this series of blog posts you will learn about powerful Flink 
patterns for building streaming applications.</dd>
       
+        <dt> <a 
href="/2020/07/28/pyflink-pandas-udf-support-flink.html">PyFlink: The 
integration of Pandas into PyFlink</a></dt>
+        <dd>The Apache Flink community put some great effort into integrating 
Pandas with PyFlink in the latest Flink version 1.11. Some of the added 
features include support for Pandas UDF and the conversion between Pandas 
DataFrame and Table. In this article, we will introduce how these 
functionalities work and how to use them with a step-by-step example.</dd>
+      
         <dt> <a 
href="/2020/07/28/flink-sql-demo-building-e2e-streaming-application.html">Flink 
SQL Demo: Building an End-to-End Streaming Application</a></dt>
         <dd>Apache Flink 1.11 has released many exciting new features, 
including many developments in Flink SQL which is evolving at a fast pace. This 
article takes a closer look at how to quickly build streaming applications with 
Flink SQL from a practical point of view.</dd>
       
@@ -581,11 +584,6 @@
         <dd><p>With an ever-growing number of people working with data, it’s a 
common practice for companies to build self-service platforms with the goal of 
democratizing their access across different teams and — especially — to enable 
users from any background to be independent in their data needs. In such 
environments, metadata management becomes a crucial aspect. Without it, users 
often work blindly, spending too much time searching for datasets and their 
location, figuring out data  [...]
 
 </dd>
-      
-        <dt> <a href="/news/2020/07/21/release-1.11.1.html">Apache Flink 
1.11.1 Released</a></dt>
-        <dd><p>The Apache Flink community released the first bugfix version of 
the Apache Flink 1.11 series.</p>
-
-</dd>
     
   </dl>
 
diff --git a/content/zh/index.html b/content/zh/index.html
index cc1f2e1..c4e878b 100644
--- a/content/zh/index.html
+++ b/content/zh/index.html
@@ -568,6 +568,9 @@
         <dt> <a href="/news/2020/07/30/demo-fraud-detection-3.html">Advanced 
Flink Application Patterns Vol.3: Custom Window Processing</a></dt>
         <dd>In this series of blog posts you will learn about powerful Flink 
patterns for building streaming applications.</dd>
       
+        <dt> <a 
href="/2020/07/28/pyflink-pandas-udf-support-flink.html">PyFlink: The 
integration of Pandas into PyFlink</a></dt>
+        <dd>The Apache Flink community put some great effort into integrating 
Pandas with PyFlink in the latest Flink version 1.11. Some of the added 
features include support for Pandas UDF and the conversion between Pandas 
DataFrame and Table. In this article, we will introduce how these 
functionalities work and how to use them with a step-by-step example.</dd>
+      
         <dt> <a 
href="/2020/07/28/flink-sql-demo-building-e2e-streaming-application.html">Flink 
SQL Demo: Building an End-to-End Streaming Application</a></dt>
         <dd>Apache Flink 1.11 has released many exciting new features, 
including many developments in Flink SQL which is evolving at a fast pace. This 
article takes a closer look at how to quickly build streaming applications with 
Flink SQL from a practical point of view.</dd>
       
@@ -578,11 +581,6 @@
         <dd><p>With an ever-growing number of people working with data, it’s a 
common practice for companies to build self-service platforms with the goal of 
democratizing their access across different teams and — especially — to enable 
users from any background to be independent in their data needs. In such 
environments, metadata management becomes a crucial aspect. Without it, users 
often work blindly, spending too much time searching for datasets and their 
location, figuring out data  [...]
 
 </dd>
-      
-        <dt> <a href="/news/2020/07/21/release-1.11.1.html">Apache Flink 
1.11.1 Released</a></dt>
-        <dd><p>The Apache Flink community released the first bugfix version of 
the Apache Flink 1.11 series.</p>
-
-</dd>
     
   </dl>

[flink-web] 02/02: Rebuild website

Reply via email to