http://git-wip-us.apache.org/repos/asf/flink-web/blob/b68e1f48/content/faq.html
----------------------------------------------------------------------
diff --git a/content/faq.html b/content/faq.html
index ca6df08..03a0a65 100644
--- a/content/faq.html
+++ b/content/faq.html
@@ -180,41 +180,41 @@ under the License.
 
 <div class="page-toc">
 <ul id="markdown-toc">
-  <li><a href="#general">General</a>    <ul>
-      <li><a href="#is-flink-a-hadoop-project">Is Flink a Hadoop 
Project?</a></li>
-      <li><a href="#do-i-have-to-install-apache-hadoop-to-use-flink">Do I have 
to install Apache Hadoop to use Flink?</a></li>
+  <li><a href="#general" id="markdown-toc-general">General</a>    <ul>
+      <li><a href="#is-flink-a-hadoop-project" 
id="markdown-toc-is-flink-a-hadoop-project">Is Flink a Hadoop Project?</a></li>
+      <li><a href="#do-i-have-to-install-apache-hadoop-to-use-flink" 
id="markdown-toc-do-i-have-to-install-apache-hadoop-to-use-flink">Do I have to 
install Apache Hadoop to use Flink?</a></li>
     </ul>
   </li>
-  <li><a href="#usage">Usage</a>    <ul>
-      <li><a href="#how-do-i-assess-the-progress-of-a-flink-program">How do I 
assess the progress of a Flink program?</a></li>
-      <li><a href="#how-can-i-figure-out-why-a-program-failed">How can I 
figure out why a program failed?</a></li>
-      <li><a href="#how-do-i-debug-flink-programs">How do I debug Flink 
programs?</a></li>
-      <li><a href="#what-is-the-parallelism-how-do-i-set-it">What is the 
parallelism? How do I set it?</a></li>
+  <li><a href="#usage" id="markdown-toc-usage">Usage</a>    <ul>
+      <li><a href="#how-do-i-assess-the-progress-of-a-flink-program" 
id="markdown-toc-how-do-i-assess-the-progress-of-a-flink-program">How do I 
assess the progress of a Flink program?</a></li>
+      <li><a href="#how-can-i-figure-out-why-a-program-failed" 
id="markdown-toc-how-can-i-figure-out-why-a-program-failed">How can I figure 
out why a program failed?</a></li>
+      <li><a href="#how-do-i-debug-flink-programs" 
id="markdown-toc-how-do-i-debug-flink-programs">How do I debug Flink 
programs?</a></li>
+      <li><a href="#what-is-the-parallelism-how-do-i-set-it" 
id="markdown-toc-what-is-the-parallelism-how-do-i-set-it">What is the 
parallelism? How do I set it?</a></li>
     </ul>
   </li>
-  <li><a href="#errors">Errors</a>    <ul>
-      <li><a href="#why-am-i-getting-a-nonserializableexception-">Why am I 
getting a “NonSerializableException” ?</a></li>
-      <li><a 
href="#in-scala-api-i-get-an-error-about-implicit-values-and-evidence-parameters">In
 Scala API, I get an error about implicit values and evidence 
parameters</a></li>
-      <li><a 
href="#i-get-an-error-message-saying-that-not-enough-buffers-are-available-how-do-i-fix-this">I
 get an error message saying that not enough buffers are available. How do I 
fix this?</a></li>
-      <li><a 
href="#my-job-fails-early-with-a-javaioeofexception-what-could-be-the-cause">My 
job fails early with a java.io.EOFException. What could be the cause?</a></li>
-      <li><a 
href="#my-job-fails-with-various-exceptions-from-the-hdfshadoop-code-what-can-i-do">My
 job fails with various exceptions from the HDFS/Hadoop code. What can I 
do?</a></li>
-      <li><a 
href="#in-eclipse-i-get-compilation-errors-in-the-scala-projects">In Eclipse, I 
get compilation errors in the Scala projects</a></li>
-      <li><a 
href="#my-program-does-not-compute-the-correct-result-why-are-my-custom-key-types">My
 program does not compute the correct result. Why are my custom key 
types</a></li>
-      <li><a 
href="#i-get-a-javalanginstantiationexception-for-my-data-type-what-is-wrong">I 
get a java.lang.InstantiationException for my data type, what is wrong?</a></li>
-      <li><a 
href="#i-cant-stop-flink-with-the-provided-stop-scripts-what-can-i-do">I 
can’t stop Flink with the provided stop-scripts. What can I do?</a></li>
-      <li><a href="#i-got-an-outofmemoryexception-what-can-i-do">I got an 
OutOfMemoryException. What can I do?</a></li>
-      <li><a href="#why-do-the-taskmanager-log-files-become-so-huge">Why do 
the TaskManager log files become so huge?</a></li>
-      <li><a 
href="#the-slot-allocated-for-my-task-manager-has-been-released-what-should-i-do">The
 slot allocated for my task manager has been released. What should I 
do?</a></li>
+  <li><a href="#errors" id="markdown-toc-errors">Errors</a>    <ul>
+      <li><a href="#why-am-i-getting-a-nonserializableexception-" 
id="markdown-toc-why-am-i-getting-a-nonserializableexception-">Why am I getting 
a “NonSerializableException” ?</a></li>
+      <li><a 
href="#in-scala-api-i-get-an-error-about-implicit-values-and-evidence-parameters"
 
id="markdown-toc-in-scala-api-i-get-an-error-about-implicit-values-and-evidence-parameters">In
 Scala API, I get an error about implicit values and evidence 
parameters</a></li>
+      <li><a 
href="#i-get-an-error-message-saying-that-not-enough-buffers-are-available-how-do-i-fix-this"
 
id="markdown-toc-i-get-an-error-message-saying-that-not-enough-buffers-are-available-how-do-i-fix-this">I
 get an error message saying that not enough buffers are available. How do I 
fix this?</a></li>
+      <li><a 
href="#my-job-fails-early-with-a-javaioeofexception-what-could-be-the-cause" 
id="markdown-toc-my-job-fails-early-with-a-javaioeofexception-what-could-be-the-cause">My
 job fails early with a java.io.EOFException. What could be the cause?</a></li>
+      <li><a 
href="#my-job-fails-with-various-exceptions-from-the-hdfshadoop-code-what-can-i-do"
 
id="markdown-toc-my-job-fails-with-various-exceptions-from-the-hdfshadoop-code-what-can-i-do">My
 job fails with various exceptions from the HDFS/Hadoop code. What can I 
do?</a></li>
+      <li><a href="#in-eclipse-i-get-compilation-errors-in-the-scala-projects" 
id="markdown-toc-in-eclipse-i-get-compilation-errors-in-the-scala-projects">In 
Eclipse, I get compilation errors in the Scala projects</a></li>
+      <li><a 
href="#my-program-does-not-compute-the-correct-result-why-are-my-custom-key-types"
 
id="markdown-toc-my-program-does-not-compute-the-correct-result-why-are-my-custom-key-types">My
 program does not compute the correct result. Why are my custom key 
types</a></li>
+      <li><a 
href="#i-get-a-javalanginstantiationexception-for-my-data-type-what-is-wrong" 
id="markdown-toc-i-get-a-javalanginstantiationexception-for-my-data-type-what-is-wrong">I
 get a java.lang.InstantiationException for my data type, what is 
wrong?</a></li>
+      <li><a 
href="#i-cant-stop-flink-with-the-provided-stop-scripts-what-can-i-do" 
id="markdown-toc-i-cant-stop-flink-with-the-provided-stop-scripts-what-can-i-do">I
 can’t stop Flink with the provided stop-scripts. What can I do?</a></li>
+      <li><a href="#i-got-an-outofmemoryexception-what-can-i-do" 
id="markdown-toc-i-got-an-outofmemoryexception-what-can-i-do">I got an 
OutOfMemoryException. What can I do?</a></li>
+      <li><a href="#why-do-the-taskmanager-log-files-become-so-huge" 
id="markdown-toc-why-do-the-taskmanager-log-files-become-so-huge">Why do the 
TaskManager log files become so huge?</a></li>
+      <li><a 
href="#the-slot-allocated-for-my-task-manager-has-been-released-what-should-i-do"
 
id="markdown-toc-the-slot-allocated-for-my-task-manager-has-been-released-what-should-i-do">The
 slot allocated for my task manager has been released. What should I 
do?</a></li>
     </ul>
   </li>
-  <li><a href="#yarn-deployment">YARN Deployment</a>    <ul>
-      <li><a href="#the-yarn-session-runs-only-for-a-few-seconds">The YARN 
session runs only for a few seconds</a></li>
-      <li><a 
href="#the-yarn-session-crashes-with-a-hdfs-permission-exception-during-startup">The
 YARN session crashes with a HDFS permission exception during startup</a></li>
+  <li><a href="#yarn-deployment" id="markdown-toc-yarn-deployment">YARN 
Deployment</a>    <ul>
+      <li><a href="#the-yarn-session-runs-only-for-a-few-seconds" 
id="markdown-toc-the-yarn-session-runs-only-for-a-few-seconds">The YARN session 
runs only for a few seconds</a></li>
+      <li><a 
href="#the-yarn-session-crashes-with-a-hdfs-permission-exception-during-startup"
 
id="markdown-toc-the-yarn-session-crashes-with-a-hdfs-permission-exception-during-startup">The
 YARN session crashes with a HDFS permission exception during startup</a></li>
     </ul>
   </li>
-  <li><a href="#features">Features</a>    <ul>
-      <li><a href="#what-kind-of-fault-tolerance-does-flink-provide">What kind 
of fault-tolerance does Flink provide?</a></li>
-      <li><a 
href="#are-hadoop-like-utilities-such-as-counters-and-the-distributedcache-supported">Are
 Hadoop-like utilities, such as Counters and the DistributedCache 
supported?</a></li>
+  <li><a href="#features" id="markdown-toc-features">Features</a>    <ul>
+      <li><a href="#what-kind-of-fault-tolerance-does-flink-provide" 
id="markdown-toc-what-kind-of-fault-tolerance-does-flink-provide">What kind of 
fault-tolerance does Flink provide?</a></li>
+      <li><a 
href="#are-hadoop-like-utilities-such-as-counters-and-the-distributedcache-supported"
 
id="markdown-toc-are-hadoop-like-utilities-such-as-counters-and-the-distributedcache-supported">Are
 Hadoop-like utilities, such as Counters and the DistributedCache 
supported?</a></li>
     </ul>
   </li>
 </ul>
@@ -444,7 +444,7 @@ cluster.sh</code>). You can kill their processes on 
Linux/Mac as follows:</p>
 <ul>
   <li>Determine the process id (pid) of the JobManager / TaskManager process. 
You
 can use the <code>jps</code> command on Linux(if you have OpenJDK installed) 
or command
-<code>ps -ef | grep java</code> to find all Java processes. </li>
+<code>ps -ef | grep java</code> to find all Java processes.</li>
   <li>Kill the process with <code>kill -9 &lt;pid&gt;</code>, where 
<code>pid</code> is the process id of the
 affected JobManager or TaskManager process.</li>
 </ul>

http://git-wip-us.apache.org/repos/asf/flink-web/blob/b68e1f48/content/how-to-contribute.html
----------------------------------------------------------------------
diff --git a/content/how-to-contribute.html b/content/how-to-contribute.html
index 6701d75..9b681e9 100644
--- a/content/how-to-contribute.html
+++ b/content/how-to-contribute.html
@@ -161,17 +161,17 @@
 
 <div class="page-toc">
 <ul id="markdown-toc">
-  <li><a href="#ask-questions">Ask questions!</a></li>
-  <li><a href="#file-a-bug-report">File a bug report</a></li>
-  <li><a href="#propose-an-improvement-or-a-new-feature">Propose an 
improvement or a new feature</a></li>
-  <li><a href="#help-others-and-join-the-discussions">Help others and join the 
discussions</a></li>
-  <li><a href="#test-a-release-candidate">Test a release candidate</a></li>
-  <li><a href="#contribute-code">Contribute code</a></li>
-  <li><a href="#contribute-documentation">Contribute documentation</a></li>
-  <li><a href="#improve-the-website">Improve the website</a></li>
-  <li><a href="#more-ways-to-contribute">More ways to contribute…</a></li>
-  <li><a href="#submit-a-contributor-license-agreement">Submit a Contributor 
License Agreement</a></li>
-  <li><a href="#how-to-become-a-committer">How to become a committer</a></li>
+  <li><a href="#ask-questions" id="markdown-toc-ask-questions">Ask 
questions!</a></li>
+  <li><a href="#file-a-bug-report" id="markdown-toc-file-a-bug-report">File a 
bug report</a></li>
+  <li><a href="#propose-an-improvement-or-a-new-feature" 
id="markdown-toc-propose-an-improvement-or-a-new-feature">Propose an 
improvement or a new feature</a></li>
+  <li><a href="#help-others-and-join-the-discussions" 
id="markdown-toc-help-others-and-join-the-discussions">Help others and join the 
discussions</a></li>
+  <li><a href="#test-a-release-candidate" 
id="markdown-toc-test-a-release-candidate">Test a release candidate</a></li>
+  <li><a href="#contribute-code" id="markdown-toc-contribute-code">Contribute 
code</a></li>
+  <li><a href="#contribute-documentation" 
id="markdown-toc-contribute-documentation">Contribute documentation</a></li>
+  <li><a href="#improve-the-website" 
id="markdown-toc-improve-the-website">Improve the website</a></li>
+  <li><a href="#more-ways-to-contribute" 
id="markdown-toc-more-ways-to-contribute">More ways to contribute…</a></li>
+  <li><a href="#submit-a-contributor-license-agreement" 
id="markdown-toc-submit-a-contributor-license-agreement">Submit a Contributor 
License Agreement</a></li>
+  <li><a href="#how-to-become-a-committer" 
id="markdown-toc-how-to-become-a-committer">How to become a committer</a></li>
 </ul>
 
 </div>
@@ -198,7 +198,7 @@
   <li>It allow for constructive discussions that might arise around this 
issue.</li>
 </ul>
 
-<p>Detailed information is also required, if you plan to contribute the 
improvement or feature you proposed yourself. Please read the <a 
href="/contribute-code.html">Contribute code</a> guide in this case as well. 
</p>
+<p>Detailed information is also required, if you plan to contribute the 
improvement or feature you proposed yourself. Please read the <a 
href="/contribute-code.html">Contribute code</a> guide in this case as well.</p>
 
 <hr />
 
@@ -225,7 +225,7 @@
   <li>Going back to step 1 if the release candidate had issues otherwise we 
publish the release.</li>
 </ol>
 
-<p>Our wiki contains a page that summarizes the <a 
href="https://cwiki.apache.org/confluence/display/FLINK/Releasing";>test 
procedure for a release</a>. Release testing is a big effort if done by a small 
group of people but can be easily scaled out to more people. The Flink 
community encourages everybody to participate in the testing of a release 
candidate. By testing a release candidate, you can ensure that the next Flink 
release is working properly for your setup and help to improve the quality of 
releases. </p>
+<p>Our wiki contains a page that summarizes the <a 
href="https://cwiki.apache.org/confluence/display/FLINK/Releasing";>test 
procedure for a release</a>. Release testing is a big effort if done by a small 
group of people but can be easily scaled out to more people. The Flink 
community encourages everybody to participate in the testing of a release 
candidate. By testing a release candidate, you can ensure that the next Flink 
release is working properly for your setup and help to improve the quality of 
releases.</p>
 
 <hr />
 
@@ -239,7 +239,7 @@
 
 <h3 class="no_toc" id="looking-for-an-issue-to-work-on">Looking for an issue 
to work on?</h3>
 
-<p>We maintain a list of all known bugs, proposed improvements and suggested 
features in <a 
href="https://issues.apache.org/jira/browse/FLINK/?selectedTab=com.atlassian.jira.jira-projects-plugin:issues-panel";>Flink’s
 JIRA</a>. Issues that we believe are good tasks for new contributors are 
tagged with a special “starter” tag. Those tasks are supposed to be rather 
easy to solve and will help you to become familiar with the project and the 
contribution process. </p>
+<p>We maintain a list of all known bugs, proposed improvements and suggested 
features in <a 
href="https://issues.apache.org/jira/browse/FLINK/?selectedTab=com.atlassian.jira.jira-projects-plugin:issues-panel";>Flink’s
 JIRA</a>. Issues that we believe are good tasks for new contributors are 
tagged with a special “starter” tag. Those tasks are supposed to be rather 
easy to solve and will help you to become familiar with the project and the 
contribution process.</p>
 
 <p>Please have a look at the list of <a 
href="https://issues.apache.org/jira/issues/?jql=project%20%3D%20FLINK%20AND%20resolution%20%3D%20Unresolved%20AND%20labels%20%3D%20starter%20ORDER%20BY%20priority%20DESC";>starter
 issues</a>, if you are looking for an issue to work on. You can of course also 
choose <a 
href="https://issues.apache.org/jira/issues/?jql=project%20%3D%20FLINK%20AND%20resolution%20%3D%20Unresolved%20ORDER%20BY%20priority%20DESC";>any
 other issue</a> to work on. Feel free to ask questions about issues that you 
would be interested in working on.</p>
 
@@ -304,7 +304,7 @@
 
 <h2 id="how-to-become-a-committer">How to become a committer</h2>
 
-<p>Committers are community members that have write access to the project’s 
repositories, i.e., they can modify the code, documentation, and website by 
themselves and also accept other contributions. </p>
+<p>Committers are community members that have write access to the project’s 
repositories, i.e., they can modify the code, documentation, and website by 
themselves and also accept other contributions.</p>
 
 <p>There is no strict protocol for becoming a committer. Candidates for new 
committers are typically people that are active contributors and community 
members.</p>
 
@@ -312,7 +312,7 @@
 
 <p>Of course, contributing code and documentation to the project is important 
as well. A good way to start is contributing improvements, new features, or bug 
fixes. You need to show that you take responsibility for the code that you 
contribute, add tests and documentation, and help maintaining it.</p>
 
-<p>Candidates for new committers are suggested by current committers or PMC 
members, and voted upon by the PMC. </p>
+<p>Candidates for new committers are suggested by current committers or PMC 
members, and voted upon by the PMC.</p>
 
 <p>If you would like to become a committer, you should engage with the 
community and start contributing to Apache Flink in any of the above ways. You 
might also want to talk to other committers and ask for their advice and 
guidance.</p>
 

http://git-wip-us.apache.org/repos/asf/flink-web/blob/b68e1f48/content/improve-website.html
----------------------------------------------------------------------
diff --git a/content/improve-website.html b/content/improve-website.html
index e80fefb..e35a199 100644
--- a/content/improve-website.html
+++ b/content/improve-website.html
@@ -169,11 +169,11 @@
 
 <div class="page-toc">
 <ul id="markdown-toc">
-  <li><a href="#obtain-the-website-sources">Obtain the website sources</a></li>
-  <li><a href="#directory-structure-and-files">Directory structure and 
files</a></li>
-  <li><a href="#update-or-extend-the-documentation">Update or extend the 
documentation</a></li>
-  <li><a href="#submit-your-contribution">Submit your contribution</a></li>
-  <li><a href="#committer-section">Committer section</a></li>
+  <li><a href="#obtain-the-website-sources" 
id="markdown-toc-obtain-the-website-sources">Obtain the website sources</a></li>
+  <li><a href="#directory-structure-and-files" 
id="markdown-toc-directory-structure-and-files">Directory structure and 
files</a></li>
+  <li><a href="#update-or-extend-the-documentation" 
id="markdown-toc-update-or-extend-the-documentation">Update or extend the 
documentation</a></li>
+  <li><a href="#submit-your-contribution" 
id="markdown-toc-submit-your-contribution">Submit your contribution</a></li>
+  <li><a href="#committer-section" 
id="markdown-toc-committer-section">Committer section</a></li>
 </ul>
 
 </div>

http://git-wip-us.apache.org/repos/asf/flink-web/blob/b68e1f48/content/material.html
----------------------------------------------------------------------
diff --git a/content/material.html b/content/material.html
index 6c4f4cf..5dc923c 100644
--- a/content/material.html
+++ b/content/material.html
@@ -159,14 +159,14 @@
 
 <div class="page-toc">
 <ul id="markdown-toc">
-  <li><a href="#apache-flink-logos">Apache Flink Logos</a>    <ul>
-      <li><a href="#portable-network-graphics-png">Portable Network Graphics 
(PNG)</a></li>
-      <li><a href="#scalable-vector-graphics-svg">Scalable Vector Graphics 
(SVG)</a></li>
-      <li><a href="#photoshop-psd">Photoshop (PSD)</a></li>
+  <li><a href="#apache-flink-logos" 
id="markdown-toc-apache-flink-logos">Apache Flink Logos</a>    <ul>
+      <li><a href="#portable-network-graphics-png" 
id="markdown-toc-portable-network-graphics-png">Portable Network Graphics 
(PNG)</a></li>
+      <li><a href="#scalable-vector-graphics-svg" 
id="markdown-toc-scalable-vector-graphics-svg">Scalable Vector Graphics 
(SVG)</a></li>
+      <li><a href="#photoshop-psd" id="markdown-toc-photoshop-psd">Photoshop 
(PSD)</a></li>
     </ul>
   </li>
-  <li><a href="#color-scheme">Color Scheme</a></li>
-  <li><a href="#slides">Slides</a></li>
+  <li><a href="#color-scheme" id="markdown-toc-color-scheme">Color 
Scheme</a></li>
+  <li><a href="#slides" id="markdown-toc-slides">Slides</a></li>
 </ul>
 
 </div>

http://git-wip-us.apache.org/repos/asf/flink-web/blob/b68e1f48/content/news/2014/01/13/stratosphere-release-0.4.html
----------------------------------------------------------------------
diff --git a/content/news/2014/01/13/stratosphere-release-0.4.html 
b/content/news/2014/01/13/stratosphere-release-0.4.html
index f6e28bd..028f019 100644
--- a/content/news/2014/01/13/stratosphere-release-0.4.html
+++ b/content/news/2014/01/13/stratosphere-release-0.4.html
@@ -159,7 +159,7 @@
       <article>
         <p>13 Jan 2014</p>
 
-<p>We are pleased to announce that version 0.4 of the Stratosphere system has 
been released. </p>
+<p>We are pleased to announce that version 0.4 of the Stratosphere system has 
been released.</p>
 
 <p>Our team has been working hard during the last few months to create an 
improved and stable Stratosphere version. The new version comes with many new 
features, usability and performance improvements in all levels, including a new 
Scala API for the concise specification of programs, a Pregel-like API, support 
for Yarn clusters, and major performance improvements. The system features now 
first-class support for iterative programs and thus covers traditional 
analytical use cases as well as data mining and graph processing use cases with 
great performance.</p>
 
@@ -187,7 +187,7 @@ Follow <a href="/docs/0.4/setup/yarn.html">our guide</a> on 
how to start a Strat
 <p>The high-level language Meteor now natively serializes JSON trees for 
greater performance and offers additional operators and file formats. We 
greatly empowered the user to write crispier scripts by adding second-order 
functions, multi-output operators, and other syntactical sugar. For developers 
of Meteor packages, the API is much more comprehensive and allows to define 
custom data types that can be easily embedded in JSON trees through ad-hoc byte 
code generation.</p>
 
 <h3 id="spargel-pregel-inspired-graph-processing">Spargel: Pregel Inspired 
Graph Processing</h3>
-<p>Spargel is a vertex-centric API similar to the interface proposed in 
Google’s Pregel paper and implemented in Apache Giraph. Spargel is 
implemented in 500 lines of code (including comments) on top of 
Stratosphere’s delta iterations feature. This confirms the flexibility of 
Stratosphere’s architecture. </p>
+<p>Spargel is a vertex-centric API similar to the interface proposed in 
Google’s Pregel paper and implemented in Apache Giraph. Spargel is 
implemented in 500 lines of code (including comments) on top of 
Stratosphere’s delta iterations feature. This confirms the flexibility of 
Stratosphere’s architecture.</p>
 
 <h3 id="web-frontend">Web Frontend</h3>
 <p>Using the new web frontend, you can monitor the progress of Stratosphere 
jobs. For finished jobs, the frontend shows a breakdown of the execution times 
for each operator. The webclient also visualizes the execution strategies 
chosen by the optimizer.</p>
@@ -215,7 +215,7 @@ Follow <a href="/docs/0.4/setup/yarn.html">our guide</a> on 
how to start a Strat
 </ul>
 
 <h3 id="download-and-get-started-with-stratosphere-v04">Download and get 
started with Stratosphere v0.4</h3>
-<p>There are several options for getting started with Stratosphere. </p>
+<p>There are several options for getting started with Stratosphere.</p>
 
 <ul>
   <li>Download it on the <a href="/downloads">download page</a></li>

http://git-wip-us.apache.org/repos/asf/flink-web/blob/b68e1f48/content/news/2014/02/18/amazon-elastic-mapreduce-cloud-yarn.html
----------------------------------------------------------------------
diff --git a/content/news/2014/02/18/amazon-elastic-mapreduce-cloud-yarn.html 
b/content/news/2014/02/18/amazon-elastic-mapreduce-cloud-yarn.html
index b30d553..eca3ce2 100644
--- a/content/news/2014/02/18/amazon-elastic-mapreduce-cloud-yarn.html
+++ b/content/news/2014/02/18/amazon-elastic-mapreduce-cloud-yarn.html
@@ -229,7 +229,7 @@
 ssh had...@ec2-54-213-61-105.us-west-2.compute.amazonaws.com -i 
~/Downloads/work-laptop.pem</code></pre></div>
 
 <p>(Windows users have to follow <a 
href="http://docs.aws.amazon.com/ElasticMapReduce/latest/DeveloperGuide/emr-connect-master-node-ssh.html";>these
 instructions</a> to SSH into the machine running the master.) 
&lt;/br&gt;&lt;/br&gt;
-Once connected to the master, download and start Stratosphere for YARN: </p>
+Once connected to the master, download and start Stratosphere for YARN:</p>
 <ul>
        <li>Download and extract Stratosphere-YARN</li>
 
@@ -252,11 +252,11 @@ The arguments have the following meaning
        </ul>
 </ul>
 
-<p>Once the output has changed from </p>
+<p>Once the output has changed from</p>
 
 <div class="highlight"><pre><code class="language-bash" 
data-lang="bash">JobManager is now running on N/A:6123</code></pre></div>
 
-<p>to </p>
+<p>to</p>
 
 <div class="highlight"><pre><code class="language-bash" 
data-lang="bash">JobManager is now running on 
ip-172-31-13-68.us-west-2.compute.internal:6123</code></pre></div>
 

http://git-wip-us.apache.org/repos/asf/flink-web/blob/b68e1f48/content/news/2014/11/04/release-0.7.0.html
----------------------------------------------------------------------
diff --git a/content/news/2014/11/04/release-0.7.0.html 
b/content/news/2014/11/04/release-0.7.0.html
index 33825e7..9d97cb6 100644
--- a/content/news/2014/11/04/release-0.7.0.html
+++ b/content/news/2014/11/04/release-0.7.0.html
@@ -179,7 +179,7 @@
 
 <p><strong>Record API deprecated:</strong> The (old) Stratosphere Record API 
has been marked as deprecated and is planned for removal in the 0.9.0 
release.</p>
 
-<p><strong>BLOB service:</strong> This release contains a new service to 
distribute jar files and other binary data among the JobManager, TaskManagers 
and the client. </p>
+<p><strong>BLOB service:</strong> This release contains a new service to 
distribute jar files and other binary data among the JobManager, TaskManagers 
and the client.</p>
 
 <p><strong>Intermediate data sets:</strong> A major rewrite of the system 
internals introduces intermediate data sets as first class citizens. The 
internal state machine that tracks the distributed tasks has also been 
completely rewritten for scalability. While this is not visible as a 
user-facing feature yet, it is the foundation for several upcoming exciting 
features.</p>
 

http://git-wip-us.apache.org/repos/asf/flink-web/blob/b68e1f48/content/news/2014/11/18/hadoop-compatibility.html
----------------------------------------------------------------------
diff --git a/content/news/2014/11/18/hadoop-compatibility.html 
b/content/news/2014/11/18/hadoop-compatibility.html
index 0db27fd..067d456 100644
--- a/content/news/2014/11/18/hadoop-compatibility.html
+++ b/content/news/2014/11/18/hadoop-compatibility.html
@@ -167,7 +167,7 @@
 <img src="/img/blog/hcompat-logos.png" style="width:30%;margin:15px" />
 </center>
 
-<p>To close this gap, Flink provides a Hadoop Compatibility package to wrap 
functions implemented against Hadoop’s MapReduce interfaces and embed them in 
Flink programs. This package was developed as part of a <a 
href="https://developers.google.com/open-source/soc/";>Google Summer of Code</a> 
2014 project. </p>
+<p>To close this gap, Flink provides a Hadoop Compatibility package to wrap 
functions implemented against Hadoop’s MapReduce interfaces and embed them in 
Flink programs. This package was developed as part of a <a 
href="https://developers.google.com/open-source/soc/";>Google Summer of Code</a> 
2014 project.</p>
 
 <p>With the Hadoop Compatibility package, you can reuse all your Hadoop</p>
 
@@ -180,7 +180,7 @@
 
 <p>in Flink programs without changing a line of code. Moreover, Flink also 
natively supports all Hadoop data types (<code>Writables</code> and 
<code>WritableComparable</code>).</p>
 
-<p>The following code snippet shows a simple Flink WordCount program that 
solely uses Hadoop data types, InputFormat, OutputFormat, Mapper, and Reducer 
functions. </p>
+<p>The following code snippet shows a simple Flink WordCount program that 
solely uses Hadoop data types, InputFormat, OutputFormat, Mapper, and Reducer 
functions.</p>
 
 <div class="highlight"><pre><code class="language-java"><span class="c1">// 
Definition of Hadoop Mapper function</span>
 <span class="kd">public</span> <span class="kd">class</span> <span 
class="nc">Tokenizer</span> <span class="kd">implements</span> <span 
class="n">Mapper</span><span class="o">&lt;</span><span 
class="n">LongWritable</span><span class="o">,</span> <span 
class="n">Text</span><span class="o">,</span> <span class="n">Text</span><span 
class="o">,</span> <span class="n">LongWritable</span><span 
class="o">&gt;</span> <span class="o">{</span> <span class="o">...</span> <span 
class="o">}</span>

http://git-wip-us.apache.org/repos/asf/flink-web/blob/b68e1f48/content/news/2015/01/21/release-0.8.html
----------------------------------------------------------------------
diff --git a/content/news/2015/01/21/release-0.8.html 
b/content/news/2015/01/21/release-0.8.html
index abcd1dc..ee5c54a 100644
--- a/content/news/2015/01/21/release-0.8.html
+++ b/content/news/2015/01/21/release-0.8.html
@@ -210,7 +210,7 @@
   <li>Stefan Bunk</li>
   <li>Paris Carbone</li>
   <li>Ufuk Celebi</li>
-  <li>Nils Engelbach </li>
+  <li>Nils Engelbach</li>
   <li>Stephan Ewen</li>
   <li>Gyula Fora</li>
   <li>Gabor Hermann</li>

http://git-wip-us.apache.org/repos/asf/flink-web/blob/b68e1f48/content/news/2015/02/04/january-in-flink.html
----------------------------------------------------------------------
diff --git a/content/news/2015/02/04/january-in-flink.html 
b/content/news/2015/02/04/january-in-flink.html
index fb784a9..96b73f6 100644
--- a/content/news/2015/02/04/january-in-flink.html
+++ b/content/news/2015/02/04/january-in-flink.html
@@ -191,7 +191,7 @@
 
 <h3 id="using-off-heap-memoryhttpsgithubcomapacheflinkpull290"><a 
href="https://github.com/apache/flink/pull/290";>Using off-heap memory</a></h3>
 
-<p>This pull request enables Flink to use off-heap memory for its internal 
memory uses (sort, hash, caching of intermediate data sets). </p>
+<p>This pull request enables Flink to use off-heap memory for its internal 
memory uses (sort, hash, caching of intermediate data sets).</p>
 
 <h3 id="gelly-flinks-graph-apihttpsgithubcomapacheflinkpull335"><a 
href="https://github.com/apache/flink/pull/335";>Gelly, Flink’s Graph 
API</a></h3>
 

http://git-wip-us.apache.org/repos/asf/flink-web/blob/b68e1f48/content/news/2015/02/09/streaming-example.html
----------------------------------------------------------------------
diff --git a/content/news/2015/02/09/streaming-example.html 
b/content/news/2015/02/09/streaming-example.html
index e60e63e..3b3fd55 100644
--- a/content/news/2015/02/09/streaming-example.html
+++ b/content/news/2015/02/09/streaming-example.html
@@ -195,7 +195,7 @@ found <a 
href="https://github.com/mbalassi/flink/blob/stockprices/flink-staging/
   <li>Read a socket stream of stock prices</li>
   <li>Parse the text in the stream to create a stream of 
<code>StockPrice</code> objects</li>
   <li>Add four other sources tagged with the stock symbol.</li>
-  <li>Finally, merge the streams to create a unified stream. </li>
+  <li>Finally, merge the streams to create a unified stream.</li>
 </ol>
 
 <p><img alt="Reading from multiple inputs" 
src="/img/blog/blog_multi_input.png" width="70%" class="img-responsive 
center-block" /></p>
@@ -667,7 +667,7 @@ number of mentions of a given stock in the Twitter stream. 
As both of
 these data streams are potentially infinite, we apply the join on a
 30-second window.</p>
 
-<p><img alt="Streaming joins" src="/img/blog/blog_stream_join.png" width="60%" 
class="img-responsive center-block" /> </p>
+<p><img alt="Streaming joins" src="/img/blog/blog_stream_join.png" width="60%" 
class="img-responsive center-block" /></p>
 
 <div class="codetabs">
 

http://git-wip-us.apache.org/repos/asf/flink-web/blob/b68e1f48/content/news/2015/03/13/peeking-into-Apache-Flinks-Engine-Room.html
----------------------------------------------------------------------
diff --git 
a/content/news/2015/03/13/peeking-into-Apache-Flinks-Engine-Room.html 
b/content/news/2015/03/13/peeking-into-Apache-Flinks-Engine-Room.html
index 93566d6..5f4c5af 100644
--- a/content/news/2015/03/13/peeking-into-Apache-Flinks-Engine-Room.html
+++ b/content/news/2015/03/13/peeking-into-Apache-Flinks-Engine-Room.html
@@ -166,7 +166,7 @@
 <p>In this blog post, we cut through Apache Flink’s layered architecture and 
take a look at its internals with a focus on how it handles joins. 
Specifically, I will</p>
 
 <ul>
-  <li>show how easy it is to join data sets using Flink’s fluent APIs, </li>
+  <li>show how easy it is to join data sets using Flink’s fluent APIs,</li>
   <li>discuss basic distributed join strategies, Flink’s join 
implementations, and its memory management,</li>
   <li>talk about Flink’s optimizer that automatically chooses join 
strategies,</li>
   <li>show some performance numbers for joining data sets of different sizes, 
and finally</li>
@@ -177,7 +177,7 @@
 
 <h3 id="how-do-i-join-with-flink">How do I join with Flink?</h3>
 
-<p>Flink provides fluent APIs in Java and Scala to write data flow programs. 
Flink’s APIs are centered around parallel data collections which are called 
data sets. data sets are processed by applying Transformations that compute new 
data sets. Flink’s transformations include Map and Reduce as known from 
MapReduce <a href="http://research.google.com/archive/mapreduce.html";>[1]</a> 
but also operators for joining, co-grouping, and iterative processing. The 
documentation gives an overview of all available transformations <a 
href="http://ci.apache.org/projects/flink/flink-docs-release-0.8/dataset_transformations.html";>[2]</a>.
 </p>
+<p>Flink provides fluent APIs in Java and Scala to write data flow programs. 
Flink’s APIs are centered around parallel data collections which are called 
data sets. data sets are processed by applying Transformations that compute new 
data sets. Flink’s transformations include Map and Reduce as known from 
MapReduce <a href="http://research.google.com/archive/mapreduce.html";>[1]</a> 
but also operators for joining, co-grouping, and iterative processing. The 
documentation gives an overview of all available transformations <a 
href="http://ci.apache.org/projects/flink/flink-docs-release-0.8/dataset_transformations.html";>[2]</a>.</p>
 
 <p>Joining two Scala case class data sets is very easy as the following 
example shows:</p>
 
@@ -214,7 +214,7 @@
 
 <ol>
   <li>The data of both inputs is distributed across all parallel instances 
that participate in the join and</li>
-  <li>each parallel instance performs a standard stand-alone join algorithm on 
its local partition of the overall data. </li>
+  <li>each parallel instance performs a standard stand-alone join algorithm on 
its local partition of the overall data.</li>
 </ol>
 
 <p>The distribution of data across parallel instances must ensure that each 
valid join pair can be locally built by exactly one instance. For both steps, 
there are multiple valid strategies that can be independently picked and which 
are favorable in different situations. In Flink terminology, the first phase is 
called Ship Strategy and the second phase Local Strategy. In the following I 
will describe Flink’s ship and local strategies to join two data sets 
<em>R</em> and <em>S</em>.</p>
@@ -233,7 +233,7 @@
 <img src="/img/blog/joins-repartition.png" style="width:90%;margin:15px" />
 </center>
 
-<p>The Broadcast-Forward strategy sends one complete data set (R) to each 
parallel instance that holds a partition of the other data set (S), i.e., each 
parallel instance receives the full data set R. Data set S remains local and is 
not shipped at all. The cost of the BF strategy depends on the size of R and 
the number of parallel instances it is shipped to. The size of S does not 
matter because S is not moved. The figure below illustrates how both ship 
strategies work. </p>
+<p>The Broadcast-Forward strategy sends one complete data set (R) to each 
parallel instance that holds a partition of the other data set (S), i.e., each 
parallel instance receives the full data set R. Data set S remains local and is 
not shipped at all. The cost of the BF strategy depends on the size of R and 
the number of parallel instances it is shipped to. The size of S does not 
matter because S is not moved. The figure below illustrates how both ship 
strategies work.</p>
 
 <center>
 <img src="/img/blog/joins-broadcast.png" style="width:90%;margin:15px" />
@@ -242,7 +242,7 @@
 <p>The Repartition-Repartition and Broadcast-Forward ship strategies establish 
suitable data distributions to execute a distributed join. Depending on the 
operations that are applied before the join, one or even both inputs of a join 
are already distributed in a suitable way across parallel instances. In this 
case, Flink will reuse such distributions and only ship one or no input at 
all.</p>
 
 <h4 id="flinks-memory-management">Flink’s Memory Management</h4>
-<p>Before delving into the details of Flink’s local join algorithms, I will 
briefly discuss Flink’s internal memory management. Data processing 
algorithms such as joining, grouping, and sorting need to hold portions of 
their input data in memory. While such algorithms perform best if there is 
enough memory available to hold all data, it is crucial to gracefully handle 
situations where the data size exceeds memory. Such situations are especially 
tricky in JVM-based systems such as Flink because the system needs to reliably 
recognize that it is short on memory. Failure to detect such situations can 
result in an <code>OutOfMemoryException</code> and kill the JVM. </p>
+<p>Before delving into the details of Flink’s local join algorithms, I will 
briefly discuss Flink’s internal memory management. Data processing 
algorithms such as joining, grouping, and sorting need to hold portions of 
their input data in memory. While such algorithms perform best if there is 
enough memory available to hold all data, it is crucial to gracefully handle 
situations where the data size exceeds memory. Such situations are especially 
tricky in JVM-based systems such as Flink because the system needs to reliably 
recognize that it is short on memory. Failure to detect such situations can 
result in an <code>OutOfMemoryException</code> and kill the JVM.</p>
 
 <p>Flink handles this challenge by actively managing its memory. When a worker 
node (TaskManager) is started, it allocates a fixed portion (70% by default) of 
the JVM’s heap memory that is available after initialization as 32KB byte 
arrays. These byte arrays are distributed as working memory to all algorithms 
that need to hold significant portions of data in memory. The algorithms 
receive their input data as Java data objects and serialize them into their 
working memory.</p>
 
@@ -259,7 +259,7 @@
 <p>After the data has been distributed across all parallel join instances 
using either a Repartition-Repartition or Broadcast-Forward ship strategy, each 
instance runs a local join algorithm to join the elements of its local 
partition. Flink’s runtime features two common join strategies to perform 
these local joins:</p>
 
 <ul>
-  <li>the <em>Sort-Merge-Join</em> strategy (SM) and </li>
+  <li>the <em>Sort-Merge-Join</em> strategy (SM) and</li>
   <li>the <em>Hybrid-Hash-Join</em> strategy (HH).</li>
 </ul>
 
@@ -304,13 +304,13 @@
 <ul>
   <li>1GB     : 1000GB</li>
   <li>10GB    : 1000GB</li>
-  <li>100GB   : 1000GB </li>
+  <li>100GB   : 1000GB</li>
   <li>1000GB  : 1000GB</li>
 </ul>
 
 <p>The Broadcast-Forward strategy is only executed for up to 10GB. Building a 
hash table from 100GB broadcasted data in 5GB working memory would result in 
spilling proximately 95GB (build input) + 950GB (probe input) in each parallel 
thread and require more than 8TB local disk storage on each machine.</p>
 
-<p>As in the single-core benchmark, we run 1:N joins, generate the data 
on-the-fly, and immediately discard the result after the join. We run the 
benchmark on 10 n1-highmem-8 Google Compute Engine instances. Each instance is 
equipped with 8 cores, 52GB RAM, 40GB of which are configured as working memory 
(5GB per core), and one local SSD for spilling to disk. All benchmarks are 
performed using the same configuration, i.e., no fine tuning for the respective 
data sizes is done. The programs are executed with a parallelism of 80. </p>
+<p>As in the single-core benchmark, we run 1:N joins, generate the data 
on-the-fly, and immediately discard the result after the join. We run the 
benchmark on 10 n1-highmem-8 Google Compute Engine instances. Each instance is 
equipped with 8 cores, 52GB RAM, 40GB of which are configured as working memory 
(5GB per core), and one local SSD for spilling to disk. All benchmarks are 
performed using the same configuration, i.e., no fine tuning for the respective 
data sizes is done. The programs are executed with a parallelism of 80.</p>
 
 <center>
 <img src="/img/blog/joins-dist-perf.png" style="width:70%;margin:15px" />
@@ -327,7 +327,7 @@
 <ul>
   <li>Flink’s fluent Scala and Java APIs make joins and other data 
transformations easy as cake.</li>
   <li>The optimizer does the hard choices for you, but gives you control in 
case you know better.</li>
-  <li>Flink’s join implementations perform very good in-memory and 
gracefully degrade when going to disk. </li>
+  <li>Flink’s join implementations perform very good in-memory and 
gracefully degrade when going to disk.</li>
   <li>Due to Flink’s robust memory management, there is no need for job- or 
data-specific memory tuning to avoid a nasty <code>OutOfMemoryException</code>. 
It just runs out-of-the-box.</li>
 </ul>
 

http://git-wip-us.apache.org/repos/asf/flink-web/blob/b68e1f48/content/news/2015/05/11/Juggling-with-Bits-and-Bytes.html
----------------------------------------------------------------------
diff --git a/content/news/2015/05/11/Juggling-with-Bits-and-Bytes.html 
b/content/news/2015/05/11/Juggling-with-Bits-and-Bytes.html
index 6c442e5..c018d89 100644
--- a/content/news/2015/05/11/Juggling-with-Bits-and-Bytes.html
+++ b/content/news/2015/05/11/Juggling-with-Bits-and-Bytes.html
@@ -178,7 +178,7 @@ However, this approach has a few notable drawbacks. First 
of all it is not trivi
 <img src="/img/blog/memory-mgmt.png" style="width:90%;margin:15px" />
 </center>
 
-<p>Flink’s style of active memory management and operating on binary data 
has several benefits: </p>
+<p>Flink’s style of active memory management and operating on binary data 
has several benefits:</p>
 
 <ol>
   <li><strong>Memory-safe execution &amp; efficient out-of-core 
algorithms.</strong> Due to the fixed amount of allocated memory segments, it 
is trivial to monitor remaining memory resources. In case of memory shortage, 
processing operators can efficiently write larger batches of memory segments to 
disk and later them read back. Consequently, <code>OutOfMemoryErrors</code> are 
effectively prevented.</li>
@@ -187,13 +187,13 @@ However, this approach has a few notable drawbacks. First 
of all it is not trivi
   <li><strong>Efficient binary operations &amp; cache sensitivity.</strong> 
Binary data can be efficiently compared and operated on given a suitable binary 
representation. Furthermore, the binary representations can put related values, 
as well as hash codes, keys, and pointers, adjacently into memory. This gives 
data structures with usually more cache efficient access patterns.</li>
 </ol>
 
-<p>These properties of active memory management are very desirable in a data 
processing systems for large-scale data analytics but have a significant price 
tag attached. Active memory management and operating on binary data is not 
trivial to implement, i.e., using <code>java.util.HashMap</code> is much easier 
than implementing a spillable hash-table backed by byte arrays and a custom 
serialization stack. Of course Apache Flink is not the only JVM-based data 
processing system that operates on serialized binary data. Projects such as <a 
href="http://drill.apache.org/";>Apache Drill</a>, <a 
href="http://ignite.incubator.apache.org/";>Apache Ignite (incubating)</a> or <a 
href="http://projectgeode.org/";>Apache Geode (incubating)</a> apply similar 
techniques and it was recently announced that also <a 
href="http://spark.apache.org/";>Apache Spark</a> will evolve into this 
direction with <a 
href="https://databricks.com/blog/2015/04/28/project-tungsten-bringing-spark-closer-to-bare-metal.html";>
 Project Tungsten</a>. </p>
+<p>These properties of active memory management are very desirable in a data 
processing systems for large-scale data analytics but have a significant price 
tag attached. Active memory management and operating on binary data is not 
trivial to implement, i.e., using <code>java.util.HashMap</code> is much easier 
than implementing a spillable hash-table backed by byte arrays and a custom 
serialization stack. Of course Apache Flink is not the only JVM-based data 
processing system that operates on serialized binary data. Projects such as <a 
href="http://drill.apache.org/";>Apache Drill</a>, <a 
href="http://ignite.incubator.apache.org/";>Apache Ignite (incubating)</a> or <a 
href="http://projectgeode.org/";>Apache Geode (incubating)</a> apply similar 
techniques and it was recently announced that also <a 
href="http://spark.apache.org/";>Apache Spark</a> will evolve into this 
direction with <a 
href="https://databricks.com/blog/2015/04/28/project-tungsten-bringing-spark-closer-to-bare-metal.html";>
 Project Tungsten</a>.</p>
 
 <p>In the following we discuss in detail how Flink allocates memory, 
de/serializes objects, and operates on binary data. We will also show some 
performance numbers comparing processing objects on the heap and operating on 
binary data.</p>
 
 <h2 id="how-does-flink-allocate-memory">How does Flink allocate memory?</h2>
 
-<p>A Flink worker, called TaskManager, is composed of several internal 
components such as an actor system for coordination with the Flink master, an 
IOManager that takes care of spilling data to disk and reading it back, and a 
MemoryManager that coordinates memory usage. In the context of this blog post, 
the MemoryManager is of most interest. </p>
+<p>A Flink worker, called TaskManager, is composed of several internal 
components such as an actor system for coordination with the Flink master, an 
IOManager that takes care of spilling data to disk and reading it back, and a 
MemoryManager that coordinates memory usage. In the context of this blog post, 
the MemoryManager is of most interest.</p>
 
 <p>The MemoryManager takes care of allocating, accounting, and distributing 
MemorySegments to data processing operators such as sort and join operators. A 
<a 
href="https://github.com/apache/flink/blob/release-0.9.0-milestone-1/flink-core/src/main/java/org/apache/flink/core/memory/MemorySegment.java";>MemorySegment</a>
 is Flink’s distribution unit of memory and is backed by a regular Java byte 
array (size is 32 KB by default). A MemorySegment provides very efficient write 
and read access to its backed byte array using Java’s unsafe methods. You can 
think of a MemorySegment as a custom-tailored version of Java’s NIO 
ByteBuffer. In order to operate on multiple MemorySegments like on a larger 
chunk of consecutive memory, Flink uses logical views that implement Java’s 
<code>java.io.DataOutput</code> and <code>java.io.DataInput</code> 
interfaces.</p>
 
@@ -205,7 +205,7 @@ However, this approach has a few notable drawbacks. First 
of all it is not trivi
 
 <h2 id="how-does-flink-serialize-objects">How does Flink serialize 
objects?</h2>
 
-<p>The Java ecosystem offers several libraries to convert objects into a 
binary representation and back. Common alternatives are standard Java 
serialization, <a href="https://github.com/EsotericSoftware/kryo";>Kryo</a>, <a 
href="http://avro.apache.org/";>Apache Avro</a>, <a 
href="http://thrift.apache.org/";>Apache Thrift</a>, or Google’s <a 
href="https://github.com/google/protobuf";>Protobuf</a>. Flink includes its own 
custom serialization framework in order to control the binary representation of 
data. This is important because operating on binary data such as comparing or 
even manipulating binary data requires exact knowledge of the serialization 
layout. Further, configuring the serialization layout with respect to 
operations that are performed on binary data can yield a significant 
performance boost. Flink’s serialization stack also leverages the fact, that 
the type of the objects which are going through de/serialization are exactly 
known before a program is executed. </p>
+<p>The Java ecosystem offers several libraries to convert objects into a 
binary representation and back. Common alternatives are standard Java 
serialization, <a href="https://github.com/EsotericSoftware/kryo";>Kryo</a>, <a 
href="http://avro.apache.org/";>Apache Avro</a>, <a 
href="http://thrift.apache.org/";>Apache Thrift</a>, or Google’s <a 
href="https://github.com/google/protobuf";>Protobuf</a>. Flink includes its own 
custom serialization framework in order to control the binary representation of 
data. This is important because operating on binary data such as comparing or 
even manipulating binary data requires exact knowledge of the serialization 
layout. Further, configuring the serialization layout with respect to 
operations that are performed on binary data can yield a significant 
performance boost. Flink’s serialization stack also leverages the fact, that 
the type of the objects which are going through de/serialization are exactly 
known before a program is executed.</p>
 
 <p>Flink programs can process data represented as arbitrary Java or Scala 
objects. Before a program is optimized, the data types at each processing step 
of the program’s data flow need to be identified. For Java programs, Flink 
features a reflection-based type extraction component to analyze the return 
types of user-defined functions. Scala programs are analyzed with help of the 
Scala compiler. Flink represents each data type with a <a 
href="https://github.com/apache/flink/blob/release-0.9.0-milestone-1/flink-core/src/main/java/org/apache/flink/api/common/typeinfo/TypeInformation.java";>TypeInformation</a>.
 Flink has TypeInformations for several kinds of data types, including:</p>
 
@@ -215,11 +215,11 @@ However, this approach has a few notable drawbacks. First 
of all it is not trivi
   <li>WritableTypeInfo: Any implementation of Hadoop’s Writable 
interface.</li>
   <li>TupleTypeInfo: Any Flink tuple (Tuple1 to Tuple25). Flink tuples are 
Java representations for fixed-length tuples with typed fields.</li>
   <li>CaseClassTypeInfo: Any Scala CaseClass (including Scala tuples).</li>
-  <li>PojoTypeInfo: Any POJO (Java or Scala), i.e., an object with all fields 
either being public or accessible through getters and setter that follow the 
common naming conventions. </li>
+  <li>PojoTypeInfo: Any POJO (Java or Scala), i.e., an object with all fields 
either being public or accessible through getters and setter that follow the 
common naming conventions.</li>
   <li>GenericTypeInfo: Any data type that cannot be identified as another 
type.</li>
 </ul>
 
-<p>Each TypeInformation provides a serializer for the data type it represents. 
For example, a BasicTypeInfo returns a serializer that writes the respective 
primitive type, the serializer of a WritableTypeInfo delegates de/serialization 
to the write() and readFields() methods of the object implementing Hadoop’s 
Writable interface, and a GenericTypeInfo returns a serializer that delegates 
serialization to Kryo. Object serialization to a DataOutput which is backed by 
Flink MemorySegments goes automatically through Java’s efficient unsafe 
operations. For data types that can be used as keys, i.e., compared and hashed, 
the TypeInformation provides TypeComparators. TypeComparators compare and hash 
objects and can - depending on the concrete data type - also efficiently 
compare binary representations and extract fixed-length binary key prefixes. 
</p>
+<p>Each TypeInformation provides a serializer for the data type it represents. 
For example, a BasicTypeInfo returns a serializer that writes the respective 
primitive type, the serializer of a WritableTypeInfo delegates de/serialization 
to the write() and readFields() methods of the object implementing Hadoop’s 
Writable interface, and a GenericTypeInfo returns a serializer that delegates 
serialization to Kryo. Object serialization to a DataOutput which is backed by 
Flink MemorySegments goes automatically through Java’s efficient unsafe 
operations. For data types that can be used as keys, i.e., compared and hashed, 
the TypeInformation provides TypeComparators. TypeComparators compare and hash 
objects and can - depending on the concrete data type - also efficiently 
compare binary representations and extract fixed-length binary key prefixes.</p>
 
 <p>Tuple, Pojo, and CaseClass types are composite types, i.e., containers for 
one or more possibly nested data types. As such, their serializers and 
comparators are also composite and delegate the serialization and comparison of 
their member data types to the respective serializers and comparators. The 
following figure illustrates the serialization of a (nested) 
<code>Tuple3&lt;Integer, Double, Person&gt;</code> object where 
<code>Person</code> is a POJO and defined as follows:</p>
 
@@ -232,13 +232,13 @@ However, this approach has a few notable drawbacks. First 
of all it is not trivi
 <img src="/img/blog/data-serialization.png" style="width:80%;margin:15px" />
 </center>
 
-<p>Flink’s type system can be easily extended by providing custom 
TypeInformations, Serializers, and Comparators to improve the performance of 
serializing and comparing custom data types. </p>
+<p>Flink’s type system can be easily extended by providing custom 
TypeInformations, Serializers, and Comparators to improve the performance of 
serializing and comparing custom data types.</p>
 
 <h2 id="how-does-flink-operate-on-binary-data">How does Flink operate on 
binary data?</h2>
 
 <p>Similar to many other data processing APIs (including SQL), Flink’s APIs 
provide transformations to group, sort, and join data sets. These 
transformations operate on potentially very large data sets. Relational 
database systems feature very efficient algorithms for these purposes since 
several decades including external merge-sort, merge-join, and hybrid 
hash-join. Flink builds on this technology, but generalizes it to handle 
arbitrary objects using its custom serialization and comparison stack. In the 
following, we show how Flink operates with binary data by the example of 
Flink’s in-memory sort algorithm.</p>
 
-<p>Flink assigns a memory budget to its data processing operators. Upon 
initialization, a sort algorithm requests its memory budget from the 
MemoryManager and receives a corresponding set of MemorySegments. The set of 
MemorySegments becomes the memory pool of a so-called sort buffer which 
collects the data that is be sorted. The following figure illustrates how data 
objects are serialized into the sort buffer. </p>
+<p>Flink assigns a memory budget to its data processing operators. Upon 
initialization, a sort algorithm requests its memory budget from the 
MemoryManager and receives a corresponding set of MemorySegments. The set of 
MemorySegments becomes the memory pool of a so-called sort buffer which 
collects the data that is be sorted. The following figure illustrates how data 
objects are serialized into the sort buffer.</p>
 
 <center>
 <img src="/img/blog/sorting-binary-data-1.png" style="width:90%;margin:15px" />
@@ -251,7 +251,7 @@ The following figure shows how two objects are compared.</p>
 <img src="/img/blog/sorting-binary-data-2.png" style="width:80%;margin:15px" />
 </center>
 
-<p>The sort buffer compares two elements by comparing their binary fix-length 
sort keys. The comparison is successful if either done on a full key (not a 
prefix key) or if the binary prefix keys are not equal. If the prefix keys are 
equal (or the sort key data type does not provide a binary prefix key), the 
sort buffer follows the pointers to the actual object data, deserializes both 
objects and compares the objects. Depending on the result of the comparison, 
the sort algorithm decides whether to swap the compared elements or not. The 
sort buffer swaps two elements by moving their fix-length keys and pointers. 
The actual data is not moved. Once the sort algorithm finishes, the pointers in 
the sort buffer are correctly ordered. The following figure shows how the 
sorted data is returned from the sort buffer. </p>
+<p>The sort buffer compares two elements by comparing their binary fix-length 
sort keys. The comparison is successful if either done on a full key (not a 
prefix key) or if the binary prefix keys are not equal. If the prefix keys are 
equal (or the sort key data type does not provide a binary prefix key), the 
sort buffer follows the pointers to the actual object data, deserializes both 
objects and compares the objects. Depending on the result of the comparison, 
the sort algorithm decides whether to swap the compared elements or not. The 
sort buffer swaps two elements by moving their fix-length keys and pointers. 
The actual data is not moved. Once the sort algorithm finishes, the pointers in 
the sort buffer are correctly ordered. The following figure shows how the 
sorted data is returned from the sort buffer.</p>
 
 <center>
 <img src="/img/blog/sorting-binary-data-3.png" style="width:80%;margin:15px" />
@@ -269,7 +269,7 @@ The following figure shows how two objects are compared.</p>
   <li><strong>Kryo-serialized.</strong> The tuple fields are serialized into a 
sort buffer of 600 MB size using Kryo serialization and sorted without binary 
sort keys. This means that each pair-wise comparison requires two object to be 
deserialized.</li>
 </ol>
 
-<p>All sort methods are implemented using a single thread. The reported times 
are averaged over ten runs. After each run, we call <code>System.gc()</code> to 
request a garbage collection run which does not go into measured execution 
time. The following figure shows the time to store the input data in memory, 
sort it, and read it back as objects. </p>
+<p>All sort methods are implemented using a single thread. The reported times 
are averaged over ten runs. After each run, we call <code>System.gc()</code> to 
request a garbage collection run which does not go into measured execution 
time. The following figure shows the time to store the input data in memory, 
sort it, and read it back as objects.</p>
 
 <center>
 <img src="/img/blog/sort-benchmark.png" style="width:90%;margin:15px" />
@@ -327,13 +327,13 @@ The following figure shows how two objects are 
compared.</p>
 
 <p><br /></p>
 
-<p>To summarize, the experiments verify the previously stated benefits of 
operating on binary data. </p>
+<p>To summarize, the experiments verify the previously stated benefits of 
operating on binary data.</p>
 
 <h2 id="were-not-done-yet">We’re not done yet!</h2>
 
-<p>Apache Flink features quite a bit of advanced techniques to safely and 
efficiently process huge amounts of data with limited memory resources. 
However, there are a few points that could make Flink even more efficient. The 
Flink community is working on moving the managed memory to off-heap memory. 
This will allow for smaller JVMs, lower garbage collection overhead, and also 
easier system configuration. With Flink’s Table API, the semantics of all 
operations such as aggregations and projections are known (in contrast to 
black-box user-defined functions). Hence we can generate code for Table API 
operations that directly operates on binary data. Further improvements include 
serialization layouts which are tailored towards the operations that are 
applied on the binary data and code generation for serializers and comparators. 
</p>
+<p>Apache Flink features quite a bit of advanced techniques to safely and 
efficiently process huge amounts of data with limited memory resources. 
However, there are a few points that could make Flink even more efficient. The 
Flink community is working on moving the managed memory to off-heap memory. 
This will allow for smaller JVMs, lower garbage collection overhead, and also 
easier system configuration. With Flink’s Table API, the semantics of all 
operations such as aggregations and projections are known (in contrast to 
black-box user-defined functions). Hence we can generate code for Table API 
operations that directly operates on binary data. Further improvements include 
serialization layouts which are tailored towards the operations that are 
applied on the binary data and code generation for serializers and 
comparators.</p>
 
-<p>The groundwork (and a lot more) for operating on binary data is done but 
there is still some room for making Flink even better and faster. If you are 
crazy about performance and like to juggle with lot of bits and bytes, join the 
Flink community! </p>
+<p>The groundwork (and a lot more) for operating on binary data is done but 
there is still some room for making Flink even better and faster. If you are 
crazy about performance and like to juggle with lot of bits and bytes, join the 
Flink community!</p>
 
 <h2 id="tldr-give-me-three-things-to-remember">TL;DR; Give me three things to 
remember!</h2>
 

http://git-wip-us.apache.org/repos/asf/flink-web/blob/b68e1f48/content/news/2015/05/14/Community-update-April.html
----------------------------------------------------------------------
diff --git a/content/news/2015/05/14/Community-update-April.html 
b/content/news/2015/05/14/Community-update-April.html
index 2e0210d..1523a02 100644
--- a/content/news/2015/05/14/Community-update-April.html
+++ b/content/news/2015/05/14/Community-update-April.html
@@ -159,7 +159,7 @@
       <article>
         <p>14 May 2015 by Kostas Tzoumas (<a 
href="https://twitter.com/kostas_tzoumas";>@kostas_tzoumas</a>)</p>
 
-<p>April was an packed month for Apache Flink. </p>
+<p>April was an packed month for Apache Flink.</p>
 
 <h2 id="flink-090-milestone1-release">Flink 0.9.0-milestone1 release</h2>
 
@@ -175,7 +175,7 @@
 
 <h2 id="flink-on-the-web">Flink on the web</h2>
 
-<p>Fabian Hueske gave an <a 
href="http://www.infoq.com/news/2015/04/hueske-apache-flink?utm_campaign=infoq_content&amp;utm_source=infoq&amp;utm_medium=feed&amp;utm_term=global";>interview
 at InfoQ</a> on Apache Flink. </p>
+<p>Fabian Hueske gave an <a 
href="http://www.infoq.com/news/2015/04/hueske-apache-flink?utm_campaign=infoq_content&amp;utm_source=infoq&amp;utm_medium=feed&amp;utm_term=global";>interview
 at InfoQ</a> on Apache Flink.</p>
 
 <h2 id="upcoming-events">Upcoming events</h2>
 

http://git-wip-us.apache.org/repos/asf/flink-web/blob/b68e1f48/content/news/2015/08/24/introducing-flink-gelly.html
----------------------------------------------------------------------
diff --git a/content/news/2015/08/24/introducing-flink-gelly.html 
b/content/news/2015/08/24/introducing-flink-gelly.html
index 2bc2f8a..2e05a24 100644
--- a/content/news/2015/08/24/introducing-flink-gelly.html
+++ b/content/news/2015/08/24/introducing-flink-gelly.html
@@ -225,21 +225,21 @@ and mutations as well as neighborhood aggregations.</p>
 
 <h4 id="common-graph-metrics">Common Graph Metrics</h4>
 <p>These methods can be used to retrieve several graph metrics and properties, 
such as the number
-of vertices, edges and the node degrees. </p>
+of vertices, edges and the node degrees.</p>
 
 <h4 id="transformations">Transformations</h4>
 <p>The transformation methods enable several Graph operations, using 
high-level functions similar to
 the ones provided by the batch processing API. These transformations can be 
applied one after the
-other, yielding a new Graph after each step, in a fashion similar to operators 
on DataSets: </p>
+other, yielding a new Graph after each step, in a fashion similar to operators 
on DataSets:</p>
 
 <div class="highlight"><pre><code class="language-java"><span 
class="n">inputGraph</span><span class="o">.</span><span 
class="na">getUndirected</span><span class="o">().</span><span 
class="na">mapEdges</span><span class="o">(</span><span class="k">new</span> 
<span class="nf">CustomEdgeMapper</span><span 
class="o">());</span></code></pre></div>
 
 <p>Transformations can be applied on:</p>
 
 <ol>
-  <li><strong>Vertices</strong>: <code>mapVertices</code>, 
<code>joinWithVertices</code>, <code>filterOnVertices</code>, 
<code>addVertex</code>, …  </li>
-  <li><strong>Edges</strong>: <code>mapEdges</code>, 
<code>filterOnEdges</code>, <code>removeEdge</code>, …   </li>
-  <li><strong>Triplets</strong> (source vertex, target vertex, edge): 
<code>getTriplets</code>  </li>
+  <li><strong>Vertices</strong>: <code>mapVertices</code>, 
<code>joinWithVertices</code>, <code>filterOnVertices</code>, 
<code>addVertex</code>, …</li>
+  <li><strong>Edges</strong>: <code>mapEdges</code>, 
<code>filterOnEdges</code>, <code>removeEdge</code>, …</li>
+  <li><strong>Triplets</strong> (source vertex, target vertex, edge): 
<code>getTriplets</code></li>
 </ol>
 
 <h4 id="neighborhood-aggregations">Neighborhood Aggregations</h4>
@@ -373,7 +373,7 @@ vertex values do not need to be recomputed during an 
iteration.</p>
 <p>Let us reconsider the Single Source Shortest Paths algorithm. In each 
iteration, a vertex:</p>
 
 <ol>
-  <li><strong>Gather</strong> retrieves distances from its neighbors summed up 
with the corresponding edge values; </li>
+  <li><strong>Gather</strong> retrieves distances from its neighbors summed up 
with the corresponding edge values;</li>
   <li><strong>Sum</strong> compares the newly obtained distances in order to 
extract the minimum;</li>
   <li><strong>Apply</strong> and finally adopts the minimum distance computed 
in the sum step,
 provided that it is lower than its current value. If a vertex’s value does 
not change during
@@ -432,7 +432,7 @@ plays that each song has. We then filter out the list of 
songs the users do not
 playlist. Then we compute the top songs per user (i.e. the songs a user 
listened to the most).
 Finally, as a separate use-case on the same data set, we create a user-user 
similarity graph based
 on the common songs and use this resulting graph to detect communities by 
calling Gelly’s Label Propagation
-library method. </p>
+library method.</p>
 
 <p>For running the example implementation, please use the 0.10-SNAPSHOT 
version of Flink as a
 dependency. The full example code base can be found <a 
href="https://github.com/apache/flink/blob/master/flink-staging/flink-gelly/src/main/java/org/apache/flink/graph/example/MusicProfiles.java";>here</a>.
 The public data set used for testing
@@ -522,10 +522,10 @@ in the figure below.</p>
 
 <p>To form the user-user graph in Flink, we will simply take the edges from 
the user-song graph
 (left-hand side of the image), group them by song-id, and then add all the 
users (source vertex ids)
-to an ArrayList. </p>
+to an ArrayList.</p>
 
 <p>We then match users who listened to the same song two by two, creating a 
new edge to mark their
-common interest (right-hand side of the image). </p>
+common interest (right-hand side of the image).</p>
 
 <p>Afterwards, we perform a <code>distinct()</code> operation to avoid 
creation of duplicate data.
 Considering that we now have the DataSet of edges which present interest, 
creating a graph is as
@@ -564,7 +564,7 @@ formed. To do so, we first initialize each vertex with a 
numeric label using the
 the id of a vertex with the first element of the tuple, afterwards applying a 
map function.
 Finally, we call the <code>run()</code> method with the LabelPropagation 
library method passed
 as a parameter. In the end, the vertices will be updated to contain the most 
frequent label
-among their neighbors. </p>
+among their neighbors.</p>
 
 <div class="highlight"><pre><code class="language-java"><span class="c1">// 
detect user communities using label propagation</span>
 <span class="c1">// initialize each vertex with a unique numeric label</span>
@@ -594,10 +594,10 @@ among their neighbors. </p>
 <p>Currently, Gelly matches the basic functionalities provided by most 
state-of-the-art graph
 processing systems. Our vision is to turn Gelly into more than “yet another 
library for running
 PageRank-like algorithms” by supporting generic iterations, implementing 
graph partitioning,
-providing bipartite graph support and by offering numerous other features. </p>
+providing bipartite graph support and by offering numerous other features.</p>
 
 <p>We are also enriching Flink Gelly with a set of operators suitable for 
highly skewed graphs
-as well as a Graph API built on Flink Streaming. </p>
+as well as a Graph API built on Flink Streaming.</p>
 
 <p>In the near future, we would like to see how Gelly can be integrated with 
graph visualization
 tools, graph database systems and sampling techniques.</p>

http://git-wip-us.apache.org/repos/asf/flink-web/blob/b68e1f48/content/news/2015/09/16/off-heap-memory.html
----------------------------------------------------------------------
diff --git a/content/news/2015/09/16/off-heap-memory.html 
b/content/news/2015/09/16/off-heap-memory.html
index 50f4016..c1b1efc 100644
--- a/content/news/2015/09/16/off-heap-memory.html
+++ b/content/news/2015/09/16/off-heap-memory.html
@@ -205,7 +205,7 @@
 
 <h2 id="the-off-heap-memory-implementation">The off-heap Memory 
Implementation</h2>
 
-<p>Given that all memory intensive internal algorithms are already implemented 
against the <code>MemorySegment</code>, our implementation to switch to 
off-heap memory is actually trivial. You can compare it to replacing all 
<code>ByteBuffer.allocate(numBytes)</code> calls with 
<code>ByteBuffer.allocateDirect(numBytes)</code>. In Flink’s case it meant 
that we made the <code>MemorySegment</code> abstract and added the 
<code>HeapMemorySegment</code> and <code>OffHeapMemorySegment</code> 
subclasses. The <code>OffHeapMemorySegment</code> takes the off-heap memory 
pointer from a <code>java.nio.DirectByteBuffer</code> and implements its 
specialized access methods using <code>sun.misc.Unsafe</code>. We also made a 
few adjustments to the startup scripts and the deployment code to make sure 
that the JVM is permitted enough off-heap memory (direct memory, 
<em>-XX:MaxDirectMemorySize</em>). </p>
+<p>Given that all memory intensive internal algorithms are already implemented 
against the <code>MemorySegment</code>, our implementation to switch to 
off-heap memory is actually trivial. You can compare it to replacing all 
<code>ByteBuffer.allocate(numBytes)</code> calls with 
<code>ByteBuffer.allocateDirect(numBytes)</code>. In Flink’s case it meant 
that we made the <code>MemorySegment</code> abstract and added the 
<code>HeapMemorySegment</code> and <code>OffHeapMemorySegment</code> 
subclasses. The <code>OffHeapMemorySegment</code> takes the off-heap memory 
pointer from a <code>java.nio.DirectByteBuffer</code> and implements its 
specialized access methods using <code>sun.misc.Unsafe</code>. We also made a 
few adjustments to the startup scripts and the deployment code to make sure 
that the JVM is permitted enough off-heap memory (direct memory, 
<em>-XX:MaxDirectMemorySize</em>).</p>
 
 <p>In practice we had to go one step further, to make the implementation 
perform well. While the <code>ByteBuffer</code> is used in I/O code paths to 
compose headers and move bulk memory into place, the MemorySegment is part of 
the innermost loops of many algorithms (sorting, hash tables, …). That means 
that the access methods have to be as fast as possible.</p>
 

http://git-wip-us.apache.org/repos/asf/flink-web/blob/b68e1f48/content/news/2015/11/16/release-0.10.0.html
----------------------------------------------------------------------
diff --git a/content/news/2015/11/16/release-0.10.0.html 
b/content/news/2015/11/16/release-0.10.0.html
index 7afd6e6..1bf2ff7 100644
--- a/content/news/2015/11/16/release-0.10.0.html
+++ b/content/news/2015/11/16/release-0.10.0.html
@@ -161,7 +161,7 @@
 
 <p>The Apache Flink community is pleased to announce the availability of the 
0.10.0 release. The community put significant effort into improving and 
extending Apache Flink since the last release, focusing on data stream 
processing and operational features. About 80 contributors provided bug fixes, 
improvements, and new features such that in total more than 400 JIRA issues 
could be resolved.</p>
 
-<p>For Flink 0.10.0, the focus of the community was to graduate the DataStream 
API from beta and to evolve Apache Flink into a production-ready stream data 
processor with a competitive feature set. These efforts resulted in support for 
event-time and out-of-order streams, exactly-once guarantees in the case of 
failures, a very flexible windowing mechanism, sophisticated operator state 
management, and a highly-available cluster operation mode. Flink 0.10.0 also 
brings a new monitoring dashboard with real-time system and job monitoring 
capabilities. Both batch and streaming modes of Flink benefit from the new high 
availability and improved monitoring features. Needless to say that Flink 
0.10.0 includes many more features, improvements, and bug fixes. </p>
+<p>For Flink 0.10.0, the focus of the community was to graduate the DataStream 
API from beta and to evolve Apache Flink into a production-ready stream data 
processor with a competitive feature set. These efforts resulted in support for 
event-time and out-of-order streams, exactly-once guarantees in the case of 
failures, a very flexible windowing mechanism, sophisticated operator state 
management, and a highly-available cluster operation mode. Flink 0.10.0 also 
brings a new monitoring dashboard with real-time system and job monitoring 
capabilities. Both batch and streaming modes of Flink benefit from the new high 
availability and improved monitoring features. Needless to say that Flink 
0.10.0 includes many more features, improvements, and bug fixes.</p>
 
 <p>We encourage everyone to <a href="/downloads.html">download the release</a> 
and <a 
href="https://ci.apache.org/projects/flink/flink-docs-release-0.10/";>check out 
the documentation</a>. Feedback through the Flink <a 
href="/community.html#mailing-lists">mailing lists</a> is, as always, very 
welcome!</p>
 

http://git-wip-us.apache.org/repos/asf/flink-web/blob/b68e1f48/content/news/2015/12/04/Introducing-windows.html
----------------------------------------------------------------------
diff --git a/content/news/2015/12/04/Introducing-windows.html 
b/content/news/2015/12/04/Introducing-windows.html
index 23536b4..d246e4a 100644
--- a/content/news/2015/12/04/Introducing-windows.html
+++ b/content/news/2015/12/04/Introducing-windows.html
@@ -159,7 +159,7 @@
       <article>
         <p>04 Dec 2015 by Fabian Hueske (<a 
href="https://twitter.com/fhueske";>@fhueske</a>)</p>
 
-<p>The data analysis space is witnessing an evolution from batch to stream 
processing for many use cases. Although batch can be handled as a special case 
of stream processing, analyzing never-ending streaming data often requires a 
shift in the mindset and comes with its own terminology (for example, 
“windowing” and “at-least-once”/”exactly-once” processing). This 
shift and the new terminology can be quite confusing for people being new to 
the space of stream processing. Apache Flink is a production-ready stream 
processor with an easy-to-use yet very expressive API to define advanced stream 
analysis programs. Flink’s API features very flexible window definitions on 
data streams which let it stand out among other open source stream processors. 
</p>
+<p>The data analysis space is witnessing an evolution from batch to stream 
processing for many use cases. Although batch can be handled as a special case 
of stream processing, analyzing never-ending streaming data often requires a 
shift in the mindset and comes with its own terminology (for example, 
“windowing” and “at-least-once”/”exactly-once” processing). This 
shift and the new terminology can be quite confusing for people being new to 
the space of stream processing. Apache Flink is a production-ready stream 
processor with an easy-to-use yet very expressive API to define advanced stream 
analysis programs. Flink’s API features very flexible window definitions on 
data streams which let it stand out among other open source stream 
processors.</p>
 
 <p>In this blog post, we discuss the concept of windows for stream processing, 
present Flink’s built-in windows, and explain its support for custom 
windowing semantics.</p>
 
@@ -222,17 +222,17 @@
 
 <p>There is one aspect that we haven’t discussed yet, namely the exact 
meaning of “<em>collects elements for one minute</em>” which boils down to 
the question, “<em>How does the stream processor interpret time?</em>”.</p>
 
-<p>Apache Flink features three different notions of time, namely 
<em>processing time</em>, <em>event time</em>, and <em>ingestion time</em>. </p>
+<p>Apache Flink features three different notions of time, namely 
<em>processing time</em>, <em>event time</em>, and <em>ingestion time</em>.</p>
 
 <ol>
-  <li>In <strong>processing time</strong>, windows are defined with respect to 
the wall clock of the machine that builds and processes a window, i.e., a one 
minute processing time window collects elements for exactly one minute. </li>
-  <li>In <strong>event time</strong>, windows are defined with respect to 
timestamps that are attached to each event record. This is common for many 
types of events, such as log entries, sensor data, etc, where the timestamp 
usually represents the time at which the event occurred. Event time has several 
benefits over processing time. First of all, it decouples the program semantics 
from the actual serving speed of the source and the processing performance of 
system. Hence you can process historic data, which is served at maximum speed, 
and continuously produced data with the same program. It also prevents 
semantically incorrect results in case of backpressure or delays due to failure 
recovery. Second, event time windows compute correct results, even if events 
arrive out-of-order of their timestamp which is common if a data stream gathers 
events from distributed sources. </li>
+  <li>In <strong>processing time</strong>, windows are defined with respect to 
the wall clock of the machine that builds and processes a window, i.e., a one 
minute processing time window collects elements for exactly one minute.</li>
+  <li>In <strong>event time</strong>, windows are defined with respect to 
timestamps that are attached to each event record. This is common for many 
types of events, such as log entries, sensor data, etc, where the timestamp 
usually represents the time at which the event occurred. Event time has several 
benefits over processing time. First of all, it decouples the program semantics 
from the actual serving speed of the source and the processing performance of 
system. Hence you can process historic data, which is served at maximum speed, 
and continuously produced data with the same program. It also prevents 
semantically incorrect results in case of backpressure or delays due to failure 
recovery. Second, event time windows compute correct results, even if events 
arrive out-of-order of their timestamp which is common if a data stream gathers 
events from distributed sources.</li>
   <li><strong>Ingestion time</strong> is a hybrid of processing and event 
time. It assigns wall clock timestamps to records as soon as they arrive in the 
system (at the source) and continues processing with event time semantics based 
on the attached timestamps.</li>
 </ol>
 
 <h2 id="count-windows">Count Windows</h2>
 
-<p>Apache Flink also features count windows. A tumbling count window of 100 
will collect 100 events in a window and evaluate the window when the 100th 
element has been added. </p>
+<p>Apache Flink also features count windows. A tumbling count window of 100 
will collect 100 events in a window and evaluate the window when the 100th 
element has been added.</p>
 
 <p>In Flink’s DataStream API, tumbling and sliding count windows are defined 
as follows:</p>
 
@@ -255,7 +255,7 @@
 
 <h2 id="dissecting-flinks-windowing-mechanics">Dissecting Flink’s windowing 
mechanics</h2>
 
-<p>Flink’s built-in time and count windows cover a wide range of common 
window use cases. However, there are of course applications that require custom 
windowing logic that cannot be addressed by Flink’s built-in windows. In 
order to support also applications that need very specific windowing semantics, 
the DataStream API exposes interfaces for the internals of its windowing 
mechanics. These interfaces give very fine-grained control about the way that 
windows are built and evaluated. </p>
+<p>Flink’s built-in time and count windows cover a wide range of common 
window use cases. However, there are of course applications that require custom 
windowing logic that cannot be addressed by Flink’s built-in windows. In 
order to support also applications that need very specific windowing semantics, 
the DataStream API exposes interfaces for the internals of its windowing 
mechanics. These interfaces give very fine-grained control about the way that 
windows are built and evaluated.</p>
 
 <p>The following figure depicts Flink’s windowing mechanism and introduces 
the components being involved.</p>
 

Reply via email to