[flink-web] 01/03: Rebuild website

2019-09-13 Thread fhueske
This is an automated email from the ASF dual-hosted git repository.

fhueske pushed a commit to branch asf-site
in repository https://gitbox.apache.org/repos/asf/flink-web.git

commit 36c49b45c70261cdd53752f625641d1f540ca1f0
Author: Fabian Hueske 
AuthorDate: Fri Sep 13 09:48:30 2019 +0200

Rebuild website
---
 content/blog/feed.xml | 103 +++---
 content/downloads.html|   2 +-
 content/zh/downloads.html |   2 +-
 3 files changed, 99 insertions(+), 8 deletions(-)

diff --git a/content/blog/feed.xml b/content/blog/feed.xml
index a3f9991..2113d6f 100644
--- a/content/blog/feed.xml
+++ b/content/blog/feed.xml
@@ -7,6 +7,97 @@
 https://flink.apache.org/blog/feed.xml"; rel="self" 
type="application/rss+xml" />
 
 
+Apache Flink 1.8.2 Released
+

The Apache Flink community released the second bugfix version of the Apache Flink 1.8 series.

+ +

This release includes 23 fixes and minor improvements for Flink 1.8.1. The list below includes a detailed list of all fixes and improvements.

+ +

We highly recommend all users to upgrade to Flink 1.8.2.

+ +

Updated Maven dependencies:

+ +
<dependency>
+  <groupId>org.apache.flink</groupId>
+  <artifactId>flink-java</artifactId>
+  <version>1.8.2</version>
+</dependency>
+<dependency>
+  <groupId>org.apache.flink</groupId>
+  <artifactId>flink-streaming-java_2.11</artifactId>
+  <version>1.8.2</version>
+</dependency>
+<dependency>
+  <groupId>org.apache.flink</groupId>
+  <artifactId>flink-clients_2.11</artifactId>
+  <version>1.8.2</version>
+</dependency>
+ +

You can find the binaries on the updated Downloads page.

+ +

List of resolved issues:

+ +

Bug +

+
    +
  • [FLINK-13941;] - Prevent data-loss by not cleaning up small part files from S3. +
  • +
  • [FLINK-9526;] - BucketingSink end-to-end test failed on Travis +
  • +
  • [FLINK-10368;] - 'Kerberized YARN on Docker test' unstable +
  • +
  • [FLINK-12319;] - StackOverFlowError in cep.nfa.sharedbuffer.SharedBuffer +
  • +
  • [FLINK-12736;] - ResourceManager may release TM with allocated slots +
  • +
  • [FLINK-12889;] - Job keeps in FAILING state +
  • +
  • [FLINK-13059;] - Cassandra Connector leaks Semaphore on Exception; hangs on close +
  • +
  • [FLINK-13159;] - java.lang.ClassNotFoundException when restore job +
  • +
  • [FLINK-13367;] - Make ClosureCleaner detect writeReplace serialization override +
  • +
  • [FLINK-13369;] - Recursive closure cleaner ends up with stackOverflow in case of circular dependency +
  • +
  • [FLINK-13394;] - Use fallback unsafe secure MapR in nightly.sh +
  • +
  • [FLINK-13484;] - ConnectedComponents end-to-end test instable with NoResourceAvailableException +
  • +
  • [FLINK-13499;] - Remove dependency on MapR artifact repository +
  • +
  • [FLINK-13508;] - CommonTestUtils#waitUntilCondition() may attempt to sleep with negative time +
  • +
  • [FLINK-13586;] - Method ClosureCleaner.clean broke backward compatibility between 1.8.0 and 1.8.1 +<

[flink-web] 01/03: Rebuild website

2019-12-05 Thread chesnay
This is an automated email from the ASF dual-hosted git repository.

chesnay pushed a commit to branch asf-site
in repository https://gitbox.apache.org/repos/asf/flink-web.git

commit 80ae871143ff8ad817d564ad8dc2f5433771bd3d
Author: Dian Fu 
AuthorDate: Thu Dec 5 10:10:25 2019 +0800

Rebuild website
---
 content/blog/feed.xml | 419 ++
 1 file changed, 419 insertions(+)

diff --git a/content/blog/feed.xml b/content/blog/feed.xml
index 1410119..be37a00 100644
--- a/content/blog/feed.xml
+++ b/content/blog/feed.xml
@@ -7,6 +7,425 @@
 https://flink.apache.org/blog/feed.xml"; rel="self" 
type="application/rss+xml" />
 
 
+How to query Pulsar Streams using Apache Flink
+

In a previous story; on the Flink blog, we explained the different ways that Apache Flink and Apache Pulsar can integrate to provide elastic data processing at large scale. This blog post discusses the new developments and integrations between the two fra [...] + +

A short intro to Apache Pulsar

+ +

Apache Pulsar is a flexible pub/sub messaging system, backed by durable log storage. Some of the framework’s highlights include multi-tenancy, a unified message model, structured event streams and a cloud-native architecture that make it a perfect fit for a wide set of use cases, ranging from billing, payments and trading services all the way to the unification of the different messaging architectures in an organization. If you are interested in finding out more about Pulsar, yo [...] + +

Existing Pulsar & Flink integration (Apache Flink 1.6+)

+ +

The existing integration between Pulsar and Flink exploits Pulsar as a message queue in a Flink application. Flink developers can utilize Pulsar as a streaming source and streaming sink for their Flink applications by selecting a specific Pulsar source and connecting to their desired Pulsar cluster and topic:

+ +
// create 
and configure Pulsar consumer
+PulsarSourceBuilder<String>builder = PulsarSourceBuilder  
+  .builder(new SimpleStringSchema()) 
+  .serviceUrl(serviceUrl)
+  .topic(inputTopic)
+  .subsciptionName(subscription);
+SourceFunction<String> src = builder.build// ingest DataStream with Pulsar 
consumer
+DataStream<String> words = env.addSource 
[...]
+
+

Pulsar streams can then get connected to the Flink processing logic…

+ +
// perform 
computation on DataStream (here a simple WordCount)
+DataStream<WordWithCount> wc = words
+  .flatmap((FlatMapFunction<String, WordWithCount>) collector.collect(new WordWithCount(word, 1})
+ 
+  .returns(WordWithCount

[flink-web] 01/03: Rebuild website.

This is an automated email from the ASF dual-hosted git repository.

nkruber pushed a commit to branch asf-site
in repository https://gitbox.apache.org/repos/asf/flink-web.git

commit d7dbb09d30afc2605dfc087207af7d46d8e0b8b8
Author: Nico Kruber 
AuthorDate: Tue Jun 4 09:25:00 2019 +0200

Rebuild website.
---
 content/blog/feed.xml | 131 ++
 1 file changed, 131 insertions(+)

diff --git a/content/blog/feed.xml b/content/blog/feed.xml
index 59b7950..31828ce 100644
--- a/content/blog/feed.xml
+++ b/content/blog/feed.xml
@@ -7,6 +7,137 @@
 https://flink.apache.org/blog/feed.xml"; rel="self" 
type="application/rss+xml" />
 
 
+State TTL in Flink 1.8.0: How to Automatically Cleanup Application 
State in Apache Flink
+

A common requirement for many stateful streaming applications is to automatically cleanup application state for effective management of your state size, or to control how long the application state can be accessed (e.g. due to legal regulations like the GDPR). The state time-to-live (TTL) feature was initiated in Flink 1.6.0 and enabled application state cleanup and efficient state size management in Apache Flink.

+ +

In this post, we motivate the State TTL feature and discuss its use cases. Moreover, we show how to use and configure it. We explain how Flink internally manages state with TTL and present some exciting additions to the feature in Flink 1.8.0. The blog post concludes with an outlook on future improvements and extensions.

+ +

The Transient Nature of State

+ +

There are two major reasons why state should be maintained only for a limited time. For example, let’s imagine a Flink application that ingests a stream of user login events and stores for each user the time of the last login to improve the experience of frequent visitors.

+ +
    +
  • +

    Controlling the size of state. +Being able to efficiently manage an ever-growing state size is a primary use case for state TTL. Oftentimes, data needs to be persisted temporarily while there is some user activity around it, e.g. web sessions. When the activity ends there is no longer interest in that data while it still occupies storage. Flink 1.8.0 introduces background cleanup of old state based on TTL that makes the eviction of no-longer-necessary data frictionless. Previously, the application developer had to take [...] +

  • +
  • +

    Complying with data protection and sensitive data requirements. +Recent developments around data privacy regulations, such as the General Data Protection Regulation (GDPR) introduced by the European Union, make compliance with such data requirements or treating sensitive data a top priority for many use cases and applications. An example of such use cases includes applications that require keeping data for a specific timeframe and preventing access to it thereafter. This is a common challenge for companies providing short-term services to their custom [...] +

  • +
+ +

Both requirements can be addressed by a feature that periodically, yet continuously, removes the state for a key once it becomes unnecessary or unimportant and there is no requirement to keep it in storage any more.

+ +

State TTL for continuous cleanup of application state

+ +

The 1.6.0 release of Apache Flink introduced the State TTL feature. It enabled developers of stream processing applications to configure the state of operators to expire and be cleaned up after a defined timeout (time-to-live). In Flink 1.8.0 the feature was extended, including continuous cleanup of old entries for both the RocksDB and the heap state backends (FSStateBackend and MemoryStateBackend), enabling a continuous cleanup process of old entries (according to the TTL setti [...] + +

In Flink’s DataStream API, application state is defined by a state descriptor. State TTL is configured by passing a StateTtlConfiguration object to a state descriptor. The following Java example shows how to create a state TTL configuration and provide it to the state descriptor that holds the last login ti [...] + +

import org.apache.flink.api.common.state.StateTtlConfig;
+import org.apache.flink.api.common.time.Time;
+import org.apache.flink.api.common.state.ValueStateDescriptor;
+
+StateTtlConfig ttlConfig =