Repository: flink
Updated Branches:
  refs/heads/release-1.2 33c5df6dd -> f4869a66d


[hotfix] [docs] Move 'dev/state_backends' to 'ops/state_backends'


Project: http://git-wip-us.apache.org/repos/asf/flink/repo
Commit: http://git-wip-us.apache.org/repos/asf/flink/commit/daad28ab
Tree: http://git-wip-us.apache.org/repos/asf/flink/tree/daad28ab
Diff: http://git-wip-us.apache.org/repos/asf/flink/diff/daad28ab

Branch: refs/heads/release-1.2
Commit: daad28ab5431ba1ab280a2024b5d28b70b0713ee
Parents: 33c5df6
Author: Stephan Ewen <se...@apache.org>
Authored: Mon Jan 9 19:39:44 2017 +0100
Committer: Stephan Ewen <se...@apache.org>
Committed: Mon Jan 16 11:52:41 2017 +0100

----------------------------------------------------------------------
 docs/dev/state.md                |   4 +-
 docs/dev/state_backends.md       | 148 ----------------------------------
 docs/ops/README.md               |  21 +++++
 docs/ops/state_backends.md       | 148 ++++++++++++++++++++++++++++++++++
 docs/redirects/state_backends.md |   2 +-
 docs/setup/aws.md                |   4 +-
 docs/setup/savepoints.md         |   2 +-
 7 files changed, 175 insertions(+), 154 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/flink/blob/daad28ab/docs/dev/state.md
----------------------------------------------------------------------
diff --git a/docs/dev/state.md b/docs/dev/state.md
index 73a4ceb..4478bfc 100644
--- a/docs/dev/state.md
+++ b/docs/dev/state.md
@@ -40,7 +40,7 @@ Flink's state interface.
 By default state checkpoints will be stored in-memory at the JobManager. For 
proper persistence of large
 state, Flink supports storing the checkpoints on file systems (HDFS, S3, or 
any mounted POSIX file system),
 which can be configured in the `flink-conf.yaml` or via 
`StreamExecutionEnvironment.setStateBackend(…)`.
-See [state backends]({{ site.baseurl }}/dev/state_backends.html) for 
information
+See [state backends]({{ site.baseurl }}/ops/state_backends.html) for 
information
 about the available state backends and how to configure them.
 
 * ToC
@@ -52,7 +52,7 @@ Enabling Checkpointing
 Flink has a checkpointing mechanism that recovers streaming jobs after 
failures. The checkpointing mechanism requires a *persistent* (or *durable*) 
source that
 can be asked for prior records again (Apache Kafka is a good example of such a 
source).
 
-The checkpointing mechanism stores the progress in the data sources and data 
sinks, the state of windows, as well as the user-defined state (see [Working 
with State]({{ site.baseurl }}/dev/state.html)) consistently to provide 
*exactly once* processing semantics. Where the checkpoints are stored (e.g., 
JobManager memory, file system, database) depends on the configured [state 
backend]({{ site.baseurl }}/dev/state_backends.html).
+The checkpointing mechanism stores the progress in the data sources and data 
sinks, the state of windows, as well as the user-defined state (see [Working 
with State]({{ site.baseurl }}/dev/state.html)) consistently to provide 
*exactly once* processing semantics. Where the checkpoints are stored (e.g., 
JobManager memory, file system, database) depends on the configured [state 
backend]({{ site.baseurl }}/ops/state_backends.html).
 
 The [docs on streaming fault tolerance]({{ site.baseurl 
}}/internals/stream_checkpointing.html) describe in detail the technique behind 
Flink's streaming fault tolerance mechanism.
 

http://git-wip-us.apache.org/repos/asf/flink/blob/daad28ab/docs/dev/state_backends.md
----------------------------------------------------------------------
diff --git a/docs/dev/state_backends.md b/docs/dev/state_backends.md
deleted file mode 100644
index af9934d..0000000
--- a/docs/dev/state_backends.md
+++ /dev/null
@@ -1,148 +0,0 @@
----
-title: "State Backends"
-nav-parent_id: setup
-nav-pos: 11
----
-<!--
-Licensed to the Apache Software Foundation (ASF) under one
-or more contributor license agreements.  See the NOTICE file
-distributed with this work for additional information
-regarding copyright ownership.  The ASF licenses this file
-to you under the Apache License, Version 2.0 (the
-"License"); you may not use this file except in compliance
-with the License.  You may obtain a copy of the License at
-
-  http://www.apache.org/licenses/LICENSE-2.0
-
-Unless required by applicable law or agreed to in writing,
-software distributed under the License is distributed on an
-"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
-KIND, either express or implied.  See the License for the
-specific language governing permissions and limitations
-under the License.
--->
-
-Programs written in the [Data Stream API]({{ site.baseurl 
}}/dev/datastream_api.html) often hold state in various forms:
-
-- Windows gather elements or aggregates until they are triggered
-- Transformation functions may use the key/value state interface to store 
values
-- Transformation functions may implement the `Checkpointed` interface to make 
their local variables fault tolerant
-
-See also [Working with State]({{ site.baseurl }}/dev/state.html) in the 
streaming API guide.
-
-When checkpointing is activated, such state is persisted upon checkpoints to 
guard against data loss and recover consistently.
-How the state is represented internally, and how and where it is persisted 
upon checkpoints depends on the
-chosen **State Backend**.
-
-* ToC
-{:toc}
-
-## Available State Backends
-
-Out of the box, Flink bundles these state backends:
-
- - *MemoryStateBackend*
- - *FsStateBackend*
- - *RocksDBStateBackend*
-
-If nothing else is configured, the system will use the MemoryStateBackend.
-
-
-### The MemoryStateBackend
-
-The *MemoryStateBackend* holds data internally as objects on the Java heap. 
Key/value state and window operators hold hash tables
-that store the values, triggers, etc.
-
-Upon checkpoints, this state backend will snapshot the state and send it as 
part of the checkpoint acknowledgement messages to the
-JobManager (master), which stores it on its heap as well.
-
-Limitations of the MemoryStateBackend:
-
-  - The size of each individual state is by default limited to 5 MB. This 
value can be increased in the constructor of the MemoryStateBackend.
-  - Irrespective of the configured maximal state size, the state cannot be 
larger than the akka frame size (see [Configuration]({{ site.baseurl 
}}/setup/config.html)).
-  - The aggregate state must fit into the JobManager memory.
-
-The MemoryStateBackend is encouraged for:
-
-  - Local development and debugging
-  - Jobs that do hold little state, such as jobs that consist only of 
record-at-a-time functions (Map, FlatMap, Filter, ...). The Kafka Consumer 
requires very little state.
-
-
-### The FsStateBackend
-
-The *FsStateBackend* is configured with a file system URL (type, address, 
path), such as "hdfs://namenode:40010/flink/checkpoints" or 
"file:///data/flink/checkpoints".
-
-The FsStateBackend holds in-flight data in the TaskManager's memory. Upon 
checkpointing, it writes state snapshots into files in the configured file 
system and directory. Minimal metadata is stored in the JobManager's memory 
(or, in high-availability mode, in the metadata checkpoint).
-
-The FsStateBackend is encouraged for:
-
-  - Jobs with large state, long windows, large key/value states.
-  - All high-availability setups.
-
-### The RocksDBStateBackend
-
-The *RocksDBStateBackend* is configured with a file system URL (type, address, 
path), such as "hdfs://namenode:40010/flink/checkpoints" or 
"file:///data/flink/checkpoints".
-
-The RocksDBStateBackend holds in-flight data in a 
[RocksDB](http://rocksdb.org) data base
-that is (per default) stored in the TaskManager data directories. Upon 
checkpointing, the whole
-RocksDB data base will be checkpointed into the configured file system and 
directory. Minimal
-metadata is stored in the JobManager's memory (or, in high-availability mode, 
in the metadata checkpoint).
-
-The RocksDBStateBackend is encouraged for:
-
-  - Jobs with very large state, long windows, large key/value states.
-  - All high-availability setups.
-
-Note that the amount of state that you can keep is only limited by the amount 
of disc space available.
-This allows keeping very large state, compared to the FsStateBackend that 
keeps state in memory.
-This also means, however, that the maximum throughput that can be achieved 
will be lower with
-this state backend.
-
-## Configuring a State Backend
-
-State backends can be configured per job. In addition, you can define a 
default state backend to be used when the
-job does not explicitly define a state backend.
-
-
-### Setting the Per-job State Backend
-
-The per-job state backend is set on the `StreamExecutionEnvironment` of the 
job, as shown in the example below:
-
-<div class="codetabs" markdown="1">
-<div data-lang="java" markdown="1">
-{% highlight java %}
-StreamExecutionEnvironment env = 
StreamExecutionEnvironment.getExecutionEnvironment();
-env.setStateBackend(new 
FsStateBackend("hdfs://namenode:40010/flink/checkpoints"));
-{% endhighlight %}
-</div>
-<div data-lang="scala" markdown="1">
-{% highlight scala %}
-val env = StreamExecutionEnvironment.getExecutionEnvironment()
-env.setStateBackend(new 
FsStateBackend("hdfs://namenode:40010/flink/checkpoints"))
-{% endhighlight %}
-</div>
-</div>
-
-
-### Setting Default State Backend
-
-A default state backend can be configured in the `flink-conf.yaml`, using the 
configuration key `state.backend`.
-
-Possible values for the config entry are *jobmanager* (MemoryStateBackend), 
*filesystem* (FsStateBackend), or the fully qualified class
-name of the class that implements the state backend factory 
[FsStateBackendFactory](https://github.com/apache/flink/blob/master/flink-runtime/src/main/java/org/apache/flink/runtime/state/filesystem/FsStateBackendFactory.java),
-such as `org.apache.flink.contrib.streaming.state.RocksDBStateBackendFactory` 
for RocksDBStateBackend.
-
-In the case where the default state backend is set to *filesystem*, the entry 
`state.backend.fs.checkpointdir` defines the directory where the checkpoint 
data will be stored.
-
-A sample section in the configuration file could look as follows:
-
-~~~
-# The backend that will be used to store operator state checkpoints
-
-state.backend: filesystem
-
-
-# Directory for storing checkpoints
-
-state.backend.fs.checkpointdir: hdfs://namenode:40010/flink/checkpoints
-~~~

http://git-wip-us.apache.org/repos/asf/flink/blob/daad28ab/docs/ops/README.md
----------------------------------------------------------------------
diff --git a/docs/ops/README.md b/docs/ops/README.md
new file mode 100644
index 0000000..5fe6568
--- /dev/null
+++ b/docs/ops/README.md
@@ -0,0 +1,21 @@
+<!--
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+
+This folder contains the documentation in the category
+**Deployment & Operations**.

http://git-wip-us.apache.org/repos/asf/flink/blob/daad28ab/docs/ops/state_backends.md
----------------------------------------------------------------------
diff --git a/docs/ops/state_backends.md b/docs/ops/state_backends.md
new file mode 100644
index 0000000..af9934d
--- /dev/null
+++ b/docs/ops/state_backends.md
@@ -0,0 +1,148 @@
+---
+title: "State Backends"
+nav-parent_id: setup
+nav-pos: 11
+---
+<!--
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+  http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+
+Programs written in the [Data Stream API]({{ site.baseurl 
}}/dev/datastream_api.html) often hold state in various forms:
+
+- Windows gather elements or aggregates until they are triggered
+- Transformation functions may use the key/value state interface to store 
values
+- Transformation functions may implement the `Checkpointed` interface to make 
their local variables fault tolerant
+
+See also [Working with State]({{ site.baseurl }}/dev/state.html) in the 
streaming API guide.
+
+When checkpointing is activated, such state is persisted upon checkpoints to 
guard against data loss and recover consistently.
+How the state is represented internally, and how and where it is persisted 
upon checkpoints depends on the
+chosen **State Backend**.
+
+* ToC
+{:toc}
+
+## Available State Backends
+
+Out of the box, Flink bundles these state backends:
+
+ - *MemoryStateBackend*
+ - *FsStateBackend*
+ - *RocksDBStateBackend*
+
+If nothing else is configured, the system will use the MemoryStateBackend.
+
+
+### The MemoryStateBackend
+
+The *MemoryStateBackend* holds data internally as objects on the Java heap. 
Key/value state and window operators hold hash tables
+that store the values, triggers, etc.
+
+Upon checkpoints, this state backend will snapshot the state and send it as 
part of the checkpoint acknowledgement messages to the
+JobManager (master), which stores it on its heap as well.
+
+Limitations of the MemoryStateBackend:
+
+  - The size of each individual state is by default limited to 5 MB. This 
value can be increased in the constructor of the MemoryStateBackend.
+  - Irrespective of the configured maximal state size, the state cannot be 
larger than the akka frame size (see [Configuration]({{ site.baseurl 
}}/setup/config.html)).
+  - The aggregate state must fit into the JobManager memory.
+
+The MemoryStateBackend is encouraged for:
+
+  - Local development and debugging
+  - Jobs that do hold little state, such as jobs that consist only of 
record-at-a-time functions (Map, FlatMap, Filter, ...). The Kafka Consumer 
requires very little state.
+
+
+### The FsStateBackend
+
+The *FsStateBackend* is configured with a file system URL (type, address, 
path), such as "hdfs://namenode:40010/flink/checkpoints" or 
"file:///data/flink/checkpoints".
+
+The FsStateBackend holds in-flight data in the TaskManager's memory. Upon 
checkpointing, it writes state snapshots into files in the configured file 
system and directory. Minimal metadata is stored in the JobManager's memory 
(or, in high-availability mode, in the metadata checkpoint).
+
+The FsStateBackend is encouraged for:
+
+  - Jobs with large state, long windows, large key/value states.
+  - All high-availability setups.
+
+### The RocksDBStateBackend
+
+The *RocksDBStateBackend* is configured with a file system URL (type, address, 
path), such as "hdfs://namenode:40010/flink/checkpoints" or 
"file:///data/flink/checkpoints".
+
+The RocksDBStateBackend holds in-flight data in a 
[RocksDB](http://rocksdb.org) data base
+that is (per default) stored in the TaskManager data directories. Upon 
checkpointing, the whole
+RocksDB data base will be checkpointed into the configured file system and 
directory. Minimal
+metadata is stored in the JobManager's memory (or, in high-availability mode, 
in the metadata checkpoint).
+
+The RocksDBStateBackend is encouraged for:
+
+  - Jobs with very large state, long windows, large key/value states.
+  - All high-availability setups.
+
+Note that the amount of state that you can keep is only limited by the amount 
of disc space available.
+This allows keeping very large state, compared to the FsStateBackend that 
keeps state in memory.
+This also means, however, that the maximum throughput that can be achieved 
will be lower with
+this state backend.
+
+## Configuring a State Backend
+
+State backends can be configured per job. In addition, you can define a 
default state backend to be used when the
+job does not explicitly define a state backend.
+
+
+### Setting the Per-job State Backend
+
+The per-job state backend is set on the `StreamExecutionEnvironment` of the 
job, as shown in the example below:
+
+<div class="codetabs" markdown="1">
+<div data-lang="java" markdown="1">
+{% highlight java %}
+StreamExecutionEnvironment env = 
StreamExecutionEnvironment.getExecutionEnvironment();
+env.setStateBackend(new 
FsStateBackend("hdfs://namenode:40010/flink/checkpoints"));
+{% endhighlight %}
+</div>
+<div data-lang="scala" markdown="1">
+{% highlight scala %}
+val env = StreamExecutionEnvironment.getExecutionEnvironment()
+env.setStateBackend(new 
FsStateBackend("hdfs://namenode:40010/flink/checkpoints"))
+{% endhighlight %}
+</div>
+</div>
+
+
+### Setting Default State Backend
+
+A default state backend can be configured in the `flink-conf.yaml`, using the 
configuration key `state.backend`.
+
+Possible values for the config entry are *jobmanager* (MemoryStateBackend), 
*filesystem* (FsStateBackend), or the fully qualified class
+name of the class that implements the state backend factory 
[FsStateBackendFactory](https://github.com/apache/flink/blob/master/flink-runtime/src/main/java/org/apache/flink/runtime/state/filesystem/FsStateBackendFactory.java),
+such as `org.apache.flink.contrib.streaming.state.RocksDBStateBackendFactory` 
for RocksDBStateBackend.
+
+In the case where the default state backend is set to *filesystem*, the entry 
`state.backend.fs.checkpointdir` defines the directory where the checkpoint 
data will be stored.
+
+A sample section in the configuration file could look as follows:
+
+~~~
+# The backend that will be used to store operator state checkpoints
+
+state.backend: filesystem
+
+
+# Directory for storing checkpoints
+
+state.backend.fs.checkpointdir: hdfs://namenode:40010/flink/checkpoints
+~~~

http://git-wip-us.apache.org/repos/asf/flink/blob/daad28ab/docs/redirects/state_backends.md
----------------------------------------------------------------------
diff --git a/docs/redirects/state_backends.md b/docs/redirects/state_backends.md
index f575111..3a21aaa 100644
--- a/docs/redirects/state_backends.md
+++ b/docs/redirects/state_backends.md
@@ -1,7 +1,7 @@
 ---
 title: "State Backends"
 layout: redirect
-redirect: /dev/state_backends.html
+redirect: /ops/state_backends.html
 permalink: /apis/streaming/state_backends.html
 ---
 <!--

http://git-wip-us.apache.org/repos/asf/flink/blob/daad28ab/docs/setup/aws.md
----------------------------------------------------------------------
diff --git a/docs/setup/aws.md b/docs/setup/aws.md
index 83b5d97..d165955 100644
--- a/docs/setup/aws.md
+++ b/docs/setup/aws.md
@@ -57,7 +57,7 @@ HADOOP_CONF_DIR=/etc/hadoop/conf bin/flink run -m 
yarn-cluster examples/streamin
 
 ## S3: Simple Storage Service
 
-[Amazon Simple Storage Service](http://aws.amazon.com/s3/) (Amazon S3) 
provides cloud object storage for a variety of use cases. You can use S3 with 
Flink for **reading** and **writing data** as well in conjunction with the 
[streaming **state backends**]({{ site.baseurl}}/dev/state_backends.html).
+[Amazon Simple Storage Service](http://aws.amazon.com/s3/) (Amazon S3) 
provides cloud object storage for a variety of use cases. You can use S3 with 
Flink for **reading** and **writing data** as well in conjunction with the 
[streaming **state backends**]({{ site.baseurl}}/ops/state_backends.html).
 
 You can use S3 objects like regular files by specifying paths in the following 
format:
 
@@ -78,7 +78,7 @@ stream.writeAsText("s3://<bucket>/<endpoint>");
 env.setStateBackend(new FsStateBackend("s3://<your-bucket>/<endpoint>"));
 ```
 
-Note that these examples are *not* exhaustive and you can use S3 in other 
places as well, including your [high availability setup]({{ site.baseurl 
}}/setup/jobmanager_high_availability.html) or the [RocksDBStateBackend]({{ 
site.baseurl }}/dev/state_backends.html#the-rocksdbstatebackend);  everywhere 
that Flink expects a FileSystem URI.
+Note that these examples are *not* exhaustive and you can use S3 in other 
places as well, including your [high availability setup]({{ site.baseurl 
}}/setup/jobmanager_high_availability.html) or the [RocksDBStateBackend]({{ 
site.baseurl }}/ops/state_backends.html#the-rocksdbstatebackend);  everywhere 
that Flink expects a FileSystem URI.
 
 ### Set S3 FileSystem
 

http://git-wip-us.apache.org/repos/asf/flink/blob/daad28ab/docs/setup/savepoints.md
----------------------------------------------------------------------
diff --git a/docs/setup/savepoints.md b/docs/setup/savepoints.md
index 2866635..2a1f631 100644
--- a/docs/setup/savepoints.md
+++ b/docs/setup/savepoints.md
@@ -22,7 +22,7 @@ specific language governing permissions and limitations
 under the License.
 -->
 
-Programs written in the [Data Stream API](index.html) can resume execution 
from a **savepoint**. Savepoints allow both updating your programs and your 
Flink cluster without losing any state. This page covers all steps to trigger, 
restore, and dispose savepoints. For more details on how Flink handles state 
and failures, check out the [State in Streaming Programs]({{ site.baseurl 
}}/dev/state_backends.html) and [Fault Tolerance](fault_tolerance.html) pages.
+Programs written in the [Data Stream API](index.html) can resume execution 
from a **savepoint**. Savepoints allow both updating your programs and your 
Flink cluster without losing any state. This page covers all steps to trigger, 
restore, and dispose savepoints. For more details on how Flink handles state 
and failures, check out the [State in Streaming Programs]({{ site.baseurl 
}}/ops/state_backends.html) and [Fault Tolerance](fault_tolerance.html) pages.
 
 * toc
 {:toc}

Reply via email to