Repository: incubator-beam-site Updated Branches: refs/heads/asf-site 2473d849a -> a94ad4021
BEAM-845 Update navigation and runner capability matrix to include Apex. Project: http://git-wip-us.apache.org/repos/asf/incubator-beam-site/repo Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam-site/commit/c54a9dfb Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam-site/tree/c54a9dfb Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam-site/diff/c54a9dfb Branch: refs/heads/asf-site Commit: c54a9dfb6abcf864528f174b73e5c574b130b5f0 Parents: 2473d84 Author: Thomas Weise <t...@apache.org> Authored: Sun Nov 6 11:48:47 2016 -0800 Committer: Davor Bonaci <da...@google.com> Committed: Mon Nov 7 10:19:41 2016 -0800 ---------------------------------------------------------------------- src/_data/capability-matrix.yml | 116 +++++++++++++++++++- src/_includes/header.html | 1 + src/documentation/index.md | 2 +- src/documentation/runners/apex.md | 9 ++ src/documentation/runners/capability-matrix.md | 2 +- 5 files changed, 122 insertions(+), 8 deletions(-) ---------------------------------------------------------------------- http://git-wip-us.apache.org/repos/asf/incubator-beam-site/blob/c54a9dfb/src/_data/capability-matrix.yml ---------------------------------------------------------------------- diff --git a/src/_data/capability-matrix.yml b/src/_data/capability-matrix.yml index c61b68b..375fdf4 100644 --- a/src/_data/capability-matrix.yml +++ b/src/_data/capability-matrix.yml @@ -7,6 +7,8 @@ columns: name: Apache Flink - class: spark name: Apache Spark + - class: apex + name: Apache Apex (on feature branch) categories: - description: What is being computed? @@ -34,6 +36,10 @@ categories: l1: 'Yes' l2: fully supported l3: ParDo applies per-element transformations as Spark FlatMapFunction. + - class: apex + l1: 'Yes' + l2: fully supported + l3: Supported through Apex operator that wraps the function and processes data as single element bundles. - name: GroupByKey values: - class: model @@ -52,6 +58,10 @@ categories: l1: 'Partially' l2: fully supported in batch mode l3: "Using Spark's <tt>groupByKey</tt>. GroupByKey with multiple trigger firings in streaming mode is a work in progress." + - class: apex + l1: 'Yes' + l2: fully supported + l3: "Apex runner uses the Beam code for grouping by window and thereby has support for all windowing and triggering mechanisms. Runner does not implement partitioning yet (BEAM-838)" - name: Flatten values: - class: model @@ -70,7 +80,10 @@ categories: l1: 'Yes' l2: fully supported l3: '' - + - class: apex + l1: 'Yes' + l2: fully supported + l3: '' - name: Combine values: - class: model @@ -89,7 +102,10 @@ categories: l1: 'Yes' l2: fully supported l3: "Using Spark's <tt>combineByKey</tt> and <tt>aggregate</tt> functions." - + - class: apex + l1: 'Yes' + l2: 'fully supported' + l3: "Default Beam translation. Currently no efficient pre-aggregation (BEAM-935)." - name: Composite Transforms values: - class: model @@ -108,7 +124,10 @@ categories: l1: 'Partially' l2: supported via inlining l3: '' - + - class: apex + l1: 'Partially' + l2: supported via inlining + l3: '' - name: Side Inputs values: - class: model @@ -127,7 +146,10 @@ categories: l1: 'Yes' l2: fully supported l3: "Using Spark's broadcast variables. In streaming mode, side inputs may update but only between micro-batches." - + - class: apex + l1: 'Yes' + l2: size restrictions + l3: No distributed implementation and therefore size restrictions. - name: Source API values: - class: model @@ -146,6 +168,10 @@ categories: l1: 'Yes' l2: fully supported l3: + - class: apex + l1: 'Yes' + l2: fully supported + l3: - name: Aggregators values: @@ -165,6 +191,10 @@ categories: l1: 'Partially' l2: may overcount when tasks are retried in transformations. l3: 'supported via <tt>AccumulatorParam</tt> mechanism. If a task retries, and the accumulator is not within a Spark "Action", an overcount is possible.' + - class: apex + l1: 'No' + l2: Not implemented in runner. + l3: - name: Keyed State values: @@ -185,7 +215,10 @@ categories: l1: 'No' l2: pending model support l3: Spark supports keyed state with mapWithState() so support shuold be straight forward. - + - class: apex + l1: 'No' + l2: pending model support + l3: Apex supports keyed state, so adding support for this should be easy once the Beam model exposes it. - description: Where in event time? anchor: where @@ -212,6 +245,10 @@ categories: l1: 'Yes' l2: supported l3: '' + - class: apex + l1: 'Yes' + l2: supported + l3: '' - name: Fixed windows values: @@ -231,6 +268,10 @@ categories: l1: 'Yes' l2: supported l3: '' + - class: apex + l1: 'Yes' + l2: supported + l3: '' - name: Sliding windows values: @@ -250,6 +291,10 @@ categories: l1: 'Yes' l2: supported l3: '' + - class: apex + l1: 'Yes' + l2: supported + l3: '' - name: Session windows values: @@ -269,6 +314,10 @@ categories: l1: 'Yes' l2: supported l3: '' + - class: apex + l1: 'Yes' + l2: supported + l3: '' - name: Custom windows values: @@ -288,6 +337,10 @@ categories: l1: 'Yes' l2: supported l3: '' + - class: apex + l1: 'Yes' + l2: supported + l3: '' - name: Custom merging windows values: @@ -307,6 +360,10 @@ categories: l1: 'Yes' l2: supported l3: '' + - class: apex + l1: 'Yes' + l2: supported + l3: '' - name: Timestamp control values: @@ -326,6 +383,10 @@ categories: l1: 'Yes' l2: supported l3: '' + - class: apex + l1: 'Yes' + l2: supported + l3: '' @@ -355,6 +416,10 @@ categories: l1: 'No' l2: '' l3: '' + - class: apex + l1: 'Yes' + l2: fully supported + l3: '' - name: Event-time triggers values: @@ -374,6 +439,10 @@ categories: l1: 'No' l2: '' l3: '' + - class: apex + l1: 'Yes' + l2: fully supported + l3: '' - name: Processing-time triggers values: @@ -393,6 +462,10 @@ categories: l1: 'Yes' l2: "This is Spark streaming's native model" l3: "Spark processes streams in micro-batches. The micro-batch size is actually a pre-set, fixed, time interval. Currently, the runner takes the first window size in the pipeline and sets it's size as the batch interval. Any following window operations will be considered processing time windows and will affect triggering." + - class: apex + l1: 'Yes' + l2: fully supported + l3: '' - name: Count triggers values: @@ -412,6 +485,10 @@ categories: l1: 'No' l2: '' l3: '' + - class: apex + l1: 'Yes' + l2: fully supported + l3: '' - name: '[Meta]data driven triggers' values: @@ -432,6 +509,10 @@ categories: l1: 'No' l2: pending model support l3: + - class: apex + l1: 'No' + l2: pending model support + l3: - name: Composite triggers values: @@ -451,6 +532,10 @@ categories: l1: 'No' l2: '' l3: '' + - class: apex + l1: 'Yes' + l2: fully supported + l3: '' - name: Allowed lateness values: @@ -470,6 +555,10 @@ categories: l1: 'No' l2: '' l3: '' + - class: apex + l1: 'Yes' + l2: fully supported + l3: '' - name: Timers values: @@ -490,7 +579,10 @@ categories: l1: 'No' l2: pending model support l3: '' - + - class: apex + l1: 'No' + l2: pending model support + l3: '' - description: How do refinements relate? anchor: how @@ -518,6 +610,10 @@ categories: l1: 'Yes' l2: fully supported l3: 'Spark streaming natively discards elements after firing.' + - class: apex + l1: 'Yes' + l2: fully supported + l3: '' - name: Accumulating values: @@ -537,6 +633,10 @@ categories: l1: 'No' l2: '' l3: '' + - class: apex + l1: 'Yes' + l2: fully supported + l3: 'Size restriction, see combine support.' - name: 'Accumulating & Retracting' values: @@ -557,3 +657,7 @@ categories: l1: 'No' l2: pending model support l3: '' + - class: apex + l1: 'No' + l2: pending model support + l3: '' http://git-wip-us.apache.org/repos/asf/incubator-beam-site/blob/c54a9dfb/src/_includes/header.html ---------------------------------------------------------------------- diff --git a/src/_includes/header.html b/src/_includes/header.html index f70bcee..e39e9d1 100644 --- a/src/_includes/header.html +++ b/src/_includes/header.html @@ -50,6 +50,7 @@ <li class="dropdown-header">Runners</li> <li><a href="{{ site.baseurl }}/documentation/runners/capability-matrix/">Capability Matrix</a></li> <li><a href="{{ site.baseurl }}/documentation/runners/direct/">Direct Runner</a></li> + <li><a href="{{ site.baseurl }}/documentation/runners/apex/">Apache Apex Runner</a></li> <li><a href="{{ site.baseurl }}/documentation/runners/flink/">Apache Flink Runner</a></li> <li><a href="{{ site.baseurl }}/documentation/runners/spark/">Apache Spark Runner</a></li> <li><a href="{{ site.baseurl }}/documentation/runners/dataflow/">Cloud Dataflow Runner</a></li> http://git-wip-us.apache.org/repos/asf/incubator-beam-site/blob/c54a9dfb/src/documentation/index.md ---------------------------------------------------------------------- diff --git a/src/documentation/index.md b/src/documentation/index.md index 2ed18f3..c4bbd83 100644 --- a/src/documentation/index.md +++ b/src/documentation/index.md @@ -32,10 +32,10 @@ A Beam Runner runs a Beam pipeline on a specific (often distributed) data proces ### Available Runners * [DirectRunner]({{ site.baseurl }}/documentation/runners/direct/): Runs locally on your machine -- great for developing, testing, and debugging. +* [ApexRunner]({{ site.baseurl }}/documentation/runners/apex/): Runs on [Apache Apex](http://apex.apache.org). * [FlinkRunner]({{ site.baseurl }}/documentation/runners/flink/): Runs on [Apache Flink](http://flink.apache.org). * [SparkRunner]({{ site.baseurl }}/documentation/runners/spark/): Runs on [Apache Spark](http://spark.apache.org). * [DataflowRunner]({{ site.baseurl }}/documentation/runners/dataflow/): Runs on [Google Cloud Dataflow](https://cloud.google.com/dataflow), a fully managed service within [Google Cloud Platform](https://cloud.google.com/). -* _[Under Development]_ [ApexRunner]({{ site.baseurl }}/contribute/work-in-progress/#feature-branches): Runs on [Apache Apex](http://apex.apache.org). * _[Under Development]_ [GearpumpRunner]({{ site.baseurl }}/contribute/work-in-progress/#feature-branches): Runs on [Apache Gearpump (incubating)](http://gearpump.apache.org). ### Choosing a Runner http://git-wip-us.apache.org/repos/asf/incubator-beam-site/blob/c54a9dfb/src/documentation/runners/apex.md ---------------------------------------------------------------------- diff --git a/src/documentation/runners/apex.md b/src/documentation/runners/apex.md new file mode 100644 index 0000000..408e6de --- /dev/null +++ b/src/documentation/runners/apex.md @@ -0,0 +1,9 @@ +--- +layout: default +title: "Apache Apex Runner" +permalink: /documentation/runners/apex/ +--- +# Using the Apache Apex Runner + +This page is under construction ([BEAM-825](https://issues.apache.org/jira/browse/BEAM-825)). The runner is on a feature branch. + http://git-wip-us.apache.org/repos/asf/incubator-beam-site/blob/c54a9dfb/src/documentation/runners/capability-matrix.md ---------------------------------------------------------------------- diff --git a/src/documentation/runners/capability-matrix.md b/src/documentation/runners/capability-matrix.md index 22c602c..bfb8cc1 100644 --- a/src/documentation/runners/capability-matrix.md +++ b/src/documentation/runners/capability-matrix.md @@ -8,7 +8,7 @@ redirect_from: --- # Beam Capability Matrix -Apache Beam (incubating) provides a portable API layer for building sophisticated data-parallel processing engines that may be executed across a diversity of exeuction engines, or <i>runners</i>. The core concepts of this layer are based upon the Beam Model (formerly referred to as the [Dataflow Model](http://www.vldb.org/pvldb/vol8/p1792-Akidau.pdf)), and implemented to varying degrees in each Beam runner. To help clarify the capabilities of individual runners, we've created the capability matrix below. +Apache Beam (incubating) provides a portable API layer for building sophisticated data-parallel processing pipelines that may be executed across a diversity of execution engines, or <i>runners</i>. The core concepts of this layer are based upon the Beam Model (formerly referred to as the [Dataflow Model](http://www.vldb.org/pvldb/vol8/p1792-Akidau.pdf)), and implemented to varying degrees in each Beam runner. To help clarify the capabilities of individual runners, we've created the capability matrix below. Individual capabilities have been grouped by their corresponding <span class="wwwh-what-dark">What</span> / <span class="wwwh-where-dark">Where</span> / <span class="wwwh-when-dark">When</span> / <span class="wwwh-how-dark">How</span> question: