Repository: beam-site Updated Branches: refs/heads/asf-site 3cafa86a0 -> d7f468491
[BEAM-969] Add a gearpump runner web page Project: http://git-wip-us.apache.org/repos/asf/beam-site/repo Commit: http://git-wip-us.apache.org/repos/asf/beam-site/commit/c717c8bd Tree: http://git-wip-us.apache.org/repos/asf/beam-site/tree/c717c8bd Diff: http://git-wip-us.apache.org/repos/asf/beam-site/diff/c717c8bd Branch: refs/heads/asf-site Commit: c717c8bdc759ab8f46508ede90ed4e38424b29a7 Parents: 3cafa86 Author: huafengw <fvunic...@gmail.com> Authored: Mon Apr 24 15:10:37 2017 +0800 Committer: Kenneth Knowles <k...@google.com> Committed: Wed May 3 13:59:03 2017 -0700 ---------------------------------------------------------------------- src/_data/capability-matrix.yml | 124 ++++++++++++++++++++++--- src/contribute/work-in-progress.md | 2 +- src/documentation/runners/gearpump.md | 141 +++++++++++++++++++++++++++++ src/get-started/beam-overview.md | 8 +- src/images/logos/runners/gearpump.png | Bin 0 -> 2643 bytes 5 files changed, 261 insertions(+), 14 deletions(-) ---------------------------------------------------------------------- http://git-wip-us.apache.org/repos/asf/beam-site/blob/c717c8bd/src/_data/capability-matrix.yml ---------------------------------------------------------------------- diff --git a/src/_data/capability-matrix.yml b/src/_data/capability-matrix.yml index 3c90800..565ffba 100644 --- a/src/_data/capability-matrix.yml +++ b/src/_data/capability-matrix.yml @@ -9,6 +9,8 @@ columns: name: Apache Spark - class: apex name: Apache Apex + - class: gearpump + name: Apache Gearpump categories: - description: What is being computed? @@ -40,6 +42,10 @@ categories: l1: 'Yes' l2: fully supported l3: Supported through Apex operator that wraps the function and processes data as single element bundles. + - class: gearpump + l1: 'Yes' + l2: fully supported + l3: Gearpump wraps the per-element transformation function into processor execution. - name: GroupByKey values: - class: model @@ -62,6 +68,10 @@ categories: l1: 'Yes' l2: fully supported l3: "Apex runner uses the Beam code for grouping by window and thereby has support for all windowing and triggering mechanisms. Runner does not implement partitioning yet (BEAM-838)" + - class: gearpump + l1: 'Yes' + l2: fully supported + l3: "Use Gearpump's groupBy and window for key grouping and translate Beam's windowing and triggering to Gearpump's internal implementation." - name: Flatten values: - class: model @@ -84,6 +94,10 @@ categories: l1: 'Yes' l2: fully supported l3: '' + - class: gearpump + l1: 'Yes' + l2: fully supported + l3: '' - name: Combine values: - class: model @@ -106,6 +120,10 @@ categories: l1: 'Yes' l2: 'fully supported' l3: "Default Beam translation. Currently no efficient pre-aggregation (BEAM-935)." + - class: gearpump + l1: 'Yes' + l2: fully supported + l3: '' - name: Composite Transforms values: - class: model @@ -128,6 +146,10 @@ categories: l1: 'Partially' l2: supported via inlining l3: '' + - class: gearpump + l1: 'Partially' + l2: supported via inlining + l3: '' - name: Side Inputs values: - class: model @@ -149,7 +171,11 @@ categories: - class: apex l1: 'Yes' l2: size restrictions - l3: No distributed implementation and therefore size restrictions. + l3: No distributed implementation and therefore size restrictions. + - class: gearpump + l1: 'Yes' + l2: fully supported + l3: Implemented by merging side input as a normal stream in Gearpump - name: Source API values: - class: model @@ -172,7 +198,10 @@ categories: l1: 'Yes' l2: fully supported l3: - + - class: gearpump + l1: 'Yes' + l2: fully supported + l3: '' - name: Aggregators values: - class: model @@ -195,7 +224,10 @@ categories: l1: 'No' l2: Not implemented in runner. l3: - + - class: gearpump + l1: 'No' + l2: '' + l3: not implemented - name: Stateful Processing values: - class: model @@ -218,7 +250,10 @@ categories: l1: 'No' l2: not implemented l3: Apex supports per-key state, so adding support for this should be easy. - + - class: gearpump + l1: 'No' + l2: not implemented + l3: '' - description: Where in event time? anchor: where color-b: '37d' @@ -248,7 +283,10 @@ categories: l1: 'Yes' l2: supported l3: '' - + - class: gearpump + l1: 'Yes' + l2: supported + l3: '' - name: Fixed windows values: - class: model @@ -271,7 +309,10 @@ categories: l1: 'Yes' l2: supported l3: '' - + - class: gearpump + l1: 'Yes' + l2: supported + l3: '' - name: Sliding windows values: - class: model @@ -294,7 +335,10 @@ categories: l1: 'Yes' l2: supported l3: '' - + - class: gearpump + l1: 'Yes' + l2: supported + l3: '' - name: Session windows values: - class: model @@ -317,7 +361,10 @@ categories: l1: 'Yes' l2: supported l3: '' - + - class: gearpump + l1: 'Yes' + l2: supported + l3: '' - name: Custom windows values: - class: model @@ -340,7 +387,10 @@ categories: l1: 'Yes' l2: supported l3: '' - + - class: gearpump + l1: 'Yes' + l2: supported + l3: '' - name: Custom merging windows values: - class: model @@ -363,7 +413,10 @@ categories: l1: 'Yes' l2: supported l3: '' - + - class: gearpump + l1: 'Yes' + l2: supported + l3: '' - name: Timestamp control values: - class: model @@ -386,7 +439,10 @@ categories: l1: 'Yes' l2: supported l3: '' - + - class: gearpump + l1: 'Yes' + l2: supported + l3: '' - description: When in processing time? @@ -419,6 +475,10 @@ categories: l1: 'Yes' l2: fully supported l3: '' + - class: gearpump + l1: 'No' + l2: '' + l3: '' - name: Event-time triggers values: @@ -442,6 +502,10 @@ categories: l1: 'Yes' l2: fully supported l3: '' + - class: gearpump + l1: 'Yes' + l2: fully supported + l3: '' - name: Processing-time triggers values: @@ -465,6 +529,10 @@ categories: l1: 'Yes' l2: fully supported l3: '' + - class: gearpump + l1: 'No' + l2: '' + l3: '' - name: Count triggers values: @@ -488,6 +556,10 @@ categories: l1: 'Yes' l2: fully supported l3: '' + - class: gearpump + l1: 'No' + l2: '' + l3: '' - name: '[Meta]data driven triggers' values: @@ -511,7 +583,11 @@ categories: - class: apex l1: 'No' l2: pending model support - l3: + l3: + - class: gearpump + l1: 'No' + l2: pending model support + l3: - name: Composite triggers values: @@ -535,6 +611,10 @@ categories: l1: 'Yes' l2: fully supported l3: '' + - class: gearpump + l1: 'No' + l2: '' + l3: '' - name: Allowed lateness values: @@ -558,6 +638,10 @@ categories: l1: 'Yes' l2: fully supported l3: '' + - class: gearpump + l1: 'Yes' + l2: fully supported + l3: '' - name: Timers values: @@ -581,6 +665,10 @@ categories: l1: 'No' l2: not implemented l3: '' + - class: gearpump + l1: 'No' + l2: not implemented + l3: '' - description: How do refinements relate? anchor: how @@ -612,6 +700,10 @@ categories: l1: 'Yes' l2: fully supported l3: '' + - class: gearpump + l1: 'Yes' + l2: fully supported + l3: '' - name: Accumulating values: @@ -635,6 +727,10 @@ categories: l1: 'Yes' l2: fully supported l3: 'Size restriction, see combine support.' + - class: gearpump + l1: 'No' + l2: '' + l3: '' - name: 'Accumulating & Retracting' values: @@ -659,3 +755,7 @@ categories: l1: 'No' l2: pending model support l3: '' + - class: gearpump + l1: 'No' + l2: pending model support + l3: '' \ No newline at end of file http://git-wip-us.apache.org/repos/asf/beam-site/blob/c717c8bd/src/contribute/work-in-progress.md ---------------------------------------------------------------------- diff --git a/src/contribute/work-in-progress.md b/src/contribute/work-in-progress.md index c3a4d17..224dddb 100644 --- a/src/contribute/work-in-progress.md +++ b/src/contribute/work-in-progress.md @@ -24,7 +24,7 @@ Current branches include: | Feature | Branch | JIRA Component | More Info | | ---- | ---- | ---- | ---- | -| Apache Gearpump Runner | [gearpump-runner](https://github.com/apache/beam/tree/gearpump-runner) | [runner-gearpump](https://issues.apache.org/jira/browse/BEAM/component/12330829) | [README](https://github.com/apache/beam/blob/gearpump-runner/runners/gearpump/README.md) | +| Apache Gearpump Runner | [gearpump-runner](https://github.com/apache/beam/tree/gearpump-runner) | [runner-gearpump](https://issues.apache.org/jira/browse/BEAM/component/12330829) | [runner homepage]({{ site.baseurl }}/documentation/runners/gearpump/) | | Apache Spark 2.0 Runner | [runners-spark2](https://github.com/apache/beam/tree/runners-spark2) | - | [thread](https://lists.apache.org/thread.html/e38ac4e4914a6cb1b865b1f32a6ca06c2be28ea4aa0f6b18393de66f@%3Cdev.beam.apache.org%3E) | {:.table} http://git-wip-us.apache.org/repos/asf/beam-site/blob/c717c8bd/src/documentation/runners/gearpump.md ---------------------------------------------------------------------- diff --git a/src/documentation/runners/gearpump.md b/src/documentation/runners/gearpump.md new file mode 100644 index 0000000..7234647 --- /dev/null +++ b/src/documentation/runners/gearpump.md @@ -0,0 +1,141 @@ +--- +layout: default +title: "Apache Gearpump (incubating) Runner" +permalink: /documentation/runners/gearpump/ +--- +# Using the Apache Gearpump Runner + +The Apache Gearpump Runner can be used to execute Beam pipelines using [Apache Gearpump (incubating)](https://gearpump.apache.org). +When you are running your pipeline with Gearpump Runner you just need to create a jar file containing your job and then it can be executed on a regular Gearpump distributed cluster, or a local cluster which is useful for development and debugging of your pipeline. + +The Gearpump Runner and Gearpump are suitable for large scale, continuous jobs, and provide: + +* High throughput and low latency stream processing +* Comprehensive Dashboard for application monitoring +* Fault-tolerance with exactly-once processing guarantees +* Application hot re-deployment + +The [Beam Capability Matrix]({{ site.baseurl }}/documentation/runners/capability-matrix/) documents the currently supported capabilities of the Gearpump Runner. + +## Building Gearpump Runner +Currently Gearpump Runner is on a [feature branch](https://github.com/apache/beam/tree/gearpump-runner) and in order to run your Beam pipeline with Gearpump runner, you should build out the corresponding artifacts first. +``` +git checkout gearpump-runner +mvn install +``` + + +## Writing Beam Pipeline with Gearpump Runner +To use the Gearpump Runner in a distributed mode, you have to setup a Gearpump cluster first by following the Gearpump [setup quickstart](http://gearpump.apache.org/releases/latest/deployment/deployment-standalone/index.html). + +Suppose you are writing a Beam pipeline, you can add a dependency on the latest version of the Gearpump runner by adding to your pom.xml to enable Gearpump runner. +And your Beam application should also pack Beam SDK explicitly and here is a snippet of example pom.xml: +```java +<dependencies> + <dependency> + <groupId>org.apache.beam</groupId> + <artifactId>beam-runners-gearpump</artifactId> + <version>{{ site.release_latest }}</version> + </dependency> + + <dependency> + <groupId>org.apache.gearpump</groupId> + <artifactId>gearpump-streaming_2.11</artifactId> + <version>${gearpump.version}</version> + <scope>provided</scope> + </dependency> + + <dependency> + <groupId>org.apache.gearpump</groupId> + <artifactId>gearpump-core_2.11</artifactId> + <version>${gearpump.version}</version> + <scope>provided</scope> + </dependency> + + <dependency> + <groupId>org.apache.beam</groupId> + <artifactId>beam-sdks-java-core</artifactId> + <version>{{ site.release_latest }}</version> + </dependency> +</dependencies> + +<build> + <plugins> + <plugin> + <groupId>org.apache.maven.plugins</groupId> + <artifactId>maven-shade-plugin</artifactId> + <configuration> + <createDependencyReducedPom>false</createDependencyReducedPom> + <filters> + <filter> + <artifact>*:*</artifact> + <excludes> + <exclude>META-INF/*.SF</exclude> + <exclude>META-INF/*.DSA</exclude> + <exclude>META-INF/*.RSA</exclude> + </excludes> + </filter> + </filters> + </configuration> + <executions> + <execution> + <phase>package</phase> + <goals> + <goal>shade</goal> + </goals> + <configuration> + <shadedArtifactAttached>true</shadedArtifactAttached> + <shadedClassifierName>shaded</shadedClassifierName> + </configuration> + </execution> + </executions> + </plugin> + </plugins> +</build> +``` + +After running <code>mvn package</code>, run <code>ls target</code> and you should see your application jar like: +``` +{your_application}-{version}-shaded.jar +``` + +## Executing the pipeline on a Gearpump cluster +To run against a Gearpump cluster simply run: +``` +gear app -jar /path/to/{your_application}-{version}-shaded.jar com.beam.examples.BeamPipeline --runner=GearpumpRunner ... +``` + +## Monitoring your application +You can monitor a running Gearpump application using Gearpump's Dashboard. Please follow the Gearpump [Start UI](http://gearpump.apache.org/releases/latest/deployment/deployment-standalone/index.html#start-ui) to start the dashboard. + +## Pipeline options for the Gearpump Runner + +When executing your pipeline with the Gearpump Runner, you should consider the following pipeline options. + +<table class="table table-bordered"> +<tr> + <th>Field</th> + <th>Description</th> + <th>Default Value</th> +</tr> +<tr> + <td><code>runner</code></td> + <td>The pipeline runner to use. This option allows you to determine the pipeline runner at runtime.</td> + <td>Set to <code>GearpumpRunner</code> to run using Gearpump.</td> +</tr> +<tr> + <td><code>parallelism</code></td> + <td>The parallelism for Gearpump processor.</td> + <td><code>1</code></td> +</tr> +<tr> + <td><code>applicationName</code></td> + <td>The application name for Gearpump runner.</td> + <td><code>beam_gearpump_app</code></td> +</tr> +</table> + + + + + http://git-wip-us.apache.org/repos/asf/beam-site/blob/c717c8bd/src/get-started/beam-overview.md ---------------------------------------------------------------------- diff --git a/src/get-started/beam-overview.md b/src/get-started/beam-overview.md index 0ef5152..07b1ad9 100644 --- a/src/get-started/beam-overview.md +++ b/src/get-started/beam-overview.md @@ -36,10 +36,16 @@ Beam currently supports Runners that work with the following distributed process alt="Apache Flink"> * Apache Spark <img src="{{ site.baseurl }}/images/logos/runners/spark.png" alt="Apache Spark"> +* Apache Gearpump(incubating) <img src="{{ site.baseurl }}/images/logos/runners/gearpump.png" + alt="Apache Gearpump"> * Google Cloud Dataflow <img src="{{ site.baseurl }}/images/logos/runners/dataflow.png" alt="Google Cloud Dataflow"> + +**Note:** -**Note:** You can always execute your pipeline locally for testing and debugging purposes. +1. You can always execute your pipeline locally for testing and debugging purposes. + +2. Gearpump Runner is under development and for more details please refer to [Ongoing Projects]({{ site.baseurl }}/contribute/work-in-progress) ## Get Started http://git-wip-us.apache.org/repos/asf/beam-site/blob/c717c8bd/src/images/logos/runners/gearpump.png ---------------------------------------------------------------------- diff --git a/src/images/logos/runners/gearpump.png b/src/images/logos/runners/gearpump.png new file mode 100644 index 0000000..d35f44a Binary files /dev/null and b/src/images/logos/runners/gearpump.png differ