[4/4] beam-site git commit: Regenerate website after merge
Regenerate website after merge Project: http://git-wip-us.apache.org/repos/asf/beam-site/repo Commit: http://git-wip-us.apache.org/repos/asf/beam-site/commit/af94ee21 Tree: http://git-wip-us.apache.org/repos/asf/beam-site/tree/af94ee21 Diff: http://git-wip-us.apache.org/repos/asf/beam-site/diff/af94ee21 Branch: refs/heads/asf-site Commit: af94ee21689d1e50bda93e5a46c1f5528866b9b7 Parents: d758267 Author: Dan Halperin Authored: Thu May 25 14:16:04 2017 -0700 Committer: Dan Halperin Committed: Thu May 25 14:16:04 2017 -0700 -- .../documentation/runners/capability-matrix/index.html | 12 ++-- 1 file changed, 6 insertions(+), 6 deletions(-) -- http://git-wip-us.apache.org/repos/asf/beam-site/blob/af94ee21/content/documentation/runners/capability-matrix/index.html -- diff --git a/content/documentation/runners/capability-matrix/index.html b/content/documentation/runners/capability-matrix/index.html index 1a31a2f..6467c66 100644 --- a/content/documentation/runners/capability-matrix/index.html +++ b/content/documentation/runners/capability-matrix/index.html @@ -438,7 +438,7 @@ -Aggregators +Metrics @@ -1493,26 +1493,26 @@ -Aggregators +Metrics -Partially: user-provided metricsAllow transforms to aggregate simple metrics across bundles in a DoFn. Semantically equivalent to using an additional output, but support partial results as the transform executes. Will likely want to augment Aggregators to be more useful for processing unbounded data by making them windowed. +Partially: user-provided metricsAllow transforms to gather simple metrics across bundles in a PTransform. Provide a mechanism to obtain both committed and attempted metrics. Semantically similar to using an additional output, but support partial results as the transform executes, and support both committed and attempted values. Will likely want to augment Metrics to be more useful for processing unbounded data by making them windowed. -Partially: may miscount in streaming modeCurrent model is fully supported in batch mode. In streaming mode, Aggregators may under or overcount when bundles are retried. +Partially: In batch mode, Dataflow supports committed and attempted Counters and Distributions.Gauge metrics are not supported in batch mode. Metrics are not yet supported at all in streaming mode, but this support is coming soon ([BEAM-2059](https://issues.apache.org/jira/browse/BEAM-2059)). -Partially: may undercount in streamingCurrent model is fully supported in batch. In streaming mode, Aggregators may undercount. +Partially: All metrics types are supported.Only attempted values are supported. No committed values for metrics. -Partially: may overcount when tasks are retried in transformations.supported via AccumulatorParam mechanism. If a task retries, and the accumulator is not within a Spark "Action", an overcount is possible. +Partially: All metric types are supported.Only attempted values are supported. No committed values for metrics.
[4/4] beam-site git commit: Regenerate website after merge
Regenerate website after merge Project: http://git-wip-us.apache.org/repos/asf/beam-site/repo Commit: http://git-wip-us.apache.org/repos/asf/beam-site/commit/34524776 Tree: http://git-wip-us.apache.org/repos/asf/beam-site/tree/34524776 Diff: http://git-wip-us.apache.org/repos/asf/beam-site/diff/34524776 Branch: refs/heads/asf-site Commit: 34524776a0570cf59ea164132aa2cc64b41d7d73 Parents: ae9e888 Author: Dan Halperin Authored: Thu May 25 14:02:09 2017 -0700 Committer: Dan Halperin Committed: Thu May 25 14:02:09 2017 -0700 -- .../get-started/wordcount-example/index.html| 263 ++- 1 file changed, 259 insertions(+), 4 deletions(-) -- http://git-wip-us.apache.org/repos/asf/beam-site/blob/34524776/content/get-started/wordcount-example/index.html -- diff --git a/content/get-started/wordcount-example/index.html b/content/get-started/wordcount-example/index.html index e4efeb7..3a4adf9 100644 --- a/content/get-started/wordcount-example/index.html +++ b/content/get-started/wordcount-example/index.html @@ -196,7 +196,77 @@ Minimal WordCount demonstrates a simple pipeline that can read from a text file, apply transforms to tokenize and count the words, and write the data to an output text file. This example hard-codes the locations for its input and output files and doesnât perform any error checking; it is intended to only show you the âbare bonesâ of creating a Beam pipeline. This lack of parameterization makes this particular pipeline less portable across different runners than standard Beam pipelines. In later examples, we will parameterize the pipelineâs input and output sources and show other best practices. -To run this example, follow the instructions in the Quickstart for Java or Python. To view the full code, see https://github.com/apache/beam/blob/master/examples/java/src/main/java/org/apache/beam/examples/MinimalWordCount.java";>MinimalWordCount. +To run this example in Java: + +$ mvn compile exec:java -Dexec.mainClass=org.apache.beam.examples.MinimalWordCount + + + +$ mvn compile exec:java -Dexec.mainClass=org.apache.beam.examples.MinimalWordCount \ + -Dexec.args="--inputFile=pom.xml --output=counts --runner=ApexRunner" -Papex-runner + + + +$ mvn compile exec:java -Dexec.mainClass=org.apache.beam.examples.MinimalWordCount \ + -Dexec.args="--runner=FlinkRunner --inputFile=pom.xml --output=counts" -Pflink-runner + + + +$ mvn package exec:java -Dexec.mainClass=org.apache.beam.examples.MinimalWordCount \ + -Dexec.args="--runner=FlinkRunner --flinkMaster=--filesToStage=target/word-count-beam-bundled-0.1.jar \ + --inputFile=/path/to/quickstart/pom.xml --output=/tmp/counts" -Pflink-runner + +You can monitor the running job by visiting the Flink dashboard at http:// :8081 + + + +$ mvn compile exec:java -Dexec.mainClass=org.apache.beam.examples.MinimalWordCount \ + -Dexec.args="--runner=SparkRunner --inputFile=pom.xml --output=counts" -Pspark-runner + + + +$ mvn compile exec:java -Dexec.mainClass=org.apache.beam.examples.MinimalWordCount \ + -Dexec.args="--runner=DataflowRunner --gcpTempLocation=gs:// /tmp \ +--inputFile=gs://apache-beam-samples/shakespeare/* --output=gs:// /counts" \ + -Pdataflow-runner + + + +To view the full code in Java, see https://github.com/apache/beam/blob/master/examples/java/src/main/java/org/apache/beam/examples/MinimalWordCount.java";>MinimalWordCount. + +To run this example in Python: + +python -m apache_beam.examples.wordcount_minimal --input README.md --output counts + + + +This runner is not yet available for the Python SDK. + + + +This runner is not yet available for the Python SDK. + + + +This runner is not yet available for the Python SDK. + + + +This runner is not yet available for the Python SDK. + + + +# As part of the initial setup, install Google Cloud Platform specific extra components. +pip install apache-beam[gcp] +python -m apache_beam.examples.wordcount_minimal --input gs://dataflow-samples/shakespeare/kinglear.txt \ + --output gs:// /counts \ + --runner DataflowRunner \ + --project your-gcp-project \ + --temp_location gs:// /tmp/ + + + +To view the full code in Python, see https://github.com/apache/beam/blob/master/sdks/python/apache_beam/examples/wordcount_minimal.py";>wordcount_minimal.py. Key Concepts: @@ -368,7 +438,78 @@ Figure 1: The pipeline data flow. This section assumes that you have a good understanding of the basic concepts in building a pipel
[3/3] beam-site git commit: Regenerate website after merge
Regenerate website after merge Project: http://git-wip-us.apache.org/repos/asf/beam-site/repo Commit: http://git-wip-us.apache.org/repos/asf/beam-site/commit/ce15747f Tree: http://git-wip-us.apache.org/repos/asf/beam-site/tree/ce15747f Diff: http://git-wip-us.apache.org/repos/asf/beam-site/diff/ce15747f Branch: refs/heads/asf-site Commit: ce15747f32d776feb457472fe964311cd921510f Parents: 56c289f Author: Dan Halperin Authored: Thu May 25 09:43:48 2017 -0700 Committer: Dan Halperin Committed: Thu May 25 09:43:48 2017 -0700 -- content/documentation/io/built-in/index.html | 4 1 file changed, 4 insertions(+) -- http://git-wip-us.apache.org/repos/asf/beam-site/blob/ce15747f/content/documentation/io/built-in/index.html -- diff --git a/content/documentation/io/built-in/index.html b/content/documentation/io/built-in/index.html index 6b3de1b..688e24f 100644 --- a/content/documentation/io/built-in/index.html +++ b/content/documentation/io/built-in/index.html @@ -259,6 +259,10 @@ TikaIOJava https://issues.apache.org/jira/browse/BEAM-2328";>BEAM-2328 + +Cloud SpannerJava +https://issues.apache.org/jira/browse/BEAM-1542";>BEAM-1542 +
beam-site git commit: Regenerate website after merge
Repository: beam-site Updated Branches: refs/heads/asf-site 222ba1259 -> c4a54639e Regenerate website after merge Project: http://git-wip-us.apache.org/repos/asf/beam-site/repo Commit: http://git-wip-us.apache.org/repos/asf/beam-site/commit/c4a54639 Tree: http://git-wip-us.apache.org/repos/asf/beam-site/tree/c4a54639 Diff: http://git-wip-us.apache.org/repos/asf/beam-site/diff/c4a54639 Branch: refs/heads/asf-site Commit: c4a54639e0a25ba74c5fd5eb62eeab371f6e614d Parents: 222ba12 Author: Dan Halperin Authored: Wed May 17 02:33:04 2017 -0400 Committer: Dan Halperin Committed: Wed May 17 02:33:04 2017 -0400 -- content/feed.xml | 2 +- content/get-started/quickstart-java/index.html | 3 +-- 2 files changed, 2 insertions(+), 3 deletions(-) -- http://git-wip-us.apache.org/repos/asf/beam-site/blob/c4a54639/content/feed.xml -- diff --git a/content/feed.xml b/content/feed.xml index 02da655..1a32bd5 100644 --- a/content/feed.xml +++ b/content/feed.xml @@ -883,7 +883,7 @@ be controlled within a test.Writing Deterministic Tests to Emulate Nondeterminism
The Beam testing infrastructure provides the -PAssert +PAssert methods, which assert properties about the contents of a PCollection from within a pipeline. We have expanded this infrastructure to include TestStream;, http://git-wip-us.apache.org/repos/asf/beam-site/blob/c4a54639/content/get-started/quickstart-java/index.html -- diff --git a/content/get-started/quickstart-java/index.html b/content/get-started/quickstart-java/index.html index abee09e..0d9d0af 100644 --- a/content/get-started/quickstart-java/index.html +++ b/content/get-started/quickstart-java/index.html @@ -164,10 +164,9 @@ The easiest way to get a copy of the WordCount pipeline is to use the following command to generate a simple Maven project that contains Beamâs WordCount examples and builds against the most recent Beam release: $ mvn archetype:generate \ - -DarchetypeRepository=https://repository.apache.org/content/groups/snapshots \ -DarchetypeGroupId=org.apache.beam \ -DarchetypeArtifactId=beam-sdks-java-maven-archetypes-examples \ - -DarchetypeVersion=LATEST \ + -DarchetypeVersion=2.0.0 \ -DgroupId=org.example \ -DartifactId=word-count-beam \ -Dversion="0.1" \