This is an automated email from the ASF dual-hosted git repository. kenn pushed a commit to branch asf-site in repository https://gitbox.apache.org/repos/asf/beam-site.git
commit 0bc08e3eec87fc7ba936b63e73adca1c57500b88 Author: xiliu <xi...@linkedin.com> AuthorDate: Wed Jun 6 16:16:51 2018 -0700 Update the option docs --- src/documentation/runners/samza.md | 22 ++++++++++++---------- 1 file changed, 12 insertions(+), 10 deletions(-) diff --git a/src/documentation/runners/samza.md b/src/documentation/runners/samza.md index fe48576..2b63189 100644 --- a/src/documentation/runners/samza.md +++ b/src/documentation/runners/samza.md @@ -72,7 +72,7 @@ The Samza Runner is built on Samza version greater than 0.14.1, and uses Scala v ## Executing a pipeline with Samza Runner -If you run your pipeline locally or deploy it to a standalone cluster bundled with all the jars and resource files, no packaging is required. For example, the following command runs the WordCount example: +If you run your pipeline locally or deploy it to a standalone cluster with all the jars and resource files, no packaging is required. For example, the following command runs the WordCount example: ``` $ mvn exec:java -Dexec.mainClass=org.apache.beam.examples.WordCount \ @@ -82,7 +82,7 @@ $ mvn exec:java -Dexec.mainClass=org.apache.beam.examples.WordCount \ --output=/path/to/counts" ``` -To deploy your pipeline to a YARN cluster, you need to package your application jars and resource files into a `.tgz` archive file. In your config, you need to specify the URI of the TGZ file for Samza Runner to download: +To deploy your pipeline to a YARN cluster, here is the [instructions](https://samza.apache.org/startup/hello-samza/latest/) of deploying a sample Samza job. First you need to package your application jars and resource files into a `.tgz` archive file, and make it available to download for Yarn containers. In your config, you need to specify the URI of this TGZ file location: ``` yarn.package.path=${your_job_tgz_URI} @@ -93,7 +93,9 @@ job.coordinator.system=${job_coordinator_system} job.default.system=${job_default_system} ``` -The config file can be passed to Samza Runner by setting the command line arg `--configFilePath=/path/to/config.properties`. For more details on the Samza configuration, see [Samza Configuration Reference](https://samza.apache.org/learn/documentation/latest/jobs/configuration-table.html). +For more details on the configuration, see [Samza Configuration Reference](https://samza.apache.org/learn/documentation/latest/jobs/configuration-table.html). + +The config file will be passed in by setting the command line arg `--configFilePath=/path/to/config.properties`. With that, you can run your main class of Beam pipeline in a Yarn Resource Manager, and the Samza Runner will submit a Yarn job under the hood. ## Pipeline options for the Samza Runner @@ -111,14 +113,14 @@ When executing your pipeline with the Samza Runner, you can use the following pi <td>Set to <code>SamzaRunner</code> to run using Samza.</td> </tr> <tr> - <td><code>samzaConfig</code></td> - <td>The config for Samza runner.</td> - <td>Config for running locally.</td> + <td><code>configFilePath</code></td> + <td>The config for Samza using a properties file.</td> + <td><code>empty</code>, i.e. use local execution.</td> </tr> <tr> - <td><code>ConfigFilePath</code></td> - <td>The config for Samza runner using a properties file.</td> - <td><code>empty</code>, i.e. use default samzaConfig</td> + <td><code>configOverride</code></td> + <td>The config override to set programmatically.</td> + <td><code>empty</code>, i.e. use config file or local execution.</td> </tr> <tr> <td><code>watermarkInterval</code></td> @@ -146,4 +148,4 @@ When executing your pipeline with the Samza Runner, you can use the following pi You can monitor your pipeline job using metrics emitted from both Beam and Samza, e.g. Beam source metrics such as `elements_read` and `backlog_elements`, and Samza job metrics such as `job-healthy` and `process-envelopes`. A complete list of Samza metrics is in [Samza Metrics Reference](https://samza.apache.org/learn/documentation/latest/container/metrics-table.html). You can view your job's metrics via JMX in development, and send the metrics to graphing system such as [Graphite](http: [...] -For a running Samza YARN job, you can use YARN web UI to monitor the job status and check logs. \ No newline at end of file +For a running Samza YARN job, you can use YARN web UI to monitor the job status and check logs.