[beam-site] 03/05: Updated IO IT docs based on PR feedback

mergebot-role Thu, 27 Jul 2017 13:23:17 -0700

This is an automated email from the ASF dual-hosted git repository.

mergebot-role pushed a commit to branch mergebot
in repository https://gitbox.apache.org/repos/asf/beam-site.git


commit 9da560eef802eac7c7fb29dedc71387328776b3b
Author: Stephen Sisk <s...@google.com>
AuthorDate: Wed Jul 19 16:28:32 2017 -0700

    Updated IO IT docs based on PR feedback
---
 src/documentation/io/testing.md | 49 ++++++++++++++++++++---------------------
 1 file changed, 24 insertions(+), 25 deletions(-)

diff --git a/src/documentation/io/testing.md b/src/documentation/io/testing.md
index 26ebc55..6281b5f 100644
--- a/src/documentation/io/testing.md
+++ b/src/documentation/io/testing.md
@@ -111,6 +111,7 @@ If your I/O transform allows batching of reads/writes, you 
must force the batchi
 
 ## I/O Transform Integration Tests {#i-o-transform-integration-tests}
 
+> We do not currently have examples of Python I/O integration tests or 
integration tests for unbounded or eventually consistent data stores. We would 
welcome contributions in these areas - please contact the Beam dev@ mailing 
list for more information.
 
 ### Goals  {#it-goals}
 
@@ -126,7 +127,7 @@ In order to test I/O transforms in real world conditions, 
you must connect to a
 
 The Beam community hosts the data stores used for integration tests in 
Kubernetes. In order for an integration test to be run in Beam's continuous 
integration environment, it must have Kubernetes scripts that set up an 
instance of the data store.
 
-However, when working locally, there is no requirement to use Kubernetes. All 
of the test infrastructure allows passing in connection info, so developers can 
use their preferred hosting infrastructure for local development.
+However, when working locally, there is no requirement to use Kubernetes. All 
of the test infrastructure allows you to pass in connection info, so developers 
can use their preferred hosting infrastructure for local development.
 
 
 ### Running integration tests {#running-integration-tests}
@@ -136,18 +137,18 @@ The high level steps for running an integration test are:
 1.  Run the test, passing it connection info from the just created data store
 1.  Clean up the data store
 
-Since setting up data stores and running the tests involves a number of steps, 
and we wish to time these tests when running performance benchmarks, we use 
PerfKit Benchmarker (PKB) to manage the process end to end. With a single 
command, you can go from an empty Kubernetes cluster to a running integration 
test.
+Since setting up data stores and running the tests involves a number of steps, 
and we wish to time these tests when running performance benchmarks, we use 
PerfKit Benchmarker to manage the process end to end. With a single command, 
you can go from an empty Kubernetes cluster to a running integration test.
 
-However, **PerfKit Benchmarker is not required for running integration 
tests**. Therefore, we have listed the steps for both using PerfKit, and 
manually running the tests below.
+However, **PerfKit Benchmarker is not required for running integration 
tests**. Therefore, we have listed the steps for both using PerfKit 
Benchmarker, and manually running the tests below.
 
 
 #### Using PerfKit Benchmarker {#using-perfkit-benchmarker}
 
 Prerequisites:
-1.  [Install 
PerfKit](https://github.com/GoogleCloudPlatform/PerfKitBenchmarker)
+1.  [Install PerfKit 
Benchmarker](https://github.com/GoogleCloudPlatform/PerfKitBenchmarker)
 1.  Have a running Kubernetes cluster you can connect to locally using kubectl
 
-You won't need to invoke PerfKit directly. Run mvn verify in the directory of 
the I/O module you'd like to test, with the parameter io-it-suite.
+You won't need to invoke PerfKit Benchmarker directly. Run mvn verify in the 
directory of the I/O module you'd like to test, with the parameter io-it-suite.
 
 Example run with the direct runner:
 ```
@@ -179,13 +180,13 @@ Parameter descriptions:
     <tr>
      <td>-Dio-it-suite
      </td>
-     <td>Invokes the call to PerfKit.
+     <td>Invokes the call to PerfKit Benchmarker.
      </td>
     </tr>
     <tr>
      <td>-Dio-it-suite-local
      </td>
-     <td>Modifies the call to PerfKit so that it exposes the postgres service 
via LoadBalancer, making it available to users not on the immediate network of 
the kubernetes cluster. This is useful if you are running on a remote 
kubernetes cluster.
+     <td>Modifies the call to PerfKit Benchmarker so that it exposes the 
postgres service via LoadBalancer, making it available to users not on the 
immediate network of the kubernetes cluster. This is useful if you are running 
on a remote kubernetes cluster.
      </td>
     </tr>
     <tr>
@@ -243,7 +244,7 @@ If you're using Kubernetes, make sure you can connect to 
your cluster locally us
 There are three components necessary to implement an integration test:
 *   **Test code**: the code that does the actual testing: interacting with the 
I/O transform, reading and writing data, and verifying the data.
 *   **Kubernetes scripts**: a Kubernetes script that sets up the data store 
that will be used by the test code.
-*   **Integrate with PerfKit Benchmarker using io-it-suite**: this allows 
users to easily invoke perfkit, creating the Kubernetes resources and running 
the test code.
+*   **Integrate with PerfKit Benchmarker using io-it-suite**: this allows 
users to easily invoke PerfKit Benchmarker, creating the Kubernetes resources 
and running the test code.
 
 These three pieces are discussed in detail below.
 
@@ -266,8 +267,6 @@ These are the conventions used by integration testing code:
 
 An end to end example of these principles can be found in 
[JdbcIOIT](https://github.com/ssisk/beam/blob/jdbc-it-perf/sdks/java/io/jdbc/src/test/java/org/apache/beam/sdk/io/jdbc/JdbcIOIT.java).
 
-If you'd like to implement Python I/O integration tests or integration tests 
for unbounded or eventually consistent data stores, please contact the Beam 
dev@ mailing list for more information.
-
 
 #### Kubernetes scripts {#kubernetes-scripts}
 
@@ -296,9 +295,9 @@ Guidelines for creating a Beam data store Kubernetes script:
 
 #### Integrate with PerfKit Benchmarker {#integrate-with-perfkit-benchmarker}
 
-To allow developers to easily invoke your I/O integration test, perform the 
following steps:
-1.  Create a PerfKit benchmark configuration file for the data store. Each 
pipeline option needed by the integration test should have a configuration 
entry. See [Defining the benchmark configuration 
file](#defining-the-benchmark-configuration-file) for information about what to 
include.
-1.  Modify the [Per-I/O mvn pom configuration](#per-i-o-mvn-pom-configuration).
+To allow developers to easily invoke your I/O integration test, you must 
perform these two steps. The follow sections describe each step in more detail.
+1.  Create a PerfKit Benchmarker benchmark configuration file for the data 
store. Each pipeline option needed by the integration test should have a 
configuration entry.
+1.  Modify the per-I/O Maven pom configuration so that PerfKit Benchmarker can 
be invoked from Maven.
 
 The goal is that a checked in config has defaults such that other developers 
can run the test without changing the configuration.
 
@@ -397,7 +396,7 @@ and may contain the following elements:
     <tr>
      <td>dynamic_pipeline_options
      </td>
-     <td>The set of mvn pipeline options that PerfKit will determine at 
runtime.
+     <td>The set of mvn pipeline options that PerfKit Benchmarker will 
determine at runtime.
      </td>
     </tr>
     <tr>
@@ -425,15 +424,15 @@ and may contain the following elements:
 
 #### Per-I/O mvn pom configuration {#per-i-o-mvn-pom-configuration}
 
-Each I/O is responsible for adding a section to its pom with a profile that 
invokes PerfKit with the proper parameters during the verify phase. Below are 
the set of PerfKit parameters and how to configure them.
+Each I/O is responsible for adding a section to its pom with a profile that 
invokes PerfKit Benchmarker with the proper parameters during the verify phase. 
Below are the set of PerfKit Benchmarker parameters and how to configure them.
 
-The [JdbcIO 
pom](https://github.com/apache/beam/blob/master/sdks/java/io/jdbc/pom.xml) has 
an example of how to put these options together into a profile and invoke 
Python+PerfKit with them.
+The [JdbcIO 
pom](https://github.com/apache/beam/blob/master/sdks/java/io/jdbc/pom.xml) has 
an example of how to put these options together into a profile and invoke 
Python+PerfKit Benchmarker with them.
 
 
 <table class="table">
   <thead>
     <tr>
-     <td><strong>PerfKit Parameter</strong>
+     <td><strong>PerfKit Benchmarker Parameter</strong>
      </td>
      <td><strong>Description</strong>
      </td>
@@ -445,7 +444,7 @@ The [JdbcIO 
pom](https://github.com/apache/beam/blob/master/sdks/java/io/jdbc/po
     <tr>
      <td>benchmarks
      </td>
-     <td>Defines the PerfKit benchmark to run. This is same for all I/O 
integration tests.
+     <td>Defines the PerfKit Benchmarker benchmark to run. This is same for 
all I/O integration tests.
      </td>
      <td>beam_integration_benchmark
      </td>
@@ -453,7 +452,7 @@ The [JdbcIO 
pom](https://github.com/apache/beam/blob/master/sdks/java/io/jdbc/po
     <tr>
      <td>beam_location
      </td>
-     <td>The location where PerfKit can find the Beam repository.
+     <td>The location where PerfKit Benchmarker can find the Beam repository.
      </td>
      <td>${beamRootProjectDir} - this is a variable you'll need to define for 
each maven pom. See example pom for an example.
      </td>
@@ -469,7 +468,7 @@ The [JdbcIO 
pom](https://github.com/apache/beam/blob/master/sdks/java/io/jdbc/po
     <tr>
      <td>beam_sdk
      </td>
-     <td>Whether PerfKit will run the Beam SDK for Java or Python.
+     <td>Whether PerfKit Benchmarker will run the Beam SDK for Java or Python.
      </td>
      <td>java
      </td>
@@ -493,7 +492,7 @@ The [JdbcIO 
pom](https://github.com/apache/beam/blob/master/sdks/java/io/jdbc/po
     <tr>
      <td>beam_it_module
      </td>
-     <td>The path to the pom that contains the test (needed for invoking the 
test with PerfKit).
+     <td>The path to the pom that contains the test (needed for invoking the 
test with PerfKit Benchmarker).
      </td>
      <td>sdks/java/io/jdbc
      </td>
@@ -517,7 +516,7 @@ The [JdbcIO 
pom](https://github.com/apache/beam/blob/master/sdks/java/io/jdbc/po
     <tr>
      <td>kubeconfig
      </td>
-     <td>The standard PerfKit parameter `kubeconfig`, which specifies where 
the Kubernetes config file lives.
+     <td>The standard PerfKit Benchmarker parameter `kubeconfig`, which 
specifies where the Kubernetes config file lives.
      </td>
      <td>Always use ${kubeconfig}
      </td>
@@ -525,7 +524,7 @@ The [JdbcIO 
pom](https://github.com/apache/beam/blob/master/sdks/java/io/jdbc/po
     <tr>
      <td>kubectl
      </td>
-     <td>The standard PerfKit parameter `kubectl`, which specifies where the 
kubectl binary lives.
+     <td>The standard PerfKit Benchmarker parameter `kubectl`, which specifies 
where the kubectl binary lives.
      </td>
      <td>Always use ${kubectl}
      </td>
@@ -542,7 +541,7 @@ The [JdbcIO 
pom](https://github.com/apache/beam/blob/master/sdks/java/io/jdbc/po
 </table>
 
 
-There is also a set of Maven properties which are useful when invoking 
PerfKit. These properties are configured in the I/O parent pom, and some are 
only available when the io-it-suite profile is active in Maven.
+There is also a set of Maven properties which are useful when invoking PerfKit 
Benchmarker. These properties are configured in the I/O parent pom, and some 
are only available when the io-it-suite profile is active in Maven.
 
 
 #### Small Scale and Large Scale Integration Tests 
{#small-scale-and-large-scale-integration-tests}
@@ -561,7 +560,7 @@ You can do this by:
 1.  Creating two Kubernetes scripts: one for a small instance of the data 
store, and one for a large instance.
 1.  Having your test take a pipeline option that decides whether to generate a 
small or large amount of test data (where small and large are sizes appropriate 
to your data store)
 
-An example of this is `HadoopInputFormatIO`'s tests.
+An example of this is 
[HadoopInputFormatIO](https://github.com/apache/beam/tree/master/sdks/java/io/hadoop/input-format)'s
 tests.
 
 <!--
 # Next steps

-- 
To stop receiving notification emails like this one, please contact
"commits@beam.apache.org" <commits@beam.apache.org>.

[beam-site] 03/05: Updated IO IT docs based on PR feedback

Reply via email to