GitHub user fyellin opened a pull request: https://github.com/apache/incubator-beam/pull/841
[Beam-556] Fix typo in documentation This is my first python submission. It was built on top of origin/python-sdk rather than origin/master. If I'm doing this wrong, please let me know. You can merge this pull request into a Git repository by running: $ git pull https://github.com/fyellin/incubator-beam beam-556 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/incubator-beam/pull/841.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #841 ---- commit 87ca53927706510dd643ae8b2d85894f713b0550 Author: Jesse Anderson <je...@smokinghand.com> Date: 2016-05-16T20:58:08Z Changed Word Counts to use TypeDescriptors. commit 21a87f96511ec1b2b696402bec220f7b984fe600 Author: Scott Wegner <sweg...@google.com> Date: 2016-05-16T21:01:43Z Add configuration for Dataflow runner System.out/err commit d32179870c61e57dd5ce4287675ae4421a9249d2 Author: Jesse Anderson <je...@smokinghand.com> Date: 2016-05-16T21:09:18Z Updated complete examples to use TypeDescriptors. commit c7911fb23a60128b096acb8462b2fe94aea52061 Author: Scott Wegner <sweg...@google.com> Date: 2016-06-09T18:31:23Z Make example AddTimestampFn range deterministic The timestamps added in the WindowedWordCount example are currently based on when the bundles are executed, which makes the min/max bounds non-deterministic. This change makes the range based on the construction time. commit cc8700b016fd1db7db3a3483068f8633d436b64d Author: Kenneth Knowles <k...@google.com> Date: 2016-06-13T01:29:46Z Add success/failure counters to new PAssert mechanism commit 5e6b35fd5c2f40bfdf46dc308d9750f85d5d7fed Author: Scott Wegner <sweg...@google.com> Date: 2016-06-13T18:05:00Z Fix AutoComplete example streaming configuration commit 99441b418e6e18ca34e0413f94f1d939393e2273 Author: manuzhang <owenzhang1...@gmail.com> Date: 2016-06-13T03:09:38Z [BEAM-336] update examples-java README commit 1f66cbf08cf18047ce67ea4eaa95dcea6872a532 Author: Thomas Groh <tg...@google.com> Date: 2016-05-18T23:56:06Z Update the Default Pipeline Runner Select the InProcessRunner if it is on the classpath, and throw an exception otherwise. commit 3df522f8b964bdcbf559de27221f9e2ce15c12c4 Author: Thomas Groh <tg...@google.com> Date: 2016-06-10T21:36:42Z Update Pipeline Execution Style in WindowedWordCountTest This sets the runner a Pipeline creation time rather than sending a (potentially rewritten) pipeline to a new runner instance. commit 9cf1d24d910b2f00ac3dbacc792c9f1d3fc053e0 Author: Thomas Groh <tg...@google.com> Date: 2016-06-10T21:38:36Z Update Direct Module tests to explicitly set Pipeline commit 140519c8291f9f3a5135b868343aa9b4181889bd Author: Thomas Groh <tg...@google.com> Date: 2016-06-10T21:41:06Z Use TestPipeline#testingPipelineOptions in IO Tests commit febf4a14d741f3a9eb1706f8ebdb8d9a9469d3bc Author: Thomas Groh <tg...@google.com> Date: 2016-06-10T21:43:10Z Move GcsUtil TextIO Tests to TextIOTest These tests are not a test of the DataflowRunner, nor any DataflowRunner specific behavior, so they should be part of TextIOTest commit cce4dcabe30b109095365830b4d10c300335e17b Author: Thomas Groh <tg...@google.com> Date: 2016-06-10T21:45:58Z Set Runner in DataflowRunner Tests Otherwise the Default Runner is used, which may be unavailable. commit 59371181a0d5af55364840842617c8aa082945a0 Author: Thomas Groh <tg...@google.com> Date: 2016-06-10T21:47:53Z Increase Visibility of Flink Test PipelineOptions This fixes an issue where the package-private nature would cause an exception commit 882e8f8bcfc0adc71e9a2bcb969c61328830b910 Author: Dan Halperin <dhalp...@google.com> Date: 2016-06-14T00:34:58Z CompressedSourceTest: simplify We should use random.nextBytes(buff) instead of making the array in a loop. The code we now point to is the same as the for loop, so the test continues to pass. commit 8cab792563302cd2863868ce209d63dc82eeebf0 Author: Kenneth Knowles <k...@google.com> Date: 2016-06-14T15:05:04Z Revert GBK-based PAssert This changed neglected the use of counters by the Dataflow runner, which is used to prevent tests for spuriously passing when a PCollection is empty. Obvious fixes for that revealed probable bugs in the in-process and Spark runner, as well as tests that happen to work with PAssert but are actually unsupported. A proper long-term fix is underway to address all of the above. In the meantime, this commit rolls back the changes to PAssert. commit ecf6ab8097e4ce9a69463578dd74841febc7d84d Author: Thomas Groh <tg...@google.com> Date: 2016-06-14T20:18:41Z Add DoFnTester#peekOutputValuesInWindow This permits DoFns that interact with windowing to test the windowed, rather than overall output. commit f1b43b9e18da9fb9ca839c985af93cc491802e31 Author: Thomas Groh <tg...@google.com> Date: 2016-06-14T20:39:59Z Use TimestampedValue in DoFnTester This removes the duplicate OutputElementWithTimestamp data structure. commit 11d78a4c1791c1dfd88f0ac348c9c07cd48cafc8 Author: Ian Zhou <ianz...@google.com> Date: 2016-06-09T21:17:14Z Modified range tracker to use first response seen as start key commit ec6d88a787dfdab064bceb70d48b2ce1c5bfa9bb Author: Thomas Groh <tg...@google.com> Date: 2016-06-14T01:34:49Z Reuse UnboundedReaders in the InProcessRunner Reuse up to a point, and then discard the reader to exercise resume from checkpoint. commit d2ceaf5e5a778fad18472ab0d7c02a14259015d7 Author: Scott Wegner <sweg...@google.com> Date: 2016-06-14T16:00:49Z Update DataflowPipelineRunner worker container version commit 0065851b96644f2c75b8e51c95ebf0e79c5865f5 Author: Thomas Groh <tg...@google.com> Date: 2016-06-14T16:27:55Z Rename DoFnTester#processBatch to processBundle DoFns process elements in bundles, not batches. commit 90bb20ee6738c57bc25f47e2d80690fb721b562e Author: Thomas Groh <tg...@google.com> Date: 2016-06-14T22:49:34Z Explicitly set the Runner in TestFlinkPipelineRunner This ensures that the created PipelineOptions are valid if the DirectRunner is not on the classpath. commit 45e57e0612ae692418e07d9c4483321f040cb4a7 Author: Thomas Groh <tg...@google.com> Date: 2016-06-15T00:51:48Z Remove DoFnRunner from GroupAlsoByWindowsProperties DoFnRunner is a runner implementation detail, and core SDK code should instead use DoFnTester. commit 99654ca4bed6758d7128d0f0ad376e8b479d4eba Author: Thomas Groh <tg...@google.com> Date: 2016-06-15T00:52:49Z Remove the DirectPipelineRunner from the Core SDK commit d5e3dfaa864744ec9a011c51707d15f1ab68a734 Author: Scott Wegner <sweg...@google.com> Date: 2016-06-15T16:51:59Z Fix NullPointerException in AfterWatermark display data Window transforms register display data for the associated trigger function by calling its .toString() method. The AfterWatermark trigger .toString() method was not properly handling cases where there is no late firings registered. commit 340fe3ebcfef0b57b163483d7d7243ad5456ae72 Author: Scott Wegner <sweg...@google.com> Date: 2016-06-15T17:17:01Z Package javadoc for org.apache.beam.sdk.transforms.display commit 6ada1a635382fcddc42a7580e74e755839f7172e Author: Thomas Groh <tg...@google.com> Date: 2016-06-15T19:01:56Z Run NeedsRunner tests in Runner Core on the DirectRunner This ensures that all runner tests in runners/core-java are executed in the standard maven build. commit e90a1b9d74cbc06d7818bae8dfe2af81acd73222 Author: Kenneth Knowles <k...@google.com> Date: 2016-06-08T22:07:52Z Roll-forwards: Base PAssert on GBK instead of side inputs Previously PAssert - hence all RunnableOnService/NeedsRunner tests - required side input support. This created a very steep on ramp for new runners. GroupByKey is a bit more fundamental and most backends will be able to group by key in the global window very quickly. So switching the primitive used to gather all the contents of a PCollection for assertions should make it a bit easier to get early feedback during runner development. commit 0a7246d268969cb1b7f46149e38361802c95e70a Author: Scott Wegner <sweg...@google.com> Date: 2016-06-13T18:05:52Z Improve BigQueryIO validation for streaming WriteDisposition ---- --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---