GitHub user manuzhang opened a pull request: https://github.com/apache/beam/pull/3636
[BEAM-79] merge gearpump-runner into master Follow this checklist to help us incorporate your contribution quickly and easily: - [ ] Make sure there is a [JIRA issue](https://issues.apache.org/jira/projects/BEAM/issues/) filed for the change (usually before you start working on it). Trivial changes like typos do not require a JIRA issue. Your pull request should address just this issue, without pulling in other changes. - [ ] Each commit in the pull request should have a meaningful subject line and body. - [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA issue. - [ ] Write a pull request description that is detailed enough to understand what the pull request does, how, and why. - [ ] Run `mvn clean verify` to make sure basic checks pass. A more thorough check will be performed on your pull request automatically. - [ ] If this contribution is large, please file an Apache [Individual Contributor License Agreement](https://www.apache.org/licenses/icla.pdf). --- You can merge this pull request into a Git repository by running: $ git pull https://github.com/apache/beam gearpump-runner Alternatively you can review and apply these changes as the patch at: https://github.com/apache/beam/pull/3636.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #3636 ---- commit 9478f4117de3a2d0ea40614ed4cb801918610724 Author: manuzhang <owenzhang1...@gmail.com> Date: 2016-03-15T08:15:16Z [BEAM-79] add Gearpump runner commit 02b2248a5b3c8a2c064547d7380bebc97f849bf1 Author: Kenneth Knowles <k...@google.com> Date: 2016-07-20T16:07:48Z This closes #323 commit 2a0ba61e8507e1539115b583749a78f14d577bd8 Author: Kenneth Knowles <k...@google.com> Date: 2016-08-25T18:36:45Z Merge branch master into gearpump-runner commit 1672b5483e029292816397248dc6fe63bf51f4af Author: manuzhang <owenzhang1...@gmail.com> Date: 2016-07-23T06:10:15Z move integration tests to profile commit 276a2e106aa1a5736666fc2eb2426b640f63cf68 Author: manuzhang <owenzhang1...@gmail.com> Date: 2016-07-28T08:30:13Z add package-info.java commit 40be715a696bb1218b209f7ad9a979b7e5d088d3 Author: Kenneth Knowles <k...@google.com> Date: 2016-08-10T17:26:57Z Update Gearpump runner version to 0.3.0-incubating commit bc1b354949416db3b52c4f37c66968bdb86f0813 Author: manuzhang <owenzhang1...@gmail.com> Date: 2016-08-11T23:22:00Z Rename DoFn to OldDoFn in Gearpump runner commit 091a15a07c7625ae3009cefaecece3a29a34c109 Author: Kenneth Knowles <k...@google.com> Date: 2016-08-25T18:40:03Z This closess #750 commit fb74c936ed92c7a8548c338cc03957794fc60902 Author: Dan Halperin <dhalp...@google.com> Date: 2016-08-26T23:25:58Z gearpump: switch to stable version They have apparently deleted the SNAPSHOT jar and now builds are failing. commit bf0a2edae11416a3cbddeaff2c0a70adc272c5fe Author: Dan Halperin <dhalp...@google.com> Date: 2016-08-27T00:46:42Z Closes #895 commit 89921c41ca9d4c333af45efa32359a631214c1df Author: bchambers <bchamb...@google.com> Date: 2016-07-29T16:41:17Z Remove Counter and associated code Aggregator is the model level concept. Counter was specific to the Dataflow Runner, and is now not needed as part of Beam. commit 7fc2c6848f002ac8b2ccbe35e6b5a576777a7af9 Author: Mark Liu <mark...@markliu-macbookpro.roam.corp.google.com> Date: 2016-08-03T00:25:14Z [BEAM-495] Create General Verifier for File Checksum commit b47549e4893a6d487c00ea0ba02619168a3f19f3 Author: Mark Liu <mark...@markliu-macbookpro.roam.corp.google.com> Date: 2016-08-03T00:47:46Z Add output checksum to WordCountITOptions commit 58cd781c82fa728f34f5ab0641f8f9b6edcf449c Author: Ian Zhou <ianz...@google.com> Date: 2016-08-05T22:31:59Z Added unit tests and error handling in removeTemporaryTables commit 36a9aa232ea56de449930194788becce585212ef Author: Thomas Groh <tg...@google.com> Date: 2016-08-09T02:09:58Z Improve Write Error Message If provided with an Unbounded PCollection, Write will fail due to restriction of calling finalize only once. This error message fails in a deep stack trace based on it not being possible to apply a GroupByKey. Instead, throw immediately on application with a specific error message. commit d5641553cebb02f08ca7c1fe667948d39cb3962c Author: Thomas Groh <tg...@google.com> Date: 2016-08-09T17:47:09Z Remove Streaming Write Overrides in DataflowRunner These writes should be forbidden based on the boundedness of the input PCollection. As Write explicitly forbids the application of the transform to an Unbounded PCollection, this will be equivalent in most cases; In cases where the input PCollection is Bounded, due to an UnboundedReadFromBoundedSource, the write will function as expected and does not need to be forbidden. commit 011bea9a83a828e0d8c6518ab83aa5cc4f75e146 Author: David Rieber <drie...@google.com> Date: 2016-08-09T21:05:25Z Do not add DataDisks to windmill service jobs. commit 0dfb8ff55d6f80264222fde4501ea3050d2e3911 Author: gaurav gupta <gaugu...@cisco.com> Date: 2016-08-10T23:43:03Z Made byteArrayCoder final static commit b9f826366823003940805e6469b10df8819b0977 Author: Dan Halperin <dhalp...@google.com> Date: 2016-08-11T00:58:09Z CompressedSource: CompressedReader is never splittable The only way it's safe to split a compressed file is if the file is not compressed. This can only happen when the source itself is splittable, and that in turn will result in the inner source's reader being returned. A CompressedReader will only be created in the event that the file is NOT splittable. So remove all the logic handling splittable compressed readers, and instead go with the logic when we know/assume the file is compressed. * TextIO: test compression with larger files It is important for correctness that we test with large files because otherwise the compressed file may be larger than the uncompressed file, which could mask bugs * TextIOTest: flesh out more * TextIOTest: add large uncompressed file commit 1d86335314685926d3eb0f9765d615a77cee75e6 Author: Thomas Groh <tg...@google.com> Date: 2016-08-11T16:16:55Z Remove timeout in DirectRunnerTest If the test hangs due to bugs, the infrastructure should kill it. commit 37ce2a3e75cb96a4b3fdcd4938fb7fda95122724 Author: Mark Liu <mark...@markliu-macbookpro.roam.corp.google.com> Date: 2016-08-11T18:26:28Z More unit test and code style fix commit 046e36eaa41659ae43866bc6fbce4f122889f286 Author: Mark Liu <mark...@markliu-macbookpro.roam.corp.google.com> Date: 2016-08-11T18:55:17Z Using IOChannelUtils to resolve file path commit d056f4661da2cc399cab44c6604eaa61d1dfd178 Author: Thomas Groh <tg...@google.com> Date: 2016-07-14T21:51:02Z Add DoFn @Setup and @Teardown Methods annotated with these annotations are used to perform expensive setup work and clean up a DoFn after another method throws an exception or the DoFn is discarded. commit b80d96748dcb71f93697126489c924020ebbd4a9 Author: Thomas Groh <tg...@google.com> Date: 2016-07-15T18:27:00Z Add TransformEvaluatorFactory#cleanup This cleans up any state stored within the Transform Evaluator Factory. commit 77c90d00ae715795e73efec8f8e85e3917cf8d80 Author: Thomas Groh <tg...@google.com> Date: 2016-07-19T18:03:15Z Replace CloningThreadLocal with DoFnLifecycleManager This is a more focused interface that interacts with a DoFn before it is available for use and after it has completed and the reference is lost. It is required to properly support setup and teardown, as the fields in a ThreadLocal cannot all be cleaned up without additional tracking. Part of BEAM-452. commit 39f763e16182e33019a1805d6210549934998856 Author: Pei He <pe...@google.com> Date: 2016-08-01T20:41:59Z Remove DataflowPipelineJob from examples commit 6603307062ec99639d3e3e05aebc0d1ea32ad411 Author: Thomas Groh <tg...@google.com> Date: 2016-08-11T17:45:43Z Move ParDo Lifecycle tests to their own file These tests are not yet functional in all runners, and this makes them easier to ignore. commit d99a652f96f12bcc235caa038ffa741906336b1f Author: Maximilian Michels <m...@apache.org> Date: 2016-08-12T15:51:02Z [flink] add missing maven config to example pom commit 424c4c492965e5a93b1c020c8d52805e3a9a9088 Author: mariusz89016 <mariusz89...@gmail.com> Date: 2016-08-13T22:35:19Z [BEAM-432] Corrected BigQueryIO javadoc commit d6cf4f2a3cda9ff4ff2da105c08be8101c58e6f1 Author: Daniel Halperin <dhalp...@users.noreply.github.com> Date: 2016-08-15T22:21:15Z Exclude ParDoTest from Dataflow @RunnableOnService Until we implement it for Dataflow runner. ---- --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---