[jira] [Commented] (BEAM-134) Investigate use of AutoValue

2016-04-20 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15250901#comment-15250901
 ] 

ASF GitHub Bot commented on BEAM-134:
-

Github user asfgit closed the pull request at:

https://github.com/apache/incubator-beam/pull/109


> Investigate use of AutoValue
> 
>
> Key: BEAM-134
> URL: https://issues.apache.org/jira/browse/BEAM-134
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core
>Reporter: Scott Wegner
>Assignee: Scott Wegner
>Priority: Minor
> Attachments: 
> 0001-Mark-classes-which-might-benefit-from-AutoValue.patch
>
>
> The initial PR for [BEAM-118] added a dependency on 
> [AutoValue|https://github.com/google/auto/tree/master/value#how-to-use-autovalue]
>  to auto-implement equality semantics for a new POJO. We decided to remove 
> the dependency because the cost of adding the dependency for this feature may 
> not be worth it for the value.
> However, we could use AutoValue for all of our POJO's, it might be worth it. 
> The proposal here is to follow-up with an investigation on whether we would 
> gain significant value to porting our code to use AutoValue instead of 
> hand-written POJO's.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-157) CombineTest.testGlobalCombineWithDefaultsAndTriggers is broken

2016-04-20 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15250961#comment-15250961
 ] 

ASF GitHub Bot commented on BEAM-157:
-

Github user asfgit closed the pull request at:

https://github.com/apache/incubator-beam/pull/218


> CombineTest.testGlobalCombineWithDefaultsAndTriggers is broken
> --
>
> Key: BEAM-157
> URL: https://issues.apache.org/jira/browse/BEAM-157
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core
>Reporter: Daniel Halperin
>Assignee: Kenneth Knowles
>Priority: Critical
>
> The test is not run because `p.run()` is not called. When `p.run()` is added, 
> the test fails.
> Kenn, I suspect this is because it's using triggers in batch, which obviously 
> is not guaranteed to work.
> Please investigate!
> https://github.com/apache/incubator-beam/blob/master/sdks/java/core/src/test/java/com/google/cloud/dataflow/sdk/transforms/CombineTest.java#L373
> Failed tests: 
>   CombineTest.testGlobalCombineWithDefaultsAndTriggers:391 
> Expected: iterable over ["2: true", "1: false"] in any order
>  but: No item matches: "1: false" in ["2: true"]
> Tests run: 30, Failures: 1, Errors: 0, Skipped: 0



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-215) Create should be implemented as a BoundedSource

2016-04-20 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15250770#comment-15250770
 ] 

ASF GitHub Bot commented on BEAM-215:
-

Github user asfgit closed the pull request at:

https://github.com/apache/incubator-beam/pull/214


> Create should be implemented as a BoundedSource
> ---
>
> Key: BEAM-215
> URL: https://issues.apache.org/jira/browse/BEAM-215
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core
>Reporter: Thomas Groh
>Assignee: Davor Bonaci
>
> Read.Bounded is the primitive to create a Bounded PCollection, and Create can 
> be implemented as a BoundedSource. The primitive implementation of Create can 
> be replaced with a source, which allows it to also automatically get any 
> benefits provided by the Runner's exercising of the Source API
> See also BEAM-115



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-22) DirectPipelineRunner: support for unbounded collections

2016-04-20 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-22?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15250768#comment-15250768
 ] 

ASF GitHub Bot commented on BEAM-22:


GitHub user tgroh opened a pull request:

https://github.com/apache/incubator-beam/pull/220

[BEAM-22] Allow InProcess Evaluators to check Side Input completion

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [ ] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [ ] Make sure tests pass via `mvn clean verify`. (Even better, enable
   Travis-CI on your fork and ensure the whole test matrix passes).
 - [ ] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [ ] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.txt).

---

This checks to ensure that the PCollectionView in the SideInputWindow
for the provided window either has elements available or is empty.

Schedule a future to ensure that the SideInputWindows are appropriately
filled with an empty iterable after retreiving the element.

This is allows the ParDoEvaluator to not attempt to process elements
that cannot currently be completed.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/tgroh/incubator-beam ippr_side_input_is_ready

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-beam/pull/220.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #220


commit ffc47d7b1367659fba36a17e477a510adb1d54f7
Author: Thomas Groh 
Date:   2016-04-18T23:16:48Z

Allow InProcess Evaluators to check Side Input completion

This checks to ensure that the PCollectionView in the SideInputWindow
for the provided window either has elements available or is empty.

Schedule a future to ensure that the SideInputWindows are appropriately
filled with an empty iterable after retreiving the element.




> DirectPipelineRunner: support for unbounded collections
> ---
>
> Key: BEAM-22
> URL: https://issues.apache.org/jira/browse/BEAM-22
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-direct
>Reporter: Davor Bonaci
>Assignee: Thomas Groh
>
> DirectPipelineRunner currently runs over bounded PCollections only, and 
> implements only a portion of the Beam Model.
> We should improve it to faithfully implement the full Beam Model, such as add 
> ability to run over unbounded PCollections, and better resemble execution 
> model in a distributed system.
> This further enables features such as a testing source which may simulate 
> late data and test triggers in the pipeline. Finally, we may want to expose 
> an option to select between "debug" (single threaded), "chaos monkey" (test 
> as many model requirements as possible), and "performance" (multi-threaded).
> more testing (chaos monkey) 
> Once this is done, we should update this StackOverflow question:
> http://stackoverflow.com/questions/35350113/testing-triggers-with-processing-time/35401426#35401426



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-204) truncateStackTrace fails with empty stack trace

2016-04-20 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-204?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15250831#comment-15250831
 ] 

ASF GitHub Bot commented on BEAM-204:
-

GitHub user mshields822 opened a pull request:

https://github.com/apache/incubator-beam/pull/221

[BEAM-204] Protect against empty stack traces

R: @lukecwik 


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/mshields822/incubator-beam beam-204

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-beam/pull/221.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #221


commit 4e5e3b8e7eefc35cd1b46ad96040c8baa94cac01
Author: Mark Shields 
Date:   2016-04-20T22:18:17Z

Protect against empty stack traces




> truncateStackTrace fails with empty stack trace
> ---
>
> Key: BEAM-204
> URL: https://issues.apache.org/jira/browse/BEAM-204
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core
>Reporter: Malo Denielou
>Assignee: Mark Shields
>Priority: Minor
>
> From a user job: 
> exception:
> "java.lang.ArrayIndexOutOfBoundsException: -1
>   at 
> com.google.cloud.dataflow.sdk.util.UserCodeException.truncateStackTrace(UserCodeException.java:72)
>   at 
> com.google.cloud.dataflow.sdk.util.UserCodeException.(UserCodeException.java:52)
>   at 
> com.google.cloud.dataflow.sdk.util.UserCodeException.wrap(UserCodeException.java:35)
>   at 
> com.google.cloud.dataflow.sdk.util.UserCodeException.wrapIf(UserCodeException.java:40)
>   at 
> com.google.cloud.dataflow.sdk.util.DoFnRunnerBase.wrapUserCodeException(DoFnRunnerBase.java:369)
>   at 
> com.google.cloud.dataflow.sdk.util.SimpleDoFnRunner.invokeProcessElement(SimpleDoFnRunner.java:51)
>   at 
> com.google.cloud.dataflow.sdk.util.DoFnRunnerBase.processElement(DoFnRunnerBase.java:138)
>   at 
> com.google.cloud.dataflow.sdk.runners.worker.SimpleParDoFn.processElement(SimpleParDoFn.java:191)
>   at 
> com.google.cloud.dataflow.sdk.runners.worker.ForwardingParDoFn.processElement(ForwardingParDoFn.java:42)
>   at 
> com.google.cloud.dataflow.sdk.runners.worker.DataflowWorkerLoggingParDoFn.processElement(DataflowWorkerLoggingParDoFn.java:47)
>   at 
> com.google.cloud.dataflow.sdk.util.common.worker.ParDoOperation.process(ParDoOperation.java:53)
>   at 
> com.google.cloud.dataflow.sdk.util.common.worker.OutputReceiver.process(OutputReceiver.java:52)
>   at 
> com.google.cloud.dataflow.sdk.runners.worker.SimpleParDoFn$1.output(SimpleParDoFn.java:161)
>   at 
> com.google.cloud.dataflow.sdk.util.DoFnRunnerBase$DoFnContext.outputWindowedValue(DoFnRunnerBase.java:288)
>   at 
> com.google.cloud.dataflow.sdk.util.DoFnRunnerBase$DoFnContext.outputWindowedValue(DoFnRunnerBase.java:284)
>   at 
> com.google.cloud.dataflow.sdk.util.DoFnRunnerBase$DoFnProcessContext$1.outputWindowedValue(DoFnRunnerBase.java:508)
>   at 
> com.google.cloud.dataflow.sdk.util.AssignWindowsDoFn.processElement(AssignWindowsDoFn.java:65)
>   at 
> com.google.cloud.dataflow.sdk.util.SimpleDoFnRunner.invokeProcessElement(SimpleDoFnRunner.java:49)
>   at 
> com.google.cloud.dataflow.sdk.util.DoFnRunnerBase.processElement(DoFnRunnerBase.java:138)
>   at 
> com.google.cloud.dataflow.sdk.runners.worker.SimpleParDoFn.processElement(SimpleParDoFn.java:191)
>   at 
> com.google.cloud.dataflow.sdk.runners.worker.ForwardingParDoFn.processElement(ForwardingParDoFn.java:42)
>   at 
> com.google.cloud.dataflow.sdk.runners.worker.DataflowWorkerLoggingParDoFn.processElement(DataflowWorkerLoggingParDoFn.java:47)
>   at 
> com.google.cloud.dataflow.sdk.util.common.worker.ParDoOperation.process(ParDoOperation.java:53)
>   at 
> com.google.cloud.dataflow.sdk.util.common.worker.OutputReceiver.process(OutputReceiver.java:52)
>   at 
> com.google.cloud.dataflow.sdk.runners.worker.SimpleParDoFn$1.output(SimpleParDoFn.java:161)
>   at 
> com.google.cloud.dataflow.sdk.util.DoFnRunnerBase$DoFnContext.sideOutputWindowedValue(DoFnRunnerBase.java:315)
>   at 
> com.google.cloud.dataflow.sdk.util.DoFnRunnerBase$DoFnProcessContext.sideOutput(DoFnRunnerBase.java:471)
>   at 
> com.google.cloud.dataflow.sdk.transforms.Partition$PartitionDoFn.processElement(Partition.java:165)
> Looking at the code, it seems that if the user code throwable has an empty 
> stacktrace we would fail.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-22) DirectPipelineRunner: support for unbounded collections

2016-04-18 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-22?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15246583#comment-15246583
 ] 

ASF GitHub Bot commented on BEAM-22:


GitHub user tgroh reopened a pull request:

https://github.com/apache/incubator-beam/pull/202

[BEAM-22] Track Synchronized Processing Time Holds per-element

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [x] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [x] Make sure tests pass via `mvn clean verify`. (Even better, enable
   Travis-CI on your fork and ensure the whole test matrix passes).
 - [x] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [x] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.txt).

---

This removes the concept of a Bundle in watermark tracking.

Bundles are still used to start & finish work, but are not referenced within
the actual Watermark objects.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/tgroh/incubator-beam 
ippr_synchronized_processing_holds_per_element

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-beam/pull/202.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #202


commit beb1aab000d9f0edd1d60fe42b2c0ffab0375634
Author: Thomas Groh 
Date:   2016-04-18T20:33:26Z

Track Synchronized Processing Time Holds per-element

This removes the concept of a Bundle in watermark tracking. Bundles are
still used to start/finish work, but watermarks are held on a
per-element basis, which allows partial completion of input.




> DirectPipelineRunner: support for unbounded collections
> ---
>
> Key: BEAM-22
> URL: https://issues.apache.org/jira/browse/BEAM-22
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-direct
>Reporter: Davor Bonaci
>Assignee: Thomas Groh
>
> DirectPipelineRunner currently runs over bounded PCollections only, and 
> implements only a portion of the Beam Model.
> We should improve it to faithfully implement the full Beam Model, such as add 
> ability to run over unbounded PCollections, and better resemble execution 
> model in a distributed system.
> This further enables features such as a testing source which may simulate 
> late data and test triggers in the pipeline. Finally, we may want to expose 
> an option to select between "debug" (single threaded), "chaos monkey" (test 
> as many model requirements as possible), and "performance" (multi-threaded).
> more testing (chaos monkey) 
> Once this is done, we should update this StackOverflow question:
> http://stackoverflow.com/questions/35350113/testing-triggers-with-processing-time/35401426#35401426



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-196) Pipeline options must be available Context in DoFn.startBundle

2016-04-19 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-196?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15247333#comment-15247333
 ] 

ASF GitHub Bot commented on BEAM-196:
-

Github user asfgit closed the pull request at:

https://github.com/apache/incubator-beam/pull/200


> Pipeline options must be available Context in DoFn.startBundle
> --
>
> Key: BEAM-196
> URL: https://issues.apache.org/jira/browse/BEAM-196
> Project: Beam
>  Issue Type: Bug
>  Components: runner-flink
>Reporter: Mark Shields
>Assignee: Maximilian Michels
>
> Our (not yet merged) Java Pubsub implementation has code like this in a DoFn:
> @Override
> public void startBundle(Context c) throws Exception {
>   Preconditions.checkState(pubsubClient == null);
>   pubsubClient = PubsubClient.newClient(transportType,
>   timestampLabel, idLabel, 
> c.getPipelineOptions().as(PubsubOptions.class));
>   super.startBundle(c);
> }
> This fails with NPE since the pipeline options are not conveyed via the 
> context.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-207) Flink test flake in ReadSourceStreamingITCase

2016-04-19 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-207?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15247359#comment-15247359
 ] 

ASF GitHub Bot commented on BEAM-207:
-

GitHub user mxm opened a pull request:

https://github.com/apache/incubator-beam/pull/209

[BEAM-207] Flink test flake in ReadSourceStreamingITCase

The `configure(..)` life cycle method is only called on the master but not 
on the worker nodes. This may lead to an incorrect initialization of the 
`Reader` because the `PipelineOptions` haven't been initialized.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/mxm/incubator-beam BEAM-207

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-beam/pull/209.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #209


commit a95e67c482d3378cf472e75a47275cedbf70de41
Author: Maximilian Michels 
Date:   2016-04-19T07:20:30Z

[BEAM-207] Flink test flake in ReadSourceStreamingITCase

The configure(..) life cycle method is only called on the master but not
on the worker nodes. This may lead to an incorrect initialization of the
Reader because the PipelineOptions haven't been initialized.




> Flink test flake in ReadSourceStreamingITCase
> -
>
> Key: BEAM-207
> URL: https://issues.apache.org/jira/browse/BEAM-207
> Project: Beam
>  Issue Type: Bug
>  Components: runner-flink, testing
>Reporter: Daniel Halperin
>Assignee: Maximilian Michels
>
> Log from Travis: 
> https://s3.amazonaws.com/archive.travis-ci.org/jobs/124066205/log.txt
> Snippet:
> {noformat}
> Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.792 sec - 
> in org.apache.beam.runners.flink.SideInputITCase
> Running org.apache.beam.runners.flink.ReadSourceStreamingITCase
> Pipeline execution failed
> java.lang.RuntimeException: Pipeline execution failed
>   at 
> org.apache.beam.runners.flink.FlinkPipelineRunner.run(FlinkPipelineRunner.java:119)
>   at 
> org.apache.beam.runners.flink.FlinkPipelineRunner.run(FlinkPipelineRunner.java:51)
>   at org.apache.beam.sdk.Pipeline.run(Pipeline.java:182)
>   at 
> org.apache.beam.runners.flink.ReadSourceStreamingITCase.runProgram(ReadSourceStreamingITCase.java:70)
>   at 
> org.apache.beam.runners.flink.ReadSourceStreamingITCase.testProgram(ReadSourceStreamingITCase.java:53)
>   at 
> org.apache.flink.streaming.util.StreamingProgramTestBase.testJob(StreamingProgramTestBase.java:85)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:483)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55)
>   at org.junit.rules.RunRules.evaluate(RunRules.java:20)
>   at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
>   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238)
>   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63)
>   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236)
>   at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53)
>   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229)
>   at org.junit.runners.ParentRunner.run(ParentRunner.java:309)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:264)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:153)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:124)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:200)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:153)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:103)
> Caused by: 

[jira] [Commented] (BEAM-267) Enable Chekstyle check in Spark runner

2016-05-08 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15275486#comment-15275486
 ] 

ASF GitHub Bot commented on BEAM-267:
-

Github user jbonofre closed the pull request at:

https://github.com/apache/incubator-beam/pull/298


> Enable Chekstyle check in Spark runner
> --
>
> Key: BEAM-267
> URL: https://issues.apache.org/jira/browse/BEAM-267
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-spark
>Reporter: Jean-Baptiste Onofré
>Assignee: Jean-Baptiste Onofré
>Priority: Minor
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-305) In Spark runner tests - When using Create.of use it's #withCoder method instead of the created PCollection's #setCoder

2016-05-24 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15298857#comment-15298857
 ] 

ASF GitHub Bot commented on BEAM-305:
-

GitHub user ilganeli opened a pull request:

https://github.com/apache/incubator-beam/pull/386

[BEAM-305] Replace usages of PCollection.setCoder with 
Create.of().withCoder in Spark Runner

* Replaced all usages of ```PCollection.setCoder`` with the recommended 
```withCoder``` in the Spark Runner.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/ilganeli/incubator-beam BEAM-305

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-beam/pull/386.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #386


commit 8113273cc21be3690356711d77a5c161a709997f
Author: Ilya Ganelin 
Date:   2016-05-24T20:37:04Z

Replaced all usages of setCoder with withCoder




> In Spark runner tests - When using Create.of use it's #withCoder method 
> instead of the created PCollection's #setCoder
> --
>
> Key: BEAM-305
> URL: https://issues.apache.org/jira/browse/BEAM-305
> Project: Beam
>  Issue Type: Bug
>  Components: runner-spark
>Reporter: Amit Sela
>Priority: Minor
>
> See [~kenn] comment here:
> https://github.com/apache/incubator-beam/pull/179#discussion_r60171526



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-306) Make java-only PubsubIO work in InProcessRunner

2016-05-25 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-306?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15300312#comment-15300312
 ] 

ASF GitHub Bot commented on BEAM-306:
-

GitHub user mshields822 opened a pull request:

https://github.com/apache/incubator-beam/pull/388

[BEAM-306] Serialize/Deserialize checkpoints

R: @dhalperi @tgroh 

The PubsubUnboundendSource implementation has an assertion to confirm the 
checkpoint from which a fresh reader is instantiated has come via 
deserialization from an earlier finalized checkpoint. The in-process runner was 
reusing the checkpoint object directly, so the assertion failed. This adds the 
serialize/deserialize to the in-process runner, which I believe is the best 
solution since other UnboundedSources may be caught by the same issue. It also 
forces the user to exercise their checkpoint coder.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/mshields822/incubator-beam pubsub-inproc

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-beam/pull/388.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #388


commit 4f4b526495a887bbc1c9b782850e9785a86bddc9
Author: Mark Shields 
Date:   2016-05-25T16:13:07Z

Serialize/Deserialize checkpoints




> Make java-only PubsubIO work in InProcessRunner 
> 
>
> Key: BEAM-306
> URL: https://issues.apache.org/jira/browse/BEAM-306
> Project: Beam
>  Issue Type: Bug
>  Components: runner-core
>Reporter: Mark Shields
>Assignee: Mark Shields
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-486) Cleanup NOTICE file

2016-07-25 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15392864#comment-15392864
 ] 

ASF GitHub Bot commented on BEAM-486:
-

GitHub user dhalperi opened a pull request:

https://github.com/apache/incubator-beam/pull/727

[BEAM-486] Remove mention of Apache v2.0 LICENSE

R: @jbonofre 

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/dhalperi/incubator-beam start-release

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-beam/pull/727.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #727


commit f301b164c0dce8d3db766c3cdc939048c06d5235
Author: Dan Halperin 
Date:   2016-07-25T23:42:23Z

[BEAM-486] Remove mention of Apache v2.0 LICENSE




> Cleanup NOTICE file
> ---
>
> Key: BEAM-486
> URL: https://issues.apache.org/jira/browse/BEAM-486
> Project: Beam
>  Issue Type: Task
>  Components: project-management
>Reporter: Daniel Halperin
>Assignee: Daniel Halperin
>
> http://mail-archives.apache.org/mod_mbox/incubator-general/201606.mbox/%3ca5f50a0f-f1e1-4391-8188-391187b9e...@classsoftware.com%3E
> - NOTICE file contain unneeded text (i.e mentions  Apache v2.0 licence). 
> There no need to
> generally mention Apache 2.0 licences in NOTICE [2]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-480) Use BigQueryServices abstraction in BigQueryIO

2016-07-25 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-480?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15392918#comment-15392918
 ] 

ASF GitHub Bot commented on BEAM-480:
-

GitHub user peihe opened a pull request:

https://github.com/apache/incubator-beam/pull/729

[BEAM-480] Move createTable() from BigQueryTableInserter to BigQueryServices




You can merge this pull request into a Git repository by running:

$ git pull https://github.com/peihe/incubator-beam bq-services

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-beam/pull/729.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #729


commit fffb3d98f16132771c9a5e5de7699f7cacdb84f7
Author: Pei He 
Date:   2016-07-25T18:02:10Z

[BEAM-480] Move createTable() to BigQueryServices




> Use BigQueryServices abstraction in BigQueryIO
> --
>
> Key: BEAM-480
> URL: https://issues.apache.org/jira/browse/BEAM-480
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-gcp
>Reporter: Pei He
>Assignee: Pei He
>Priority: Minor
>
> There are legacy code that sent request to BigQuery directly.
> They should be moved to use BigQueryServices.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-386) Dataflow runner to support Read.Bounded in streaming mode.

2016-07-26 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15394283#comment-15394283
 ] 

ASF GitHub Bot commented on BEAM-386:
-

Github user asfgit closed the pull request at:

https://github.com/apache/incubator-beam/pull/704


> Dataflow runner to support Read.Bounded in streaming mode.
> --
>
> Key: BEAM-386
> URL: https://issues.apache.org/jira/browse/BEAM-386
> Project: Beam
>  Issue Type: New Feature
>Reporter: Pei He
>Assignee: Pei He
> Fix For: 0.2.0-incubating
>
>
> UnboundedReadFromBoundedSource is done.
> Make Dataflow runner use it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-491) Reuse context and disable UI in the Spark runner tests

2016-07-26 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-491?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15394532#comment-15394532
 ] 

ASF GitHub Bot commented on BEAM-491:
-

GitHub user amitsela opened a pull request:

https://github.com/apache/incubator-beam/pull/736

[BEAM-491]-Reuse context and disable UI in the Spark runner tests

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [ ] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [ ] Make sure tests pass via `mvn clean verify`. (Even better, enable
   Travis-CI on your fork and ensure the whole test matrix passes).
 - [ ] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [ ] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.txt).

---

Rename dataflow to beam.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/amitsela/incubator-beam BEAM-491

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-beam/pull/736.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #736


commit e62bfdb16dddfd19fc881e8a78df956bd0ef0ff1
Author: Sela 
Date:   2016-07-26T20:47:06Z

Reuse context and disable UI.

Rename dataflow to beam.




> Reuse context and disable UI in the Spark runner tests 
> ---
>
> Key: BEAM-491
> URL: https://issues.apache.org/jira/browse/BEAM-491
> Project: Beam
>  Issue Type: Bug
>  Components: runner-spark
>Affects Versions: 0.1.0-incubating
>Reporter: Amit Sela
>Assignee: Amit Sela
>Priority: Minor
> Fix For: 0.2.0-incubating
>
>
> Currently, only RunnableOnService tests reuse the context for some reason, 
> although surefire is using 1 fork (no reuse). I don't see a reason why not to 
> reuse in all surefire executions.
> UI could be disabled for test executions as well. This could also help with 
> Jenkins ports issues we've been experiencing. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-491) Reuse context and disable UI in the Spark runner tests

2016-07-26 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-491?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15394775#comment-15394775
 ] 

ASF GitHub Bot commented on BEAM-491:
-

Github user asfgit closed the pull request at:

https://github.com/apache/incubator-beam/pull/736


> Reuse context and disable UI in the Spark runner tests 
> ---
>
> Key: BEAM-491
> URL: https://issues.apache.org/jira/browse/BEAM-491
> Project: Beam
>  Issue Type: Bug
>  Components: runner-spark
>Affects Versions: 0.1.0-incubating
>Reporter: Amit Sela
>Assignee: Amit Sela
>Priority: Minor
> Fix For: 0.2.0-incubating
>
>
> Currently, only RunnableOnService tests reuse the context for some reason, 
> although surefire is using 1 fork (no reuse). I don't see a reason why not to 
> reuse in all surefire executions.
> UI could be disabled for test executions as well. This could also help with 
> Jenkins ports issues we've been experiencing. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-488) Remove KEYS file

2016-07-26 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-488?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15393300#comment-15393300
 ] 

ASF GitHub Bot commented on BEAM-488:
-

GitHub user dhalperi opened a pull request:

https://github.com/apache/incubator-beam/pull/732

[BEAM-488] Remove KEYS file

Per discussion, linked in JIRA:

> Bundling PGP keys inside a package is worse than worthless – an
> attacker can just bundle spoofed keys with a bogus distro! Keys need
> to be made available from a highly reliable, separate server: Download
> the main package from a mirror, get PGP keys from apache.org,
> pgp.mit.edu, etc. and verify.
>
> The KEYS file within the Beam source tree should be deleted.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/dhalperi/incubator-beam beam-488

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-beam/pull/732.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #732


commit a94f71b339120b2e10ff174d50bf5c9d8fe23023
Author: Dan Halperin 
Date:   2016-07-26T06:24:10Z

[BEAM-488] Remove KEYS file

Per discussion, linked in JIRA:

> Bundling PGP keys inside a package is worse than worthless – an
> attacker can just bundle spoofed keys with a bogus distro! Keys need
> to be made available from a highly reliable, separate server: Download
> the main package from a mirror, get PGP keys from apache.org,
> pgp.mit.edu, etc. and verify.
>
> The KEYS file within the Beam source tree should be deleted.




> Remove KEYS file
> 
>
> Key: BEAM-488
> URL: https://issues.apache.org/jira/browse/BEAM-488
> Project: Beam
>  Issue Type: Task
>  Components: project-management
>Affects Versions: Not applicable
>Reporter: Daniel Halperin
>Assignee: Daniel Halperin
>
> http://mail-archives.apache.org/mod_mbox/incubator-general/201606.mbox/%3CCAAS6=7hVLcw6060Un7sXxk+WLLh08DFOSWktC0Aam4F=dye...@mail.gmail.com%3E
> > Bundling PGP keys inside a package is worse than worthless -- an attacker 
> > can
> just bundle spoofed keys with a bogus distro!  Keys need to be made available
> from a highly reliable, separate server: Download the main package from a
> mirror, get PGP keys from apache.org, pgp.mit.edu, etc. and verify.
> > 
> > The KEYS file within the Beam source tree should be deleted.
>  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-339) Archetype project version shouldn't be coupled to Beam version

2016-07-14 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-339?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15377149#comment-15377149
 ] 

ASF GitHub Bot commented on BEAM-339:
-

GitHub user swegner opened a pull request:

https://github.com/apache/incubator-beam/pull/654

[BEAM-339] Archetype project version shouldn't be coupled to Beam version

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [ ] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [ ] Make sure tests pass via `mvn clean verify`. (Even better, enable
   Travis-CI on your fork and ensure the whole test matrix passes).
 - [ ] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [ ] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.txt).

---

Follow-up from [a previous archetype 
PR](https://github.com/apache/incubator-beam/pull/444/files/ba55042275bd9b525ee8716e4e1007b7924a647f#r66819150),
 the maven artifact version for the generated project should not be tied to the 
version of Beam. The generated module is a new user project, so the version 
should represent "initial version", i.e. 0.1.

This PR drops the -SNAPSHOT suffix from the version and fixes the version 
to 0.1 in our tests.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/swegner/incubator-beam archetype-version

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-beam/pull/654.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #654


commit 5b317b852b1fac9f6320e1a074238e6e7da77b79
Author: Scott Wegner 
Date:   2016-07-13T22:59:16Z

Archetype generated projects shouldn't have SNAPSHOT version




> Archetype project version shouldn't be coupled to Beam version
> --
>
> Key: BEAM-339
> URL: https://issues.apache.org/jira/browse/BEAM-339
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-extensions
>Reporter: Scott Wegner
>Assignee: Scott Wegner
>Priority: Minor
>
> tl;dr: The maven-archetype project has a version reference of 0.1-SNAPSHOT. 
> This is for the user project and shouldn't be tied to Beam versions.
>  
> In the maven-archetype projects, we have a test which injects property values 
> and verify that the generated project matches the expected. One of the 
> injected properties is "version", which is currently set to "0.1-snapshot" to 
> match the Beam project versions. The version property represents the version 
> of the user project being created and thus shouldn't be tied to the Beam 
> versioning. We should change it such that the intended usage is more clear 
> and to test that the version isn't being set from the Beam version.
> See: 
> https://github.com/apache/incubator-beam/pull/444/files/ba55042275bd9b525ee8716e4e1007b7924a647f#r66819150



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-242) Enable Checkstyle check and Javadoc build for the Flink Runner

2016-07-14 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15377203#comment-15377203
 ] 

ASF GitHub Bot commented on BEAM-242:
-

Github user jbonofre closed the pull request at:

https://github.com/apache/incubator-beam/pull/372


> Enable Checkstyle check and Javadoc build for the Flink Runner 
> ---
>
> Key: BEAM-242
> URL: https://issues.apache.org/jira/browse/BEAM-242
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-flink
>Reporter: Maximilian Michels
>Assignee: Jean-Baptiste Onofré
>Priority: Minor
>
> We don't have a Checkstyle check in place for the Flink Runner. I would like 
> to use the SDK's checkstyle rules.
> We could also think about a unified Checkstyle for all Runners.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-286) Reorganize flink runner directories

2016-07-14 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15377200#comment-15377200
 ] 

ASF GitHub Bot commented on BEAM-286:
-

Github user jbonofre closed the pull request at:

https://github.com/apache/incubator-beam/pull/348


> Reorganize flink runner directories
> ---
>
> Key: BEAM-286
> URL: https://issues.apache.org/jira/browse/BEAM-286
> Project: Beam
>  Issue Type: Task
>  Components: runner-flink
>Reporter: Jean-Baptiste Onofré
>Assignee: Jean-Baptiste Onofré
> Fix For: 0.2.0-incubating
>
>
> The flink runner Maven module uses two sub-modules: runner and examples. It's 
> the only one which use such layout (compare to spark, dataflow or 
> inprocess/direct runners).
> I will propose a PR to align flink runner module with the other, keeping the 
> examples in a sub-directory.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-339) Archetype project version shouldn't be coupled to Beam version

2016-07-14 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-339?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15378091#comment-15378091
 ] 

ASF GitHub Bot commented on BEAM-339:
-

Github user asfgit closed the pull request at:

https://github.com/apache/incubator-beam/pull/654


> Archetype project version shouldn't be coupled to Beam version
> --
>
> Key: BEAM-339
> URL: https://issues.apache.org/jira/browse/BEAM-339
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-extensions
>Reporter: Scott Wegner
>Assignee: Scott Wegner
>Priority: Minor
>
> tl;dr: The maven-archetype project has a version reference of 0.1-SNAPSHOT. 
> This is for the user project and shouldn't be tied to Beam versions.
>  
> In the maven-archetype projects, we have a test which injects property values 
> and verify that the generated project matches the expected. One of the 
> injected properties is "version", which is currently set to "0.1-snapshot" to 
> match the Beam project versions. The version property represents the version 
> of the user project being created and thus shouldn't be tied to the Beam 
> versioning. We should change it such that the intended usage is more clear 
> and to test that the version isn't being set from the Beam version.
> See: 
> https://github.com/apache/incubator-beam/pull/444/files/ba55042275bd9b525ee8716e4e1007b7924a647f#r66819150



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-155) Support asserting the contents of windows and panes in PAssert

2016-07-14 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15378477#comment-15378477
 ] 

ASF GitHub Bot commented on BEAM-155:
-

Github user asfgit closed the pull request at:

https://github.com/apache/incubator-beam/pull/532


> Support asserting the contents of windows and panes in PAssert
> --
>
> Key: BEAM-155
> URL: https://issues.apache.org/jira/browse/BEAM-155
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core
>Reporter: Thomas Groh
>Assignee: Thomas Groh
>
> This consists of reifying the output windows and panes, and running asserts 
> per-window about the contents of panes.
> This includes aggregated matching and final pane matching, e.g.
> PAssert.that(output).byOnTimePane().hasOutputElements(foo, bar);
> // For discarding mode - could have emitted (say) [spam, eggs], [spam], [], 
> [sausage], []
> PAssert.that(output).byFinalPane().hasOutputElements(spam, eggs, sausage, 
> spam);
> // For accumulating mode without late data
> PAssert.that(output).finalPane().containsInAnyOrder(spam, eggs, sausage, 
> spam);
> // For accumulating mode with late data
> PAssert.that(output).finalPane().containsInAnyOrder(foo, 
> bar).mayAlsoContain(baz, rab);
> See also: 
> https://docs.google.com/document/d/1fZUUbG2LxBtqCVabQshldXIhkMcXepsbv2vuuny8Ix4/edit#



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-391) Exceptions in gcsio upload thread causes pipeline to stall

2016-07-14 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15378717#comment-15378717
 ] 

ASF GitHub Bot commented on BEAM-391:
-

Github user aaltay closed the pull request at:

https://github.com/apache/incubator-beam/pull/617


> Exceptions in gcsio upload thread causes pipeline to stall
> --
>
> Key: BEAM-391
> URL: https://issues.apache.org/jira/browse/BEAM-391
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py
>Reporter: Ahmet Altay
>
> gcsio got stuck with invalid bucket name
> GcsBufferedWriter._start_upload (gcsio.py) raises an exception if the bucket 
> does not exist. This causes upload thread to silenty fail. It logs exception 
> to the log but this does not stop the pipeline or closes the receiving end of 
> the multiprocessing.Pipe(). Later a call in to write() blocks at 
> self.conn.send_bytes(). Note that send may block if the buffer is full.
> Upload thread should have a finally clause to close the socket connection. Or 
> better propagating the exception to its parent. This is true for other types 
> of exceptions also.
> Another small issue in the GcsBufferedWriter.close(). It does not self 
> self.close to True.
> reproduction: python -m apache_beam.examples.wordcount --output 
> gs://no-such-thing/
> Prints the exception but goes on forever. Ctrl + C breaks the main thread 
> shows where it got stuck.
> Similarly reproducible on the service.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-445) Beam-examples-java build failed through local "mvn install"

2016-07-14 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15378595#comment-15378595
 ] 

ASF GitHub Bot commented on BEAM-445:
-

Github user asfgit closed the pull request at:

https://github.com/apache/incubator-beam/pull/648


> Beam-examples-java build failed through local "mvn install"
> ---
>
> Key: BEAM-445
> URL: https://issues.apache.org/jira/browse/BEAM-445
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-gcp
> Environment: linux
>Reporter: Mark Liu
>Assignee: Daniel Halperin
>Priority: Critical
>
> Build project under beam/examples/java with command "mvn clean install 
> -DskipTests" failed with following error:
> [ERROR] Failed to execute goal on project beam-examples-java: Could not 
> resolve dependencies for project 
> org.apache.beam:beam-examples-java:jar:0.2.0-incubating-SNAPSHOT: Could not 
> transfer artifact 
> io.netty:netty-tcnative-boringssl-static:jar:${os.detected.classifier}:1.1.33.Fork13
>  from/to central (http://repo.maven.apache.org/maven2): Illegal character in 
> path at index 138: 
> http://repo.maven.apache.org/maven2/io/netty/netty-tcnative-boringssl-static/1.1.33.Fork13/netty-tcnative-boringssl-static-1.1.33.Fork13-${os.detected.classifier}.jar
> Reason: can't resolve ${os.detected.classifier} in 
> beam/sdks/java/io/google-cloud-platform/pom file.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-37) Run DoFnWithContext without conversion to vanilla DoFn

2016-07-14 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-37?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15378636#comment-15378636
 ] 

ASF GitHub Bot commented on BEAM-37:


Github user asfgit closed the pull request at:

https://github.com/apache/incubator-beam/pull/521


> Run DoFnWithContext without conversion to vanilla DoFn
> --
>
> Key: BEAM-37
> URL: https://issues.apache.org/jira/browse/BEAM-37
> Project: Beam
>  Issue Type: Sub-task
>  Components: runner-core
>Reporter: Kenneth Knowles
>Assignee: Ben Chambers
>
> DoFnWithContext is an enhanced DoFn where annotations and parameter lists are 
> inspected to determine whether it accesses windowing information, etc.
> Today, each feature of DoFnWithContext requires implementation on DoFn, which 
> precludes the easy addition of features that we don't have designs for in 
> DoFn.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-79) Gearpump runner

2016-07-22 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-79?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15390102#comment-15390102
 ] 

ASF GitHub Bot commented on BEAM-79:


GitHub user kennknowles opened a pull request:

https://github.com/apache/incubator-beam/pull/714

[BEAM-79] Add Gearpump runner to runners parent pom.xml

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [x] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [x] Make sure tests pass via `mvn clean verify`. (Even better, enable
   Travis-CI on your fork and ensure the whole test matrix passes).
 - [x] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [x] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.txt).

---

I think this line got lost in the shuffle. It is why we aren't seeing the 
module in the Jenkins web UI, though the [tests are running and 
passing](https://builds.apache.org/view/Beam/job/beam_PostCommit_RunnableOnService_GearpumpLocal/).

R: @dhalperi 
CC: @manuzhang 

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/kennknowles/incubator-beam gearpump-module

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-beam/pull/714.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #714


commit f5e1abb61ea36866cd2d15be6e82c2b0785a4d9c
Author: Kenneth Knowles 
Date:   2016-07-22T19:50:43Z

Add Gearpump runner to runners parent pom.xml




> Gearpump runner
> ---
>
> Key: BEAM-79
> URL: https://issues.apache.org/jira/browse/BEAM-79
> Project: Beam
>  Issue Type: New Feature
>  Components: runner-ideas
>Reporter: Tyler Akidau
>Assignee: Manu Zhang
>
> Intel is submitting Gearpump (http://www.gearpump.io) to ASF 
> (https://wiki.apache.org/incubator/GearpumpProposal). Appears to be a mix of 
> low-level primitives a la MillWheel, with some higher level primitives like 
> non-merging windowing mixed in. Seems like it would make a nice Beam runner.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-459) Add BigInteger to TypeDescriptors

2016-07-22 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-459?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15390183#comment-15390183
 ] 

ASF GitHub Bot commented on BEAM-459:
-

GitHub user eljefe6a opened a pull request:

https://github.com/apache/incubator-beam/pull/716

[BEAM-459] Add BigInteger to TypeDescriptors

Added BigInteger to TypeDescriptors class.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/eljefe6a/incubator-beam BigIntegerTypeDesc

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-beam/pull/716.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #716


commit 75df931a55dbd3f93d9143574cdad7291d3457aa
Author: Jesse Anderson 
Date:   2016-07-22T20:54:26Z

Added BigInteger to TypeDescriptors class.




> Add BigInteger to TypeDescriptors
> -
>
> Key: BEAM-459
> URL: https://issues.apache.org/jira/browse/BEAM-459
> Project: Beam
>  Issue Type: Bug
>Reporter: Jesse Anderson
>Assignee: Jesse Anderson
>
> The TypeDescriptors class is missing a BigInteger TypeDescriptor method.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-354) Modify DatastoreIO to use Datastore v1beta3 API

2016-07-25 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-354?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15392448#comment-15392448
 ] 

ASF GitHub Bot commented on BEAM-354:
-

GitHub user vikkyrk opened a pull request:

https://github.com/apache/incubator-beam/pull/725

[BEAM-354] Move V1Beta3 DatastoreSource into its own file

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [ ] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [ ] Make sure tests pass via `mvn clean verify`. (Even better, enable
   Travis-CI on your fork and ensure the whole test matrix passes).
 - [ ] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [ ] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.txt).

---

- No logic changes, just move DataSource to new file and rename to 
V1Beta3Source

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/vikkyrk/incubator-beam vikasrk/ds_refactor

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-beam/pull/725.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #725


commit 0f7ab24daa18788bef1083242c3a2a7d054c767f
Author: Vikas Kedigehalli 
Date:   2016-07-25T18:10:10Z

Move V1Beta3 DatastoreSource into its own file




> Modify DatastoreIO to use Datastore v1beta3 API
> ---
>
> Key: BEAM-354
> URL: https://issues.apache.org/jira/browse/BEAM-354
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-gcp
>Affects Versions: 0.2.0-incubating
>Reporter: Vikas Kedigehalli
>Assignee: Vikas Kedigehalli
> Fix For: 0.2.0-incubating
>
>
> Datastore v1beta2 API is getting deprecated in favor of v1beta3. Hence the 
> DatastoreIO needs to be migrated to use the new version. Also in the process 
> of doing so, this is a good time to add a level of indirection via a 
> PTranform such that future changes in Datastore API would not result in 
> changing user/pipeline code. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-362) Move shared runner functionality out of SDK and into runners/core-java

2016-07-25 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-362?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15392524#comment-15392524
 ] 

ASF GitHub Bot commented on BEAM-362:
-

Github user asfgit closed the pull request at:

https://github.com/apache/incubator-beam/pull/697


> Move shared runner functionality out of SDK and into runners/core-java
> --
>
> Key: BEAM-362
> URL: https://issues.apache.org/jira/browse/BEAM-362
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-core
>Reporter: Kenneth Knowles
>Assignee: Kenneth Knowles
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-484) Datastore Source should support Dynamic Splitting

2016-07-27 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-484?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15395774#comment-15395774
 ] 

ASF GitHub Bot commented on BEAM-484:
-

GitHub user vikkyrk opened a pull request:

https://github.com/apache/incubator-beam/pull/739

[BEAM-484] Datastore Read as a composite PTransform

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [ ] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [ ] Make sure tests pass via `mvn clean verify`. (Even better, enable
   Travis-CI on your fork and ensure the whole test matrix passes).
 - [ ] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [ ] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.txt).

---
- Datastore Read as a composite PTransform:
Create.of(query) | ParDo.of(split query) | Reshard | ParDo.of(read from 
query)

- Dynamic rebalancing support comes for free as it happens after the 
sharding of queries (GBK)


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/vikkyrk/incubator-beam 
vikasrk/ds_as_ptransform

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-beam/pull/739.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #739


commit d8fab9fd351a4e1227623c6469137dec0d7f24e6
Author: Vikas Kedigehalli 
Date:   2016-07-26T16:54:43Z

Datastore Read as a composite PTransform




> Datastore Source should support Dynamic Splitting
> -
>
> Key: BEAM-484
> URL: https://issues.apache.org/jira/browse/BEAM-484
> Project: Beam
>  Issue Type: Improvement
>Reporter: Vikas Kedigehalli
>Assignee: Vikas Kedigehalli
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-79) Gearpump runner

2016-07-28 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-79?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15397251#comment-15397251
 ] 

ASF GitHub Bot commented on BEAM-79:


GitHub user manuzhang opened a pull request:

https://github.com/apache/incubator-beam/pull/750

[BEAM-79] Merge branch 'master' into gearpump_runner

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [ ] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [ ] Make sure tests pass via `mvn clean verify`. (Even better, enable
   Travis-CI on your fork and ensure the whole test matrix passes).
 - [ ] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [ ] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.txt).

---


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/manuzhang/incubator-beam gearpump_runner_sync

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-beam/pull/750.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #750


commit 34467f92d5a31b47f95b734c737fde0a8277311b
Author: Thomas Groh 
Date:   2016-07-15T17:51:24Z

Make TransformEvaluatorFactory reuse Explicit

Transform Evaluator Factories must be reused for the entire execution of
a Pipeline and must not be reused across pipelines.

Remove EvaluatorKey, and key explicitly by the transform application.

commit ad6ee01825740ee47f25ca036fa5f352375bbe6e
Author: Dan Halperin 
Date:   2016-07-19T06:40:13Z

Checkstyle: enforce package-info.java presence

Ignores tests and examples

commit c4ad11832235eef0b73d299cd02d1a224c130ece
Author: Dan Halperin 
Date:   2016-07-20T16:16:28Z

Closes #692

commit 6d7efe3df8cde590722d94441eca3922a3a67734
Author: Kenneth Knowles 
Date:   2016-07-20T17:30:55Z

This closes #666

commit 5f19e4caf207ff02d226fec2933be2f37ca66b4d
Author: Thomas Groh 
Date:   2016-06-30T17:06:52Z

Add withAllowedLateness with Closing Behavior to Window

This makes the static constructors for withAllowedLateness symmetric to
the PTransform builder methods. It also allows references to
Window#withAllowedLateness(Duration, ClosingBehavior).

commit f547f70e1c9535ff663d124f67c72c7ec2c55e9e
Author: Kenneth Knowles 
Date:   2016-07-20T17:40:13Z

This closes #567

commit 00195d2543eb347cc3669a4ac89e98da0bc4dca4
Author: Thomas Groh 
Date:   2016-06-28T22:44:49Z

Use the ParDo Application to Cache DoFns

A DoFn application is the scope of reuse.

Factor CloningThreadLocal as the top-level class instead of
SerializableCloningThreadLocalCacheLoader, and extract the Fn from the
AppliedPTransform when loading an absent element.

commit 436e4a34ebb222545cb03cb6d39ea4ca2d905254
Author: Kenneth Knowles 
Date:   2016-07-20T17:55:53Z

This closes #554

commit b240525affb205a83054577233f3a4a508fe1c72
Author: Dan Halperin 
Date:   2016-07-18T19:05:02Z

BigQueryIO: move to google-cloud-platform module

* Move package from io to io.gcp.bigquery
* Move from SDK core into GCP-IO module
* Fixup references and import orders
* Separate AvroUtils into generic AvroUtils and BigQueryAvroUtils
* Rewrite a unit test in sdk core to not depend on BigQueryIO
* Fixup Javadoc in SDK core that need not depend on BigQueryIO
* Make utility classes package-private

commit 7ec8781a2e18548a23c882329f0b50f7254202ec
Author: Dan Halperin 
Date:   2016-07-20T20:02:43Z

Closes #681

commit bdb65278873a5010a625dc6a569ba25b17374c06
Author: Kenneth Knowles 
Date:   2016-07-20T04:36:10Z

Add os-maven-plugin to Spark runner

commit 84332ee9716233af928e85c14c534714ab828531
Author: Chandni Singh 
Date:   2016-07-20T00:30:16Z

BEAM-372 verfify if a nested coder consumes bytes equal to encoded bytes

commit 6d5e8186a2da532eb1c29097bc1259a19d9f72c9
Author: Luke Cwik 
Date:   2016-07-21T13:34:52Z

[BEAM-372] added a test that verifies if a coder consumes bytes equal to 
encoded bytes

This closes #695

commit ae2144196c351cc2ee544e030d793929d8607696
Author: Dan Halperin 
Date:   2016-07-20T17:26:02Z

BigtableIO: upgrade to 0.9.1

* Use the uber jar
* Remove OS classifier mumbo jumbo
* Move common dependency versioning to 

[jira] [Commented] (BEAM-328) CoderRegistry does not provide SerializableCoder for `T extends Serializable`

2016-07-27 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-328?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15395997#comment-15395997
 ] 

ASF GitHub Bot commented on BEAM-328:
-

Github user asfgit closed the pull request at:

https://github.com/apache/incubator-beam/pull/663


> CoderRegistry does not provide SerializableCoder for `T extends Serializable`
> -
>
> Key: BEAM-328
> URL: https://issues.apache.org/jira/browse/BEAM-328
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core
>Reporter: Kenneth Knowles
>Assignee: Manu Zhang
>Priority: Minor
>  Labels: starter
>
> When the type for which a coder is being inferred is a type variable with an 
> upper bound of {{Serializable}}, it is reasonable for the coder registry to 
> propagate this to the {{SerializableCoder.PROVIDER}}, which should be able to 
> succeed.
> Unfortunately, the particulars of the distinctions made between {{Type}}, 
> {{Class}}, {{TypeVariable}}, {{ParameterizedType}}, etc, go down a code path 
> where this is not the case. Instead, an error is raised that the type 
> variable has been subject to erasure.
> Originally reported at: 
> https://github.com/GoogleCloudPlatform/DataflowJavaSDK/issues/298



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-354) Modify DatastoreIO to use Datastore v1beta3 API

2016-07-27 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-354?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15396163#comment-15396163
 ] 

ASF GitHub Bot commented on BEAM-354:
-

Github user vikkyrk closed the pull request at:

https://github.com/apache/incubator-beam/pull/725


> Modify DatastoreIO to use Datastore v1beta3 API
> ---
>
> Key: BEAM-354
> URL: https://issues.apache.org/jira/browse/BEAM-354
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-gcp
>Affects Versions: 0.2.0-incubating
>Reporter: Vikas Kedigehalli
>Assignee: Vikas Kedigehalli
> Fix For: 0.2.0-incubating
>
>
> Datastore v1beta2 API is getting deprecated in favor of v1beta3. Hence the 
> DatastoreIO needs to be migrated to use the new version. Also in the process 
> of doing so, this is a good time to add a level of indirection via a 
> PTranform such that future changes in Datastore API would not result in 
> changing user/pipeline code. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-433) Make Beam examples runners agnostic

2016-07-27 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15396202#comment-15396202
 ] 

ASF GitHub Bot commented on BEAM-433:
-

Github user peihe closed the pull request at:

https://github.com/apache/incubator-beam/pull/301


> Make Beam examples runners agnostic
> ---
>
> Key: BEAM-433
> URL: https://issues.apache.org/jira/browse/BEAM-433
> Project: Beam
>  Issue Type: Improvement
>  Components: examples-java
>Reporter: Pei He
>Assignee: Pei He
>
> Beam examples are ported from Dataflow, and they heavily reference to 
> Dataflow classes.
> There are following cleanup tasks:
> 1. Remove Dataflow streaming and batch injector setup (Done).
> 2. Remove references to DataflowPipelineOptions.
> 3. Move cancel() from DataflowPipelineJob to PipelineResult.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-488) Remove KEYS file

2016-07-27 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-488?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15396212#comment-15396212
 ] 

ASF GitHub Bot commented on BEAM-488:
-

Github user asfgit closed the pull request at:

https://github.com/apache/incubator-beam/pull/732


> Remove KEYS file
> 
>
> Key: BEAM-488
> URL: https://issues.apache.org/jira/browse/BEAM-488
> Project: Beam
>  Issue Type: Task
>  Components: project-management
>Affects Versions: Not applicable
>Reporter: Daniel Halperin
>Assignee: Jean-Baptiste Onofré
> Fix For: 0.2.0-incubating
>
>
> http://mail-archives.apache.org/mod_mbox/incubator-general/201606.mbox/%3CCAAS6=7hVLcw6060Un7sXxk+WLLh08DFOSWktC0Aam4F=dye...@mail.gmail.com%3E
> > Bundling PGP keys inside a package is worse than worthless -- an attacker 
> > can
> just bundle spoofed keys with a bogus distro!  Keys need to be made available
> from a highly reliable, separate server: Download the main package from a
> mirror, get PGP keys from apache.org, pgp.mit.edu, etc. and verify.
> > 
> > The KEYS file within the Beam source tree should be deleted.
>  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-486) Cleanup NOTICE file

2016-07-27 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15396229#comment-15396229
 ] 

ASF GitHub Bot commented on BEAM-486:
-

Github user asfgit closed the pull request at:

https://github.com/apache/incubator-beam/pull/727


> Cleanup NOTICE file
> ---
>
> Key: BEAM-486
> URL: https://issues.apache.org/jira/browse/BEAM-486
> Project: Beam
>  Issue Type: Task
>  Components: project-management
>Reporter: Daniel Halperin
>Assignee: Jean-Baptiste Onofré
> Fix For: 0.2.0-incubating
>
>
> http://mail-archives.apache.org/mod_mbox/incubator-general/201606.mbox/%3ca5f50a0f-f1e1-4391-8188-391187b9e...@classsoftware.com%3E
> - NOTICE file contain unneeded text (i.e mentions  Apache v2.0 licence). 
> There no need to
> generally mention Apache 2.0 licences in NOTICE [2]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-433) Make Beam examples runners agnostic

2016-07-27 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15396211#comment-15396211
 ] 

ASF GitHub Bot commented on BEAM-433:
-

GitHub user peihe opened a pull request:

https://github.com/apache/incubator-beam/pull/742

[BEAM-433] Remove references to DataflowPipelineOptions




You can merge this pull request into a Git repository by running:

$ git pull https://github.com/peihe/incubator-beam rm-dataflow-options

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-beam/pull/742.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #742


commit 9aad5c6ec034a10ded8c07e251f69213a0b0bdc8
Author: Pei He 
Date:   2016-07-27T19:29:44Z

[BEAM-433] Remove references to DataflowPipelineOptions




> Make Beam examples runners agnostic
> ---
>
> Key: BEAM-433
> URL: https://issues.apache.org/jira/browse/BEAM-433
> Project: Beam
>  Issue Type: Improvement
>  Components: examples-java
>Reporter: Pei He
>Assignee: Pei He
>
> Beam examples are ported from Dataflow, and they heavily reference to 
> Dataflow classes.
> There are following cleanup tasks:
> 1. Remove Dataflow streaming and batch injector setup (Done).
> 2. Remove references to DataflowPipelineOptions.
> 3. Move cancel() from DataflowPipelineJob to PipelineResult.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-156) Implement Quiescence Signalling in the InProcessPipelineRunner

2016-07-28 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-156?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15397653#comment-15397653
 ] 

ASF GitHub Bot commented on BEAM-156:
-

Github user asfgit closed the pull request at:

https://github.com/apache/incubator-beam/pull/746


> Implement Quiescence Signalling in the InProcessPipelineRunner
> --
>
> Key: BEAM-156
> URL: https://issues.apache.org/jira/browse/BEAM-156
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core
>Reporter: Thomas Groh
>Assignee: Thomas Groh
>
> A pipeline is quiescent when the following two properties hold:
>   There are no triggers that can fire, given the current processing time and 
> watermark
>   All pending elements cannot make progress until a side input produces 
> additional output
> This is approximately equivalent to: If no more input is received, the 
> pipeline will not perform any additional processing absent advances in 
> processing time or event time
> See also: 
> https://docs.google.com/document/d/1fZUUbG2LxBtqCVabQshldXIhkMcXepsbv2vuuny8Ix4/edit#



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-13) Create JMS IO

2016-07-27 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-13?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15396344#comment-15396344
 ] 

ASF GitHub Bot commented on BEAM-13:


Github user asfgit closed the pull request at:

https://github.com/apache/incubator-beam/pull/299


> Create JMS IO
> -
>
> Key: BEAM-13
> URL: https://issues.apache.org/jira/browse/BEAM-13
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-java-extensions
>Reporter: Jean-Baptiste Onofré
>Assignee: Jean-Baptiste Onofré
>
> Work in progress: https://github.com/jbonofre/DataflowJavaSDK/tree/IO-JMS



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-433) Make Beam examples runners agnostic

2016-07-27 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15396415#comment-15396415
 ] 

ASF GitHub Bot commented on BEAM-433:
-

Github user asfgit closed the pull request at:

https://github.com/apache/incubator-beam/pull/742


> Make Beam examples runners agnostic
> ---
>
> Key: BEAM-433
> URL: https://issues.apache.org/jira/browse/BEAM-433
> Project: Beam
>  Issue Type: Improvement
>  Components: examples-java
>Reporter: Pei He
>Assignee: Pei He
>
> Beam examples are ported from Dataflow, and they heavily reference to 
> Dataflow classes.
> There are following cleanup tasks:
> 1. Remove Dataflow streaming and batch injector setup (Done).
> 2. Remove references to DataflowPipelineOptions.
> 3. Move cancel() from DataflowPipelineJob to PipelineResult.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-450) Modules are shaded to the same path

2016-07-27 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15396434#comment-15396434
 ] 

ASF GitHub Bot commented on BEAM-450:
-

GitHub user dhalperi opened a pull request:

https://github.com/apache/incubator-beam/pull/744

[BEAM-450] Shade separately per artifact

Prevents reusing the same path to shaded files across packages.

R: @lukecwik or @kennknowles 

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/dhalperi/incubator-beam beam-450

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-beam/pull/744.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #744


commit b10a427f4ac92d3217dce346a93ef063b4101e47
Author: Dan Halperin 
Date:   2016-07-26T06:54:34Z

[BEAM-450] Shade separately per artifact

Prevents conflicts among shaded files




> Modules are shaded to the same path
> ---
>
> Key: BEAM-450
> URL: https://issues.apache.org/jira/browse/BEAM-450
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core
>Affects Versions: 0.1.0-incubating, 0.2.0-incubating
>Reporter: Daniel Halperin
>Assignee: Manu Zhang
>  Labels: newbie, starter
>
> Right now multiple modules are using the same repackaged path. We should be 
> using per-artifact paths so that they don't conflict.
> One proposal was simply to adopt 
> {{${project.groupId}.${project.artifactId}.repackaged}} as the shading 
> location. If it works.
> This is a good starter issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-484) Datastore Source should support Dynamic Splitting

2016-07-22 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-484?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15390439#comment-15390439
 ] 

ASF GitHub Bot commented on BEAM-484:
-

GitHub user vikkyrk opened a pull request:

https://github.com/apache/incubator-beam/pull/722

[BEAM-484] Dynamic Rebalance support for Datastore Source

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [ ] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [ ] Make sure tests pass via `mvn clean verify`. (Even better, enable
   Travis-CI on your fork and ensure the whole test matrix passes).
 - [ ] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [ ] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.txt).

---

- Add support for Dynamic rebalance
- Move and rename DatastoreSource/Reader to V1Beta3Source/Reader into a new 
file.
- Test Read transform, Source and Reader independently.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/vikkyrk/incubator-beam 
vikasrk/datastore_dyn_rebalance

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-beam/pull/722.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #722


commit 8528a1a733e0d54682388411f8dffa620b1cd6a1
Author: Vikas Kedigehalli 
Date:   2016-07-19T18:51:05Z

Dynamic Rebalance support for Datastore Source




> Datastore Source should support Dynamic Splitting
> -
>
> Key: BEAM-484
> URL: https://issues.apache.org/jira/browse/BEAM-484
> Project: Beam
>  Issue Type: Improvement
>Reporter: Vikas Kedigehalli
>Assignee: Vikas Kedigehalli
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-483) Generated job names easily collide

2016-07-22 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15390369#comment-15390369
 ] 

ASF GitHub Bot commented on BEAM-483:
-

GitHub user peihe opened a pull request:

https://github.com/apache/incubator-beam/pull/719

[BEAM-483] Moves NormalizedUniqueName to ApplicationNameOptions, and uses 
it in runners.





You can merge this pull request into a Git repository by running:

$ git pull https://github.com/peihe/incubator-beam job-name

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-beam/pull/719.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #719


commit e597582da820016f97709c847329237841a3ddd6
Author: Pei He 
Date:   2016-07-22T23:40:54Z

[BEAM-483] Move NormalizedUniqueName to ApplicationNameOptions




> Generated job names easily collide
> --
>
> Key: BEAM-483
> URL: https://issues.apache.org/jira/browse/BEAM-483
> Project: Beam
>  Issue Type: Bug
>  Components: runner-dataflow
>Reporter: Pei He
>Assignee: Pei He
>Priority: Minor
>
> The current job name generation scheme may easily lead to duplicate job names 
> and cause DataflowJobAlreadyExistsException, especially when a series of jobs 
> are submitted at the same time. (e.g., from a single script).
> It would be better to just add a random suffix like "-3275" or "-x1bh".



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-480) Use BigQueryServices abstraction in BigQueryIO

2016-07-22 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-480?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15390382#comment-15390382
 ] 

ASF GitHub Bot commented on BEAM-480:
-

Github user asfgit closed the pull request at:

https://github.com/apache/incubator-beam/pull/717


> Use BigQueryServices abstraction in BigQueryIO
> --
>
> Key: BEAM-480
> URL: https://issues.apache.org/jira/browse/BEAM-480
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-gcp
>Reporter: Pei He
>Assignee: Pei He
>Priority: Minor
>
> There are legacy code that sent request to BigQuery directly.
> They should be moved to use BigQueryServices.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-459) Add BigInteger to TypeDescriptors

2016-07-22 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-459?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15390379#comment-15390379
 ] 

ASF GitHub Bot commented on BEAM-459:
-

Github user asfgit closed the pull request at:

https://github.com/apache/incubator-beam/pull/716


> Add BigInteger to TypeDescriptors
> -
>
> Key: BEAM-459
> URL: https://issues.apache.org/jira/browse/BEAM-459
> Project: Beam
>  Issue Type: Bug
>Reporter: Jesse Anderson
>Assignee: Jesse Anderson
>
> The TypeDescriptors class is missing a BigInteger TypeDescriptor method.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-79) Gearpump runner

2016-07-22 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-79?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15390464#comment-15390464
 ] 

ASF GitHub Bot commented on BEAM-79:


Github user kennknowles closed the pull request at:

https://github.com/apache/incubator-beam/pull/714


> Gearpump runner
> ---
>
> Key: BEAM-79
> URL: https://issues.apache.org/jira/browse/BEAM-79
> Project: Beam
>  Issue Type: New Feature
>  Components: runner-ideas
>Reporter: Tyler Akidau
>Assignee: Manu Zhang
>
> Intel is submitting Gearpump (http://www.gearpump.io) to ASF 
> (https://wiki.apache.org/incubator/GearpumpProposal). Appears to be a mix of 
> low-level primitives a la MillWheel, with some higher level primitives like 
> non-merging windowing mixed in. Seems like it would make a nice Beam runner.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-480) Use BigQueryServices abstraction in BigQueryIO

2016-07-22 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-480?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15390312#comment-15390312
 ] 

ASF GitHub Bot commented on BEAM-480:
-

GitHub user peihe opened a pull request:

https://github.com/apache/incubator-beam/pull/717

[BEAM-480] Move insertAll() from BigQueryTableInserter to BigQueryServices



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/peihe/incubator-beam log-request

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-beam/pull/717.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #717


commit 2d998bd8dd99c2c03420a70fbed0194b2df95d23
Author: Pei He 
Date:   2016-07-22T22:03:38Z

[BEAM-480] Move insertAll() to BigQueryServices




> Use BigQueryServices abstraction in BigQueryIO
> --
>
> Key: BEAM-480
> URL: https://issues.apache.org/jira/browse/BEAM-480
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-gcp
>Reporter: Pei He
>Assignee: Pei He
>Priority: Minor
>
> There are legacy code that sent request to BigQuery directly.
> They should be moved to use BigQueryServices.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-383) BigQueryIO: update sink to shard into multiple write jobs

2016-07-21 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-383?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15388653#comment-15388653
 ] 

ASF GitHub Bot commented on BEAM-383:
-

GitHub user ianzhou1 opened a pull request:

https://github.com/apache/incubator-beam/pull/707

[BEAM-383] BigQueryIO update sink to shard into multiple write jobs

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [ ] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [ ] Make sure tests pass via `mvn clean verify`. (Even better, enable
   Travis-CI on your fork and ensure the whole test matrix passes).
 - [ ] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [ ] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.txt).

---


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/ianzhou1/incubator-beam BigQueryBranch

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-beam/pull/707.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #707


commit 5294e161d38c37a71c54a8925288f488e9982cab
Author: Ian Zhou 
Date:   2016-07-20T22:56:21Z

Modified BigQueryIO to write based on number of files and file sizes

commit b96184d221de08fd825c9f914b8dc393987c6de9
Author: Ian Zhou 
Date:   2016-07-22T00:04:25Z

Added unit tests for DoFns used in BigQueryWrite




> BigQueryIO: update sink to shard into multiple write jobs
> -
>
> Key: BEAM-383
> URL: https://issues.apache.org/jira/browse/BEAM-383
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-gcp
>Reporter: Daniel Halperin
>Assignee: Ian Zhou
>
> BigQuery has global limits on both the # files that can be written in a 
> single job and the total bytes in those files. We should be able to modify 
> BigQueryIO.Write to chunk into multiple smaller jobs that meet these limits, 
> write to temp tables, and atomically copy into the destination table.
> This functionality will let us safely stay within BQ's load job limits.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-316) Support file scheme in TextIO

2016-07-21 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15388808#comment-15388808
 ] 

ASF GitHub Bot commented on BEAM-316:
-

Github user asfgit closed the pull request at:

https://github.com/apache/incubator-beam/pull/684


> Support file scheme in TextIO
> -
>
> Key: BEAM-316
> URL: https://issues.apache.org/jira/browse/BEAM-316
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-extensions
>Reporter: Jean-Baptiste Onofré
>Assignee: Jean-Baptiste Onofré
>
> When users use {{TextIO}}, most of the time, they provide a full file URI: 
> {{file:/tmp/foo}}. Unfortunately, the {{file}} schema is not supported by 
> {{TextIO}}  and it fails with "No handler found". It's not easy for users to 
> figure it out.
> We should support {{file}} schema to provide better flexibility to users.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-380) Remove Spark runner dependency on beam-examlpes-java

2016-07-22 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-380?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15390492#comment-15390492
 ] 

ASF GitHub Bot commented on BEAM-380:
-

Github user asfgit closed the pull request at:

https://github.com/apache/incubator-beam/pull/539


> Remove Spark runner dependency on beam-examlpes-java
> 
>
> Key: BEAM-380
> URL: https://issues.apache.org/jira/browse/BEAM-380
> Project: Beam
>  Issue Type: Bug
>  Components: runner-spark
>Reporter: Amit Sela
>Assignee: Amit Sela
>Priority: Minor
>
> Remove runner dependency to allow to allow beam-examples to have runtime 
> dependency on runners-spark. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-479) Move RunnableOnService test executions to postcommit

2016-07-22 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15389106#comment-15389106
 ] 

ASF GitHub Bot commented on BEAM-479:
-

Github user asfgit closed the pull request at:

https://github.com/apache/incubator-beam/pull/709


> Move RunnableOnService test executions to postcommit
> 
>
> Key: BEAM-479
> URL: https://issues.apache.org/jira/browse/BEAM-479
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-dataflow, runner-flink, runner-gearpump, 
> runner-spark
>Reporter: Kenneth Knowles
>Assignee: Kenneth Knowles
>
> The Spark and Flink RunnableOnService tests only use a local endpoint, so 
> they _can_ run as unit tests, but our test duration is getting out of hand. 
> For Gearpump, the tests timeout. So this ticket tracks getting everyone to a 
> symmetric configuration.
> Later, we can re-enable a select few local endpoint tests for the various 
> runners, to smoke test, and hopefully get actual cluster-based integration 
> tests running on Jenkins.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-316) Support file scheme in TextIO

2016-07-21 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15388820#comment-15388820
 ] 

ASF GitHub Bot commented on BEAM-316:
-

Github user jbonofre closed the pull request at:

https://github.com/apache/incubator-beam/pull/402


> Support file scheme in TextIO
> -
>
> Key: BEAM-316
> URL: https://issues.apache.org/jira/browse/BEAM-316
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-extensions
>Reporter: Jean-Baptiste Onofré
>Assignee: Jean-Baptiste Onofré
> Fix For: 0.2.0-incubating
>
>
> When users use {{TextIO}}, most of the time, they provide a full file URI: 
> {{file:/tmp/foo}}. Unfortunately, the {{file}} schema is not supported by 
> {{TextIO}}  and it fails with "No handler found". It's not easy for users to 
> figure it out.
> We should support {{file}} schema to provide better flexibility to users.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-479) Move RunnableOnService test executions to postcommit

2016-07-21 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1530#comment-1530
 ] 

ASF GitHub Bot commented on BEAM-479:
-

GitHub user kennknowles opened a pull request:

https://github.com/apache/incubator-beam/pull/709

[BEAM-479] Execute RunnableOnService tests only when runner options provided

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [x] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [ ] Make sure tests pass via `mvn clean verify`. (Even better, enable
   Travis-CI on your fork and ensure the whole test matrix passes).
 - [x] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [x] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.txt).

---

Previously, the situation was this:

 - All runners inherit a RunnableOnService integration-test
   execution referencing runnableOnServicePipelineOptions
   whether or not the variable was set. Basically an unbound
   variable reference.
 - The Dataflow runner had a profile disabling it if
   runnableOnServicePipelineOptions was not set.
 - Before they got configured, Flink and Spark had to
   do extra work to explicitly prevent the invalid
   configuration from being used.

After this change:

 - All runners inherit the same integration-test execution
   but only if the variable it requires is present.
 - Dataflow doesn't have any special profile.
 - Flink and Spark are unchanged, since they do set
   up the variable themselves. When they move to running
   only as postcommit, like Dataflow does, the hardcoding
   is expected to either move to a profile or move to
   the Jenkins invocation.

This addresses the particular aspect of 
[BEAM-479](https://issues.apache.org/jira/browse/BEAM-479) about getting a 
symmetrical config. The way that the configuration is set up is an annoying 
barrier to new runners (they have to suppress the thing) and also less readable 
than a straightforward profile.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/kennknowles/incubator-beam integration-tests

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-beam/pull/709.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #709


commit a34a9a819d12ad4d9c14a7292fc241a89f3f836b
Author: Kenneth Knowles 
Date:   2016-07-22T04:01:36Z

Execute RunnableOnService tests only when runner options provided

Previously, the situation was this:

 - All runners inherit a RunnableOnService integration-test
   execution referencing runnableOnServicePipelineOptions
   whether or not the variable was set. Basically an unbound
   variable reference.
 - The Dataflow runner had a profile disabling it if
   runnableOnServicePipelineOptions was not set.
 - Before they got configured, Flink and Spark had to
   do extra work to explicitly prevent the invalid
   configuration from being used.

After this change:

 - All runners inherit the same integration-test execution
   but only if the variable it requires is present.
 - Dataflow doesn't have any special profile.
 - Flink and Spark are unchanged, since they do set
   up the variable themselves. When they move to running
   only as postcommit, like Dataflow does, the hardcoding
   is expected to either move to a profile or move to
   the Jenkins invocation.




> Move RunnableOnService test executions to postcommit
> 
>
> Key: BEAM-479
> URL: https://issues.apache.org/jira/browse/BEAM-479
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-dataflow, runner-flink, runner-gearpump, 
> runner-spark
>Reporter: Kenneth Knowles
>Assignee: Kenneth Knowles
>
> The Spark and Flink RunnableOnService tests only use a local endpoint, so 
> they _can_ run as unit tests, but our test duration is getting out of hand. 
> For Gearpump, the tests timeout. So this ticket tracks getting everyone to a 
> symmetric configuration.
> Later, we can re-enable a select few local endpoint tests for the various 
> runners, to smoke test, and hopefully get actual cluster-based integration 
> tests running on Jenkins.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-479) Move RunnableOnService test executions to postcommit

2016-07-21 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15388892#comment-15388892
 ] 

ASF GitHub Bot commented on BEAM-479:
-

GitHub user kennknowles opened a pull request:

https://github.com/apache/incubator-beam/pull/711

[BEAM-479] Name local Spark RunnableOnService profile more precisely

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [x] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [ ] Make sure tests pass via `mvn clean verify`. (Even better, enable
   Travis-CI on your fork and ensure the whole test matrix passes).
 - [x] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [x] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.txt).

---

Settling on the name `local-runnable-on-service-tests` for all profiles 
with a local endpoint. That way, this profile plus the desired module will 
suffice to run against a local endpoint if possible.

This is probably blocked on #711. It requires seeing what Jenkins does.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/kennknowles/incubator-beam remove-spark-local

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-beam/pull/711.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #711


commit 18e064823e03bbd113f22e2e1e53a89a25ebb16e
Author: Kenneth Knowles 
Date:   2016-07-22T04:27:41Z

Name local Spark RunnableOnService profile more precisely

Settling on the name "local-runnable-on-service-tests" for
all profiles with a local endpoint. That way, this profile plus
the desired module will suffice to run against a local endpoint if
possible.




> Move RunnableOnService test executions to postcommit
> 
>
> Key: BEAM-479
> URL: https://issues.apache.org/jira/browse/BEAM-479
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-dataflow, runner-flink, runner-gearpump, 
> runner-spark
>Reporter: Kenneth Knowles
>Assignee: Kenneth Knowles
>
> The Spark and Flink RunnableOnService tests only use a local endpoint, so 
> they _can_ run as unit tests, but our test duration is getting out of hand. 
> For Gearpump, the tests timeout. So this ticket tracks getting everyone to a 
> symmetric configuration.
> Later, we can re-enable a select few local endpoint tests for the various 
> runners, to smoke test, and hopefully get actual cluster-based integration 
> tests running on Jenkins.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-443) PipelineResult needs waitToFinish() and cancel()

2016-07-28 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15398635#comment-15398635
 ] 

ASF GitHub Bot commented on BEAM-443:
-

Github user asfgit closed the pull request at:

https://github.com/apache/incubator-beam/pull/642


> PipelineResult needs waitToFinish() and cancel()
> 
>
> Key: BEAM-443
> URL: https://issues.apache.org/jira/browse/BEAM-443
> Project: Beam
>  Issue Type: New Feature
>Reporter: Pei He
>Assignee: Pei He
>
> waitToFinish() and cancel() are two most common operations for users to 
> interact with a started pipeline.
> Right now, they are only available in DataflowPipelineJob. But, it is better 
> to move them to the common interface, so people can start implement them in 
> other runners, and runner agnostic code can interact with PipelineResult 
> better.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-479) Move RunnableOnService test executions to postcommit

2016-07-28 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15398678#comment-15398678
 ] 

ASF GitHub Bot commented on BEAM-479:
-

Github user asfgit closed the pull request at:

https://github.com/apache/incubator-beam/pull/710


> Move RunnableOnService test executions to postcommit
> 
>
> Key: BEAM-479
> URL: https://issues.apache.org/jira/browse/BEAM-479
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-dataflow, runner-flink, runner-gearpump, 
> runner-spark
>Reporter: Kenneth Knowles
>Assignee: Kenneth Knowles
>
> The Spark and Flink RunnableOnService tests only use a local endpoint, so 
> they _can_ run as unit tests, but our test duration is getting out of hand. 
> For Gearpump, the tests timeout. So this ticket tracks getting everyone to a 
> symmetric configuration.
> Later, we can re-enable a select few local endpoint tests for the various 
> runners, to smoke test, and hopefully get actual cluster-based integration 
> tests running on Jenkins.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-498) Make DoFnWithContext the new DoFn

2016-07-28 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15398675#comment-15398675
 ] 

ASF GitHub Bot commented on BEAM-498:
-

GitHub user kennknowles opened a pull request:

https://github.com/apache/incubator-beam/pull/758

[BEAM-498] Rename DoFn to OldDoFn, DoFnWithContext to DoFn, and port some 
examples

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [x] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [x] Make sure tests pass via `mvn clean verify`. (Even better, enable
   Travis-CI on your fork and ensure the whole test matrix passes).
 - [x] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [x] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.txt).

---

R: @bjchambers. Since there was general approval on the mailing list and 
these changes are mostly trivial, I'm mostly just asking Ben for a double-check 
since he has the most technical depth on this.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/kennknowles/incubator-beam DoFnWithContext

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-beam/pull/758.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #758


commit a29447cd338825ac056dfe04d2eeadd8fb4a3171
Author: Kenneth Knowles 
Date:   2016-07-22T20:00:10Z

Rename DoFn to OldDoFn

commit 20d6217cf2db355febc637ebe14e113fbef0ce05
Author: Kenneth Knowles 
Date:   2016-07-26T04:27:02Z

Rename NoOpDoFn to NoOpOldDoFn

commit 9026589010efcdb29543d551a6e8ed9117ea20a7
Author: Kenneth Knowles 
Date:   2016-07-22T21:10:01Z

Rename DoFnWithContext to DoFn

commit 93fab04cc9dc6c93e3080ba5ebbc4aba18851758
Author: Kenneth Knowles 
Date:   2016-07-22T21:28:28Z

Port WordCount example from OldDoFn to DoFn

commit a0d40c6eece48231df2247faec8008bb3abc5bb7
Author: Kenneth Knowles 
Date:   2016-07-22T21:28:42Z

Port MinimalWordCount example from OldDoFn to DoFn

commit 0d4a470af28177af9e8956ddc74f05f52b0fb05b
Author: Kenneth Knowles 
Date:   2016-07-22T21:29:01Z

Port WindowedWordCount example from OldDoFn to DoFn

commit cabea9d4241cfd44996f1d58e8032c60f6413dae
Author: Kenneth Knowles 
Date:   2016-07-22T21:29:18Z

Port DebuggingWordCount example from OldDoFn to DoFn

commit 8f0ce4f1edbdd25fe4c408cb1215190806d874ba
Author: Kenneth Knowles 
Date:   2016-07-22T21:29:37Z

Port AutoComplete example from OldDoFn to DoFn

commit 343d3bc75d04cac554d858c238bb21f0df982c9a
Author: Kenneth Knowles 
Date:   2016-07-22T21:29:51Z

Port microbenchmarks to new vocabulary




> Make DoFnWithContext the new DoFn
> -
>
> Key: BEAM-498
> URL: https://issues.apache.org/jira/browse/BEAM-498
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-java-core
>Reporter: Kenneth Knowles
>Assignee: Kenneth Knowles
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-156) Implement Quiescence Signalling in the InProcessPipelineRunner

2016-07-28 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-156?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15398500#comment-15398500
 ] 

ASF GitHub Bot commented on BEAM-156:
-

GitHub user tgroh opened a pull request:

https://github.com/apache/incubator-beam/pull/754

[BEAM-156] Use Quiescence to Drive the DirectRunner

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [ ] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [ ] Make sure tests pass via `mvn clean verify`. (Even better, enable
   Travis-CI on your fork and ensure the whole test matrix passes).
 - [ ] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [ ] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.txt).

---
The Executor should add work whenever it becomes Quiescent.

Track the amount of outstanding work in the executor, and
modify the state appropriately whenever work is scheduled or completes.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/tgroh/incubator-beam quiescence_driver

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-beam/pull/754.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #754


commit b4afedad05291585205c81e813c7bbb7f0e5c167
Author: Thomas Groh 
Date:   2016-07-22T20:47:19Z

Apply ExecutorUpdates in two Phases

This removes the need for an explicit break by ensuring that work added
by the monitor will not complete and add more work for the monitor to
complete.

commit 3c4a0d39b6655228cfb0d6870e69f229a85537ac
Author: Thomas Groh 
Date:   2016-07-23T01:01:41Z

Add handleEmpty to CompletionCallback

This is invoked when a Transform Executor has no work to do. Usually
this is due to reinvocation of a Source.

commit 292276cfd4e56bfb9a0278e2717d5eb49e304e3e
Author: Thomas Groh 
Date:   2016-07-26T16:53:22Z

Add ProducedOutput method to TransformResult

This can communicate that a PTransform that produced no outputs still
should cause pending work to be evaluated. PCollectionViews modifiy the
state of the evaluator and can cause formerly blocked PTransforms to be
able to progress.

commit 0397f9f67ee1af8866b72019f0b5cf97c4e6b62a
Author: Thomas Groh 
Date:   2016-07-22T20:47:43Z

Use the State of the Executor to drive progress

Add the concept of Quiescence to ExecutorServiceParallelExecutor.

If the executor is Quiescent, it should interrogate root nodes for
additional work. If not, runs of the monitor should update the state as
appropriate.




> Implement Quiescence Signalling in the InProcessPipelineRunner
> --
>
> Key: BEAM-156
> URL: https://issues.apache.org/jira/browse/BEAM-156
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core
>Reporter: Thomas Groh
>Assignee: Thomas Groh
>
> A pipeline is quiescent when the following two properties hold:
>   There are no triggers that can fire, given the current processing time and 
> watermark
>   All pending elements cannot make progress until a side input produces 
> additional output
> This is approximately equivalent to: If no more input is received, the 
> pipeline will not perform any additional processing absent advances in 
> processing time or event time
> See also: 
> https://docs.google.com/document/d/1fZUUbG2LxBtqCVabQshldXIhkMcXepsbv2vuuny8Ix4/edit#



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-444) Promote isBlockOnRun() to PipelineOptions.

2016-07-29 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15400066#comment-15400066
 ] 

ASF GitHub Bot commented on BEAM-444:
-

Github user peihe closed the pull request at:

https://github.com/apache/incubator-beam/pull/643


> Promote isBlockOnRun() to PipelineOptions.
> --
>
> Key: BEAM-444
> URL: https://issues.apache.org/jira/browse/BEAM-444
> Project: Beam
>  Issue Type: New Feature
>Reporter: Pei He
>Assignee: Pei He
>
> Currently, blockOnRun is implemented in different ways by runners.
> DirectRunner did blockOnRun based on DirectOptions.isBlockOnRun.
> Dataflow have a separate BlockingDataflowRunner.
> Flink and Spark runners might or might not block depends on their 
> implementation on run().
> I think DirectRunner's approach is the right way to go, and isBlockOnRun 
> options need to be promoted to the general PipelineOptions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-499) Remove unused code in apiclient.py and iobase.py

2016-07-29 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15400353#comment-15400353
 ] 

ASF GitHub Bot commented on BEAM-499:
-

Github user asfgit closed the pull request at:

https://github.com/apache/incubator-beam/pull/763


> Remove unused code in apiclient.py and iobase.py
> 
>
> Key: BEAM-499
> URL: https://issues.apache.org/jira/browse/BEAM-499
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py
>Reporter: Chamikara Jayalath
>Assignee: Chamikara Jayalath
>
> There is some code in apiclient.py and iobase.py that is not used by Dataflow 
> SDK. This code has to be removed.
> E.g.:
> class DataflowWorkerClient
> def reader_progress_to_cloud_progress() and other similar methods.
> def splits_to_split_response()
> class ConcatPosition



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-499) Remove unused code in apiclient.py and iobase.py

2016-07-29 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15400271#comment-15400271
 ] 

ASF GitHub Bot commented on BEAM-499:
-

GitHub user chamikaramj opened a pull request:

https://github.com/apache/incubator-beam/pull/763

[BEAM-499] Deletes some code that is not used by SDK.

Some code in apiclient.py is not used by Python SDK.

Deleting unused code and corresponding tests.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/chamikaramj/incubator-beam 
delete_unused_apiclient_code

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-beam/pull/763.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #763


commit 7d50d8040585e0cea5bc02de4cb199f29c1472fc
Author: Chamikara Jayalath 
Date:   2016-07-29T22:40:39Z

Deletes some code that is not used by SDK.

Also deletes corresponding tests.




> Remove unused code in apiclient.py and iobase.py
> 
>
> Key: BEAM-499
> URL: https://issues.apache.org/jira/browse/BEAM-499
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py
>Reporter: Chamikara Jayalath
>Assignee: Chamikara Jayalath
>
> There is some code in apiclient.py and iobase.py that is not used by Dataflow 
> SDK. This code has to be removed.
> E.g.:
> class DataflowWorkerClient
> def reader_progress_to_cloud_progress() and other similar methods.
> def splits_to_split_response()
> class ConcatPosition



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-450) Modules are shaded to the same path

2016-07-29 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15400182#comment-15400182
 ] 

ASF GitHub Bot commented on BEAM-450:
-

Github user dhalperi closed the pull request at:

https://github.com/apache/incubator-beam/pull/744


> Modules are shaded to the same path
> ---
>
> Key: BEAM-450
> URL: https://issues.apache.org/jira/browse/BEAM-450
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core
>Affects Versions: 0.1.0-incubating, 0.2.0-incubating
>Reporter: Daniel Halperin
>Assignee: Daniel Halperin
>  Labels: newbie, starter
>
> Right now multiple modules are using the same repackaged path. We should be 
> using per-artifact paths so that they don't conflict.
> One proposal was simply to adopt 
> {{${project.groupId}.${project.artifactId}.repackaged}} as the shading 
> location. If it works.
> This is a good starter issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-487) Add disclaimer to GitHub README.md

2016-07-26 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-487?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15393313#comment-15393313
 ] 

ASF GitHub Bot commented on BEAM-487:
-

GitHub user dhalperi opened a pull request:

https://github.com/apache/incubator-beam/pull/733

[BEAM-487] README.md: add DISCLAIMER, incubating, minor fixes

From the linked JIRA:

> For the github audience, reaching 
https://github.com/apache/incubator-beam,
there is no disclaimer. I think there should be a disclaimer on 
`README.md`, and at least the
first reference to beam should read “Apache Beam (incubating)”.

See screenshot:
https://cloud.githubusercontent.com/assets/526415/17128253/3523de78-52c0-11e6-8381-eb89e6bab396.png;>


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/dhalperi/incubator-beam beam-487

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-beam/pull/733.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #733


commit 90ed35ba1fce36d2eb47d2a446c0e57bb71683cd
Author: Dan Halperin 
Date:   2016-07-26T06:30:44Z

README.md: add DISCLAIMER, incubating, minor fixes




> Add disclaimer to GitHub README.md
> --
>
> Key: BEAM-487
> URL: https://issues.apache.org/jira/browse/BEAM-487
> Project: Beam
>  Issue Type: Task
>  Components: project-management
>Affects Versions: Not applicable
>Reporter: Daniel Halperin
>Assignee: Daniel Halperin
>
> http://mail-archives.apache.org/mod_mbox/incubator-general/201606.mbox/%3c47b44d85-be01-42f9-96c6-43ff23e31...@apache.org%3E
> > 1. The DISCLAIMER file is sufficient for the purposes of a source release. 
> > But for the github
> audience, reaching https://github.com/apache/incubator-beam 
> ,
> there is no disclaimer. I think there should be a disclaimer on README.md, 
> and at least the
> first reference to beam should read “Apache Beam (incubating)”.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-489) remove headerLocation from maven-checkstyle-plugin configuration

2016-07-26 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15393694#comment-15393694
 ] 

ASF GitHub Bot commented on BEAM-489:
-

GitHub user manuzhang opened a pull request:

https://github.com/apache/incubator-beam/pull/734

[BEAM-489] remove headerLocation from maven-checkstyle-plugin

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [ ] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [ ] Make sure tests pass via `mvn clean verify`. (Even better, enable
   Travis-CI on your fork and ensure the whole test matrix passes).
 - [ ] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [ ] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.txt).

---



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/manuzhang/incubator-beam BEAM-489

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-beam/pull/734.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #734


commit 7a5651763d73ef78f5e11c95631576024ff5efac
Author: manuzhang 
Date:   2016-07-26T12:05:37Z

[BEAM-489] remove headerLocation from maven-checkstyle-plugin




> remove headerLocation from maven-checkstyle-plugin configuration 
> -
>
> Key: BEAM-489
> URL: https://issues.apache.org/jira/browse/BEAM-489
> Project: Beam
>  Issue Type: Improvement
>  Components: build-system
>Affects Versions: 0.1.0-incubating
>Reporter: Manu Zhang
>Assignee: Manu Zhang
>Priority: Trivial
>
> As license header has been checked by maven-rat-plugin since BEAM-254, 
> headerLocation in the maven-checkstyle-plugin is redundant.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-156) Implement Quiescence Signalling in the InProcessPipelineRunner

2016-07-27 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-156?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15396679#comment-15396679
 ] 

ASF GitHub Bot commented on BEAM-156:
-

GitHub user tgroh opened a pull request:

https://github.com/apache/incubator-beam/pull/746

Use AutoValue for StepTransformResult

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [ ] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [ ] Make sure tests pass via `mvn clean verify`. (Even better, enable
   Travis-CI on your fork and ensure the whole test matrix passes).
 - [ ] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [ ] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.txt).

---
This is a cleanup PR for future changes to StepTransformResult, primarily 
as part of
BEAM-156


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/tgroh/incubator-beam 
step_transform_result_autovalue

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-beam/pull/746.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #746


commit c1f66b67c82e149ef27d19c88993b0ab15d19b35
Author: Thomas Groh 
Date:   2016-07-26T16:38:13Z

Use AutoValue for StepTransformResult




> Implement Quiescence Signalling in the InProcessPipelineRunner
> --
>
> Key: BEAM-156
> URL: https://issues.apache.org/jira/browse/BEAM-156
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core
>Reporter: Thomas Groh
>Assignee: Thomas Groh
>
> A pipeline is quiescent when the following two properties hold:
>   There are no triggers that can fire, given the current processing time and 
> watermark
>   All pending elements cannot make progress until a side input produces 
> additional output
> This is approximately equivalent to: If no more input is received, the 
> pipeline will not perform any additional processing absent advances in 
> processing time or event time
> See also: 
> https://docs.google.com/document/d/1fZUUbG2LxBtqCVabQshldXIhkMcXepsbv2vuuny8Ix4/edit#



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-156) Implement Quiescence Signalling in the InProcessPipelineRunner

2016-07-27 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-156?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15396678#comment-15396678
 ] 

ASF GitHub Bot commented on BEAM-156:
-

GitHub user tgroh opened a pull request:

https://github.com/apache/incubator-beam/pull/745

[BEAM-156] Apply ExecutorUpdates in two Phases

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [ ] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [ ] Make sure tests pass via `mvn clean verify`. (Even better, enable
   Travis-CI on your fork and ensure the whole test matrix passes).
 - [ ] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [ ] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.txt).

---

This removes the need for an explicit break by ensuring that work added
by the monitor will not complete and add more work for the monitor to
complete.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/tgroh/incubator-beam two_phase_executor_update

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-beam/pull/745.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #745


commit 4b932d93352b04f60392b6083ce5573c86535a65
Author: Thomas Groh 
Date:   2016-07-22T20:47:19Z

Apply ExecutorUpdates in two Phases

This removes the need for an explicit break by ensuring that work added
by the monitor will not complete and add more work for the monitor to
complete.




> Implement Quiescence Signalling in the InProcessPipelineRunner
> --
>
> Key: BEAM-156
> URL: https://issues.apache.org/jira/browse/BEAM-156
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core
>Reporter: Thomas Groh
>Assignee: Thomas Groh
>
> A pipeline is quiescent when the following two properties hold:
>   There are no triggers that can fire, given the current processing time and 
> watermark
>   All pending elements cannot make progress until a side input produces 
> additional output
> This is approximately equivalent to: If no more input is received, the 
> pipeline will not perform any additional processing absent advances in 
> processing time or event time
> See also: 
> https://docs.google.com/document/d/1fZUUbG2LxBtqCVabQshldXIhkMcXepsbv2vuuny8Ix4/edit#



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-124) Testing -- End to End WordCount Batch and Streaming Tests

2016-08-10 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15415466#comment-15415466
 ] 

ASF GitHub Bot commented on BEAM-124:
-

Github user asfgit closed the pull request at:

https://github.com/apache/incubator-beam/pull/769


> Testing -- End to End WordCount Batch and Streaming Tests
> -
>
> Key: BEAM-124
> URL: https://issues.apache.org/jira/browse/BEAM-124
> Project: Beam
>  Issue Type: New Feature
>  Components: testing
>Reporter: Steve Wheeler
>Assignee: Mark Liu
>
> Set up testing infrastructure so that an end to end test for WordCount (both 
> batch and streaming) will be run periodically. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-533) Autocomplete Example should use Datastore AncestorKey for strong consistency

2016-08-10 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-533?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15415650#comment-15415650
 ] 

ASF GitHub Bot commented on BEAM-533:
-

Github user asfgit closed the pull request at:

https://github.com/apache/incubator-beam/pull/788


> Autocomplete Example should use Datastore AncestorKey for strong consistency
> 
>
> Key: BEAM-533
> URL: https://issues.apache.org/jira/browse/BEAM-533
> Project: Beam
>  Issue Type: Bug
>  Components: examples-java
>Reporter: Vikas Kedigehalli
>Assignee: Vikas Kedigehalli
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-124) Testing -- End to End WordCount Batch and Streaming Tests

2016-08-02 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15404908#comment-15404908
 ] 

ASF GitHub Bot commented on BEAM-124:
-

GitHub user markflyhigh opened a pull request:

https://github.com/apache/incubator-beam/pull/769

[BEAM-124] Spark Running WordCountIT Example

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [x] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [x] Make sure tests pass via `mvn clean verify`. (Even better, enable
   Travis-CI on your fork and ensure the whole test matrix passes).
 - [x] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [x] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.txt).

---

 - Add Spark dependency in order to use SparkRunner to execute WordCountIT.
 - Change default test file and add to project which avoid the problem that 
SparkRunner can't resove `gs://` right now.

Following command is used to run WordCountIT with SparkRunner:

`mvn clean verify -pl examples/java -am -rf :beam-examples-java 
-DskipITs=false -DintegrationTestPipelineOptions='[ "--tempRoot=/tmp", 
"--runner=org.apache.beam.runners.spark.SparkRunner" ]'`


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/markflyhigh/incubator-beam 
wordcount-e2e-spark-runner

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-beam/pull/769.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #769


commit 1e6ec6e2c00e9ecec6b5e5027655bdb755f93ac6
Author: Mark Liu 
Date:   2016-08-02T21:56:28Z

[BEAM-124] Spark Running WordCountIT Example




> Testing -- End to End WordCount Batch and Streaming Tests
> -
>
> Key: BEAM-124
> URL: https://issues.apache.org/jira/browse/BEAM-124
> Project: Beam
>  Issue Type: New Feature
>  Components: testing
>Reporter: Steve Wheeler
>Assignee: Mark Liu
>
> Set up testing infrastructure so that an end to end test for WordCount (both 
> batch and streaming) will be run periodically. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-514) Add all mandatory links

2016-08-03 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-514?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15405394#comment-15405394
 ] 

ASF GitHub Bot commented on BEAM-514:
-

Github user asfgit closed the pull request at:

https://github.com/apache/incubator-beam-site/pull/33


> Add all mandatory links
> ---
>
> Key: BEAM-514
> URL: https://issues.apache.org/jira/browse/BEAM-514
> Project: Beam
>  Issue Type: Bug
>  Components: website
>Affects Versions: Not applicable
>Reporter: Daniel Halperin
>Assignee: Frances Perry
>
> Except from: 
> http://mail-archives.apache.org/mod_mbox/incubator-general/201608.mbox/%3C7E0226B1-0386-499C-8473-61A8E51A691B%40classsoftware.com%3E
> > Branding wise I think you are missing a few of the
> required links [3] including a link back to the Apache homepage.
> http://www.apache.org/foundation/marks/pmcs.html#navigation



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-514) Add all mandatory links

2016-08-02 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-514?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15405323#comment-15405323
 ] 

ASF GitHub Bot commented on BEAM-514:
-

GitHub user francesperry opened a pull request:

https://github.com/apache/incubator-beam-site/pull/33

[BEAM-514][BEAM-515] Fixed Apache logos and links

* Added required Apache links (BEAM-514)
* Added incubator and feather logos (BEAM-515)

R: @dhalperi @evilsoapbox 

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/francesperry/incubator-beam-site 
branding-514-515

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-beam-site/pull/33.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #33


commit 2c97532679a5a2f32ea6495f7cafbad946aa5321
Author: Frances Perry 
Date:   2016-08-03T04:33:35Z

Addressed BEAM-514 (add required asf links)

commit 4983897a4b39e624c0a898c113ebeb2e065e2f34
Author: Frances Perry 
Date:   2016-08-03T04:53:38Z

Addressed Beam-515 (apache logos)




> Add all mandatory links
> ---
>
> Key: BEAM-514
> URL: https://issues.apache.org/jira/browse/BEAM-514
> Project: Beam
>  Issue Type: Bug
>  Components: website
>Affects Versions: Not applicable
>Reporter: Daniel Halperin
>Assignee: Frances Perry
>
> Except from: 
> http://mail-archives.apache.org/mod_mbox/incubator-general/201608.mbox/%3C7E0226B1-0386-499C-8473-61A8E51A691B%40classsoftware.com%3E
> > Branding wise I think you are missing a few of the
> required links [3] including a link back to the Apache homepage.
> http://www.apache.org/foundation/marks/pmcs.html#navigation



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-542) Spark batch interval should be a configuration instead of an interpretation of the Pipeline's windows

2016-08-10 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15415676#comment-15415676
 ] 

ASF GitHub Bot commented on BEAM-542:
-

Github user asfgit closed the pull request at:

https://github.com/apache/incubator-beam/pull/808


> Spark batch interval should be a configuration instead of an interpretation 
> of the Pipeline's windows
> -
>
> Key: BEAM-542
> URL: https://issues.apache.org/jira/browse/BEAM-542
> Project: Beam
>  Issue Type: Bug
>  Components: runner-spark
>Reporter: Amit Sela
>Assignee: Amit Sela
>
> Currently, the SparkRunner extracts the batch interval from the duration of 
> the first window. 
> This is wrong in several ways:
> # GlobalWindow pipelines
> # It's an engine specific property and should not be expressed as a part of 
> the logic but rather as a configuration for execution of the pipeline.
> # Effectively forces the definition of Fixed/SlidingWindows even when they 
> are not needed (stateless processing), which also makes the pipeline code not 
> portable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-498) Make DoFnWithContext the new DoFn

2016-08-10 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15415968#comment-15415968
 ] 

ASF GitHub Bot commented on BEAM-498:
-

Github user asfgit closed the pull request at:

https://github.com/apache/incubator-beam/pull/803


> Make DoFnWithContext the new DoFn
> -
>
> Key: BEAM-498
> URL: https://issues.apache.org/jira/browse/BEAM-498
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-java-core
>Reporter: Kenneth Knowles
>Assignee: Kenneth Knowles
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-522) Update FileSink.finalize_write() to be idempotent

2016-08-10 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-522?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15415985#comment-15415985
 ] 

ASF GitHub Bot commented on BEAM-522:
-

Github user asfgit closed the pull request at:

https://github.com/apache/incubator-beam/pull/779


> Update FileSink.finalize_write() to be idempotent
> -
>
> Key: BEAM-522
> URL: https://issues.apache.org/jira/browse/BEAM-522
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py
>Reporter: Chamikara Jayalath
>Assignee: Chamikara Jayalath
>
> Currently FileSink.finelize_write() in fileio.py [1] performs following 
> operations.
> (1) Obtains a list of temporary files as a side input
> (2) Renames each temporary file to the location where final output should be 
> stored.
> iobase.Sink.finalize_write() operation should be idempotent since runner 
> implementations may call this operation multiple times due to task failures. 
> Current implementation is not idempotent because if we re-run the operation 
> after renaming a sub-set of files, the operations may fail due to not being 
> able to find some files at source location (for example, [2] for GCS files).
> We can fix this by checking if the destination file is already available 
> before performing the rename and not performing the rename for files that are 
> already available at the destination.
> [1] 
> https://github.com/apache/incubator-beam/blob/python-sdk/sdks/python/apache_beam/io/fileio.py#L503
> [2] 
> https://github.com/apache/incubator-beam/blob/python-sdk/sdks/python/apache_beam/io/gcsio.py#L187
>  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-156) Implement Quiescence Signalling in the InProcessPipelineRunner

2016-08-03 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-156?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15406911#comment-15406911
 ] 

ASF GitHub Bot commented on BEAM-156:
-

Github user asfgit closed the pull request at:

https://github.com/apache/incubator-beam/pull/754


> Implement Quiescence Signalling in the InProcessPipelineRunner
> --
>
> Key: BEAM-156
> URL: https://issues.apache.org/jira/browse/BEAM-156
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core
>Reporter: Thomas Groh
>Assignee: Thomas Groh
>
> A pipeline is quiescent when the following two properties hold:
>   There are no triggers that can fire, given the current processing time and 
> watermark
>   All pending elements cannot make progress until a side input produces 
> additional output
> This is approximately equivalent to: If no more input is received, the 
> pipeline will not perform any additional processing absent advances in 
> processing time or event time
> See also: 
> https://docs.google.com/document/d/1fZUUbG2LxBtqCVabQshldXIhkMcXepsbv2vuuny8Ix4/edit#



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-498) Make DoFnWithContext the new DoFn

2016-08-03 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15406954#comment-15406954
 ] 

ASF GitHub Bot commented on BEAM-498:
-

Github user asfgit closed the pull request at:

https://github.com/apache/incubator-beam/pull/758


> Make DoFnWithContext the new DoFn
> -
>
> Key: BEAM-498
> URL: https://issues.apache.org/jira/browse/BEAM-498
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-java-core
>Reporter: Kenneth Knowles
>Assignee: Kenneth Knowles
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-498) Make DoFnWithContext the new DoFn

2016-08-03 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15407011#comment-15407011
 ] 

ASF GitHub Bot commented on BEAM-498:
-

GitHub user kennknowles opened a pull request:

https://github.com/apache/incubator-beam/pull/781

[BEAM-498] Port examples to new DoFn

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [x] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [x] Make sure tests pass via `mvn clean verify`. (Even better, enable
   Travis-CI on your fork and ensure the whole test matrix passes).
 - [x] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [x] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.txt).

---

R: @bjchambers pretty nice to be showcasing the `BoundedWindow` parameter 
in a couple of them.

Any other reviewer can feel free to LGTM if they are happy with it. It is 
pretty mechanical.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/kennknowles/incubator-beam examples-new-DoFn

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-beam/pull/781.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #781


commit c31986f5ee31c34538a73057638a45efd796caf5
Author: Kenneth Knowles 
Date:   2016-08-04T01:54:22Z

Port example tests to new DoFn

commit 8e6c0eaf66d7e5e6f2319aa13efe627b9fe3002a
Author: Kenneth Knowles 
Date:   2016-08-04T02:01:16Z

Port TfIdf example to new DoFn

commit c3c42aafc1987d06c55e76e669f8e2c0b7b59b2c
Author: Kenneth Knowles 
Date:   2016-08-04T02:03:11Z

Port TopWikipediaSessions example to new DoFn

commit 380f26afc73620abecbd38118095b96f1a91f030
Author: Kenneth Knowles 
Date:   2016-08-04T02:04:50Z

Port GameState Java 8 example to new DoFn

commit 1584666cb2c5ef3d8c9edfefe4d8cd7b366ae133
Author: Kenneth Knowles 
Date:   2016-08-04T02:06:26Z

Port the UserScore example to new DoFn

commit 86e90246d582327330af7f3212b6ed2c6a4f6af7
Author: Kenneth Knowles 
Date:   2016-08-04T02:07:56Z

Port StreamingWordExtract example to new DoFn

commit 30da6afd8d06f63f0a0a7ebafc51e8d30217763c
Author: Kenneth Knowles 
Date:   2016-08-04T02:08:19Z

fixup! UserScore

commit 18ee240879a0edc738355746f13b5c6b967babf7
Author: Kenneth Knowles 
Date:   2016-08-04T02:09:39Z

Port TrafficMaxLaneFlow to new DoFn

commit 589337562cf59c511dbc49030a88110cbfcd5a3a
Author: Kenneth Knowles 
Date:   2016-08-04T02:10:43Z

Port TrafficeRoutes example to new DoFn

commit 616411a1263604182f4cab7e899de3df22fc734d
Author: Kenneth Knowles 
Date:   2016-08-04T02:12:08Z

Port DatastoreWordCount example to new DoFn

commit 4878f0b274c656f4d9951f471b7ef346fca58d1f
Author: Kenneth Knowles 
Date:   2016-08-04T02:13:19Z

Port BigQueryTornadoes example to new DoFn

commit 480926d9591bfa0dace4af6d6883650bae61bb99
Author: Kenneth Knowles 
Date:   2016-08-04T02:13:58Z

Port MaxPerKeyExamples to new DoFn

commit 607ed16fdc38ad19fc711844e0c55da6306d0882
Author: Kenneth Knowles 
Date:   2016-08-04T02:14:37Z

Port CombinePerKeyExamples to new DoFn

commit e2262521eb4e84a258bfff03edab1440e91fd9f3
Author: Kenneth Knowles 
Date:   2016-08-04T02:15:56Z

Port TriggerExample to new DoFn

commit 8b376606d9a956a2be9b70508010a19d34584d81
Author: Kenneth Knowles 
Date:   2016-08-04T02:17:26Z

Port JoinExamples to new DoFn

commit 00e19ae9e690e35e73a0f8aff2c1a371d80c
Author: Kenneth Knowles 
Date:   2016-08-04T02:18:07Z

Port FilterExamples to new DoFn

commit 16b9ca531970b6b32c91229df80926fa0d99714c
Author: Kenneth Knowles 
Date:   2016-08-04T02:18:38Z

fixup! TriggerExample

commit ba47f11fb0d1aa99141ab12a4a3665a52d1e016e
Author: Kenneth Knowles 
Date:   2016-08-04T02:19:38Z

Fix mention of DoFn in WordCountTest




> Make DoFnWithContext the new DoFn
> -
>
> Key: BEAM-498
> URL: https://issues.apache.org/jira/browse/BEAM-498
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-java-core
>Reporter: Kenneth Knowles
>Assignee: Kenneth Knowles
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-396) Coder.NonDeterministicException doesn't inherit from Exception

2016-08-03 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15406921#comment-15406921
 ] 

ASF GitHub Bot commented on BEAM-396:
-

Github user asfgit closed the pull request at:

https://github.com/apache/incubator-beam/pull/776


> Coder.NonDeterministicException doesn't inherit from Exception
> --
>
> Key: BEAM-396
> URL: https://issues.apache.org/jira/browse/BEAM-396
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core
>Reporter: Scott Wegner
>Assignee: Chandni Singh
>Priority: Minor
>  Labels: findbugs, newbie, starter
>
> [FindBugs 
> NM_CLASS_NOT_EXCEPTION|https://github.com/apache/incubator-beam/blob/58a029a06aea1030279e5da8f9fa3114f456c1db/sdks/java/build-tools/src/main/resources/beam/findbugs-filter.xml#L67]:
>  Class is not derived from an Exception, even though it is named as such.
> Applies to 
> [Coder.NonDeterministicException|https://github.com/apache/incubator-beam/blob/58a029a06aea1030279e5da8f9fa3114f456c1db/sdks/java/core/src/main/java/org/apache/beam/sdk/coders/Coder.java#L263]
> This is a good starter bug. When fixing, please remove the corresponding 
> entries from 
> [findbugs-filter.xml|https://github.com/apache/incubator-beam/blob/master/sdks/java/build-tools/src/main/resources/beam/findbugs-filter.xml]
>  and verify the build passes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-522) Update FileSink.finalize_write() to be idempotent

2016-08-03 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-522?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15406959#comment-15406959
 ] 

ASF GitHub Bot commented on BEAM-522:
-

GitHub user chamikaramj opened a pull request:

https://github.com/apache/incubator-beam/pull/779

[BEAM-522] Fixes GcsIO.exists() to properly handle files that do not exist

Currently this invocation fails for non existing files instead of returning 
false.

Updates FileSink.finalize_write() so that we capture and log any transient 
errors that get thrown at the channel_factory.exists() call.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/chamikaramj/incubator-beam 
sink_finalize_fix_idempotency

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-beam/pull/779.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #779


commit 792c3b5c79b6e979bc34bcf457f8a33cebd74daf
Author: Chamikara Jayalath 
Date:   2016-08-04T01:25:41Z

Fixes GcsIO.exists() to properly handle files that do not exist.

Currently this invocation fails for non existing files instead of returning 
false.

Updates FileSink.finalize_write() so that we capture and log any transient 
errors that get thrown at the channel_factory.exists() call.




> Update FileSink.finalize_write() to be idempotent
> -
>
> Key: BEAM-522
> URL: https://issues.apache.org/jira/browse/BEAM-522
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py
>Reporter: Chamikara Jayalath
>Assignee: Chamikara Jayalath
>
> Currently FileSink.finelize_write() in fileio.py [1] performs following 
> operations.
> (1) Obtains a list of temporary files as a side input
> (2) Renames each temporary file to the location where final output should be 
> stored.
> iobase.Sink.finalize_write() operation should be idempotent since runner 
> implementations may call this operation multiple times due to task failures. 
> Current implementation is not idempotent because if we re-run the operation 
> after renaming a sub-set of files, the operations may fail due to not being 
> able to find some files at source location (for example, [2] for GCS files).
> We can fix this by checking if the destination file is already available 
> before performing the rename and not performing the rename for files that are 
> already available at the destination.
> [1] 
> https://github.com/apache/incubator-beam/blob/python-sdk/sdks/python/apache_beam/io/fileio.py#L503
> [2] 
> https://github.com/apache/incubator-beam/blob/python-sdk/sdks/python/apache_beam/io/gcsio.py#L187
>  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-498) Make DoFnWithContext the new DoFn

2016-08-03 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15407076#comment-15407076
 ] 

ASF GitHub Bot commented on BEAM-498:
-

GitHub user kennknowles opened a pull request:

https://github.com/apache/incubator-beam/pull/782

[BEAM-498] Port easy bits of the SDK to new DoFn

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [x] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [ ] Make sure tests pass via `mvn clean verify`. (Even better, enable
   Travis-CI on your fork and ensure the whole test matrix passes).
 - [x] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [x] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.txt).

---

R: @bjchambers 

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/kennknowles/incubator-beam transforms-new-DoFn

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-beam/pull/782.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #782


commit 3fd414f9dd2b2aeb41ab7a24b635adb234ec5056
Author: Kenneth Knowles 
Date:   2016-08-04T02:55:21Z

Port join library to new DoFn

commit ab22cd9933bd13571c03e7dd66e4d01a6e38d7f3
Author: Kenneth Knowles 
Date:   2016-08-04T02:56:33Z

Port mentions of OldDoFn in PipelineOptions

commit 866d2c7dda01e04ff040b2ed655e6390c6b56ef4
Author: Kenneth Knowles 
Date:   2016-08-04T03:15:12Z

Port easy Java SDK tests to new DoFn

commit 2504240de6115139addf051c354fae4b3c49b67c
Author: Kenneth Knowles 
Date:   2016-08-04T03:15:58Z

Port PAssert to new DoFn

commit d16cc7f7dc14eaab1564b42177980ada149f7f99
Author: Kenneth Knowles 
Date:   2016-08-04T03:22:26Z

Port easy I/O transforms to new DoFn

commit 3f949838df812012175a086c287e99c25bca894e
Author: Kenneth Knowles 
Date:   2016-08-04T03:27:28Z

Port easy transforms to new DoFn




> Make DoFnWithContext the new DoFn
> -
>
> Key: BEAM-498
> URL: https://issues.apache.org/jira/browse/BEAM-498
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-java-core
>Reporter: Kenneth Knowles
>Assignee: Kenneth Knowles
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-550) Datastore should support writes for Unbounded PCollections

2016-08-14 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-550?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15420630#comment-15420630
 ] 

ASF GitHub Bot commented on BEAM-550:
-

GitHub user vikkyrk opened a pull request:

https://github.com/apache/incubator-beam/pull/825

[BEAM-550] DatastoreIO Write PTranform for Bounded/Unbounded PCollections

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [ ] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [ ] Make sure tests pass via `mvn clean verify`. (Even better, enable
   Travis-CI on your fork and ensure the whole test matrix passes).
 - [ ] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [ ] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.txt).

---

- Implement Datastore Write as a composite transform of ParDo without using 
the Sink API
- Generalize the transform for handling idempotent Datastore Mutations. 
This allows for a Delete transform for deleting entities. 

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/vikkyrk/incubator-beam vikasrk/mutation

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-beam/pull/825.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #825


commit c7729a944bb4cc287a3c537a130503385b8c1e1c
Author: Vikas Kedigehalli 
Date:   2016-08-12T06:05:09Z

DatastoreIO Write PTranform for Bounded/Unbounded PCollections




> Datastore should support writes for Unbounded PCollections 
> ---
>
> Key: BEAM-550
> URL: https://issues.apache.org/jira/browse/BEAM-550
> Project: Beam
>  Issue Type: Bug
>Reporter: Vikas Kedigehalli
>Assignee: Vikas Kedigehalli
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-206) FileBasedSink does serial renames

2016-08-15 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15420651#comment-15420651
 ] 

ASF GitHub Bot commented on BEAM-206:
-

GitHub user dhalperi opened a pull request:

https://github.com/apache/incubator-beam/pull/826

[BEAM-206] FileBasedSink: improve parallelism in GCS copy/remove



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/dhalperi/incubator-beam file-based-sink-rename

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-beam/pull/826.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #826


commit b0671496fd7d7b182a617da0daefee1509e9c9da
Author: Dan Halperin 
Date:   2016-08-15T06:08:21Z

FileBasedSink: improve parallelism in GCS copy/remove




> FileBasedSink does serial renames
> -
>
> Key: BEAM-206
> URL: https://issues.apache.org/jira/browse/BEAM-206
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-gcp
>Reporter: Daniel Halperin
>Assignee: Daniel Halperin
>
> This code can be the bottleneck if there are many small files. Should be 
> parallelized when possible.
> https://github.com/apache/incubator-beam/blob/master/sdks/java/core/src/main/java/org/apache/beam/sdk/io/FileBasedSink.java#L633



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-418) BitSetCoder transient field byteArrayCoder not set after deserialization

2016-08-10 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15416562#comment-15416562
 ] 

ASF GitHub Bot commented on BEAM-418:
-

Github user asfgit closed the pull request at:

https://github.com/apache/incubator-beam/pull/818


> BitSetCoder transient field byteArrayCoder not set after deserialization
> 
>
> Key: BEAM-418
> URL: https://issues.apache.org/jira/browse/BEAM-418
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core
>Reporter: Scott Wegner
>Priority: Minor
>  Labels: findbugs, newbie, starter
>
> [FindBugs 
> SE_TRANSIENT_FIELD_NOT_RESTORED|https://github.com/apache/incubator-beam/blob/58a029a06aea1030279e5da8f9fa3114f456c1db/sdks/java/build-tools/src/main/resources/beam/findbugs-filter.xml#L357]:
>  Transient field that isn't set by deserialization.
> Applies to: 
> [BitSetCoder.byteArrayCoder|https://github.com/apache/incubator-beam/blob/58a029a06aea1030279e5da8f9fa3114f456c1db/sdks/java/core/src/main/java/org/apache/beam/sdk/util/BitSetCoder.java#L35].
> This is a good starter bug. When fixing, please remove the corresponding 
> entries from 
> [findbugs-filter.xml|https://github.com/apache/incubator-beam/blob/master/sdks/java/build-tools/src/main/resources/beam/findbugs-filter.xml]
>  and verify the build passes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-418) BitSetCoder transient field byteArrayCoder not set after deserialization

2016-08-10 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15416228#comment-15416228
 ] 

ASF GitHub Bot commented on BEAM-418:
-

GitHub user gauravgopi123 opened a pull request:

https://github.com/apache/incubator-beam/pull/818

[BEAM-418] Setting transient field that isn't set by deserialization

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [ ] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [ ] Make sure tests pass via `mvn clean verify`. (Even better, enable
   Travis-CI on your fork and ensure the whole test matrix passes).
 - [ ] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [ ] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.txt).

---



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/gauravgopi123/incubator-beam BEAM-418

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-beam/pull/818.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #818


commit 4f3d01183b28f14980b2b9da5bdb66ea6a09c4c9
Author: gaurav gupta 
Date:   2016-08-10T23:43:03Z

Setting transient field that isn't set by deserialization




> BitSetCoder transient field byteArrayCoder not set after deserialization
> 
>
> Key: BEAM-418
> URL: https://issues.apache.org/jira/browse/BEAM-418
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core
>Reporter: Scott Wegner
>Priority: Minor
>  Labels: findbugs, newbie, starter
>
> [FindBugs 
> SE_TRANSIENT_FIELD_NOT_RESTORED|https://github.com/apache/incubator-beam/blob/58a029a06aea1030279e5da8f9fa3114f456c1db/sdks/java/build-tools/src/main/resources/beam/findbugs-filter.xml#L357]:
>  Transient field that isn't set by deserialization.
> Applies to: 
> [BitSetCoder.byteArrayCoder|https://github.com/apache/incubator-beam/blob/58a029a06aea1030279e5da8f9fa3114f456c1db/sdks/java/core/src/main/java/org/apache/beam/sdk/util/BitSetCoder.java#L35].
> This is a good starter bug. When fixing, please remove the corresponding 
> entries from 
> [findbugs-filter.xml|https://github.com/apache/incubator-beam/blob/master/sdks/java/build-tools/src/main/resources/beam/findbugs-filter.xml]
>  and verify the build passes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-285) Make MinLongFn and MaxLongFn mimic SumLongFn and use BinaryCombineLongFn

2016-07-13 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15375485#comment-15375485
 ] 

ASF GitHub Bot commented on BEAM-285:
-

Github user asfgit closed the pull request at:

https://github.com/apache/incubator-beam/pull/552


> Make MinLongFn and MaxLongFn mimic SumLongFn and use BinaryCombineLongFn
> 
>
> Key: BEAM-285
> URL: https://issues.apache.org/jira/browse/BEAM-285
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core
>Reporter: Mark Shields
>Assignee: Pei He
>
> Ditto for the other 'optimized accumulator' combiner functions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-446) Improve IOChannelUtils.resolve() to accept multiple paths at once

2016-07-13 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15375472#comment-15375472
 ] 

ASF GitHub Bot commented on BEAM-446:
-

GitHub user markflyhigh opened a pull request:

https://github.com/apache/incubator-beam/pull/646

[BEAM-446] Improve IOChannelUtils.resolve to accept multiple paths

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [x] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [x] Make sure tests pass via `mvn clean verify`. (Even better, enable
   Travis-CI on your fork and ensure the whole test matrix passes).
 - [x] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [x] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.txt).

---



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/markflyhigh/incubator-beam 
improve-iochannelutils-resolve

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-beam/pull/646.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #646


commit db38493615b47c11ca5831f58c9d24140d5ef829
Author: Mark Liu 
Date:   2016-07-13T18:08:18Z

[BEAM-446] Improve IOChannelUtils.resolve to accept multiple paths




> Improve IOChannelUtils.resolve() to accept multiple paths at once
> -
>
> Key: BEAM-446
> URL: https://issues.apache.org/jira/browse/BEAM-446
> Project: Beam
>  Issue Type: Improvement
>Reporter: Mark Liu
>Assignee: Mark Liu
>Priority: Minor
>
> Currently, IOChannelUtils.resolve() method can only resolve one path against 
> base path. 
> It's useful to have another method with arguments that includes one base path 
> and multiple others. The return string will be a directory that start with 
> base path and append rests which are separated by file separator.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-446) Improve IOChannelUtils.resolve() to accept multiple paths at once

2016-07-13 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15375632#comment-15375632
 ] 

ASF GitHub Bot commented on BEAM-446:
-

Github user asfgit closed the pull request at:

https://github.com/apache/incubator-beam/pull/646


> Improve IOChannelUtils.resolve() to accept multiple paths at once
> -
>
> Key: BEAM-446
> URL: https://issues.apache.org/jira/browse/BEAM-446
> Project: Beam
>  Issue Type: Improvement
>Reporter: Mark Liu
>Assignee: Mark Liu
>Priority: Minor
> Fix For: 0.2.0-incubating
>
>
> Currently, IOChannelUtils.resolve() method can only resolve one path against 
> base path. 
> It's useful to have another method with arguments that includes one base path 
> and multiple others. The return string will be a directory that start with 
> base path and append rests which are separated by file separator.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-434) Limit the number of output files a beam-examples execution writes

2016-07-13 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-434?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15376035#comment-15376035
 ] 

ASF GitHub Bot commented on BEAM-434:
-

GitHub user tgroh opened a pull request:

https://github.com/apache/incubator-beam/pull/651

[BEAM-434] Control the number of output shards in the Direct Runner

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [ ] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [ ] Make sure tests pass via `mvn clean verify`. (Even better, enable
   Travis-CI on your fork and ensure the whole test matrix passes).
 - [ ] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [ ] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.txt).

---

Add a Write Override Factory that limits the number of shards is
unspecified. This ensures that we will not write an output file per-key
due to bundling.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/tgroh/incubator-beam write_sharding

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-beam/pull/651.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #651


commit dfecd170d67c90ae20a4baed3986bd1c7116e79e
Author: Thomas Groh 
Date:   2016-07-13T21:26:10Z

Control the number of output shards in the Direct Runner

Add a Write Override Factory that limits the number of shards is
unspecified. This ensures that we will not write an output file per-key
due to bundling.




> Limit the number of output files a beam-examples execution writes
> -
>
> Key: BEAM-434
> URL: https://issues.apache.org/jira/browse/BEAM-434
> Project: Beam
>  Issue Type: Bug
>  Components: examples-java
>Reporter: Amit Sela
>Assignee: Amit Sela
>Priority: Minor
>
> When using `TextIO.Write.to("/path/to/output")` without any restrictions on 
> the number of shards, it might generate many output files (depending on your 
> input), for WordCount for example, you'll get as many output files as unique 
> words in your input.
> Since I think examples are expected to execute in a friendly manner to "see" 
> what it does and not optimize for performance in some way, I suggest to use 
> `withoutSharding()` when writing the example output to an output file.
> Examples I could find that behave this way:
> org.apache.beam.examples.WordCount
> org.apache.beam.examples.complete.TfIdf
> org.apache.beam.examples.cookbook.DeDupExample



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-372) CoderProperties: Test that the coder doesn't consume more bytes than it produces

2016-07-21 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15387771#comment-15387771
 ] 

ASF GitHub Bot commented on BEAM-372:
-

Github user asfgit closed the pull request at:

https://github.com/apache/incubator-beam/pull/695


> CoderProperties: Test that the coder doesn't consume more bytes than it 
> produces
> 
>
> Key: BEAM-372
> URL: https://issues.apache.org/jira/browse/BEAM-372
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core
>Reporter: Ben Chambers
>Assignee: Chandni Singh
>Priority: Minor
>  Labels: beginner, newbie, starter
>
> Add a test to CoderProperties that does the following:
> 1. Encode a value using the Coder in the nested context
> 2. Decode the value using the Coder in the nested context
> 3. Verify that the input stream wasn't requested to consume any extra bytes
> (This could possibly just be an enhancement to the existing round-trip 
> encode/decode test)
> When this fails it can lead to very difficult to debug situations in a coder 
> wrapped around the problematic coder. This would be an easy test that would 
> clearly fail *for the coder which was problematic*.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-124) Testing -- End to End WordCount Batch and Streaming Tests

2016-07-20 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15386637#comment-15386637
 ] 

ASF GitHub Bot commented on BEAM-124:
-

GitHub user markflyhigh opened a pull request:

https://github.com/apache/incubator-beam/pull/703

[BEAM-124] Flink and Spark running Examples WordCountIT

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [x] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [x] Make sure tests pass via `mvn clean verify`. (Even better, enable
   Travis-CI on your fork and ensure the whole test matrix passes).
 - [x] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [x] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.txt).

---

+R: @dhalperi @jasonkuster

This PR is an updated version of an old 
[PR](https://github.com/apache/incubator-beam/pull/345).

Removes dependencies of Spark and Flink runner on Beam java example in 
order to run WordCountIT with Spark and Flink runner successfully. The 
following command is used for different runner:

- Spark runner:

`mvn clean verify -pl examples/java -am -rf :beam-examples-java 
-DskipITs=false -DintegrationTestPipelineOptions='[ "--tempRoot=/tmp", 
"--inputFile=/tmp/kinglear.txt", 
"--runner=org.apache.beam.runners.spark.SparkRunner" ]'`

- Flink runner:

`mvn clean verify -pl examples/java -am -rf :beam-examples-java 
-DskipITs=false -DintegrationTestPipelineOptions='[ 
"--tempRoot=gs://clouddfe-testing-temp-storage", 
"--runner=org.apache.beam.runners.flink.FlinkRunner" ]'`

- Dataflow test runner:

`mvn clean verify -pl examples/java -am -rf :beam-examples-java 
-DskipIT=false -DintegrationTestPipelineOptions='[ 
"--tempRoot=gs://clouddfe-testing-temp-storage", 
"--runner=org.apache.beam.runners.dataflow.testing.TestDataflowRunner" ]'`



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/markflyhigh/incubator-beam 
wordcountit-spark-flink

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-beam/pull/703.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #703


commit 37753b0f3c7969ba6315c4326a958cc3c9bbbc50
Author: Mark Liu 
Date:   2016-07-20T00:37:50Z

[Beam-124] Run WorldCountIT with Spark and Flink runner

commit d8581c7b1bf8d14910d2f5c335caf25eba56427a
Author: Mark Liu 
Date:   2016-07-20T18:57:11Z

Remove depedency of Spark and Flink runner on Beam java examples




> Testing -- End to End WordCount Batch and Streaming Tests
> -
>
> Key: BEAM-124
> URL: https://issues.apache.org/jira/browse/BEAM-124
> Project: Beam
>  Issue Type: New Feature
>  Components: testing
>Reporter: Steve Wheeler
>Assignee: Mark Liu
>
> Set up testing infrastructure so that an end to end test for WordCount (both 
> batch and streaming) will be run periodically. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-445) Beam-examples-java build failed through local "mvn install"

2016-07-20 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15386570#comment-15386570
 ] 

ASF GitHub Bot commented on BEAM-445:
-

Github user dhalperi closed the pull request at:

https://github.com/apache/incubator-beam/pull/701


> Beam-examples-java build failed through local "mvn install"
> ---
>
> Key: BEAM-445
> URL: https://issues.apache.org/jira/browse/BEAM-445
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-gcp
> Environment: linux
>Reporter: Mark Liu
>Assignee: Daniel Halperin
>Priority: Critical
> Fix For: Not applicable
>
>
> Build project under beam/examples/java with command "mvn clean install 
> -DskipTests" failed with following error:
> [ERROR] Failed to execute goal on project beam-examples-java: Could not 
> resolve dependencies for project 
> org.apache.beam:beam-examples-java:jar:0.2.0-incubating-SNAPSHOT: Could not 
> transfer artifact 
> io.netty:netty-tcnative-boringssl-static:jar:${os.detected.classifier}:1.1.33.Fork13
>  from/to central (http://repo.maven.apache.org/maven2): Illegal character in 
> path at index 138: 
> http://repo.maven.apache.org/maven2/io/netty/netty-tcnative-boringssl-static/1.1.33.Fork13/netty-tcnative-boringssl-static-1.1.33.Fork13-${os.detected.classifier}.jar
> Reason: can't resolve ${os.detected.classifier} in 
> beam/sdks/java/io/google-cloud-platform/pom file.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-460) Implement Python support for size-estimation aggregators

2016-07-20 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-460?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15386534#comment-15386534
 ] 

ASF GitHub Bot commented on BEAM-460:
-

Github user charlesccychen closed the pull request at:

https://github.com/apache/incubator-beam/pull/678


> Implement Python support for size-estimation aggregators
> 
>
> Key: BEAM-460
> URL: https://issues.apache.org/jira/browse/BEAM-460
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py
>Reporter: Charles Chen
>Assignee: Charles Chen
>
> Size-estimation aggregators are provided by the execution of each step in a 
> Beam pipeline and help clarify the amount of data processed by each pipeline 
> step.  To ease implementation of this feature for runners, we should first 
> expose size-estimation support for Coder objects.  Runners can then use this 
> functionality to implement full support for size-estimation aggregators.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-386) Dataflow runner to support Read.Bounded in streaming mode.

2016-07-20 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15386770#comment-15386770
 ] 

ASF GitHub Bot commented on BEAM-386:
-

GitHub user peihe opened a pull request:

https://github.com/apache/incubator-beam/pull/704

[BEAM-386] Remove StreamingCreate in DataflowRunner

Create is based on BoundedSource, and which is supported through 
UnboundedReadFromBoundedSource.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/peihe/incubator-beam rm-streaming-create

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-beam/pull/704.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #704


commit 2e3fe2f7e13858f382090e2572f406b5f35da964
Author: Pei He 
Date:   2016-07-20T22:47:05Z

Remove StreamingCreate




> Dataflow runner to support Read.Bounded in streaming mode.
> --
>
> Key: BEAM-386
> URL: https://issues.apache.org/jira/browse/BEAM-386
> Project: Beam
>  Issue Type: New Feature
>Reporter: Pei He
>Assignee: Pei He
> Fix For: 0.2.0-incubating
>
>
> UnboundedReadFromBoundedSource is done.
> Make Dataflow runner use it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-79) Gearpump runner

2016-07-20 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-79?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15386813#comment-15386813
 ] 

ASF GitHub Bot commented on BEAM-79:


Github user manuzhang closed the pull request at:

https://github.com/apache/incubator-beam/pull/323


> Gearpump runner
> ---
>
> Key: BEAM-79
> URL: https://issues.apache.org/jira/browse/BEAM-79
> Project: Beam
>  Issue Type: New Feature
>  Components: runner-ideas
>Reporter: Tyler Akidau
>Assignee: Manu Zhang
>
> Intel is submitting Gearpump (http://www.gearpump.io) to ASF 
> (https://wiki.apache.org/incubator/GearpumpProposal). Appears to be a mix of 
> low-level primitives a la MillWheel, with some higher level primitives like 
> non-merging windowing mixed in. Seems like it would make a nice Beam runner.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-445) Beam-examples-java build failed through local "mvn install"

2016-07-20 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15386825#comment-15386825
 ] 

ASF GitHub Bot commented on BEAM-445:
-

GitHub user dhalperi reopened a pull request:

https://github.com/apache/incubator-beam/pull/701

[BEAM-445] BigtableIO: upgrade to 0.9.1

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [ ] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [ ] Make sure tests pass via `mvn clean verify`. (Even better, enable
   Travis-CI on your fork and ensure the whole test matrix passes).
 - [ ] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [ ] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.txt).

---

* Use the uber jar
* Remove OS classifier mumbo jumbo
* Move common dependency versioning to root pom

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/dhalperi/incubator-beam bigtable-0.9.1

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-beam/pull/701.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #701


commit 6d6e0f22c652b1c540a1c3f29091312231f3833a
Author: Dan Halperin 
Date:   2016-07-20T17:26:02Z

BigtableIO: upgrade to 0.9.1

* Use the uber jar
* Remove OS classifier mumbo jumbo
* Move common dependency versioning to root pom

commit 7f6a054d54fc719ae0d6a0718073e07248baa7f8
Author: Dan Halperin 
Date:   2016-07-20T20:31:26Z

fixup! BigtableIO: upgrade to 0.9.1

commit 686d0d33b99036e5d8f872a248ab34fa90b17092
Author: Dan Halperin 
Date:   2016-07-20T20:52:05Z

fixup! BigtableIO: upgrade to 0.9.1

commit ef5b79d0eb94db8a298d04045d12cc220855de5d
Author: Dan Halperin 
Date:   2016-07-20T21:47:47Z

fixup! BigtableIO: upgrade to 0.9.1

commit 5ef3514126f7e12ecde596e86bb6758a41fc4f6e
Author: Dan Halperin 
Date:   2016-07-20T23:05:19Z

fixup! BigtableIO: upgrade to 0.9.1




> Beam-examples-java build failed through local "mvn install"
> ---
>
> Key: BEAM-445
> URL: https://issues.apache.org/jira/browse/BEAM-445
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-gcp
> Environment: linux
>Reporter: Mark Liu
>Assignee: Daniel Halperin
>Priority: Critical
> Fix For: Not applicable
>
>
> Build project under beam/examples/java with command "mvn clean install 
> -DskipTests" failed with following error:
> [ERROR] Failed to execute goal on project beam-examples-java: Could not 
> resolve dependencies for project 
> org.apache.beam:beam-examples-java:jar:0.2.0-incubating-SNAPSHOT: Could not 
> transfer artifact 
> io.netty:netty-tcnative-boringssl-static:jar:${os.detected.classifier}:1.1.33.Fork13
>  from/to central (http://repo.maven.apache.org/maven2): Illegal character in 
> path at index 138: 
> http://repo.maven.apache.org/maven2/io/netty/netty-tcnative-boringssl-static/1.1.33.Fork13/netty-tcnative-boringssl-static-1.1.33.Fork13-${os.detected.classifier}.jar
> Reason: can't resolve ${os.detected.classifier} in 
> beam/sdks/java/io/google-cloud-platform/pom file.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-445) Beam-examples-java build failed through local "mvn install"

2016-07-21 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15387977#comment-15387977
 ] 

ASF GitHub Bot commented on BEAM-445:
-

Github user asfgit closed the pull request at:

https://github.com/apache/incubator-beam/pull/701


> Beam-examples-java build failed through local "mvn install"
> ---
>
> Key: BEAM-445
> URL: https://issues.apache.org/jira/browse/BEAM-445
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-gcp
> Environment: linux
>Reporter: Mark Liu
>Assignee: Daniel Halperin
>Priority: Critical
> Fix For: Not applicable
>
>
> Build project under beam/examples/java with command "mvn clean install 
> -DskipTests" failed with following error:
> [ERROR] Failed to execute goal on project beam-examples-java: Could not 
> resolve dependencies for project 
> org.apache.beam:beam-examples-java:jar:0.2.0-incubating-SNAPSHOT: Could not 
> transfer artifact 
> io.netty:netty-tcnative-boringssl-static:jar:${os.detected.classifier}:1.1.33.Fork13
>  from/to central (http://repo.maven.apache.org/maven2): Illegal character in 
> path at index 138: 
> http://repo.maven.apache.org/maven2/io/netty/netty-tcnative-boringssl-static/1.1.33.Fork13/netty-tcnative-boringssl-static-1.1.33.Fork13-${os.detected.classifier}.jar
> Reason: can't resolve ${os.detected.classifier} in 
> beam/sdks/java/io/google-cloud-platform/pom file.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-462) Replace MonitoringUtil.PrintHandler with a handler that utilizes a Java logger

2016-07-18 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-462?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15382496#comment-15382496
 ] 

ASF GitHub Bot commented on BEAM-462:
-

GitHub user lukecwik opened a pull request:

https://github.com/apache/incubator-beam/pull/673

[BEAM-462] Replace MonitoringUtil.PrintHandler with a handler that utilizes 
a Java logger

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [x] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [x] Make sure tests pass via `mvn clean verify`. (Even better, enable
   Travis-CI on your fork and ensure the whole test matrix passes).
 - [x] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [x] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.txt).

---

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/lukecwik/incubator-beam logginghandler

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-beam/pull/673.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #673


commit f9dcca2f26add731b9412ddf3705d6bcb6726008
Author: Luke Cwik 
Date:   2016-07-18T15:51:32Z

[BEAM-462] Replace MonitoringUtil.PrintHandler with a handler that utilizes 
a Java logger




> Replace MonitoringUtil.PrintHandler with a handler that utilizes a Java logger
> --
>
> Key: BEAM-462
> URL: https://issues.apache.org/jira/browse/BEAM-462
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-dataflow
>Reporter: Luke Cwik
>Assignee: Luke Cwik
>Priority: Minor
>
> The current PrintHandler in MonitoringUtil only supports output to a print 
> stream.
> It would be cleaner to instead have a LoggingHandler that can be used with 
> normal Java logging.  The handler could translate the JobMessage importance 
> to severity and thus allowing for formating/filtering to be determined by the 
> logger instead of custom printing code.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-462) Replace MonitoringUtil.PrintHandler with a handler that utilizes a Java logger

2016-07-18 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-462?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15382541#comment-15382541
 ] 

ASF GitHub Bot commented on BEAM-462:
-

Github user asfgit closed the pull request at:

https://github.com/apache/incubator-beam/pull/673


> Replace MonitoringUtil.PrintHandler with a handler that utilizes a Java logger
> --
>
> Key: BEAM-462
> URL: https://issues.apache.org/jira/browse/BEAM-462
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-dataflow
>Reporter: Luke Cwik
>Assignee: Luke Cwik
>Priority: Minor
>
> The current PrintHandler in MonitoringUtil only supports output to a print 
> stream.
> It would be cleaner to instead have a LoggingHandler that can be used with 
> normal Java logging.  The handler could translate the JobMessage importance 
> to severity and thus allowing for formating/filtering to be determined by the 
> logger instead of custom printing code.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-428) InProcessRunner - Bundle based local runner

2016-07-18 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15382597#comment-15382597
 ] 

ASF GitHub Bot commented on BEAM-428:
-

Github user aaltay closed the pull request at:

https://github.com/apache/incubator-beam/pull/598


> InProcessRunner - Bundle based local runner
> ---
>
> Key: BEAM-428
> URL: https://issues.apache.org/jira/browse/BEAM-428
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-py
>Reporter: Ahmet Altay
>Assignee: Ahmet Altay
>
> InProcessRunner is a bundle based drop in replacement for DirectRunner.
> Similar to its Java equivalent it improves DirectRunner by executing 
> transforms in parallel using bundles similar to a service based 
> implementations. It offers better performance and more validation options.
> Initially it will be a runner for executing batch jobs only. The target of 
> this phase is to develop a drop in replacement for DirectRunner. Later it 
> will be improved by adding streaming execution.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-439) The start_bundle/finish_bundle should not allow side inputs

2016-07-18 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-439?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15382623#comment-15382623
 ] 

ASF GitHub Bot commented on BEAM-439:
-

Github user silviulica closed the pull request at:

https://github.com/apache/incubator-beam/pull/668


> The start_bundle/finish_bundle should not allow side inputs
> ---
>
> Key: BEAM-439
> URL: https://issues.apache.org/jira/browse/BEAM-439
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py
>Reporter: Silviu Calinoiu
>Assignee: Frances Perry
>Priority: Minor
>
> Allowing them will create headaches when we support streaming.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


<    1   2   3   4   5   6   7   8   9   10   >