[jira] [Updated] (BEAM-2208) Python SDK wordcount on cloud Dataflow runner is slow

2017-05-08 Thread Ahmet Altay (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-2208?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmet Altay updated BEAM-2208:
--
Summary: Python SDK wordcount on cloud Dataflow runner is slow  (was: 
Apache Beam Python SDK is atleast 5 times slower)

> Python SDK wordcount on cloud Dataflow runner is slow
> -
>
> Key: BEAM-2208
> URL: https://issues.apache.org/jira/browse/BEAM-2208
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-dataflow, sdk-py
>Affects Versions: 0.6.0
>Reporter: Anant Bhandarkar
>Assignee: Ahmet Altay
>Priority: Critical
>
> I have been trying to run the Beam Word count example with a 2GB file.
> When I run the Java Example for word count of this csv file the job gets 
> completed in 7.15secs Mins.
> Job ID
> 2017-04-18_23_57_02-2832613177376293063
> But word count example with same file using Python SDK takes 28 to 35mins 
> 2017-04-20_04_48_27-8924552896141769408
> SDK version   
> Apache Beam SDK for Python 0.6.0



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (BEAM-2208) Apache Beam Python SDK is atleast 5 times slower

2017-05-08 Thread Ahmet Altay (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2208?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16002070#comment-16002070
 ] 

Ahmet Altay commented on BEAM-2208:
---

[~anan...@gmail.com], do you have any recent job runs? Only bare minimum 
metadata information is kept beyond a few weeks for debugging, and I cannot 
tell what is is causing the difference from that information. One thing I 
noticed is the biggest difference in time comes from the reading step. Could 
you check whether GCS gzip compression is enabled for this input file?

> Apache Beam Python SDK is atleast 5 times slower
> 
>
> Key: BEAM-2208
> URL: https://issues.apache.org/jira/browse/BEAM-2208
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-dataflow, sdk-py
>Affects Versions: 0.6.0
>Reporter: Anant Bhandarkar
>Assignee: Ahmet Altay
>Priority: Critical
>
> I have been trying to run the Beam Word count example with a 2GB file.
> When I run the Java Example for word count of this csv file the job gets 
> completed in 7.15secs Mins.
> Job ID
> 2017-04-18_23_57_02-2832613177376293063
> But word count example with same file using Python SDK takes 28 to 35mins 
> 2017-04-20_04_48_27-8924552896141769408
> SDK version   
> Apache Beam SDK for Python 0.6.0



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


Jenkins build became unstable: beam_PostCommit_Java_MavenInstall #3717

2017-05-08 Thread Apache Jenkins Server
See 




[jira] [Commented] (BEAM-2140) Fix SplittableDoFn ValidatesRunner tests in FlinkRunner

2017-05-08 Thread Jingsong Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16002039#comment-16002039
 ] 

Jingsong Lee commented on BEAM-2140:


First, {{SplittableParDo}} should not wrap {{StatefulDoFnRunner}}.

Second, {{SplittableParDo}} use {{PROCESSING_TIME}} to continue processing. And 
it also sets watermark holds which will affect the sending of the output 
watermark. (see {{DoFnOperator.processWatermark1()}}).
When {{BoundedSourceWrapper}} is over, it will emit a Long.MAX_VALUE watermark, 
but the {{SplittableParDo}} may be not over yet. (depends on system time) So no 
one can send watermark to the downstream.

Last, {{StreamTask}} will shutdown when there are no inputs and invoke 
{{timerService.quiesceAndAwaitPending}}. (see {{StreamTask.invoke()}} in Flink)
It will shutdown TimeService and invoke all task in TimeService and reject the 
new registration. So it will break the continue processing of 
{{SplittableParDo}}.

[~aljoscha] Is that right? Please correct me if I wrong.

> Fix SplittableDoFn ValidatesRunner tests in FlinkRunner
> ---
>
> Key: BEAM-2140
> URL: https://issues.apache.org/jira/browse/BEAM-2140
> Project: Beam
>  Issue Type: Bug
>  Components: runner-flink
>Reporter: Aljoscha Krettek
>Assignee: Aljoscha Krettek
>
> As discovered as part of BEAM-1763, there is a failing SDF test. We disabled 
> the tests to unblock the open PR for BEAM-1763.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (BEAM-2227) add a post to explain how Beam SQL works

2017-05-08 Thread Xu Mingmin (JIRA)
Xu Mingmin created BEAM-2227:


 Summary: add a post to explain how Beam SQL works
 Key: BEAM-2227
 URL: https://issues.apache.org/jira/browse/BEAM-2227
 Project: Beam
  Issue Type: Task
  Components: dsl-sql, website
Reporter: Xu Mingmin
Assignee: Xu Mingmin


As mentioned in maillist, it's important to clarify what Beam SQL does, and how 
it works.

{quote}
There's some confusion where people think we're just doing a pass through
to the framework's SQL engine. We'll have to make sure we're clear on how
Beam's SQL works in the docs.
{quote}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (BEAM-2147) Re-enable UsesTimersInParDo tests for DataflowRunner

2017-05-08 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16002023#comment-16002023
 ] 

ASF GitHub Bot commented on BEAM-2147:
--

GitHub user kennknowles opened a pull request:

https://github.com/apache/beam/pull/2988

[BEAM-2147] Re-enable UsesTimersInParDo tests on Dataflow, by fixing 
TestDataflowRunner and PAssert

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [ ] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [ ] Make sure tests pass via `mvn clean verify`.
 - [ ] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [ ] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.pdf).

---


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/kennknowles/beam DataflowRunner-timers

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/2988.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2988


commit 1a4910b1b73330094a1e1033ffde20e41b7024e3
Author: Kenneth Knowles 
Date:   2017-05-09T03:07:29Z

Allow any throwable in PAssert to constitute adequate failure

Currently, some PAssert tests require an AssertionError to be thrown. This
succeeds on all runners only because many gratuitously throw AssertionError
when they don't actually know that an assertion has failed.

The spec is just that a pipeline has to fail. We don't have a good enough 
story
around exception propagation to have such a strict - and fake - spec. And it
isn't cross-language anyhow FWIW, looking forward to the possibility of 
running
a PAssert in a pipeline combining multiple SDK harnesses.

commit 01fd6cc568a7af05f260613197165bfc2f8ee86b
Author: Kenneth Knowles 
Date:   2017-04-30T23:08:48Z

TestDataflowRunner: throw AssertionError only when assertion known failed

It is quite confusing to receive an assertion error when in fact the 
pipeline
has crashed because of user error interacting with e.g. timers.

commit 898731b19a630ffc8ec8062acb0fd53bd4995635
Author: Kenneth Knowles 
Date:   2017-02-14T22:54:11Z

Re-enable UsesTimersInParDo tests in Dataflow runner




> Re-enable UsesTimersInParDo tests for DataflowRunner
> 
>
> Key: BEAM-2147
> URL: https://issues.apache.org/jira/browse/BEAM-2147
> Project: Beam
>  Issue Type: New Feature
>  Components: runner-dataflow
>Reporter: Kenneth Knowles
>Assignee: Kenneth Knowles
>
> These are disabled currently because of bugs in the {{TestDataflowRunner}}'s 
> diagnosis of whether something is an {{AssertionError}} or some other 
> {{RuntimeException}}.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] beam pull request #2988: [BEAM-2147] Re-enable UsesTimersInParDo tests on Da...

2017-05-08 Thread kennknowles
GitHub user kennknowles opened a pull request:

https://github.com/apache/beam/pull/2988

[BEAM-2147] Re-enable UsesTimersInParDo tests on Dataflow, by fixing 
TestDataflowRunner and PAssert

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [ ] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [ ] Make sure tests pass via `mvn clean verify`.
 - [ ] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [ ] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.pdf).

---


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/kennknowles/beam DataflowRunner-timers

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/2988.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2988


commit 1a4910b1b73330094a1e1033ffde20e41b7024e3
Author: Kenneth Knowles 
Date:   2017-05-09T03:07:29Z

Allow any throwable in PAssert to constitute adequate failure

Currently, some PAssert tests require an AssertionError to be thrown. This
succeeds on all runners only because many gratuitously throw AssertionError
when they don't actually know that an assertion has failed.

The spec is just that a pipeline has to fail. We don't have a good enough 
story
around exception propagation to have such a strict - and fake - spec. And it
isn't cross-language anyhow FWIW, looking forward to the possibility of 
running
a PAssert in a pipeline combining multiple SDK harnesses.

commit 01fd6cc568a7af05f260613197165bfc2f8ee86b
Author: Kenneth Knowles 
Date:   2017-04-30T23:08:48Z

TestDataflowRunner: throw AssertionError only when assertion known failed

It is quite confusing to receive an assertion error when in fact the 
pipeline
has crashed because of user error interacting with e.g. timers.

commit 898731b19a630ffc8ec8062acb0fd53bd4995635
Author: Kenneth Knowles 
Date:   2017-02-14T22:54:11Z

Re-enable UsesTimersInParDo tests in Dataflow runner




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] beam pull request #2987: Revise javadoc for sdk, state, options, annotations...

2017-05-08 Thread kennknowles
GitHub user kennknowles opened a pull request:

https://github.com/apache/beam/pull/2987

Revise javadoc for sdk, state, options, annotations, window, (and misc)

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [ ] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [ ] Make sure tests pass via `mvn clean verify`.
 - [ ] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [ ] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.pdf).

---

R: @tgroh perhaps, though any/all input is helpful! Feel free to hot potato.

CC: @davorbonaci this should certainly be cherry picked


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/kennknowles/beam javadoc

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/2987.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2987


commit 15350573bebd24c9f5f1ca94443486c7d1d65bba
Author: Kenneth Knowles 
Date:   2017-05-09T04:28:37Z

Revise javadoc for sdk, state, options, annotations, window, (and misc)




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


Jenkins build is still unstable: beam_PostCommit_Java_ValidatesRunner_Flink #2735

2017-05-08 Thread Apache Jenkins Server
See 




[GitHub] beam pull request #2933: [BEAM-2166] Use contextless encode/decode by defaul...

2017-05-08 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/2933


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (BEAM-2166) Remove Coder.Context from the public API

2017-05-08 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2166?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16002007#comment-16002007
 ] 

ASF GitHub Bot commented on BEAM-2166:
--

Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/2933


> Remove Coder.Context from the public API
> 
>
> Key: BEAM-2166
> URL: https://issues.apache.org/jira/browse/BEAM-2166
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core, sdk-py
>Affects Versions: 2.0.0
>Reporter: Robert Bradshaw
>Assignee: Robert Bradshaw
> Fix For: 2.0.0
>
>
> Justification: 
> * Contexts add confusion and complexity to the public API (e.g. 
> https://issues.apache.org/jira/browse/BEAM-1448)
> * Leaf (user-written) coders are nearly always nested.
> * Coders are being removed from sources, which was the initial need for 
> context.
> * It is unclear how much value, if any, this provides for internal 
> performance.
> * There may be performance (as well as simplification) gains in removing this 
> for the Fn API.
> Fully removing this distinction from the internals can be defered  until the 
> last bullet points are more completely investigated. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] beam pull request #2980: PR to run tests for #2933 using updated Dataflow wo...

2017-05-08 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/2980


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[11/13] beam git commit: Use latest dataflow worker.

2017-05-08 Thread lcwik
Use latest dataflow worker.


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/43037a33
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/43037a33
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/43037a33

Branch: refs/heads/master
Commit: 43037a33b1ac95326c02c97866dfce63b38308ea
Parents: d9d02f2
Author: Robert Bradshaw <rober...@gmail.com>
Authored: Mon May 8 14:56:25 2017 -0700
Committer: Luke Cwik <lc...@google.com>
Committed: Mon May 8 20:19:45 2017 -0700

--
 runners/google-cloud-dataflow-java/pom.xml | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/43037a33/runners/google-cloud-dataflow-java/pom.xml
--
diff --git a/runners/google-cloud-dataflow-java/pom.xml 
b/runners/google-cloud-dataflow-java/pom.xml
index adbd4d7..9643b69 100644
--- a/runners/google-cloud-dataflow-java/pom.xml
+++ b/runners/google-cloud-dataflow-java/pom.xml
@@ -33,7 +33,7 @@
   jar
 
   
-
beam-master-20170508
+
beam-master-20170508-pr2933
 
1
 
6
   



[02/13] beam git commit: Remove en/decodeOuter and default encode/decode methods.

2017-05-08 Thread lcwik
Remove en/decodeOuter and default encode/decode methods.

Now only the context-free encode() and decode() methods are abstract.


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/44867300
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/44867300
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/44867300

Branch: refs/heads/master
Commit: 44867300cb36dcad1cdfba70dbe093ec50a14388
Parents: 45e09b2
Author: Robert Bradshaw 
Authored: Fri May 5 17:35:35 2017 -0700
Committer: Luke Cwik 
Committed: Mon May 8 20:17:56 2017 -0700

--
 .../java/org/apache/beam/sdk/coders/Coder.java  | 35 ---
 .../org/apache/beam/sdk/coders/CustomCoder.java | 47 
 .../apache/beam/sdk/coders/StructuredCoder.java | 47 
 3 files changed, 8 insertions(+), 121 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/44867300/sdks/java/core/src/main/java/org/apache/beam/sdk/coders/Coder.java
--
diff --git a/sdks/java/core/src/main/java/org/apache/beam/sdk/coders/Coder.java 
b/sdks/java/core/src/main/java/org/apache/beam/sdk/coders/Coder.java
index d140e89..ec8a72d 100644
--- a/sdks/java/core/src/main/java/org/apache/beam/sdk/coders/Coder.java
+++ b/sdks/java/core/src/main/java/org/apache/beam/sdk/coders/Coder.java
@@ -127,18 +127,6 @@ public abstract class Coder implements Serializable {
 
   /**
* Encodes the given value of type {@code T} onto the given output stream
-   * in the outer context.
-   *
-   * @throws IOException if writing to the {@code OutputStream} fails
-   * for some reason
-   * @throws CoderException if the value could not be encoded for some reason
-   */
-  @Deprecated
-  public abstract void encodeOuter(T value, OutputStream outStream)
-  throws CoderException, IOException;
-
-  /**
-   * Encodes the given value of type {@code T} onto the given output stream
* in the given context.
*
* @throws IOException if writing to the {@code OutputStream} fails
@@ -146,8 +134,10 @@ public abstract class Coder implements Serializable {
* @throws CoderException if the value could not be encoded for some reason
*/
   @Deprecated
-  public abstract void encode(T value, OutputStream outStream, Context context)
-  throws CoderException, IOException;
+  public void encode(T value, OutputStream outStream, Context context)
+  throws CoderException, IOException {
+encode(value, outStream);
+  }
 
   /**
* Decodes a value of type {@code T} from the given input stream in
@@ -161,17 +151,6 @@ public abstract class Coder implements Serializable {
 
   /**
* Decodes a value of type {@code T} from the given input stream in
-   * the outer context.  Returns the decoded value.
-   *
-   * @throws IOException if reading from the {@code InputStream} fails
-   * for some reason
-   * @throws CoderException if the value could not be decoded for some reason
-   */
-  @Deprecated
-  public abstract T decodeOuter(InputStream inStream) throws CoderException, 
IOException;
-
-  /**
-   * Decodes a value of type {@code T} from the given input stream in
* the given context.  Returns the decoded value.
*
* @throws IOException if reading from the {@code InputStream} fails
@@ -179,8 +158,10 @@ public abstract class Coder implements Serializable {
* @throws CoderException if the value could not be decoded for some reason
*/
   @Deprecated
-  public abstract T decode(InputStream inStream, Context context)
-  throws CoderException, IOException;
+  public T decode(InputStream inStream, Context context)
+  throws CoderException, IOException {
+return decode(inStream);
+  }
 
   /**
* If this is a {@code Coder} for a parameterized type, returns the

http://git-wip-us.apache.org/repos/asf/beam/blob/44867300/sdks/java/core/src/main/java/org/apache/beam/sdk/coders/CustomCoder.java
--
diff --git 
a/sdks/java/core/src/main/java/org/apache/beam/sdk/coders/CustomCoder.java 
b/sdks/java/core/src/main/java/org/apache/beam/sdk/coders/CustomCoder.java
index edbaa7f..c581923 100644
--- a/sdks/java/core/src/main/java/org/apache/beam/sdk/coders/CustomCoder.java
+++ b/sdks/java/core/src/main/java/org/apache/beam/sdk/coders/CustomCoder.java
@@ -17,9 +17,6 @@
  */
 package org.apache.beam.sdk.coders;
 
-import java.io.IOException;
-import java.io.InputStream;
-import java.io.OutputStream;
 import java.io.Serializable;
 import java.util.Collections;
 import java.util.List;
@@ -39,50 +36,6 @@ import java.util.List;
 public abstract class CustomCoder extends Coder
 implements Serializable {
 
-  @Override
-  public void encode(T value, OutputStream 

[06/13] beam git commit: Remove contexts from coders where they'll never be used.

2017-05-08 Thread lcwik
Remove contexts from coders where they'll never be used.


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/996dce37
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/996dce37
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/996dce37

Branch: refs/heads/master
Commit: 996dce37b76b103f104328b7caa65f73a1bcb15a
Parents: 27e9a06
Author: Robert Bradshaw 
Authored: Fri May 5 16:36:47 2017 -0700
Committer: Luke Cwik 
Committed: Mon May 8 20:17:56 2017 -0700

--
 .../UnboundedReadFromBoundedSource.java |  4 +--
 .../core/ElementAndRestrictionCoder.java|  4 +--
 .../beam/runners/core/KeyedWorkItemCoder.java   |  4 +--
 .../beam/runners/core/TimerInternals.java   |  4 +--
 .../apache/beam/sdk/coders/DurationCoder.java   |  4 +--
 .../apache/beam/sdk/coders/InstantCoder.java|  4 +--
 .../sdk/transforms/ApproximateQuantiles.java| 20 +---
 .../org/apache/beam/sdk/transforms/Mean.java|  4 +--
 .../org/apache/beam/sdk/transforms/Top.java |  4 +--
 .../beam/sdk/transforms/join/CoGbkResult.java   | 10 ++
 .../transforms/windowing/IntervalWindow.java|  4 +--
 .../beam/sdk/values/TimestampedValue.java   |  4 +--
 .../beam/sdk/io/kinesis/KinesisRecordCoder.java | 34 +---
 13 files changed, 47 insertions(+), 57 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/996dce37/runners/core-construction-java/src/main/java/org/apache/beam/runners/core/construction/UnboundedReadFromBoundedSource.java
--
diff --git 
a/runners/core-construction-java/src/main/java/org/apache/beam/runners/core/construction/UnboundedReadFromBoundedSource.java
 
b/runners/core-construction-java/src/main/java/org/apache/beam/runners/core/construction/UnboundedReadFromBoundedSource.java
index ae28e3a..b74da80 100644
--- 
a/runners/core-construction-java/src/main/java/org/apache/beam/runners/core/construction/UnboundedReadFromBoundedSource.java
+++ 
b/runners/core-construction-java/src/main/java/org/apache/beam/runners/core/construction/UnboundedReadFromBoundedSource.java
@@ -224,7 +224,7 @@ public class UnboundedReadFromBoundedSource extends 
PTransform(
 elemsCoder.decode(inStream),
-sourceCoder.decode(inStream, context));
+sourceCoder.decode(inStream));
   }
 
   @Override

http://git-wip-us.apache.org/repos/asf/beam/blob/996dce37/runners/core-java/src/main/java/org/apache/beam/runners/core/ElementAndRestrictionCoder.java
--
diff --git 
a/runners/core-java/src/main/java/org/apache/beam/runners/core/ElementAndRestrictionCoder.java
 
b/runners/core-java/src/main/java/org/apache/beam/runners/core/ElementAndRestrictionCoder.java
index 5ddd865..fcb1deb 100644
--- 
a/runners/core-java/src/main/java/org/apache/beam/runners/core/ElementAndRestrictionCoder.java
+++ 
b/runners/core-java/src/main/java/org/apache/beam/runners/core/ElementAndRestrictionCoder.java
@@ -56,14 +56,14 @@ public class ElementAndRestrictionCoder
   throw new CoderException("cannot encode a null ElementAndRestriction");
 }
 elementCoder.encode(value.element(), outStream);
-restrictionCoder.encode(value.restriction(), outStream, context);
+restrictionCoder.encode(value.restriction(), outStream);
   }
 
   @Override
   public ElementAndRestriction decode(InputStream 
inStream, Context context)
   throws IOException {
 ElementT key = elementCoder.decode(inStream);
-RestrictionT value = restrictionCoder.decode(inStream, context);
+RestrictionT value = restrictionCoder.decode(inStream);
 return ElementAndRestriction.of(key, value);
   }
 

http://git-wip-us.apache.org/repos/asf/beam/blob/996dce37/runners/core-java/src/main/java/org/apache/beam/runners/core/KeyedWorkItemCoder.java
--
diff --git 
a/runners/core-java/src/main/java/org/apache/beam/runners/core/KeyedWorkItemCoder.java
 
b/runners/core-java/src/main/java/org/apache/beam/runners/core/KeyedWorkItemCoder.java
index 

[10/13] beam git commit: lint error

2017-05-08 Thread lcwik
lint error


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/ebd86e18
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/ebd86e18
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/ebd86e18

Branch: refs/heads/master
Commit: ebd86e18a6bade77f72fca944d29b6c0308fd862
Parents: 4c00272
Author: Robert Bradshaw 
Authored: Mon May 8 11:27:40 2017 -0700
Committer: Luke Cwik 
Committed: Mon May 8 20:17:57 2017 -0700

--
 .../apache/beam/runners/core/construction/PCollectionsTest.java | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/ebd86e18/runners/core-construction-java/src/test/java/org/apache/beam/runners/core/construction/PCollectionsTest.java
--
diff --git 
a/runners/core-construction-java/src/test/java/org/apache/beam/runners/core/construction/PCollectionsTest.java
 
b/runners/core-construction-java/src/test/java/org/apache/beam/runners/core/construction/PCollectionsTest.java
index a114cf5..c38dbc0 100644
--- 
a/runners/core-construction-java/src/test/java/org/apache/beam/runners/core/construction/PCollectionsTest.java
+++ 
b/runners/core-construction-java/src/test/java/org/apache/beam/runners/core/construction/PCollectionsTest.java
@@ -29,7 +29,10 @@ import java.io.OutputStream;
 import java.util.Collection;
 import java.util.Collections;
 import org.apache.beam.sdk.Pipeline;
-import org.apache.beam.sdk.coders.*;
+import org.apache.beam.sdk.coders.AtomicCoder;
+import org.apache.beam.sdk.coders.BigEndianLongCoder;
+import org.apache.beam.sdk.coders.Coder;
+import org.apache.beam.sdk.coders.CustomCoder;
 import org.apache.beam.sdk.common.runner.v1.RunnerApi;
 import org.apache.beam.sdk.io.GenerateSequence;
 import org.apache.beam.sdk.testing.TestPipeline;



[07/13] beam git commit: fixup! Swap to use encode/decode in anonymous inner class coder and @AutoValue coder

2017-05-08 Thread lcwik
fixup! Swap to use encode/decode in anonymous inner class coder and @AutoValue 
coder


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/2d379ddd
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/2d379ddd
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/2d379ddd

Branch: refs/heads/master
Commit: 2d379ddd5ef9b0b7a54ddc70203f37cd2763f387
Parents: 4486730
Author: Lukasz Cwik 
Authored: Sun May 7 19:41:07 2017 -0700
Committer: Luke Cwik 
Committed: Mon May 8 20:17:57 2017 -0700

--
 .../core/construction/PCollectionsTest.java| 17 +++--
 1 file changed, 7 insertions(+), 10 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/2d379ddd/runners/core-construction-java/src/test/java/org/apache/beam/runners/core/construction/PCollectionsTest.java
--
diff --git 
a/runners/core-construction-java/src/test/java/org/apache/beam/runners/core/construction/PCollectionsTest.java
 
b/runners/core-construction-java/src/test/java/org/apache/beam/runners/core/construction/PCollectionsTest.java
index 2c45cbd..a114cf5 100644
--- 
a/runners/core-construction-java/src/test/java/org/apache/beam/runners/core/construction/PCollectionsTest.java
+++ 
b/runners/core-construction-java/src/test/java/org/apache/beam/runners/core/construction/PCollectionsTest.java
@@ -29,10 +29,7 @@ import java.io.OutputStream;
 import java.util.Collection;
 import java.util.Collections;
 import org.apache.beam.sdk.Pipeline;
-import org.apache.beam.sdk.coders.AtomicCoder;
-import org.apache.beam.sdk.coders.BigEndianLongCoder;
-import org.apache.beam.sdk.coders.Coder;
-import org.apache.beam.sdk.coders.CustomCoder;
+import org.apache.beam.sdk.coders.*;
 import org.apache.beam.sdk.common.runner.v1.RunnerApi;
 import org.apache.beam.sdk.io.GenerateSequence;
 import org.apache.beam.sdk.testing.TestPipeline;
@@ -130,13 +127,13 @@ public class PCollectionsTest {
   @AutoValue
   abstract static class CustomIntCoder extends CustomCoder {
 @Override
-public void encode(Integer value, OutputStream outStream, Context context) 
throws IOException {
-  VarInt.encode(value, outStream);
+public Integer decode(InputStream inStream) throws IOException {
+  return VarInt.decodeInt(inStream);
 }
 
 @Override
-public Integer decode(InputStream inStream, Context context) throws 
IOException {
-  return VarInt.decodeInt(inStream);
+public void encode(Integer value, OutputStream outStream) throws 
IOException {
+  VarInt.encode(value, outStream);
 }
   }
 
@@ -163,13 +160,13 @@ public class PCollectionsTest {
 @Override public void verifyDeterministic() {}
 
 @Override
-public void encode(BoundedWindow value, OutputStream outStream, 
Context context)
+public void encode(BoundedWindow value, OutputStream outStream)
 throws IOException {
   VarInt.encode(value.maxTimestamp().getMillis(), outStream);
 }
 
 @Override
-public BoundedWindow decode(InputStream inStream, Context context) 
throws IOException {
+public BoundedWindow decode(InputStream inStream) throws IOException {
   final Instant ts = new Instant(VarInt.decodeLong(inStream));
   return new BoundedWindow() {
 @Override



[12/13] beam git commit: Explicitly mark Coder context as experimental as well as deprecated.

2017-05-08 Thread lcwik
Explicitly mark Coder context as experimental as well as deprecated.


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/fda3a43b
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/fda3a43b
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/fda3a43b

Branch: refs/heads/master
Commit: fda3a43be3277d0dca888cfa30693599d11cd5af
Parents: 43037a3
Author: Robert Bradshaw 
Authored: Mon May 8 15:04:36 2017 -0700
Committer: Luke Cwik 
Committed: Mon May 8 20:19:47 2017 -0700

--
 .../main/java/org/apache/beam/sdk/annotations/Experimental.java   | 3 +++
 .../java/core/src/main/java/org/apache/beam/sdk/coders/Coder.java | 3 +++
 2 files changed, 6 insertions(+)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/fda3a43b/sdks/java/core/src/main/java/org/apache/beam/sdk/annotations/Experimental.java
--
diff --git 
a/sdks/java/core/src/main/java/org/apache/beam/sdk/annotations/Experimental.java
 
b/sdks/java/core/src/main/java/org/apache/beam/sdk/annotations/Experimental.java
index 7255a01..2e3a711 100644
--- 
a/sdks/java/core/src/main/java/org/apache/beam/sdk/annotations/Experimental.java
+++ 
b/sdks/java/core/src/main/java/org/apache/beam/sdk/annotations/Experimental.java
@@ -84,6 +84,9 @@ public @interface Experimental {
 /** Metrics-related experimental APIs. */
 METRICS,
 
+/** Experimental feature related to alternative, unnested encodings for 
coders. */
+CODER_CONTEXT,
+
 /** Experimental runner APIs. Should not be used by pipeline authors. */
 CORE_RUNNERS_ONLY,
 

http://git-wip-us.apache.org/repos/asf/beam/blob/fda3a43b/sdks/java/core/src/main/java/org/apache/beam/sdk/coders/Coder.java
--
diff --git a/sdks/java/core/src/main/java/org/apache/beam/sdk/coders/Coder.java 
b/sdks/java/core/src/main/java/org/apache/beam/sdk/coders/Coder.java
index ec8a72d..2ee532d 100644
--- a/sdks/java/core/src/main/java/org/apache/beam/sdk/coders/Coder.java
+++ b/sdks/java/core/src/main/java/org/apache/beam/sdk/coders/Coder.java
@@ -64,6 +64,7 @@ import org.apache.beam.sdk.values.TypeDescriptor;
 public abstract class Coder implements Serializable {
   /** The context in which encoding or decoding is being done. */
   @Deprecated
+  @Experimental(Kind.CODER_CONTEXT)
   public static class Context {
 /**
  * The outer context: the value being encoded or decoded takes
@@ -134,6 +135,7 @@ public abstract class Coder implements Serializable {
* @throws CoderException if the value could not be encoded for some reason
*/
   @Deprecated
+  @Experimental(Kind.CODER_CONTEXT)
   public void encode(T value, OutputStream outStream, Context context)
   throws CoderException, IOException {
 encode(value, outStream);
@@ -158,6 +160,7 @@ public abstract class Coder implements Serializable {
* @throws CoderException if the value could not be decoded for some reason
*/
   @Deprecated
+  @Experimental(Kind.CODER_CONTEXT)
   public T decode(InputStream inStream, Context context)
   throws CoderException, IOException {
 return decode(inStream);



[05/13] beam git commit: automated context removal or redirection

2017-05-08 Thread lcwik
automated context removal or redirection


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/b7f3341e
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/b7f3341e
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/b7f3341e

Branch: refs/heads/master
Commit: b7f3341ed4fc073a834a55773bcb4a2c7f821c52
Parents: 996dce3
Author: Robert Bradshaw 
Authored: Fri May 5 17:24:02 2017 -0700
Committer: Luke Cwik 
Committed: Mon May 8 20:17:56 2017 -0700

--
 .../apex/translation/utils/ApexStreamTuple.java | 11 
 .../UnboundedReadFromBoundedSource.java |  4 +--
 .../runners/core/construction/CodersTest.java   |  4 +--
 .../core/ElementAndRestrictionCoder.java|  4 +--
 .../beam/runners/core/KeyedWorkItemCoder.java   |  6 ++---
 .../beam/runners/core/TimerInternals.java   |  6 ++---
 .../direct/CloningBundleFactoryTest.java| 20 ++
 .../UnboundedReadEvaluatorFactoryTest.java  |  5 ++--
 .../streaming/SingletonKeyedWorkItemCoder.java  | 11 
 .../runners/dataflow/BatchViewOverrides.java| 11 
 .../runners/dataflow/internal/IsmFormat.java| 28 +---
 .../runners/dataflow/util/RandomAccessData.java | 11 
 .../org/apache/beam/sdk/coders/AvroCoder.java   |  4 +--
 .../apache/beam/sdk/coders/BigDecimalCoder.java | 11 
 .../beam/sdk/coders/BigEndianIntegerCoder.java  |  4 +--
 .../beam/sdk/coders/BigEndianLongCoder.java |  4 +--
 .../apache/beam/sdk/coders/BigIntegerCoder.java | 11 
 .../org/apache/beam/sdk/coders/BitSetCoder.java | 11 
 .../apache/beam/sdk/coders/ByteArrayCoder.java  | 11 
 .../org/apache/beam/sdk/coders/ByteCoder.java   |  4 +--
 .../apache/beam/sdk/coders/DelegateCoder.java   | 11 
 .../org/apache/beam/sdk/coders/DoubleCoder.java |  4 +--
 .../apache/beam/sdk/coders/DurationCoder.java   |  4 +--
 .../apache/beam/sdk/coders/InstantCoder.java|  4 +--
 .../beam/sdk/coders/IterableLikeCoder.java  |  6 ++---
 .../org/apache/beam/sdk/coders/KvCoder.java | 11 
 .../beam/sdk/coders/LengthPrefixCoder.java  |  4 +--
 .../org/apache/beam/sdk/coders/MapCoder.java| 11 
 .../apache/beam/sdk/coders/NullableCoder.java   | 11 
 .../beam/sdk/coders/SerializableCoder.java  |  4 +--
 .../beam/sdk/coders/StringDelegateCoder.java| 11 
 .../apache/beam/sdk/coders/StringUtf8Coder.java | 11 
 .../beam/sdk/coders/TextualIntegerCoder.java| 11 
 .../org/apache/beam/sdk/coders/VarIntCoder.java |  4 +--
 .../apache/beam/sdk/coders/VarLongCoder.java|  4 +--
 .../org/apache/beam/sdk/coders/VoidCoder.java   |  4 +--
 .../org/apache/beam/sdk/io/FileBasedSink.java   | 11 
 .../sdk/transforms/ApproximateQuantiles.java|  4 +--
 .../org/apache/beam/sdk/transforms/Combine.java | 22 +++
 .../apache/beam/sdk/transforms/CombineFns.java  | 13 +++--
 .../org/apache/beam/sdk/transforms/Count.java   |  4 +--
 .../org/apache/beam/sdk/transforms/Mean.java|  4 +--
 .../org/apache/beam/sdk/transforms/Top.java |  4 +--
 .../beam/sdk/transforms/join/CoGbkResult.java   |  6 ++---
 .../beam/sdk/transforms/join/UnionCoder.java| 11 
 .../sdk/transforms/windowing/GlobalWindow.java  |  4 +--
 .../transforms/windowing/IntervalWindow.java|  4 +--
 .../beam/sdk/transforms/windowing/PaneInfo.java |  4 +--
 .../org/apache/beam/sdk/util/BitSetCoder.java   | 11 
 .../org/apache/beam/sdk/util/WindowedValue.java | 24 +++--
 .../beam/sdk/values/TimestampedValue.java   |  5 ++--
 .../beam/sdk/values/ValueInSingleWindow.java| 13 +++--
 .../beam/sdk/values/ValueWithRecordId.java  | 11 
 .../beam/sdk/coders/CoderRegistryTest.java  |  8 +++---
 .../apache/beam/sdk/coders/CustomCoderTest.java |  4 +--
 .../beam/sdk/coders/NullableCoderTest.java  | 11 
 .../beam/sdk/coders/StructuredCoderTest.java| 12 -
 .../apache/beam/sdk/testing/PAssertTest.java|  4 +--
 .../sdk/testing/SerializableMatchersTest.java   |  4 +--
 .../beam/sdk/testing/WindowSupplierTest.java|  4 +--
 .../beam/sdk/transforms/CombineFnsTest.java | 11 
 .../apache/beam/sdk/transforms/CombineTest.java | 22 +++
 .../apache/beam/sdk/transforms/CreateTest.java  |  9 +++
 .../beam/sdk/transforms/GroupByKeyTest.java |  4 +--
 .../apache/beam/sdk/transforms/ParDoTest.java   | 15 +--
 .../apache/beam/sdk/transforms/ViewTest.java| 11 
 .../transforms/reflect/DoFnInvokersTest.java|  8 +++---
 .../apache/beam/sdk/util/CoderUtilsTest.java|  4 +--
 .../beam/sdk/util/SerializableUtilsTest.java|  4 +--
 .../extensions/protobuf/ByteStringCoder.java| 11 
 .../sdk/extensions/protobuf/ProtoCoder.java | 11 
 

[01/13] beam git commit: Remove explicit used of nested contexts.

2017-05-08 Thread lcwik
Repository: beam
Updated Branches:
  refs/heads/master 23731fe7a -> 3a09ed575


Remove explicit used of nested contexts.

find . -type f -name '*.java' | xargs sed -i '' 's/\([.]..code[(].*\),  
*context.nested..[)]/\1)/'
find . -type f -name '*.java' | xargs sed -i '' 's/\([.]..code[(].*\),  
*nestedContext[)]/\1)/'
find . -type f -name '*.java' | xargs sed -i '' 's/\([.]..code[(].*\),  
*Context.NESTED[)]/\1)/'
find . -type f -name '*.java' | xargs sed -i '' 's/\([.]..code[(].*\),  *[^ 
]*.Context.NESTED[)]/\1)/'

Added back explicit context in CoGbkResult.java due to compile error.


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/27e9a060
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/27e9a060
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/27e9a060

Branch: refs/heads/master
Commit: 27e9a060ed593b2b53b88481591f14b1a274c61b
Parents: 23731fe
Author: Robert Bradshaw 
Authored: Fri May 5 16:20:37 2017 -0700
Committer: Luke Cwik 
Committed: Mon May 8 20:17:54 2017 -0700

--
 .../UnboundedReadFromBoundedSource.java |  4 +--
 .../core/ElementAndRestrictionCoder.java|  4 +--
 .../beam/runners/core/KeyedWorkItemCoder.java   |  8 +++---
 .../beam/runners/core/TimerInternals.java   | 12 -
 .../translation/types/CoderTypeSerializer.java  |  4 +--
 .../streaming/SingletonKeyedWorkItemCoder.java  |  4 +--
 .../state/FlinkKeyGroupStateInternals.java  |  8 +++---
 .../runners/dataflow/BatchViewOverrides.java|  4 +--
 .../runners/dataflow/internal/IsmFormat.java| 24 -
 .../spark/aggregators/NamedAggregators.java |  4 +--
 .../apache/beam/sdk/coders/BigDecimalCoder.java |  4 +--
 .../beam/sdk/coders/IterableLikeCoder.java  |  8 +++---
 .../org/apache/beam/sdk/coders/KvCoder.java |  4 +--
 .../org/apache/beam/sdk/coders/MapCoder.java| 12 -
 .../org/apache/beam/sdk/io/FileBasedSink.java   |  4 +--
 .../sdk/transforms/ApproximateQuantiles.java| 20 +++---
 .../apache/beam/sdk/transforms/CombineFns.java  |  4 +--
 .../org/apache/beam/sdk/transforms/Mean.java|  4 +--
 .../beam/sdk/transforms/join/CoGbkResult.java   |  2 +-
 .../transforms/windowing/IntervalWindow.java|  4 +--
 .../org/apache/beam/sdk/util/WindowedValue.java | 10 +++
 .../beam/sdk/values/TimestampedValue.java   |  4 +--
 .../beam/sdk/values/ValueInSingleWindow.java| 12 -
 .../beam/sdk/values/ValueWithRecordId.java  |  4 +--
 .../beam/sdk/coders/SerializableCoderTest.java  | 28 ++--
 .../apache/beam/sdk/transforms/CombineTest.java |  4 +--
 .../apache/beam/sdk/transforms/CreateTest.java  |  4 +--
 .../transforms/windowing/GlobalWindowTest.java  |  2 +-
 ...BufferedElementCountingOutputStreamTest.java |  5 ++--
 .../BeamFnDataBufferingOutboundObserver.java|  2 +-
 .../harness/data/BeamFnDataInboundObserver.java |  2 +-
 ...BeamFnDataBufferingOutboundObserverTest.java |  2 +-
 .../data/BeamFnDataInboundObserverTest.java |  2 +-
 .../sdk/io/gcp/bigquery/ShardedKeyCoder.java| 11 
 .../io/gcp/bigquery/TableDestinationCoder.java  | 10 +++
 .../sdk/io/gcp/bigquery/TableRowInfoCoder.java  |  4 +--
 .../io/gcp/bigquery/WriteBundlesToFiles.java| 12 -
 .../PubsubMessageWithAttributesCoder.java   |  4 +--
 .../sdk/io/gcp/pubsub/PubsubUnboundedSink.java  | 16 +--
 .../io/gcp/pubsub/PubsubUnboundedSource.java|  2 +-
 .../apache/beam/sdk/io/xml/JAXBCoderTest.java   |  8 +++---
 41 files changed, 145 insertions(+), 145 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/27e9a060/runners/core-construction-java/src/main/java/org/apache/beam/runners/core/construction/UnboundedReadFromBoundedSource.java
--
diff --git 
a/runners/core-construction-java/src/main/java/org/apache/beam/runners/core/construction/UnboundedReadFromBoundedSource.java
 
b/runners/core-construction-java/src/main/java/org/apache/beam/runners/core/construction/UnboundedReadFromBoundedSource.java
index 1424b8b..ae28e3a 100644
--- 
a/runners/core-construction-java/src/main/java/org/apache/beam/runners/core/construction/UnboundedReadFromBoundedSource.java
+++ 
b/runners/core-construction-java/src/main/java/org/apache/beam/runners/core/construction/UnboundedReadFromBoundedSource.java
@@ -223,7 +223,7 @@ public class UnboundedReadFromBoundedSource extends 
PTransform

[13/13] beam git commit: [BEAM-2166] Use contextless encode/decode by default.

2017-05-08 Thread lcwik
[BEAM-2166] Use contextless encode/decode by default.

This closes #2933


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/3a09ed57
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/3a09ed57
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/3a09ed57

Branch: refs/heads/master
Commit: 3a09ed5757c0e7a5f171350731403524c5d339fb
Parents: 23731fe fda3a43
Author: Luke Cwik 
Authored: Mon May 8 21:21:09 2017 -0700
Committer: Luke Cwik 
Committed: Mon May 8 21:21:09 2017 -0700

--
 .../apex/translation/utils/ApexStreamTuple.java | 11 +
 .../UnboundedReadFromBoundedSource.java | 12 ++---
 .../runners/core/construction/CodersTest.java   |  4 +-
 .../core/construction/PCollectionsTest.java | 12 ++---
 .../core/ElementAndRestrictionCoder.java| 12 ++---
 .../beam/runners/core/KeyedWorkItemCoder.java   | 18 +++
 .../beam/runners/core/TimerInternals.java   | 22 -
 .../direct/CloningBundleFactoryTest.java| 20 +++-
 .../beam/runners/direct/DirectRunnerTest.java   |  4 +-
 .../UnboundedReadEvaluatorFactoryTest.java  |  5 +-
 .../translation/types/CoderTypeSerializer.java  |  4 +-
 .../streaming/SingletonKeyedWorkItemCoder.java  | 16 +-
 .../state/FlinkKeyGroupStateInternals.java  |  9 ++--
 runners/google-cloud-dataflow-java/pom.xml  |  2 +-
 .../runners/dataflow/BatchViewOverrides.java| 16 +++---
 .../runners/dataflow/internal/IsmFormat.java| 51 ++--
 .../runners/dataflow/util/RandomAccessData.java | 11 +
 .../runners/dataflow/util/CloudObjectsTest.java |  8 +--
 .../spark/aggregators/NamedAggregators.java |  4 +-
 .../beam/sdk/annotations/Experimental.java  |  3 ++
 .../org/apache/beam/sdk/coders/AvroCoder.java   |  4 +-
 .../apache/beam/sdk/coders/BigDecimalCoder.java | 15 +-
 .../beam/sdk/coders/BigEndianIntegerCoder.java  |  4 +-
 .../beam/sdk/coders/BigEndianLongCoder.java |  4 +-
 .../apache/beam/sdk/coders/BigIntegerCoder.java | 11 +
 .../org/apache/beam/sdk/coders/BitSetCoder.java | 11 +
 .../apache/beam/sdk/coders/ByteArrayCoder.java  | 11 +
 .../org/apache/beam/sdk/coders/ByteCoder.java   |  4 +-
 .../java/org/apache/beam/sdk/coders/Coder.java  | 38 +--
 .../org/apache/beam/sdk/coders/CustomCoder.java | 47 --
 .../apache/beam/sdk/coders/DelegateCoder.java   | 11 +
 .../org/apache/beam/sdk/coders/DoubleCoder.java |  4 +-
 .../apache/beam/sdk/coders/DurationCoder.java   |  8 +--
 .../apache/beam/sdk/coders/InstantCoder.java|  8 +--
 .../beam/sdk/coders/IterableLikeCoder.java  | 14 +++---
 .../org/apache/beam/sdk/coders/KvCoder.java | 15 +-
 .../beam/sdk/coders/LengthPrefixCoder.java  |  4 +-
 .../org/apache/beam/sdk/coders/MapCoder.java| 23 ++---
 .../apache/beam/sdk/coders/NullableCoder.java   | 11 +
 .../beam/sdk/coders/SerializableCoder.java  |  4 +-
 .../beam/sdk/coders/StringDelegateCoder.java| 11 +
 .../apache/beam/sdk/coders/StringUtf8Coder.java | 11 +
 .../apache/beam/sdk/coders/StructuredCoder.java | 47 --
 .../beam/sdk/coders/TextualIntegerCoder.java| 11 +
 .../org/apache/beam/sdk/coders/VarIntCoder.java |  4 +-
 .../apache/beam/sdk/coders/VarLongCoder.java|  4 +-
 .../org/apache/beam/sdk/coders/VoidCoder.java   |  4 +-
 .../org/apache/beam/sdk/io/FileBasedSink.java   | 15 +++---
 .../sdk/transforms/ApproximateQuantiles.java| 44 -
 .../org/apache/beam/sdk/transforms/Combine.java | 23 +
 .../apache/beam/sdk/transforms/CombineFns.java  | 17 +--
 .../org/apache/beam/sdk/transforms/Count.java   |  4 +-
 .../org/apache/beam/sdk/transforms/Mean.java| 12 ++---
 .../org/apache/beam/sdk/transforms/Top.java |  8 +--
 .../beam/sdk/transforms/join/CoGbkResult.java   | 18 +++
 .../beam/sdk/transforms/join/UnionCoder.java| 11 +
 .../sdk/transforms/windowing/GlobalWindow.java  |  4 +-
 .../transforms/windowing/IntervalWindow.java| 12 ++---
 .../beam/sdk/transforms/windowing/PaneInfo.java |  5 +-
 .../org/apache/beam/sdk/util/BitSetCoder.java   | 11 +
 .../org/apache/beam/sdk/util/WindowedValue.java | 37 ++
 .../beam/sdk/values/TimestampedValue.java   | 13 +++--
 .../beam/sdk/values/ValueInSingleWindow.java| 25 +++---
 .../beam/sdk/values/ValueWithRecordId.java  | 15 +-
 .../beam/sdk/coders/CoderRegistryTest.java  | 16 --
 .../apache/beam/sdk/coders/CustomCoderTest.java |  4 +-
 .../beam/sdk/coders/NullableCoderTest.java  | 11 +
 .../beam/sdk/coders/SerializableCoderTest.java  | 28 +--
 .../beam/sdk/coders/StructuredCoderTest.java| 12 ++---
 .../beam/sdk/testing/CoderPropertiesTest.java   | 36 +++---
 .../apache/beam/sdk/testing/PAssertTest.java|  4 +-
 

[08/13] beam git commit: Reviewer comments + a couple of extra fixes. All compiles.

2017-05-08 Thread lcwik
Reviewer comments + a couple of extra fixes. All compiles.


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/4c002724
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/4c002724
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/4c002724

Branch: refs/heads/master
Commit: 4c002724c1adffc6acb1a5b424f864d451e1061c
Parents: 2d379dd
Author: Robert Bradshaw 
Authored: Mon May 8 10:48:40 2017 -0700
Committer: Luke Cwik 
Committed: Mon May 8 20:17:57 2017 -0700

--
 .../beam/runners/direct/DirectRunnerTest.java |  4 ++--
 .../beam/runners/dataflow/BatchViewOverrides.java | 16 ++--
 .../beam/runners/dataflow/internal/IsmFormat.java | 16 ++--
 .../runners/dataflow/util/CloudObjectsTest.java   |  8 
 .../org/apache/beam/sdk/io/FileBasedSink.java | 18 +++---
 .../apache/beam/sdk/coders/CoderRegistryTest.java |  2 ++
 .../sdk/io/gcp/pubsub/PubsubUnboundedSource.java  | 18 +++---
 7 files changed, 18 insertions(+), 64 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/4c002724/runners/direct-java/src/test/java/org/apache/beam/runners/direct/DirectRunnerTest.java
--
diff --git 
a/runners/direct-java/src/test/java/org/apache/beam/runners/direct/DirectRunnerTest.java
 
b/runners/direct-java/src/test/java/org/apache/beam/runners/direct/DirectRunnerTest.java
index 85e55eb..943d27c 100644
--- 
a/runners/direct-java/src/test/java/org/apache/beam/runners/direct/DirectRunnerTest.java
+++ 
b/runners/direct-java/src/test/java/org/apache/beam/runners/direct/DirectRunnerTest.java
@@ -526,11 +526,11 @@ public class DirectRunnerTest implements Serializable {
   private static class LongNoDecodeCoder extends AtomicCoder {
 @Override
 public void encode(
-Long value, OutputStream outStream, Context context) throws 
IOException {
+Long value, OutputStream outStream) throws IOException {
 }
 
 @Override
-public Long decode(InputStream inStream, Context context) throws 
IOException {
+public Long decode(InputStream inStream) throws IOException {
   throw new CoderException("Cannot decode a long");
 }
   }

http://git-wip-us.apache.org/repos/asf/beam/blob/4c002724/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/BatchViewOverrides.java
--
diff --git 
a/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/BatchViewOverrides.java
 
b/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/BatchViewOverrides.java
index d640f6e..32a04c0 100644
--- 
a/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/BatchViewOverrides.java
+++ 
b/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/BatchViewOverrides.java
@@ -1353,27 +1353,15 @@ class BatchViewOverrides {
 @Override
 public void encode(TransformedMap value, OutputStream outStream)
 throws CoderException, IOException {
-  encode(value, outStream, Coder.Context.NESTED);
-}
-
-@Override
-public void encode(TransformedMap value, OutputStream outStream,
-Coder.Context context) throws CoderException, IOException {
   transformCoder.encode(value.transform, outStream);
-  originalMapCoder.encode(value.originalMap, outStream, context);
+  originalMapCoder.encode(value.originalMap, outStream);
 }
 
 @Override
 public TransformedMap decode(InputStream inStream) throws 
CoderException, IOException {
-  return decode(inStream, Coder.Context.NESTED);
-}
-
-@Override
-public TransformedMap decode(
-InputStream inStream, Coder.Context context) throws CoderException, 
IOException {
   return new TransformedMap<>(
   transformCoder.decode(inStream),
-  originalMapCoder.decode(inStream, context));
+  originalMapCoder.decode(inStream));
 }
 
 @Override

http://git-wip-us.apache.org/repos/asf/beam/blob/4c002724/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/internal/IsmFormat.java
--
diff --git 
a/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/internal/IsmFormat.java
 
b/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/internal/IsmFormat.java
index 8cfae81..0796d08 100644
--- 
a/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/internal/IsmFormat.java
+++ 

[03/13] beam git commit: get it compiling

2017-05-08 Thread lcwik
get it compiling


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/45e09b2d
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/45e09b2d
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/45e09b2d

Branch: refs/heads/master
Commit: 45e09b2df7a97cfbbe0d8013c16cfdd7a55afbde
Parents: b7f3341
Author: Robert Bradshaw 
Authored: Fri May 5 17:27:13 2017 -0700
Committer: Luke Cwik 
Committed: Mon May 8 20:17:56 2017 -0700

--
 .../runners/dataflow/BatchViewOverrides.java|  4 +--
 .../org/apache/beam/sdk/transforms/Combine.java |  3 +-
 .../beam/sdk/transforms/join/CoGbkResult.java   |  2 +-
 .../beam/sdk/transforms/windowing/PaneInfo.java |  1 -
 .../org/apache/beam/sdk/util/WindowedValue.java |  3 +-
 .../beam/sdk/coders/CoderRegistryTest.java  |  6 
 .../beam/sdk/testing/CoderPropertiesTest.java   | 36 ++--
 .../sdk/testing/SerializableMatchersTest.java   |  1 -
 .../apache/beam/sdk/transforms/CombineTest.java | 19 +++
 .../apache/beam/sdk/transforms/ParDoTest.java   | 17 ++---
 10 files changed, 37 insertions(+), 55 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/45e09b2d/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/BatchViewOverrides.java
--
diff --git 
a/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/BatchViewOverrides.java
 
b/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/BatchViewOverrides.java
index 34609df..d640f6e 100644
--- 
a/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/BatchViewOverrides.java
+++ 
b/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/BatchViewOverrides.java
@@ -1351,9 +1351,9 @@ class BatchViewOverrides {
 }
 
 @Override
-public void encode(TransformedMap value, OutputStream 
outStream, OutputStream outStream)
+public void encode(TransformedMap value, OutputStream outStream)
 throws CoderException, IOException {
-  encode(outStream, outStream, Coder.Context.NESTED);
+  encode(value, outStream, Coder.Context.NESTED);
 }
 
 @Override

http://git-wip-us.apache.org/repos/asf/beam/blob/45e09b2d/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/Combine.java
--
diff --git 
a/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/Combine.java 
b/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/Combine.java
index 7e43564..9e1cc71 100644
--- a/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/Combine.java
+++ b/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/Combine.java
@@ -2001,7 +2001,8 @@ public class Combine {
 }
 
 @Override
-public InputOrAccum decode(InputStream inStream) 
throws CoderException, IOException {
+public InputOrAccum decode(InputStream inStream)
+throws CoderException, IOException {
   return decode(inStream, Coder.Context.NESTED);
 }
 

http://git-wip-us.apache.org/repos/asf/beam/blob/45e09b2d/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/join/CoGbkResult.java
--
diff --git 
a/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/join/CoGbkResult.java
 
b/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/join/CoGbkResult.java
index d42de82..877bb07 100644
--- 
a/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/join/CoGbkResult.java
+++ 
b/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/join/CoGbkResult.java
@@ -264,7 +264,7 @@ public class CoGbkResult {
   }
   List valueMap = 
Lists.newArrayListWithExpectedSize(schema.size());
   for (int unionTag = 0; unionTag < schema.size(); unionTag++) {
-valueMap.add(tagListCoder(unionTag).decode(inStream, 
Coder.Context.NESTED));
+valueMap.add(tagListCoder(unionTag).decode(inStream));
   }
   return new CoGbkResult(schema, valueMap);
 }

http://git-wip-us.apache.org/repos/asf/beam/blob/45e09b2d/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/windowing/PaneInfo.java
--
diff --git 
a/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/windowing/PaneInfo.java
 
b/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/windowing/PaneInfo.java
index 75df220..1e9a187 100644
--- 
a/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/windowing/PaneInfo.java
+++ 

[04/13] beam git commit: automated context removal or redirection

2017-05-08 Thread lcwik
http://git-wip-us.apache.org/repos/asf/beam/blob/b7f3341e/sdks/java/core/src/test/java/org/apache/beam/sdk/coders/CustomCoderTest.java
--
diff --git 
a/sdks/java/core/src/test/java/org/apache/beam/sdk/coders/CustomCoderTest.java 
b/sdks/java/core/src/test/java/org/apache/beam/sdk/coders/CustomCoderTest.java
index dfd4ea2..13a7261 100644
--- 
a/sdks/java/core/src/test/java/org/apache/beam/sdk/coders/CustomCoderTest.java
+++ 
b/sdks/java/core/src/test/java/org/apache/beam/sdk/coders/CustomCoderTest.java
@@ -48,13 +48,13 @@ public class CustomCoderTest {
 }
 
 @Override
-public void encode(KV kv, OutputStream out, Context context)
+public void encode(KV kv, OutputStream out)
 throws IOException {
   new DataOutputStream(out).writeLong(kv.getValue());
 }
 
 @Override
-public KV decode(InputStream inStream, Context context)
+public KV decode(InputStream inStream)
 throws IOException {
   return KV.of(key, new DataInputStream(inStream).readLong());
 }

http://git-wip-us.apache.org/repos/asf/beam/blob/b7f3341e/sdks/java/core/src/test/java/org/apache/beam/sdk/coders/NullableCoderTest.java
--
diff --git 
a/sdks/java/core/src/test/java/org/apache/beam/sdk/coders/NullableCoderTest.java
 
b/sdks/java/core/src/test/java/org/apache/beam/sdk/coders/NullableCoderTest.java
index d6d7de8..9fb0b82 100644
--- 
a/sdks/java/core/src/test/java/org/apache/beam/sdk/coders/NullableCoderTest.java
+++ 
b/sdks/java/core/src/test/java/org/apache/beam/sdk/coders/NullableCoderTest.java
@@ -167,6 +167,12 @@ public class NullableCoderTest {
 
   private static class EntireStreamExpectingCoder extends AtomicCoder {
 @Override
+public void encode(String value, OutputStream outStream)
+throws IOException {
+  encode(value, outStream, Context.NESTED);
+}
+
+@Override
 public void encode(
 String value, OutputStream outStream, Context context) throws 
IOException {
   checkArgument(context.isWholeStream, "Expected to get entire stream");
@@ -174,6 +180,11 @@ public class NullableCoderTest {
 }
 
 @Override
+public String decode(InputStream inStream) throws CoderException, 
IOException {
+  return decode(inStream, Context.NESTED);
+}
+
+@Override
 public String decode(InputStream inStream, Context context)
 throws CoderException, IOException {
   checkArgument(context.isWholeStream, "Expected to get entire stream");

http://git-wip-us.apache.org/repos/asf/beam/blob/b7f3341e/sdks/java/core/src/test/java/org/apache/beam/sdk/coders/StructuredCoderTest.java
--
diff --git 
a/sdks/java/core/src/test/java/org/apache/beam/sdk/coders/StructuredCoderTest.java
 
b/sdks/java/core/src/test/java/org/apache/beam/sdk/coders/StructuredCoderTest.java
index af2c94e..7aa2080 100644
--- 
a/sdks/java/core/src/test/java/org/apache/beam/sdk/coders/StructuredCoderTest.java
+++ 
b/sdks/java/core/src/test/java/org/apache/beam/sdk/coders/StructuredCoderTest.java
@@ -47,7 +47,7 @@ public class StructuredCoderTest {
 private static final long serialVersionUID = 0L;
 
 @Override
-public void encode(@Nullable Boolean value, OutputStream outStream, 
Context context)
+public void encode(@Nullable Boolean value, OutputStream outStream)
 throws CoderException, IOException {
   if (value == null) {
 outStream.write(2);
@@ -61,7 +61,7 @@ public class StructuredCoderTest {
 @Override
 @Nullable
 public Boolean decode(
-InputStream inStream, org.apache.beam.sdk.coders.Coder.Context context)
+InputStream inStream)
 throws CoderException, IOException {
   int value = inStream.read();
   if (value == 0) {
@@ -110,7 +110,7 @@ public class StructuredCoderTest {
 
 @Override
 public void encode(
-@Nullable ObjectIdentityBoolean value, OutputStream outStream, Context 
context)
+@Nullable ObjectIdentityBoolean value, OutputStream outStream)
 throws CoderException, IOException {
   if (value == null) {
 outStream.write(2);
@@ -124,7 +124,7 @@ public class StructuredCoderTest {
 @Override
 @Nullable
 public ObjectIdentityBoolean decode(
-InputStream inStream, org.apache.beam.sdk.coders.Coder.Context context)
+InputStream inStream)
 throws CoderException, IOException {
   int value = inStream.read();
   if (value == 0) {
@@ -213,13 +213,13 @@ public class StructuredCoderTest {
   private static class Foo extends StructuredCoder {
 
 @Override
-public void encode(T value, OutputStream outStream, Coder.Context context)
+public void encode(T value, OutputStream outStream)
 throws CoderException, IOException {

[jira] [Closed] (BEAM-1723) FlinkRunner should deduplicate when an UnboundedSource requires Deduping

2017-05-08 Thread Aljoscha Krettek (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-1723?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aljoscha Krettek closed BEAM-1723.
--
Resolution: Fixed

Closing for good, now that it's on the release-2.0.0 branch as well.

> FlinkRunner should deduplicate when an UnboundedSource requires Deduping
> 
>
> Key: BEAM-1723
> URL: https://issues.apache.org/jira/browse/BEAM-1723
> Project: Beam
>  Issue Type: Bug
>  Components: runner-flink
>Reporter: Thomas Groh
>Assignee: Aljoscha Krettek
> Fix For: 2.0.0
>
>
> UnboundedSource implementations can require deduping, and the FlinkRunner 
> currently logs a warning that this is not supported.
> https://github.com/apache/beam/blob/master/runners/flink/runner/src/main/java/org/apache/beam/runners/flink/translation/wrappers/streaming/io/UnboundedSourceWrapper.java#L139



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (BEAM-1723) FlinkRunner should deduplicate when an UnboundedSource requires Deduping

2017-05-08 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-1723?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16002004#comment-16002004
 ] 

ASF GitHub Bot commented on BEAM-1723:
--

Github user aljoscha closed the pull request at:

https://github.com/apache/beam/pull/2959


> FlinkRunner should deduplicate when an UnboundedSource requires Deduping
> 
>
> Key: BEAM-1723
> URL: https://issues.apache.org/jira/browse/BEAM-1723
> Project: Beam
>  Issue Type: Bug
>  Components: runner-flink
>Reporter: Thomas Groh
>Assignee: Aljoscha Krettek
> Fix For: 2.0.0
>
>
> UnboundedSource implementations can require deduping, and the FlinkRunner 
> currently logs a warning that this is not supported.
> https://github.com/apache/beam/blob/master/runners/flink/runner/src/main/java/org/apache/beam/runners/flink/translation/wrappers/streaming/io/UnboundedSourceWrapper.java#L139



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] beam pull request #2959: [BEAM-1723] deduplication of UnboundedSource in Fli...

2017-05-08 Thread aljoscha
Github user aljoscha closed the pull request at:

https://github.com/apache/beam/pull/2959


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (BEAM-2221) Make KafkaIO coder specification less awkward

2017-05-08 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16001997#comment-16001997
 ] 

ASF GitHub Bot commented on BEAM-2221:
--

GitHub user rangadi opened a pull request:

https://github.com/apache/beam/pull/2986

[BEAM-2221] KafkaIO API clean up.


- Removed withKeyCoder() and withValueCoder() methods. 
  - Their meaning changed
when KafkaIO added support for Deserializers. The coders can be explicitly
specified using withKeyDeserializerAndCoder(). This makes it explicit to
the user Deserializer is still required and JavaDoc explains why both
are required in addition to deserializers.
 
 - Removed 'readWithCoders()' and 'writeWithCoders()' methods. These were 
supposed to be utility methods to show how to use 
`CoderBasedKafkaDeserializer`. But they are too prominently placed and they 
were misused in multiple places. E.g. KafkaIOTest used VarInt coders even 
though the Kafka messages were plain big endian encoded ints and longs. They 
tests still passed!. 
   - These utility methods  are moved `CoderBasedKafkaSerializer` (this 
class implements both `Deserializer` and `Serializer` interface). JavaDoc 
includes usage examples.




You can merge this pull request into a Git repository by running:

$ git pull https://github.com/rangadi/incubator-beam kafkaio_cleanup

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/2986.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2986


commit 89a840a9291dc8de2462e4ce76293d84a24f06f0
Author: Raghu Angadi 
Date:   2017-05-09T00:14:37Z

KafkaIO API clean up.
Remove withKeyCoder() and withValueCoder() method. Their meaning changed
when KafkaIO added support for Deserializers. The coders can be explicitly
specified using withKeyDeserializerAndCoder(). This makes it explicit to
the user Deserializer is still required and JavaDoc explains why both
are required.




> Make KafkaIO coder specification less awkward
> -
>
> Key: BEAM-2221
> URL: https://issues.apache.org/jira/browse/BEAM-2221
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-extensions
>Reporter: Eugene Kirpichov
>Assignee: Raghu Angadi
> Fix For: 2.0.0
>
>
> readWithCoders and writeWithCoders functions are awkward because they don't 
> emphasize enough that coders are a poor choice for interpreting wire format.
> The only reason to specify coders in KafkaIO is when coder inference from 
> Deserializer fails. To emphasize that, let's change the API to be 
> withKeyDeserializer(Deserializer) as the default choice and 
> withKeyDeserializerAndCoder(Deserializer,Coder) if inference fails; likewise 
> for value.
> Remove functions using coders to interpret wire format from the API. A common 
> case of that is Avro and Proto - for that, introduce special helper 
> functions, I guess like withAvro/ProtoKey/Value(...), which under the hood 
> may be allowed to reuse Avro/ProtoCoder as a utility, but do not expose this 
> fact.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] beam pull request #2986: [BEAM-2221] KafkaIO API clean up.

2017-05-08 Thread rangadi
GitHub user rangadi opened a pull request:

https://github.com/apache/beam/pull/2986

[BEAM-2221] KafkaIO API clean up.


- Removed withKeyCoder() and withValueCoder() methods. 
  - Their meaning changed
when KafkaIO added support for Deserializers. The coders can be explicitly
specified using withKeyDeserializerAndCoder(). This makes it explicit to
the user Deserializer is still required and JavaDoc explains why both
are required in addition to deserializers.
 
 - Removed 'readWithCoders()' and 'writeWithCoders()' methods. These were 
supposed to be utility methods to show how to use 
`CoderBasedKafkaDeserializer`. But they are too prominently placed and they 
were misused in multiple places. E.g. KafkaIOTest used VarInt coders even 
though the Kafka messages were plain big endian encoded ints and longs. They 
tests still passed!. 
   - These utility methods  are moved `CoderBasedKafkaSerializer` (this 
class implements both `Deserializer` and `Serializer` interface). JavaDoc 
includes usage examples.




You can merge this pull request into a Git repository by running:

$ git pull https://github.com/rangadi/incubator-beam kafkaio_cleanup

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/2986.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2986


commit 89a840a9291dc8de2462e4ce76293d84a24f06f0
Author: Raghu Angadi 
Date:   2017-05-09T00:14:37Z

KafkaIO API clean up.
Remove withKeyCoder() and withValueCoder() method. Their meaning changed
when KafkaIO added support for Deserializers. The coders can be explicitly
specified using withKeyDeserializerAndCoder(). This makes it explicit to
the user Deserializer is still required and JavaDoc explains why both
are required.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


Jenkins build became unstable: beam_PostCommit_Java_ValidatesRunner_Flink #2734

2017-05-08 Thread Apache Jenkins Server
See 




[GitHub] beam pull request #2985: [BEAM-659] WindowFn#isCompatible should provide a m...

2017-05-08 Thread huafengw
GitHub user huafengw opened a pull request:

https://github.com/apache/beam/pull/2985

[BEAM-659] WindowFn#isCompatible should provide a meaningful reason

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [ ] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [ ] Make sure tests pass via `mvn clean verify`.
 - [ ] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [ ] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.pdf).

---


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/huafengw/beam BEAM-659

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/2985.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2985


commit 12addb1bea6d620050b96fdf4e30602c7a710183
Author: huafengw 
Date:   2017-05-09T03:21:44Z

[BEAM-659] WindowFn#isCompatible should provide a meaningful reason




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (BEAM-2211) DataflowRunner (Java) rejects all but GCS paths for FileBasedSource/Sink

2017-05-08 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2211?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16001968#comment-16001968
 ] 

ASF GitHub Bot commented on BEAM-2211:
--

GitHub user dhalperi opened a pull request:

https://github.com/apache/beam/pull/2984

[BEAM-2211] Cherrypick #2968 to release-2.0.0

[BEAM-2211] DataflowRunner: remove validation of file read/write paths

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/dhalperi/beam cp-2968

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/2984.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2984


commit 9e90e54809d0e11774e8e923e9f9f19651f1d28f
Author: Dan Halperin 
Date:   2017-05-08T22:37:29Z

[BEAM-2211] DataflowRunner: remove validation of file read/write paths

Now that users can implement and register custom FileSystems,
we can no longer really effectively validate filesystems they
can read or write files from. They can even register file://
to point to some HDFS path, e.g.,




> DataflowRunner (Java) rejects all but GCS paths for FileBasedSource/Sink
> 
>
> Key: BEAM-2211
> URL: https://issues.apache.org/jira/browse/BEAM-2211
> Project: Beam
>  Issue Type: Bug
>  Components: runner-dataflow
>Reporter: Daniel Halperin
>Assignee: Daniel Halperin
> Fix For: 2.0.0
>
>
> {{FileBasedSource}} and {{Sink}} have switched in Beam to the {{FileSystems}} 
> API from the the {{IOChannelUtils}} API, which means they now support HDFS 
> and GCS and others.
> However, the {{DataflowRunner}} still uses {{GcsPathValidator}}, which means 
> it will likely currently disallow HDFS and other new {{FileSystem}} 
> implementations.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] beam pull request #2984: [BEAM-2211] Cherrypick #2968 to release-2.0.0

2017-05-08 Thread dhalperi
GitHub user dhalperi opened a pull request:

https://github.com/apache/beam/pull/2984

[BEAM-2211] Cherrypick #2968 to release-2.0.0

[BEAM-2211] DataflowRunner: remove validation of file read/write paths

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/dhalperi/beam cp-2968

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/2984.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2984


commit 9e90e54809d0e11774e8e923e9f9f19651f1d28f
Author: Dan Halperin 
Date:   2017-05-08T22:37:29Z

[BEAM-2211] DataflowRunner: remove validation of file read/write paths

Now that users can implement and register custom FileSystems,
we can no longer really effectively validate filesystems they
can read or write files from. They can even register file://
to point to some HDFS path, e.g.,




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (BEAM-2223) java8 examples are not running

2017-05-08 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16001965#comment-16001965
 ] 

ASF GitHub Bot commented on BEAM-2223:
--

GitHub user dhalperi opened a pull request:

https://github.com/apache/beam/pull/2983

[BEAM-2223] PipelineOptionsFactory: improve debuggability

Include the name of the offending interface in the error message

R: @lukecwik 

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/dhalperi/beam pipeline-options-debugging

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/2983.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2983


commit 4ae133c0e05427016c4a6ab60d0b259bde6a1ad6
Author: Dan Halperin 
Date:   2017-05-09T00:53:46Z

PipelineOptionsFactory: improve debuggability

Include the name of the offending interface in the error message




> java8 examples are not running
> --
>
> Key: BEAM-2223
> URL: https://issues.apache.org/jira/browse/BEAM-2223
> Project: Beam
>  Issue Type: Bug
>  Components: examples-java
>Reporter: Ahmet Altay
>Assignee: Daniel Halperin
> Fix For: 2.0.0
>
>
> Could not run java8 examples any more with:
> mvn compile exec:java 
> -Dexec.mainClass=org.apache.beam.examples.complete.game.UserScore 
> -Dexec.args="--project= --dataset= 
> --tempLocation="
> Fails with:
> java.lang.reflect.InvocationTargetException
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at org.codehaus.mojo.exec.ExecJavaMojo$1.run(ExecJavaMojo.java:294)
>   at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.NoClassDefFoundError: org/hamcrest/Matcher
>   at java.lang.ClassLoader.defineClass1(Native Method)
>   at java.lang.ClassLoader.defineClass(ClassLoader.java:763)
>   at 
> java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
>   at java.net.URLClassLoader.defineClass(URLClassLoader.java:467)
>   at java.net.URLClassLoader.access$100(URLClassLoader.java:73)
>   at java.net.URLClassLoader$1.run(URLClassLoader.java:368)
>   at java.net.URLClassLoader$1.run(URLClassLoader.java:362)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at java.net.URLClassLoader.findClass(URLClassLoader.java:361)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
>   at java.lang.Class.getDeclaredMethods0(Native Method)
>   at java.lang.Class.privateGetDeclaredMethods(Class.java:2703)
>   at java.lang.Class.privateGetPublicMethods(Class.java:2904)
>   at java.lang.Class.privateGetPublicMethods(Class.java:2913)
>   at java.lang.Class.getMethods(Class.java:1617)
>   at sun.misc.ProxyGenerator.generateClassFile(ProxyGenerator.java:451)
>   at sun.misc.ProxyGenerator.generateProxyClass(ProxyGenerator.java:339)
>   at java.lang.reflect.Proxy$ProxyClassFactory.apply(Proxy.java:639)
>   at java.lang.reflect.Proxy$ProxyClassFactory.apply(Proxy.java:557)
>   at java.lang.reflect.WeakCache$Factory.get(WeakCache.java:230)
>   at java.lang.reflect.WeakCache.get(WeakCache.java:127)
>   at java.lang.reflect.Proxy.getProxyClass0(Proxy.java:419)
>   at java.lang.reflect.Proxy.getProxyClass(Proxy.java:371)
>   at 
> org.apache.beam.sdk.options.PipelineOptionsFactory.validateWellFormed(PipelineOptionsFactory.java:606)
>   at 
> org.apache.beam.sdk.options.PipelineOptionsFactory.register(PipelineOptionsFactory.java:544)
>   at 
> org.apache.beam.sdk.options.PipelineOptionsFactory.initializeRegistry(PipelineOptionsFactory.java:570)
>   at 
> org.apache.beam.sdk.options.PipelineOptionsFactory.(PipelineOptionsFactory.java:519)
>   at 
> org.apache.beam.examples.complete.game.UserScore.main(UserScore.java:226)
>   ... 6 more
> Caused by: java.lang.ClassNotFoundException: org.hamcrest.Matcher
>   at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
>   ... 35 more
> cc: [~kenn][~tgroh][~vikasrk][~dhalp...@google.com]



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] beam pull request #2983: [BEAM-2223] PipelineOptionsFactory: improve debugga...

2017-05-08 Thread dhalperi
GitHub user dhalperi opened a pull request:

https://github.com/apache/beam/pull/2983

[BEAM-2223] PipelineOptionsFactory: improve debuggability

Include the name of the offending interface in the error message

R: @lukecwik 

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/dhalperi/beam pipeline-options-debugging

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/2983.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2983


commit 4ae133c0e05427016c4a6ab60d0b259bde6a1ad6
Author: Dan Halperin 
Date:   2017-05-09T00:53:46Z

PipelineOptionsFactory: improve debuggability

Include the name of the offending interface in the error message




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (BEAM-2211) DataflowRunner (Java) rejects all but GCS paths for FileBasedSource/Sink

2017-05-08 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2211?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16001961#comment-16001961
 ] 

ASF GitHub Bot commented on BEAM-2211:
--

Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/2968


> DataflowRunner (Java) rejects all but GCS paths for FileBasedSource/Sink
> 
>
> Key: BEAM-2211
> URL: https://issues.apache.org/jira/browse/BEAM-2211
> Project: Beam
>  Issue Type: Bug
>  Components: runner-dataflow
>Reporter: Daniel Halperin
>Assignee: Daniel Halperin
> Fix For: 2.0.0
>
>
> {{FileBasedSource}} and {{Sink}} have switched in Beam to the {{FileSystems}} 
> API from the the {{IOChannelUtils}} API, which means they now support HDFS 
> and GCS and others.
> However, the {{DataflowRunner}} still uses {{GcsPathValidator}}, which means 
> it will likely currently disallow HDFS and other new {{FileSystem}} 
> implementations.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] beam pull request #2968: [BEAM-2211] DataflowRunner: remove validation of fi...

2017-05-08 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/2968


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (BEAM-1871) Thin Java SDK Core

2017-05-08 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-1871?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16001960#comment-16001960
 ] 

ASF GitHub Bot commented on BEAM-1871:
--

GitHub user vikkyrk opened a pull request:

https://github.com/apache/beam/pull/2982

[BEAM-1871] Remove google api BackOff usage from sdks/core

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [ ] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [ ] Make sure tests pass via `mvn clean verify`.
 - [ ] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [ ] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.pdf).

---

- cheryy-pick of https://github.com/apache/beam/pull/2937

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/vikkyrk/incubator-beam backoff_2.0.0

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/2982.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2982


commit 08f10b87dc99c082bbd6f0972ac0972f2a514f70
Author: Vikas Kedigehalli 
Date:   2017-05-06T02:24:51Z

Remove google api BackOff usage from sdks/core




> Thin Java SDK Core
> --
>
> Key: BEAM-1871
> URL: https://issues.apache.org/jira/browse/BEAM-1871
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core
>Reporter: Daniel Halperin
>Assignee: Vikas Kedigehalli
> Fix For: 2.0.0
>
>
> Before first stable release we need to thin out {{sdk-java-core}} module. 
> Some candidates for removal, but not a non-exhaustive list:
> {{sdk/io}}
> * anything BigQuery related
> * anything PubSub related
> * everything Protobuf related
> * TFRecordIO
> * XMLSink
> {{sdk/util}}
> * Everything GCS related
> * Everything Backoff related
> * Everything Google API related: ResponseInterceptors, RetryHttpBackoff, etc.
> * Everything CloudObject-related
> * Pubsub stuff
> {{sdk/coders}}
> * JAXBCoder
> * TableRowJsoNCoder



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[1/2] beam git commit: [BEAM-2211] DataflowRunner: remove validation of file read/write paths

2017-05-08 Thread dhalperi
Repository: beam
Updated Branches:
  refs/heads/master 88e044fda -> 23731fe7a


[BEAM-2211] DataflowRunner: remove validation of file read/write paths

Now that users can implement and register custom FileSystems,
we can no longer really effectively validate filesystems they
can read or write files from. They can even register file://
to point to some HDFS path, e.g.,


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/16f355f7
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/16f355f7
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/16f355f7

Branch: refs/heads/master
Commit: 16f355f7e481ebf029c0edb878742f6fea57b6cd
Parents: 88e044f
Author: Dan Halperin 
Authored: Mon May 8 15:37:29 2017 -0700
Committer: Dan Halperin 
Committed: Mon May 8 20:13:35 2017 -0700

--
 .../beam/runners/dataflow/DataflowRunner.java   | 60 
 .../beam/runners/dataflow/ReadTranslator.java   | 12 
 .../DataflowPipelineTranslatorTest.java | 26 -
 .../runners/dataflow/DataflowRunnerTest.java| 56 --
 4 files changed, 154 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/16f355f7/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/DataflowRunner.java
--
diff --git 
a/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/DataflowRunner.java
 
b/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/DataflowRunner.java
index 5278a4a..250c064 100644
--- 
a/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/DataflowRunner.java
+++ 
b/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/DataflowRunner.java
@@ -84,13 +84,10 @@ import 
org.apache.beam.sdk.coders.Coder.NonDeterministicException;
 import org.apache.beam.sdk.coders.KvCoder;
 import org.apache.beam.sdk.coders.VoidCoder;
 import org.apache.beam.sdk.io.BoundedSource;
-import org.apache.beam.sdk.io.FileBasedSink;
 import org.apache.beam.sdk.io.FileSystems;
 import org.apache.beam.sdk.io.Read;
 import org.apache.beam.sdk.io.UnboundedSource;
-import org.apache.beam.sdk.io.WriteFiles;
 import org.apache.beam.sdk.io.fs.PathValidator;
-import org.apache.beam.sdk.io.fs.ResolveOptions.StandardResolveOptions;
 import org.apache.beam.sdk.io.fs.ResourceId;
 import org.apache.beam.sdk.io.gcp.pubsub.PubsubMessage;
 import org.apache.beam.sdk.io.gcp.pubsub.PubsubMessageWithAttributesCoder;
@@ -98,7 +95,6 @@ import org.apache.beam.sdk.io.gcp.pubsub.PubsubUnboundedSink;
 import org.apache.beam.sdk.io.gcp.pubsub.PubsubUnboundedSource;
 import org.apache.beam.sdk.options.PipelineOptions;
 import org.apache.beam.sdk.options.PipelineOptionsValidator;
-import org.apache.beam.sdk.options.ValueProvider;
 import org.apache.beam.sdk.options.ValueProvider.NestedValueProvider;
 import org.apache.beam.sdk.runners.AppliedPTransform;
 import org.apache.beam.sdk.runners.PTransformOverride;
@@ -343,10 +339,6 @@ public class DataflowRunner extends 
PipelineRunner {
   PTransformMatchers.stateOrTimerParDoSingle(),
   BatchStatefulParDoOverrides.singleOutputOverrideFactory()))
 
-  // WriteFiles uses views internally
-  .add(
-  PTransformOverride.of(
-  PTransformMatchers.classEqualTo(WriteFiles.class), new 
BatchWriteFactory(this)))
   .add(
   PTransformOverride.of(
   
PTransformMatchers.classEqualTo(Combine.GloballyAsSingletonView.class),
@@ -805,58 +797,6 @@ public class DataflowRunner extends 
PipelineRunner {
 ptransformViewsWithNonDeterministicKeyCoders.add(ptransform);
   }
 
-  private class BatchWriteFactory
-  implements PTransformOverrideFactory {
-private final DataflowRunner runner;
-private BatchWriteFactory(DataflowRunner dataflowRunner) {
-  this.runner = dataflowRunner;
-}
-
-@Override
-public PTransformReplacement 
getReplacementTransform(
-AppliedPTransform transform) {
-  return PTransformReplacement.of(
-  PTransformReplacements.getSingletonMainInput(transform),
-  new BatchWrite<>(runner, transform.getTransform()));
-}
-
-@Override
-public Map mapOutputs(
-Map outputs, PDone newOutput) {
-  return Collections.emptyMap();
-}
-  }
-
-  /**
-   * Specialized implementation which overrides
-   * {@link WriteFiles WriteFiles} to provide Google
-   * Cloud Dataflow specific path validation of {@link FileBasedSink}s.
-   */
-  private 

[2/2] beam git commit: This closes #2968

2017-05-08 Thread dhalperi
This closes #2968


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/23731fe7
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/23731fe7
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/23731fe7

Branch: refs/heads/master
Commit: 23731fe7ab12ea5d3ebb7b3f16c788b83d0caf0b
Parents: 88e044f 16f355f
Author: Dan Halperin 
Authored: Mon May 8 20:13:38 2017 -0700
Committer: Dan Halperin 
Committed: Mon May 8 20:13:38 2017 -0700

--
 .../beam/runners/dataflow/DataflowRunner.java   | 60 
 .../beam/runners/dataflow/ReadTranslator.java   | 12 
 .../DataflowPipelineTranslatorTest.java | 26 -
 .../runners/dataflow/DataflowRunnerTest.java| 56 --
 4 files changed, 154 deletions(-)
--




[GitHub] beam pull request #2982: [BEAM-1871] Remove google api BackOff usage from sd...

2017-05-08 Thread vikkyrk
GitHub user vikkyrk opened a pull request:

https://github.com/apache/beam/pull/2982

[BEAM-1871] Remove google api BackOff usage from sdks/core

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [ ] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [ ] Make sure tests pass via `mvn clean verify`.
 - [ ] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [ ] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.pdf).

---

- cheryy-pick of https://github.com/apache/beam/pull/2937

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/vikkyrk/incubator-beam backoff_2.0.0

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/2982.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2982


commit 08f10b87dc99c082bbd6f0972ac0972f2a514f70
Author: Vikas Kedigehalli 
Date:   2017-05-06T02:24:51Z

Remove google api BackOff usage from sdks/core




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (BEAM-1871) Thin Java SDK Core

2017-05-08 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-1871?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16001933#comment-16001933
 ] 

ASF GitHub Bot commented on BEAM-1871:
--

Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/2937


> Thin Java SDK Core
> --
>
> Key: BEAM-1871
> URL: https://issues.apache.org/jira/browse/BEAM-1871
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core
>Reporter: Daniel Halperin
>Assignee: Vikas Kedigehalli
> Fix For: 2.0.0
>
>
> Before first stable release we need to thin out {{sdk-java-core}} module. 
> Some candidates for removal, but not a non-exhaustive list:
> {{sdk/io}}
> * anything BigQuery related
> * anything PubSub related
> * everything Protobuf related
> * TFRecordIO
> * XMLSink
> {{sdk/util}}
> * Everything GCS related
> * Everything Backoff related
> * Everything Google API related: ResponseInterceptors, RetryHttpBackoff, etc.
> * Everything CloudObject-related
> * Pubsub stuff
> {{sdk/coders}}
> * JAXBCoder
> * TableRowJsoNCoder



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[4/4] beam git commit: [BEAM-1871] Remove Google API client dependency from sdks/core

2017-05-08 Thread lcwik
[BEAM-1871] Remove Google API client dependency from sdks/core

This closes #2937


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/88e044fd
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/88e044fd
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/88e044fd

Branch: refs/heads/master
Commit: 88e044fda0a7d0ea1372f075a4c7a4bd54af47da
Parents: 6a2586a 367fcb2
Author: Luke Cwik 
Authored: Mon May 8 19:20:01 2017 -0700
Committer: Luke Cwik 
Committed: Mon May 8 19:20:01 2017 -0700

--
 .../beam/examples/common/ExampleUtils.java  |  6 +-
 .../beam/examples/WindowedWordCountIT.java  |  2 +-
 runners/google-cloud-dataflow-java/pom.xml  |  2 +-
 .../runners/dataflow/DataflowPipelineJob.java   | 18 +++-
 .../beam/runners/dataflow/util/PackageUtil.java |  3 +-
 .../dataflow/DataflowPipelineJobTest.java   | 13 ++-
 .../runners/dataflow/util/PackageUtilTest.java  |  2 +-
 runners/spark/pom.xml   |  5 -
 .../beam/runners/spark/io/MicrobatchSource.java |  2 +-
 sdks/java/core/pom.xml  |  5 -
 .../beam/sdk/coders/StructuralByteArray.java|  4 +-
 .../sdk/io/BoundedReadFromUnboundedSource.java  |  2 +-
 .../beam/sdk/testing/FileChecksumMatcher.java   |  2 +-
 .../beam/sdk/testing/MatcherDeserializer.java   |  4 +-
 .../beam/sdk/testing/MatcherSerializer.java |  4 +-
 .../java/org/apache/beam/sdk/util/BackOff.java  | 81 
 .../org/apache/beam/sdk/util/BackOffUtils.java  | 57 
 .../org/apache/beam/sdk/util/CoderUtils.java| 10 +-
 .../beam/sdk/util/ExplicitShardedFile.java  |  3 -
 .../org/apache/beam/sdk/util/FluentBackoff.java |  1 -
 .../beam/sdk/util/NumberedShardedFile.java  |  3 -
 .../org/apache/beam/sdk/util/ShardedFile.java   |  2 -
 .../java/org/apache/beam/sdk/util/Sleeper.java  | 48 ++
 .../sdk/util/UploadIdResponseInterceptor.java   | 60 
 .../org/apache/beam/SdkCoreApiSurfaceTest.java  |  1 -
 .../beam/sdk/testing/ExpectedLogsTest.java  |  2 +-
 .../sdk/testing/FastNanoClockAndSleeper.java| 47 --
 .../testing/FastNanoClockAndSleeperTest.java| 47 --
 .../sdk/testing/FileChecksumMatcherTest.java|  5 -
 .../beam/sdk/testing/SystemNanoTimeSleeper.java |  2 +-
 .../apache/beam/sdk/util/FluentBackoffTest.java |  1 -
 .../beam/sdk/util/NumberedShardedFileTest.java  | 14 ++-
 .../util/UploadIdResponseInterceptorTest.java   | 98 
 .../sdk/extensions/gcp/options/GcpOptions.java  |  3 +-
 .../apache/beam/sdk/util/BackOffAdapter.java| 43 +
 .../java/org/apache/beam/sdk/util/GcsUtil.java  | 13 ++-
 .../sdk/util/UploadIdResponseInterceptor.java   | 60 
 .../beam/sdk/util/FastNanoClockAndSleeper.java  | 47 ++
 .../sdk/util/FastNanoClockAndSleeperTest.java   | 47 ++
 .../org/apache/beam/sdk/util/GcsUtilTest.java   | 23 +++--
 .../util/UploadIdResponseInterceptorTest.java   | 98 
 .../io/gcp/bigquery/BigQueryServicesImpl.java   | 81 +++-
 .../gcp/bigquery/BigQueryTableRowIterator.java  |  7 +-
 .../beam/sdk/io/gcp/datastore/DatastoreV1.java  |  6 +-
 .../gcp/bigquery/BigQueryServicesImplTest.java  | 24 +++--
 .../sdk/io/gcp/bigquery/FakeJobService.java | 12 ++-
 .../beam/sdk/io/gcp/datastore/V1TestUtil.java   |  6 +-
 .../sdk/io/gcp/testing/BigqueryMatcher.java |  4 +-
 .../sdk/io/gcp/testing/BigqueryMatcherTest.java |  7 +-
 49 files changed, 634 insertions(+), 403 deletions(-)
--




[1/4] beam git commit: Remove google api BackOff usage from sdks/core

2017-05-08 Thread lcwik
Repository: beam
Updated Branches:
  refs/heads/master 6a2586a4e -> 88e044fda


http://git-wip-us.apache.org/repos/asf/beam/blob/367fcb28/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/datastore/DatastoreV1.java
--
diff --git 
a/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/datastore/DatastoreV1.java
 
b/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/datastore/DatastoreV1.java
index d6464dd..16bb1b4 100644
--- 
a/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/datastore/DatastoreV1.java
+++ 
b/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/datastore/DatastoreV1.java
@@ -32,9 +32,6 @@ import static 
com.google.datastore.v1.client.DatastoreHelper.makeUpsert;
 import static com.google.datastore.v1.client.DatastoreHelper.makeValue;
 
 import com.google.api.client.http.HttpRequestInitializer;
-import com.google.api.client.util.BackOff;
-import com.google.api.client.util.BackOffUtils;
-import com.google.api.client.util.Sleeper;
 import com.google.auth.Credentials;
 import com.google.auth.http.HttpCredentialsAdapter;
 import com.google.auto.value.AutoValue;
@@ -89,8 +86,11 @@ import org.apache.beam.sdk.transforms.Values;
 import org.apache.beam.sdk.transforms.display.DisplayData;
 import org.apache.beam.sdk.transforms.display.DisplayData.Builder;
 import org.apache.beam.sdk.transforms.display.HasDisplayData;
+import org.apache.beam.sdk.util.BackOff;
+import org.apache.beam.sdk.util.BackOffUtils;
 import org.apache.beam.sdk.util.FluentBackoff;
 import org.apache.beam.sdk.util.RetryHttpRequestInitializer;
+import org.apache.beam.sdk.util.Sleeper;
 import org.apache.beam.sdk.values.KV;
 import org.apache.beam.sdk.values.PBegin;
 import org.apache.beam.sdk.values.PCollection;

http://git-wip-us.apache.org/repos/asf/beam/blob/367fcb28/sdks/java/io/google-cloud-platform/src/test/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryServicesImplTest.java
--
diff --git 
a/sdks/java/io/google-cloud-platform/src/test/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryServicesImplTest.java
 
b/sdks/java/io/google-cloud-platform/src/test/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryServicesImplTest.java
index ef51650..b41490f 100644
--- 
a/sdks/java/io/google-cloud-platform/src/test/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryServicesImplTest.java
+++ 
b/sdks/java/io/google-cloud-platform/src/test/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryServicesImplTest.java
@@ -67,7 +67,8 @@ import 
org.apache.beam.sdk.io.gcp.bigquery.BigQueryServicesImpl.DatasetServiceIm
 import org.apache.beam.sdk.io.gcp.bigquery.BigQueryServicesImpl.JobServiceImpl;
 import org.apache.beam.sdk.options.PipelineOptionsFactory;
 import org.apache.beam.sdk.testing.ExpectedLogs;
-import org.apache.beam.sdk.testing.FastNanoClockAndSleeper;
+import org.apache.beam.sdk.util.BackOffAdapter;
+import org.apache.beam.sdk.util.FastNanoClockAndSleeper;
 import org.apache.beam.sdk.util.FluentBackoff;
 import org.apache.beam.sdk.util.RetryHttpRequestInitializer;
 import org.apache.beam.sdk.util.Transport;
@@ -133,7 +134,8 @@ public class BigQueryServicesImplTest {
 
 Sleeper sleeper = new FastNanoClockAndSleeper();
 JobServiceImpl.startJob(
-testJob, new ApiErrorExtractor(), bigquery, sleeper, 
FluentBackoff.DEFAULT.backoff());
+testJob, new ApiErrorExtractor(), bigquery, sleeper,
+BackOffAdapter.toGcpBackOff(FluentBackoff.DEFAULT.backoff()));
 
 verify(response, times(1)).getStatusCode();
 verify(response, times(1)).getContent();
@@ -157,7 +159,8 @@ public class BigQueryServicesImplTest {
 
 Sleeper sleeper = new FastNanoClockAndSleeper();
 JobServiceImpl.startJob(
-testJob, new ApiErrorExtractor(), bigquery, sleeper, 
FluentBackoff.DEFAULT.backoff());
+testJob, new ApiErrorExtractor(), bigquery, sleeper,
+BackOffAdapter.toGcpBackOff(FluentBackoff.DEFAULT.backoff()));
 
 verify(response, times(1)).getStatusCode();
 verify(response, times(1)).getContent();
@@ -185,7 +188,8 @@ public class BigQueryServicesImplTest {
 
 Sleeper sleeper = new FastNanoClockAndSleeper();
 JobServiceImpl.startJob(
-testJob, new ApiErrorExtractor(), bigquery, sleeper, 
FluentBackoff.DEFAULT.backoff());
+testJob, new ApiErrorExtractor(), bigquery, sleeper,
+BackOffAdapter.toGcpBackOff(FluentBackoff.DEFAULT.backoff()));
 
 verify(response, times(2)).getStatusCode();
 verify(response, times(2)).getContent();
@@ -500,7 +504,8 @@ public class BigQueryServicesImplTest {
 
 DatasetServiceImpl dataService =
 new DatasetServiceImpl(bigquery, PipelineOptionsFactory.create());
-dataService.insertAll(ref, rows, null, TEST_BACKOFF.backoff(), new 
MockSleeper());
+

[3/4] beam git commit: Use guava Base64 encoding instead of google api client

2017-05-08 Thread lcwik
Use guava Base64 encoding instead of google api client


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/87578a6a
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/87578a6a
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/87578a6a

Branch: refs/heads/master
Commit: 87578a6a69f15bca502d9b4c15999aa383e539a5
Parents: 6a2586a
Author: Vikas Kedigehalli 
Authored: Fri May 5 19:33:45 2017 -0700
Committer: Luke Cwik 
Committed: Mon May 8 19:19:29 2017 -0700

--
 .../org/apache/beam/sdk/coders/StructuralByteArray.java   |  4 ++--
 .../org/apache/beam/sdk/testing/MatcherDeserializer.java  |  4 ++--
 .../org/apache/beam/sdk/testing/MatcherSerializer.java|  4 ++--
 .../main/java/org/apache/beam/sdk/util/CoderUtils.java| 10 +++---
 4 files changed, 13 insertions(+), 9 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/87578a6a/sdks/java/core/src/main/java/org/apache/beam/sdk/coders/StructuralByteArray.java
--
diff --git 
a/sdks/java/core/src/main/java/org/apache/beam/sdk/coders/StructuralByteArray.java
 
b/sdks/java/core/src/main/java/org/apache/beam/sdk/coders/StructuralByteArray.java
index 226f79c..edde68e 100644
--- 
a/sdks/java/core/src/main/java/org/apache/beam/sdk/coders/StructuralByteArray.java
+++ 
b/sdks/java/core/src/main/java/org/apache/beam/sdk/coders/StructuralByteArray.java
@@ -17,8 +17,8 @@
  */
 package org.apache.beam.sdk.coders;
 
-import static com.google.api.client.util.Base64.encodeBase64String;
 
+import com.google.common.io.BaseEncoding;
 import java.util.Arrays;
 
 /**
@@ -53,6 +53,6 @@ public class StructuralByteArray {
 
   @Override
   public String toString() {
-return "base64:" + encodeBase64String(value);
+return "base64:" + BaseEncoding.base64().encode(value);
   }
 }

http://git-wip-us.apache.org/repos/asf/beam/blob/87578a6a/sdks/java/core/src/main/java/org/apache/beam/sdk/testing/MatcherDeserializer.java
--
diff --git 
a/sdks/java/core/src/main/java/org/apache/beam/sdk/testing/MatcherDeserializer.java
 
b/sdks/java/core/src/main/java/org/apache/beam/sdk/testing/MatcherDeserializer.java
index 6ca07ba..e7aa5a7 100644
--- 
a/sdks/java/core/src/main/java/org/apache/beam/sdk/testing/MatcherDeserializer.java
+++ 
b/sdks/java/core/src/main/java/org/apache/beam/sdk/testing/MatcherDeserializer.java
@@ -22,7 +22,7 @@ import com.fasterxml.jackson.core.JsonProcessingException;
 import com.fasterxml.jackson.databind.DeserializationContext;
 import com.fasterxml.jackson.databind.JsonDeserializer;
 import com.fasterxml.jackson.databind.node.ObjectNode;
-import com.google.api.client.util.Base64;
+import com.google.common.io.BaseEncoding;
 import java.io.IOException;
 import org.apache.beam.sdk.util.SerializableUtils;
 
@@ -36,7 +36,7 @@ class MatcherDeserializer extends 
JsonDeserializer {
   throws IOException, JsonProcessingException {
 ObjectNode node = jsonParser.readValueAsTree();
 String matcher = node.get("matcher").asText();
-byte[] in = Base64.decodeBase64(matcher);
+byte[] in = BaseEncoding.base64().decode(matcher);
 return (SerializableMatcher) SerializableUtils
 .deserializeFromByteArray(in, "SerializableMatcher");
   }

http://git-wip-us.apache.org/repos/asf/beam/blob/87578a6a/sdks/java/core/src/main/java/org/apache/beam/sdk/testing/MatcherSerializer.java
--
diff --git 
a/sdks/java/core/src/main/java/org/apache/beam/sdk/testing/MatcherSerializer.java
 
b/sdks/java/core/src/main/java/org/apache/beam/sdk/testing/MatcherSerializer.java
index 2b4584c..35375f6 100644
--- 
a/sdks/java/core/src/main/java/org/apache/beam/sdk/testing/MatcherSerializer.java
+++ 
b/sdks/java/core/src/main/java/org/apache/beam/sdk/testing/MatcherSerializer.java
@@ -21,7 +21,7 @@ import com.fasterxml.jackson.core.JsonGenerator;
 import com.fasterxml.jackson.core.JsonProcessingException;
 import com.fasterxml.jackson.databind.JsonSerializer;
 import com.fasterxml.jackson.databind.SerializerProvider;
-import com.google.api.client.util.Base64;
+import com.google.common.io.BaseEncoding;
 import java.io.IOException;
 import org.apache.beam.sdk.util.SerializableUtils;
 
@@ -33,7 +33,7 @@ class MatcherSerializer extends 
JsonSerializer {
   public void serialize(SerializableMatcher matcher, JsonGenerator 
jsonGenerator,
   SerializerProvider serializerProvider) throws IOException, 
JsonProcessingException {
 byte[] out = SerializableUtils.serializeToByteArray(matcher);
-String encodedString = Base64.encodeBase64String(out);
+String encodedString = 

[2/4] beam git commit: Remove google api BackOff usage from sdks/core

2017-05-08 Thread lcwik
ils;
 import org.apache.beam.sdk.util.FluentBackoff;
 import org.apache.beam.sdk.util.RetryHttpRequestInitializer;
+import org.apache.beam.sdk.util.Sleeper;
 import org.apache.beam.sdk.util.Transport;
 import org.joda.time.Duration;
 

http://git-wip-us.apache.org/repos/asf/beam/blob/367fcb28/examples/java/src/test/java/org/apache/beam/examples/WindowedWordCountIT.java
--
diff --git 
a/examples/java/src/test/java/org/apache/beam/examples/WindowedWordCountIT.java 
b/examples/java/src/test/java/org/apache/beam/examples/WindowedWordCountIT.java
index f7e35c0..93c4543 100644
--- 
a/examples/java/src/test/java/org/apache/beam/examples/WindowedWordCountIT.java
+++ 
b/examples/java/src/test/java/org/apache/beam/examples/WindowedWordCountIT.java
@@ -19,7 +19,6 @@ package org.apache.beam.examples;
 
 import static org.hamcrest.Matchers.equalTo;
 
-import com.google.api.client.util.Sleeper;
 import com.google.common.base.MoreObjects;
 import com.google.common.collect.ImmutableList;
 import com.google.common.collect.Lists;
@@ -47,6 +46,7 @@ import org.apache.beam.sdk.util.ExplicitShardedFile;
 import org.apache.beam.sdk.util.FluentBackoff;
 import org.apache.beam.sdk.util.NumberedShardedFile;
 import org.apache.beam.sdk.util.ShardedFile;
+import org.apache.beam.sdk.util.Sleeper;
 import org.hamcrest.Description;
 import org.hamcrest.TypeSafeMatcher;
 import org.joda.time.Duration;

http://git-wip-us.apache.org/repos/asf/beam/blob/367fcb28/runners/google-cloud-dataflow-java/pom.xml
--
diff --git a/runners/google-cloud-dataflow-java/pom.xml 
b/runners/google-cloud-dataflow-java/pom.xml
index b579041..adbd4d7 100644
--- a/runners/google-cloud-dataflow-java/pom.xml
+++ b/runners/google-cloud-dataflow-java/pom.xml
@@ -33,7 +33,7 @@
   jar
 
   
-
beam-master-20170506
+
beam-master-20170508
 
1
 
6
   

http://git-wip-us.apache.org/repos/asf/beam/blob/367fcb28/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/DataflowPipelineJob.java
--
diff --git 
a/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/DataflowPipelineJob.java
 
b/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/DataflowPipelineJob.java
index 23084ed..2d23983 100644
--- 
a/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/DataflowPipelineJob.java
+++ 
b/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/DataflowPipelineJob.java
@@ -46,6 +46,7 @@ import org.apache.beam.runners.dataflow.util.MonitoringUtil;
 import org.apache.beam.sdk.PipelineResult;
 import org.apache.beam.sdk.metrics.MetricResults;
 import org.apache.beam.sdk.runners.AppliedPTransform;
+import org.apache.beam.sdk.util.BackOffAdapter;
 import org.apache.beam.sdk.util.FluentBackoff;
 import org.joda.time.Duration;
 import org.slf4j.Logger;
@@ -285,9 +286,10 @@ public class DataflowPipelineJob implements PipelineResult 
{
 
 BackOff backoff;
 if (!duration.isLongerThan(Duration.ZERO)) {
-  backoff = MESSAGES_BACKOFF_FACTORY.backoff();
+  backoff = 
BackOffAdapter.toGcpBackOff(MESSAGES_BACKOFF_FACTORY.backoff());
 } else {
-  backoff = 
MESSAGES_BACKOFF_FACTORY.withMaxCumulativeBackoff(duration).backoff();
+  backoff = BackOffAdapter.toGcpBackOff(
+  
MESSAGES_BACKOFF_FACTORY.withMaxCumulativeBackoff(duration).backoff());
 }
 
 // This function tracks the cumulative time from the *first request* to 
enforce the wall-clock
@@ -299,7 +301,10 @@ public class DataflowPipelineJob implements PipelineResult 
{
 do {
   // Get the state of the job before listing messages. This ensures we 
always fetch job
   // messages after the job finishes to ensure we have all them.
-  state = 
getStateWithRetries(STATUS_BACKOFF_FACTORY.withMaxRetries(0).backoff(), 
sleeper);
+  state = getStateWithRetries(
+  BackOffAdapter.toGcpBackOff(
+  STATUS_BACKOFF_FACTORY.withMaxRetries(0).backoff()),
+  sleeper);
   boolean hasError = state == State.UNKNOWN;
 
   if (messageHandler != null && !hasError) {
@@ -354,7 +359,8 @@ public class DataflowPipelineJob implements PipelineResult {
   Duration consumed = Duration.millis((nanosConsumed + 99) / 
100);
   Duration remaining = duration.minus(consumed);
   if (remaining.isLongerThan(Duration.ZERO)) {
-backoff = 
MESSAGES_BACKOFF_FACTORY.withMaxCumulativeBackoff(remaining).backoff();
+backoff = BackOffAdapter.toGcpBackOff(
+
MESSAGES_BACKOFF_FACTORY.withMaxCumulativeBackoff(remaining).backoff());
   } else {
 // If there is no time remaining, don't bother backing off.
 backoff = BackOff.STOP_BACKO

Jenkins build is back to stable : beam_PostCommit_Java_ValidatesRunner_Spark #1994

2017-05-08 Thread Apache Jenkins Server
See 




Jenkins build is back to stable : beam_PostCommit_Java_ValidatesRunner_Flink #2732

2017-05-08 Thread Apache Jenkins Server
See 




[jira] [Commented] (BEAM-2154) Writing to large numbers of BigQuery tables causes out-of-memory

2017-05-08 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2154?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16001895#comment-16001895
 ] 

ASF GitHub Bot commented on BEAM-2154:
--

GitHub user jkff opened a pull request:

https://github.com/apache/beam/pull/2981

Cherrypick #2883 to release-2.0.0

Cherrypick #2883 [BEAM-2154] Make BigQuery's dynamic-destination support 
scale to large numbers of destinations

R: @davorbonaci 

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/jkff/incubator-beam cp-2883

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/2981.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2981


commit 34dc0938b2d53ff389b24ceed51da89bffa7c1ae
Author: Eugene Kirpichov 
Date:   2017-05-09T01:02:14Z

This closes #2883




> Writing to large numbers of BigQuery tables causes out-of-memory 
> -
>
> Key: BEAM-2154
> URL: https://issues.apache.org/jira/browse/BEAM-2154
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-gcp
>Reporter: Reuven Lax
>Assignee: Reuven Lax
> Fix For: 2.0.0
>
>
> Since all TableRowWriters are created in a single DoFn, the write buffers all 
> exist simultaneously and use up large amounts of memory.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] beam pull request #2981: Cherrypick #2883 to release-2.0.0

2017-05-08 Thread jkff
GitHub user jkff opened a pull request:

https://github.com/apache/beam/pull/2981

Cherrypick #2883 to release-2.0.0

Cherrypick #2883 [BEAM-2154] Make BigQuery's dynamic-destination support 
scale to large numbers of destinations

R: @davorbonaci 

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/jkff/incubator-beam cp-2883

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/2981.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2981


commit 34dc0938b2d53ff389b24ceed51da89bffa7c1ae
Author: Eugene Kirpichov 
Date:   2017-05-09T01:02:14Z

This closes #2883




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (BEAM-2154) Writing to large numbers of BigQuery tables causes out-of-memory

2017-05-08 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2154?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16001893#comment-16001893
 ] 

ASF GitHub Bot commented on BEAM-2154:
--

Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/2883


> Writing to large numbers of BigQuery tables causes out-of-memory 
> -
>
> Key: BEAM-2154
> URL: https://issues.apache.org/jira/browse/BEAM-2154
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-gcp
>Reporter: Reuven Lax
>Assignee: Reuven Lax
> Fix For: 2.0.0
>
>
> Since all TableRowWriters are created in a single DoFn, the write buffers all 
> exist simultaneously and use up large amounts of memory.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Closed] (BEAM-2154) Writing to large numbers of BigQuery tables causes out-of-memory

2017-05-08 Thread Eugene Kirpichov (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-2154?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Kirpichov closed BEAM-2154.
--
Resolution: Fixed

> Writing to large numbers of BigQuery tables causes out-of-memory 
> -
>
> Key: BEAM-2154
> URL: https://issues.apache.org/jira/browse/BEAM-2154
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-gcp
>Reporter: Reuven Lax
>Assignee: Reuven Lax
> Fix For: 2.0.0
>
>
> Since all TableRowWriters are created in a single DoFn, the write buffers all 
> exist simultaneously and use up large amounts of memory.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[2/2] beam git commit: This closes #2883

2017-05-08 Thread jkff
This closes #2883


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/6a2586a4
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/6a2586a4
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/6a2586a4

Branch: refs/heads/master
Commit: 6a2586a4ee279f66fd4eae816e93fa8e397acf6b
Parents: f315cc9 fd1dda0
Author: Eugene Kirpichov 
Authored: Mon May 8 18:02:14 2017 -0700
Committer: Eugene Kirpichov 
Committed: Mon May 8 18:02:14 2017 -0700

--
 .../beam/sdk/io/gcp/bigquery/BatchLoads.java| 122 ---
 .../beam/sdk/io/gcp/bigquery/BigQueryIO.java|  35 +++---
 .../sdk/io/gcp/bigquery/ShardedKeyCoder.java|   3 +-
 .../sdk/io/gcp/bigquery/TableRowWriter.java |  59 +
 .../io/gcp/bigquery/WriteBundlesToFiles.java| 111 ++---
 .../bigquery/WriteGroupedRecordsToFiles.java|  68 +++
 .../sdk/io/gcp/bigquery/WritePartition.java |  38 +++---
 .../beam/sdk/io/gcp/bigquery/WriteRename.java   |   9 +-
 .../beam/sdk/io/gcp/bigquery/WriteResult.java   |   2 +-
 .../beam/sdk/io/gcp/bigquery/WriteTables.java   |   7 +-
 .../sdk/io/gcp/bigquery/BigQueryIOTest.java |  81 
 .../sdk/io/gcp/bigquery/FakeJobService.java |   6 +-
 12 files changed, 396 insertions(+), 145 deletions(-)
--




[GitHub] beam pull request #2883: [BEAM-2154] Make BigQuery's dynamic-destination sup...

2017-05-08 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/2883


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[1/2] beam git commit: [BEAM-2154] More scalable dynamic BigQueryIO.Write

2017-05-08 Thread jkff
Repository: beam
Updated Branches:
  refs/heads/master f315cc9ff -> 6a2586a4e


[BEAM-2154] More scalable dynamic BigQueryIO.Write

If too many tables are generated in a bundle, spill and group the results
before writing files. Generating hundreds or thousands of file write buffers
in a single bundle was causing workers to crash with out of memory.


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/fd1dda09
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/fd1dda09
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/fd1dda09

Branch: refs/heads/master
Commit: fd1dda09fc4db48cfbc21afd6310320533fffe78
Parents: f315cc9
Author: Reuven Lax 
Authored: Sat Apr 29 07:33:54 2017 -0700
Committer: Eugene Kirpichov 
Committed: Mon May 8 18:01:44 2017 -0700

--
 .../beam/sdk/io/gcp/bigquery/BatchLoads.java| 122 ---
 .../beam/sdk/io/gcp/bigquery/BigQueryIO.java|  35 +++---
 .../sdk/io/gcp/bigquery/ShardedKeyCoder.java|   3 +-
 .../sdk/io/gcp/bigquery/TableRowWriter.java |  59 +
 .../io/gcp/bigquery/WriteBundlesToFiles.java| 111 ++---
 .../bigquery/WriteGroupedRecordsToFiles.java|  68 +++
 .../sdk/io/gcp/bigquery/WritePartition.java |  38 +++---
 .../beam/sdk/io/gcp/bigquery/WriteRename.java   |   9 +-
 .../beam/sdk/io/gcp/bigquery/WriteResult.java   |   2 +-
 .../beam/sdk/io/gcp/bigquery/WriteTables.java   |   7 +-
 .../sdk/io/gcp/bigquery/BigQueryIOTest.java |  81 
 .../sdk/io/gcp/bigquery/FakeJobService.java |   6 +-
 12 files changed, 396 insertions(+), 145 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/fd1dda09/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BatchLoads.java
--
diff --git 
a/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BatchLoads.java
 
b/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BatchLoads.java
index ba64ab1..0abd469 100644
--- 
a/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BatchLoads.java
+++ 
b/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BatchLoads.java
@@ -22,6 +22,7 @@ import static 
com.google.common.base.Preconditions.checkArgument;
 import static 
org.apache.beam.sdk.io.gcp.bigquery.BigQueryHelpers.resolveTempLocation;
 
 import com.google.api.services.bigquery.model.TableRow;
+import com.google.common.annotations.VisibleForTesting;
 import com.google.common.base.Strings;
 import com.google.common.collect.Lists;
 import java.util.List;
@@ -32,11 +33,15 @@ import org.apache.beam.sdk.coders.KvCoder;
 import org.apache.beam.sdk.coders.ListCoder;
 import org.apache.beam.sdk.coders.NullableCoder;
 import org.apache.beam.sdk.coders.StringUtf8Coder;
+import org.apache.beam.sdk.coders.VoidCoder;
 import org.apache.beam.sdk.io.gcp.bigquery.BigQueryIO.Write.CreateDisposition;
 import org.apache.beam.sdk.io.gcp.bigquery.BigQueryIO.Write.WriteDisposition;
+import org.apache.beam.sdk.io.gcp.bigquery.WriteBundlesToFiles.Result;
 import org.apache.beam.sdk.options.PipelineOptions;
 import org.apache.beam.sdk.transforms.Create;
 import org.apache.beam.sdk.transforms.DoFn;
+import org.apache.beam.sdk.transforms.Flatten;
+import org.apache.beam.sdk.transforms.GroupByKey;
 import org.apache.beam.sdk.transforms.MapElements;
 import org.apache.beam.sdk.transforms.PTransform;
 import org.apache.beam.sdk.transforms.ParDo;
@@ -49,6 +54,7 @@ import org.apache.beam.sdk.transforms.windowing.Window;
 import org.apache.beam.sdk.util.gcsfs.GcsPath;
 import org.apache.beam.sdk.values.KV;
 import org.apache.beam.sdk.values.PCollection;
+import org.apache.beam.sdk.values.PCollectionList;
 import org.apache.beam.sdk.values.PCollectionTuple;
 import org.apache.beam.sdk.values.PCollectionView;
 import org.apache.beam.sdk.values.TupleTag;
@@ -57,6 +63,36 @@ import org.apache.beam.sdk.values.TupleTagList;
 /** PTransform that uses BigQuery batch-load jobs to write a PCollection to 
BigQuery. */
 class BatchLoads
 extends PTransform>, WriteResult> {
+  // The maximum number of file writers to keep open in a single bundle at a 
time, since file
+  // writers default to 64mb buffers. This comes into play when writing 
dynamic table destinations.
+  // The first 20 tables from a single BatchLoads transform will write files 
inline in the
+  // transform. Anything beyond that might be shuffled.  Users using this 
transform directly who
+  // know that they are running on workers with sufficient memory can increase 
this by calling
+  // 

Jenkins build became unstable: beam_PostCommit_Java_ValidatesRunner_Spark #1993

2017-05-08 Thread Apache Jenkins Server
See 




[GitHub] beam pull request #2980: PR to run tests for #2933 using updated Dataflow wo...

2017-05-08 Thread lukecwik
GitHub user lukecwik opened a pull request:

https://github.com/apache/beam/pull/2980

PR to run tests for #2933 using updated Dataflow worker image

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [ ] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [ ] Make sure tests pass via `mvn clean verify`.
 - [ ] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [ ] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.pdf).

---


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/lukecwik/incubator-beam finish-pr-2933

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/2980.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2980


commit 71ea42f5d0bdc1335bf5b9e0cb0b7fcf7db439c2
Author: Robert Bradshaw 
Date:   2017-05-05T23:20:37Z

Remove explicit used of nested contexts.

find . -type f -name '*.java' | xargs sed -i '' 's/\([.]..code[(].*\),  
*context.nested..[)]/\1)/'
find . -type f -name '*.java' | xargs sed -i '' 's/\([.]..code[(].*\),  
*nestedContext[)]/\1)/'
find . -type f -name '*.java' | xargs sed -i '' 's/\([.]..code[(].*\),  
*Context.NESTED[)]/\1)/'
find . -type f -name '*.java' | xargs sed -i '' 's/\([.]..code[(].*\),  
*[^ ]*.Context.NESTED[)]/\1)/'

Added back explicit context in CoGbkResult.java due to compile error.

commit 1f930b7888a923e3d8b7fbb42d26484f6a711123
Author: Robert Bradshaw 
Date:   2017-05-05T23:36:47Z

Remove contexts from coders where they'll never be used.

commit 98ec267b5a0e95eeb2c27b4969ffec03e0e360a8
Author: Robert Bradshaw 
Date:   2017-05-06T00:24:02Z

automated context removal or redirection

commit fb50c3ff26db4ecac8abaebaffc407651b294ac9
Author: Robert Bradshaw 
Date:   2017-05-06T00:27:13Z

get it compiling

commit 44ee9b87b603a1905e136cf7356453e3d26ed5ca
Author: Robert Bradshaw 
Date:   2017-05-06T00:35:35Z

Remove en/decodeOuter and default encode/decode methods.

Now only the context-free encode() and decode() methods are abstract.

commit 7b3f2914a4195052d7e9d68e4e6c82e4ce0d8dc2
Author: Lukasz Cwik 
Date:   2017-05-08T02:41:07Z

fixup! Swap to use encode/decode in anonymous inner class coder and 
@AutoValue coder

commit 0c502352cf5d2f52f454de30dbeac2f483dbdd2d
Author: Robert Bradshaw 
Date:   2017-05-08T17:48:40Z

Reviewer comments + a couple of extra fixes. All compiles.

commit 0acdef7dd0417e613284f62a824c2e11b51ce999
Author: Robert Bradshaw 
Date:   2017-05-08T18:27:40Z

lint error

commit 52d594ec7650899f3aa3064a5c698a3378052a65
Author: Robert Bradshaw 
Date:   2017-05-08T18:54:48Z

checkstyle

commit 6b81523e392af31829ae12f28a1ae0b1aba8d550
Author: Robert Bradshaw 
Date:   2017-05-08T21:56:25Z

Use latest dataflow worker.

commit ac626f04cd8ce5aa0862723cd208b5e7d017aaa8
Author: Robert Bradshaw 
Date:   2017-05-08T22:04:36Z

Explicitly mark Coder context as experimental as well as deprecated.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


Jenkins build became unstable: beam_PostCommit_Java_ValidatesRunner_Flink #2731

2017-05-08 Thread Apache Jenkins Server
See 




Jenkins build is back to stable : beam_PostCommit_Java_ValidatesRunner_Dataflow #3078

2017-05-08 Thread Apache Jenkins Server
See 




[jira] [Commented] (BEAM-2225) Add test cases for large keys to GroupByKeyTest

2017-05-08 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16001864#comment-16001864
 ] 

ASF GitHub Bot commented on BEAM-2225:
--

GitHub user dpmills opened a pull request:

https://github.com/apache/beam/pull/2979

[BEAM-2225] Adds large key tests to GroupByKeyTest

R: @kennknowles 


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/dpmills/incubator-beam largekeys

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/2979.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2979


commit 42c708cef3744d61a05bff72f6f7b31c2e3d7167
Author: Daniel Mills 
Date:   2017-05-08T23:45:44Z

Adds large key tests to GroupByKeyTest




> Add test cases for large keys to GroupByKeyTest
> ---
>
> Key: BEAM-2225
> URL: https://issues.apache.org/jira/browse/BEAM-2225
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core
>Reporter: Daniel Mills
>Assignee: Daniel Mills
>Priority: Trivial
>
> We should have @ValidatesRunner test cases for large keys (perhaps in a range 
> from 10KB to 10MB)



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] beam pull request #2979: [BEAM-2225] Adds large key tests to GroupByKeyTest

2017-05-08 Thread dpmills
GitHub user dpmills opened a pull request:

https://github.com/apache/beam/pull/2979

[BEAM-2225] Adds large key tests to GroupByKeyTest

R: @kennknowles 


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/dpmills/incubator-beam largekeys

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/2979.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2979


commit 42c708cef3744d61a05bff72f6f7b31c2e3d7167
Author: Daniel Mills 
Date:   2017-05-08T23:45:44Z

Adds large key tests to GroupByKeyTest




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (BEAM-2223) java8 examples are not running

2017-05-08 Thread Ahmet Altay (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16001817#comment-16001817
 ] 

Ahmet Altay commented on BEAM-2223:
---

My branch was based on: 
https://github.com/apache/beam/commit/9fffa7efac634eebecf802626c6d7c88e3d60be5

> java8 examples are not running
> --
>
> Key: BEAM-2223
> URL: https://issues.apache.org/jira/browse/BEAM-2223
> Project: Beam
>  Issue Type: Bug
>  Components: examples-java
>Reporter: Ahmet Altay
>Assignee: Ahmet Altay
> Fix For: 2.0.0
>
>
> Could not run java8 examples any more with:
> mvn compile exec:java 
> -Dexec.mainClass=org.apache.beam.examples.complete.game.UserScore 
> -Dexec.args="--project= --dataset= 
> --tempLocation="
> Fails with:
> java.lang.reflect.InvocationTargetException
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at org.codehaus.mojo.exec.ExecJavaMojo$1.run(ExecJavaMojo.java:294)
>   at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.NoClassDefFoundError: org/hamcrest/Matcher
>   at java.lang.ClassLoader.defineClass1(Native Method)
>   at java.lang.ClassLoader.defineClass(ClassLoader.java:763)
>   at 
> java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
>   at java.net.URLClassLoader.defineClass(URLClassLoader.java:467)
>   at java.net.URLClassLoader.access$100(URLClassLoader.java:73)
>   at java.net.URLClassLoader$1.run(URLClassLoader.java:368)
>   at java.net.URLClassLoader$1.run(URLClassLoader.java:362)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at java.net.URLClassLoader.findClass(URLClassLoader.java:361)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
>   at java.lang.Class.getDeclaredMethods0(Native Method)
>   at java.lang.Class.privateGetDeclaredMethods(Class.java:2703)
>   at java.lang.Class.privateGetPublicMethods(Class.java:2904)
>   at java.lang.Class.privateGetPublicMethods(Class.java:2913)
>   at java.lang.Class.getMethods(Class.java:1617)
>   at sun.misc.ProxyGenerator.generateClassFile(ProxyGenerator.java:451)
>   at sun.misc.ProxyGenerator.generateProxyClass(ProxyGenerator.java:339)
>   at java.lang.reflect.Proxy$ProxyClassFactory.apply(Proxy.java:639)
>   at java.lang.reflect.Proxy$ProxyClassFactory.apply(Proxy.java:557)
>   at java.lang.reflect.WeakCache$Factory.get(WeakCache.java:230)
>   at java.lang.reflect.WeakCache.get(WeakCache.java:127)
>   at java.lang.reflect.Proxy.getProxyClass0(Proxy.java:419)
>   at java.lang.reflect.Proxy.getProxyClass(Proxy.java:371)
>   at 
> org.apache.beam.sdk.options.PipelineOptionsFactory.validateWellFormed(PipelineOptionsFactory.java:606)
>   at 
> org.apache.beam.sdk.options.PipelineOptionsFactory.register(PipelineOptionsFactory.java:544)
>   at 
> org.apache.beam.sdk.options.PipelineOptionsFactory.initializeRegistry(PipelineOptionsFactory.java:570)
>   at 
> org.apache.beam.sdk.options.PipelineOptionsFactory.(PipelineOptionsFactory.java:519)
>   at 
> org.apache.beam.examples.complete.game.UserScore.main(UserScore.java:226)
>   ... 6 more
> Caused by: java.lang.ClassNotFoundException: org.hamcrest.Matcher
>   at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
>   ... 35 more
> cc: [~kenn][~tgroh][~vikasrk][~dhalp...@google.com]



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Assigned] (BEAM-2223) java8 examples are not running

2017-05-08 Thread Daniel Halperin (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-2223?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Halperin reassigned BEAM-2223:
-

Assignee: Ahmet Altay  (was: Daniel Halperin)

> java8 examples are not running
> --
>
> Key: BEAM-2223
> URL: https://issues.apache.org/jira/browse/BEAM-2223
> Project: Beam
>  Issue Type: Bug
>  Components: examples-java
>Reporter: Ahmet Altay
>Assignee: Ahmet Altay
> Fix For: 2.0.0
>
>
> Could not run java8 examples any more with:
> mvn compile exec:java 
> -Dexec.mainClass=org.apache.beam.examples.complete.game.UserScore 
> -Dexec.args="--project= --dataset= 
> --tempLocation="
> Fails with:
> java.lang.reflect.InvocationTargetException
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at org.codehaus.mojo.exec.ExecJavaMojo$1.run(ExecJavaMojo.java:294)
>   at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.NoClassDefFoundError: org/hamcrest/Matcher
>   at java.lang.ClassLoader.defineClass1(Native Method)
>   at java.lang.ClassLoader.defineClass(ClassLoader.java:763)
>   at 
> java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
>   at java.net.URLClassLoader.defineClass(URLClassLoader.java:467)
>   at java.net.URLClassLoader.access$100(URLClassLoader.java:73)
>   at java.net.URLClassLoader$1.run(URLClassLoader.java:368)
>   at java.net.URLClassLoader$1.run(URLClassLoader.java:362)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at java.net.URLClassLoader.findClass(URLClassLoader.java:361)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
>   at java.lang.Class.getDeclaredMethods0(Native Method)
>   at java.lang.Class.privateGetDeclaredMethods(Class.java:2703)
>   at java.lang.Class.privateGetPublicMethods(Class.java:2904)
>   at java.lang.Class.privateGetPublicMethods(Class.java:2913)
>   at java.lang.Class.getMethods(Class.java:1617)
>   at sun.misc.ProxyGenerator.generateClassFile(ProxyGenerator.java:451)
>   at sun.misc.ProxyGenerator.generateProxyClass(ProxyGenerator.java:339)
>   at java.lang.reflect.Proxy$ProxyClassFactory.apply(Proxy.java:639)
>   at java.lang.reflect.Proxy$ProxyClassFactory.apply(Proxy.java:557)
>   at java.lang.reflect.WeakCache$Factory.get(WeakCache.java:230)
>   at java.lang.reflect.WeakCache.get(WeakCache.java:127)
>   at java.lang.reflect.Proxy.getProxyClass0(Proxy.java:419)
>   at java.lang.reflect.Proxy.getProxyClass(Proxy.java:371)
>   at 
> org.apache.beam.sdk.options.PipelineOptionsFactory.validateWellFormed(PipelineOptionsFactory.java:606)
>   at 
> org.apache.beam.sdk.options.PipelineOptionsFactory.register(PipelineOptionsFactory.java:544)
>   at 
> org.apache.beam.sdk.options.PipelineOptionsFactory.initializeRegistry(PipelineOptionsFactory.java:570)
>   at 
> org.apache.beam.sdk.options.PipelineOptionsFactory.(PipelineOptionsFactory.java:519)
>   at 
> org.apache.beam.examples.complete.game.UserScore.main(UserScore.java:226)
>   ... 6 more
> Caused by: java.lang.ClassNotFoundException: org.hamcrest.Matcher
>   at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
>   ... 35 more
> cc: [~kenn][~tgroh][~vikasrk][~dhalp...@google.com]



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (BEAM-2224) maptask_executor_runner_test fails in windows

2017-05-08 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16001804#comment-16001804
 ] 

ASF GitHub Bot commented on BEAM-2224:
--

GitHub user robertwb opened a pull request:

https://github.com/apache/beam/pull/2978

[BEAM-2224] Cherry-pick #2965 onto release-2.0.0 

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [ ] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [ ] Make sure tests pass via `mvn clean verify`.
 - [ ] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [ ] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.pdf).

---


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/robertwb/incubator-beam cp-2965

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/2978.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2978


commit f56b06aae992c7bef7b956618f5ac085c997df4f
Author: Robert Bradshaw 
Date:   2017-05-08T21:54:17Z

[BEAM-2224] Fix temp file management on Windows.




> maptask_executor_runner_test fails in windows
> -
>
> Key: BEAM-2224
> URL: https://issues.apache.org/jira/browse/BEAM-2224
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py
>Reporter: Ahmet Altay
>Assignee: Robert Bradshaw
> Fix For: 2.0.0
>
>
> Fails with IOError: [Errno 13] Permission denied: 
> 'c:\\windows\\temp\\tmpdvmtve'.
> Likely due to https://bugs.python.org/issue14243; we'll have to work around.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (BEAM-2223) java8 examples are not running

2017-05-08 Thread Daniel Halperin (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16001807#comment-16001807
 ] 

Daniel Halperin commented on BEAM-2223:
---

Can you give a commit ID that worked on Friday?

> java8 examples are not running
> --
>
> Key: BEAM-2223
> URL: https://issues.apache.org/jira/browse/BEAM-2223
> Project: Beam
>  Issue Type: Bug
>  Components: examples-java
>Reporter: Ahmet Altay
>Assignee: Daniel Halperin
> Fix For: 2.0.0
>
>
> Could not run java8 examples any more with:
> mvn compile exec:java 
> -Dexec.mainClass=org.apache.beam.examples.complete.game.UserScore 
> -Dexec.args="--project= --dataset= 
> --tempLocation="
> Fails with:
> java.lang.reflect.InvocationTargetException
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at org.codehaus.mojo.exec.ExecJavaMojo$1.run(ExecJavaMojo.java:294)
>   at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.NoClassDefFoundError: org/hamcrest/Matcher
>   at java.lang.ClassLoader.defineClass1(Native Method)
>   at java.lang.ClassLoader.defineClass(ClassLoader.java:763)
>   at 
> java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
>   at java.net.URLClassLoader.defineClass(URLClassLoader.java:467)
>   at java.net.URLClassLoader.access$100(URLClassLoader.java:73)
>   at java.net.URLClassLoader$1.run(URLClassLoader.java:368)
>   at java.net.URLClassLoader$1.run(URLClassLoader.java:362)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at java.net.URLClassLoader.findClass(URLClassLoader.java:361)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
>   at java.lang.Class.getDeclaredMethods0(Native Method)
>   at java.lang.Class.privateGetDeclaredMethods(Class.java:2703)
>   at java.lang.Class.privateGetPublicMethods(Class.java:2904)
>   at java.lang.Class.privateGetPublicMethods(Class.java:2913)
>   at java.lang.Class.getMethods(Class.java:1617)
>   at sun.misc.ProxyGenerator.generateClassFile(ProxyGenerator.java:451)
>   at sun.misc.ProxyGenerator.generateProxyClass(ProxyGenerator.java:339)
>   at java.lang.reflect.Proxy$ProxyClassFactory.apply(Proxy.java:639)
>   at java.lang.reflect.Proxy$ProxyClassFactory.apply(Proxy.java:557)
>   at java.lang.reflect.WeakCache$Factory.get(WeakCache.java:230)
>   at java.lang.reflect.WeakCache.get(WeakCache.java:127)
>   at java.lang.reflect.Proxy.getProxyClass0(Proxy.java:419)
>   at java.lang.reflect.Proxy.getProxyClass(Proxy.java:371)
>   at 
> org.apache.beam.sdk.options.PipelineOptionsFactory.validateWellFormed(PipelineOptionsFactory.java:606)
>   at 
> org.apache.beam.sdk.options.PipelineOptionsFactory.register(PipelineOptionsFactory.java:544)
>   at 
> org.apache.beam.sdk.options.PipelineOptionsFactory.initializeRegistry(PipelineOptionsFactory.java:570)
>   at 
> org.apache.beam.sdk.options.PipelineOptionsFactory.(PipelineOptionsFactory.java:519)
>   at 
> org.apache.beam.examples.complete.game.UserScore.main(UserScore.java:226)
>   ... 6 more
> Caused by: java.lang.ClassNotFoundException: org.hamcrest.Matcher
>   at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
>   ... 35 more
> cc: [~kenn][~tgroh][~vikasrk][~dhalp...@google.com]



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] beam pull request #2978: [BEAM-2224] Cherry-pick #2965 onto release-2.0.0

2017-05-08 Thread robertwb
GitHub user robertwb opened a pull request:

https://github.com/apache/beam/pull/2978

[BEAM-2224] Cherry-pick #2965 onto release-2.0.0 

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [ ] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [ ] Make sure tests pass via `mvn clean verify`.
 - [ ] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [ ] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.pdf).

---


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/robertwb/incubator-beam cp-2965

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/2978.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2978


commit f56b06aae992c7bef7b956618f5ac085c997df4f
Author: Robert Bradshaw 
Date:   2017-05-08T21:54:17Z

[BEAM-2224] Fix temp file management on Windows.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (BEAM-2224) maptask_executor_runner_test fails in windows

2017-05-08 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16001798#comment-16001798
 ] 

ASF GitHub Bot commented on BEAM-2224:
--

Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/2965


> maptask_executor_runner_test fails in windows
> -
>
> Key: BEAM-2224
> URL: https://issues.apache.org/jira/browse/BEAM-2224
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py
>Reporter: Ahmet Altay
>Assignee: Robert Bradshaw
> Fix For: 2.0.0
>
>
> Fails with IOError: [Errno 13] Permission denied: 
> 'c:\\windows\\temp\\tmpdvmtve'.
> Likely due to https://bugs.python.org/issue14243; we'll have to work around.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] beam pull request #2965: [BEAM-2224] Fix temp file management on Windows.

2017-05-08 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/2965


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[1/2] beam git commit: [BEAM-2224] Fix temp file management on Windows.

2017-05-08 Thread robertwb
Repository: beam
Updated Branches:
  refs/heads/master 152c5bcc7 -> f315cc9ff


[BEAM-2224] Fix temp file management on Windows.


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/78c29238
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/78c29238
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/78c29238

Branch: refs/heads/master
Commit: 78c29238aa471138fb55681d8700b095a3606e0d
Parents: 152c5bc
Author: Robert Bradshaw 
Authored: Mon May 8 14:54:17 2017 -0700
Committer: Robert Bradshaw 
Committed: Mon May 8 16:49:04 2017 -0700

--
 .../runners/portability/maptask_executor_runner_test.py   | 10 --
 1 file changed, 8 insertions(+), 2 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/78c29238/sdks/python/apache_beam/runners/portability/maptask_executor_runner_test.py
--
diff --git 
a/sdks/python/apache_beam/runners/portability/maptask_executor_runner_test.py 
b/sdks/python/apache_beam/runners/portability/maptask_executor_runner_test.py
index b52c73c..aebd2e1 100644
--- 
a/sdks/python/apache_beam/runners/portability/maptask_executor_runner_test.py
+++ 
b/sdks/python/apache_beam/runners/portability/maptask_executor_runner_test.py
@@ -16,6 +16,7 @@
 #
 
 import logging
+import os
 import tempfile
 import unittest
 
@@ -181,12 +182,17 @@ class MapTaskExecutorRunnerTest(unittest.TestCase):
   assert_that(res, equal_to([('a', 1.5), ('b', 3.0)]))
 
   def test_read(self):
-with tempfile.NamedTemporaryFile() as temp_file:
+# Can't use NamedTemporaryFile as a context
+# due to https://bugs.python.org/issue14243
+temp_file = tempfile.NamedTemporaryFile(delete=False)
+try:
   temp_file.write('a\nb\nc')
-  temp_file.flush()
+  temp_file.close()
   with self.create_pipeline() as p:
 assert_that(p | beam.io.ReadFromText(temp_file.name),
 equal_to(['a', 'b', 'c']))
+finally:
+  os.unlink(temp_file.name)
 
   def test_windowing(self):
 with self.create_pipeline() as p:



[2/2] beam git commit: Closes #2965

2017-05-08 Thread robertwb
Closes #2965


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/f315cc9f
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/f315cc9f
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/f315cc9f

Branch: refs/heads/master
Commit: f315cc9ff649ccee9c9aff18f4af30cfbd58e5e5
Parents: 152c5bc 78c2923
Author: Robert Bradshaw 
Authored: Mon May 8 16:49:05 2017 -0700
Committer: Robert Bradshaw 
Committed: Mon May 8 16:49:05 2017 -0700

--
 .../runners/portability/maptask_executor_runner_test.py   | 10 --
 1 file changed, 8 insertions(+), 2 deletions(-)
--




[1/2] beam git commit: This closes #2959

2017-05-08 Thread tgroh
Repository: beam
Updated Branches:
  refs/heads/release-2.0.0 5493c6c8f -> 336b3dc29


This closes #2959


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/336b3dc2
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/336b3dc2
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/336b3dc2

Branch: refs/heads/release-2.0.0
Commit: 336b3dc29b5de0f8affc71fbe9886a3e0ee2a0fe
Parents: 5493c6c 392ed60
Author: Thomas Groh 
Authored: Mon May 8 16:41:25 2017 -0700
Committer: Thomas Groh 
Committed: Mon May 8 16:41:25 2017 -0700

--
 .../FlinkStreamingTransformTranslators.java |  54 +-
 .../wrappers/streaming/io/DedupingOperator.java | 187 +++
 .../streaming/io/UnboundedSourceWrapper.java|  15 +-
 .../flink/streaming/DedupingOperatorTest.java   | 131 +
 .../streaming/UnboundedSourceWrapperTest.java   |  29 +--
 5 files changed, 393 insertions(+), 23 deletions(-)
--




[2/2] beam git commit: [BEAM-1723] deduplication of UnboundedSource in Flink runner

2017-05-08 Thread tgroh
[BEAM-1723] deduplication of UnboundedSource in Flink runner


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/392ed601
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/392ed601
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/392ed601

Branch: refs/heads/release-2.0.0
Commit: 392ed601392dbf5ace32577c3a4dee13488cedc4
Parents: 5493c6c
Author: JingsongLi 
Authored: Wed Apr 19 19:42:59 2017 +0800
Committer: Thomas Groh 
Committed: Mon May 8 16:41:25 2017 -0700

--
 .../FlinkStreamingTransformTranslators.java |  54 +-
 .../wrappers/streaming/io/DedupingOperator.java | 187 +++
 .../streaming/io/UnboundedSourceWrapper.java|  15 +-
 .../flink/streaming/DedupingOperatorTest.java   | 131 +
 .../streaming/UnboundedSourceWrapperTest.java   |  29 +--
 5 files changed, 393 insertions(+), 23 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/392ed601/runners/flink/src/main/java/org/apache/beam/runners/flink/FlinkStreamingTransformTranslators.java
--
diff --git 
a/runners/flink/src/main/java/org/apache/beam/runners/flink/FlinkStreamingTransformTranslators.java
 
b/runners/flink/src/main/java/org/apache/beam/runners/flink/FlinkStreamingTransformTranslators.java
index 615eaea..9a93205 100644
--- 
a/runners/flink/src/main/java/org/apache/beam/runners/flink/FlinkStreamingTransformTranslators.java
+++ 
b/runners/flink/src/main/java/org/apache/beam/runners/flink/FlinkStreamingTransformTranslators.java
@@ -44,6 +44,7 @@ import 
org.apache.beam.runners.flink.translation.wrappers.streaming.SplittableDo
 import 
org.apache.beam.runners.flink.translation.wrappers.streaming.WindowDoFnOperator;
 import 
org.apache.beam.runners.flink.translation.wrappers.streaming.WorkItemKeySelector;
 import 
org.apache.beam.runners.flink.translation.wrappers.streaming.io.BoundedSourceWrapper;
+import 
org.apache.beam.runners.flink.translation.wrappers.streaming.io.DedupingOperator;
 import 
org.apache.beam.runners.flink.translation.wrappers.streaming.io.UnboundedSourceWrapper;
 import org.apache.beam.sdk.coders.Coder;
 import org.apache.beam.sdk.coders.KvCoder;
@@ -73,12 +74,16 @@ import org.apache.beam.sdk.values.PCollection;
 import org.apache.beam.sdk.values.PCollectionView;
 import org.apache.beam.sdk.values.PValue;
 import org.apache.beam.sdk.values.TupleTag;
+import org.apache.beam.sdk.values.ValueWithRecordId;
 import org.apache.beam.sdk.values.WindowingStrategy;
 import org.apache.flink.api.common.functions.FlatMapFunction;
 import org.apache.flink.api.common.functions.MapFunction;
 import org.apache.flink.api.common.functions.RichFlatMapFunction;
 import org.apache.flink.api.common.typeinfo.TypeInformation;
+import org.apache.flink.api.java.functions.KeySelector;
 import org.apache.flink.api.java.tuple.Tuple2;
+import org.apache.flink.api.java.typeutils.GenericTypeInfo;
+import org.apache.flink.api.java.typeutils.ResultTypeQueryable;
 import org.apache.flink.streaming.api.collector.selector.OutputSelector;
 import org.apache.flink.streaming.api.datastream.DataStream;
 import org.apache.flink.streaming.api.datastream.DataStreamSource;
@@ -148,20 +153,37 @@ class FlinkStreamingTransformTranslators {
 FlinkStreamingTranslationContext context) {
   PCollection output = context.getOutput(transform);
 
+  DataStream source;
+  DataStream> nonDedupSource;
   TypeInformation outputTypeInfo =
   context.getTypeInfo(context.getOutput(transform));
 
-  DataStream source;
+  Coder coder = context.getOutput(transform).getCoder();
+
+  TypeInformation> withIdTypeInfo =
+  new CoderTypeInformation<>(WindowedValue.getFullCoder(
+  ValueWithRecordId.ValueWithRecordIdCoder.of(coder),
+  output.getWindowingStrategy().getWindowFn().windowCoder()));
+
   try {
+
 UnboundedSourceWrapper sourceWrapper =
 new UnboundedSourceWrapper<>(
 context.getCurrentTransform().getFullName(),
 context.getPipelineOptions(),
 transform.getSource(),
 context.getExecutionEnvironment().getParallelism());
-source = context
+nonDedupSource = context
 .getExecutionEnvironment()
-
.addSource(sourceWrapper).name(transform.getName()).returns(outputTypeInfo);
+
.addSource(sourceWrapper).name(transform.getName()).returns(withIdTypeInfo);
+
+if (transform.getSource().requiresDeduping()) {
+  source = nonDedupSource.keyBy(
+  new 

[jira] [Commented] (BEAM-2223) java8 examples are not running

2017-05-08 Thread Daniel Halperin (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16001789#comment-16001789
 ] 

Daniel Halperin commented on BEAM-2223:
---

I'm taking a look. Nothing relevant seems to have changed in the pom.xml 
recently, but it also doesn't look like it should have ever worked.

> java8 examples are not running
> --
>
> Key: BEAM-2223
> URL: https://issues.apache.org/jira/browse/BEAM-2223
> Project: Beam
>  Issue Type: Bug
>  Components: examples-java
>Reporter: Ahmet Altay
>Assignee: Daniel Halperin
> Fix For: 2.0.0
>
>
> Could not run java8 examples any more with:
> mvn compile exec:java 
> -Dexec.mainClass=org.apache.beam.examples.complete.game.UserScore 
> -Dexec.args="--project= --dataset= 
> --tempLocation="
> Fails with:
> java.lang.reflect.InvocationTargetException
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at org.codehaus.mojo.exec.ExecJavaMojo$1.run(ExecJavaMojo.java:294)
>   at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.NoClassDefFoundError: org/hamcrest/Matcher
>   at java.lang.ClassLoader.defineClass1(Native Method)
>   at java.lang.ClassLoader.defineClass(ClassLoader.java:763)
>   at 
> java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
>   at java.net.URLClassLoader.defineClass(URLClassLoader.java:467)
>   at java.net.URLClassLoader.access$100(URLClassLoader.java:73)
>   at java.net.URLClassLoader$1.run(URLClassLoader.java:368)
>   at java.net.URLClassLoader$1.run(URLClassLoader.java:362)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at java.net.URLClassLoader.findClass(URLClassLoader.java:361)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
>   at java.lang.Class.getDeclaredMethods0(Native Method)
>   at java.lang.Class.privateGetDeclaredMethods(Class.java:2703)
>   at java.lang.Class.privateGetPublicMethods(Class.java:2904)
>   at java.lang.Class.privateGetPublicMethods(Class.java:2913)
>   at java.lang.Class.getMethods(Class.java:1617)
>   at sun.misc.ProxyGenerator.generateClassFile(ProxyGenerator.java:451)
>   at sun.misc.ProxyGenerator.generateProxyClass(ProxyGenerator.java:339)
>   at java.lang.reflect.Proxy$ProxyClassFactory.apply(Proxy.java:639)
>   at java.lang.reflect.Proxy$ProxyClassFactory.apply(Proxy.java:557)
>   at java.lang.reflect.WeakCache$Factory.get(WeakCache.java:230)
>   at java.lang.reflect.WeakCache.get(WeakCache.java:127)
>   at java.lang.reflect.Proxy.getProxyClass0(Proxy.java:419)
>   at java.lang.reflect.Proxy.getProxyClass(Proxy.java:371)
>   at 
> org.apache.beam.sdk.options.PipelineOptionsFactory.validateWellFormed(PipelineOptionsFactory.java:606)
>   at 
> org.apache.beam.sdk.options.PipelineOptionsFactory.register(PipelineOptionsFactory.java:544)
>   at 
> org.apache.beam.sdk.options.PipelineOptionsFactory.initializeRegistry(PipelineOptionsFactory.java:570)
>   at 
> org.apache.beam.sdk.options.PipelineOptionsFactory.(PipelineOptionsFactory.java:519)
>   at 
> org.apache.beam.examples.complete.game.UserScore.main(UserScore.java:226)
>   ... 6 more
> Caused by: java.lang.ClassNotFoundException: org.hamcrest.Matcher
>   at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
>   ... 35 more
> cc: [~kenn][~tgroh][~vikasrk][~dhalp...@google.com]



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Assigned] (BEAM-2223) java8 examples are not running

2017-05-08 Thread Daniel Halperin (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-2223?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Halperin reassigned BEAM-2223:
-

Assignee: Daniel Halperin  (was: Davor Bonaci)

> java8 examples are not running
> --
>
> Key: BEAM-2223
> URL: https://issues.apache.org/jira/browse/BEAM-2223
> Project: Beam
>  Issue Type: Bug
>  Components: examples-java
>Reporter: Ahmet Altay
>Assignee: Daniel Halperin
> Fix For: 2.0.0
>
>
> Could not run java8 examples any more with:
> mvn compile exec:java 
> -Dexec.mainClass=org.apache.beam.examples.complete.game.UserScore 
> -Dexec.args="--project= --dataset= 
> --tempLocation="
> Fails with:
> java.lang.reflect.InvocationTargetException
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at org.codehaus.mojo.exec.ExecJavaMojo$1.run(ExecJavaMojo.java:294)
>   at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.NoClassDefFoundError: org/hamcrest/Matcher
>   at java.lang.ClassLoader.defineClass1(Native Method)
>   at java.lang.ClassLoader.defineClass(ClassLoader.java:763)
>   at 
> java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
>   at java.net.URLClassLoader.defineClass(URLClassLoader.java:467)
>   at java.net.URLClassLoader.access$100(URLClassLoader.java:73)
>   at java.net.URLClassLoader$1.run(URLClassLoader.java:368)
>   at java.net.URLClassLoader$1.run(URLClassLoader.java:362)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at java.net.URLClassLoader.findClass(URLClassLoader.java:361)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
>   at java.lang.Class.getDeclaredMethods0(Native Method)
>   at java.lang.Class.privateGetDeclaredMethods(Class.java:2703)
>   at java.lang.Class.privateGetPublicMethods(Class.java:2904)
>   at java.lang.Class.privateGetPublicMethods(Class.java:2913)
>   at java.lang.Class.getMethods(Class.java:1617)
>   at sun.misc.ProxyGenerator.generateClassFile(ProxyGenerator.java:451)
>   at sun.misc.ProxyGenerator.generateProxyClass(ProxyGenerator.java:339)
>   at java.lang.reflect.Proxy$ProxyClassFactory.apply(Proxy.java:639)
>   at java.lang.reflect.Proxy$ProxyClassFactory.apply(Proxy.java:557)
>   at java.lang.reflect.WeakCache$Factory.get(WeakCache.java:230)
>   at java.lang.reflect.WeakCache.get(WeakCache.java:127)
>   at java.lang.reflect.Proxy.getProxyClass0(Proxy.java:419)
>   at java.lang.reflect.Proxy.getProxyClass(Proxy.java:371)
>   at 
> org.apache.beam.sdk.options.PipelineOptionsFactory.validateWellFormed(PipelineOptionsFactory.java:606)
>   at 
> org.apache.beam.sdk.options.PipelineOptionsFactory.register(PipelineOptionsFactory.java:544)
>   at 
> org.apache.beam.sdk.options.PipelineOptionsFactory.initializeRegistry(PipelineOptionsFactory.java:570)
>   at 
> org.apache.beam.sdk.options.PipelineOptionsFactory.(PipelineOptionsFactory.java:519)
>   at 
> org.apache.beam.examples.complete.game.UserScore.main(UserScore.java:226)
>   ... 6 more
> Caused by: java.lang.ClassNotFoundException: org.hamcrest.Matcher
>   at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
>   ... 35 more
> cc: [~kenn][~tgroh][~vikasrk][~dhalp...@google.com]



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (BEAM-2222) Clean up readme files

2017-05-08 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16001785#comment-16001785
 ] 

ASF GitHub Bot commented on BEAM-:
--

Github user asfgit closed the pull request at:

https://github.com/apache/beam-site/pull/233


> Clean up readme files
> -
>
> Key: BEAM-
> URL: https://issues.apache.org/jira/browse/BEAM-
> Project: Beam
>  Issue Type: Bug
>  Components: website
>Reporter: Ahmet Altay
>Assignee: Ahmet Altay
> Fix For: 2.0.0
>
>
> Move content from readme.md's to website.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] beam-site pull request #233: [BEAM-2222] Remove README.md references from th...

2017-05-08 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/beam-site/pull/233


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[3/3] beam-site git commit: This closes #233

2017-05-08 Thread altay
This closes #233


Project: http://git-wip-us.apache.org/repos/asf/beam-site/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam-site/commit/6f771784
Tree: http://git-wip-us.apache.org/repos/asf/beam-site/tree/6f771784
Diff: http://git-wip-us.apache.org/repos/asf/beam-site/diff/6f771784

Branch: refs/heads/asf-site
Commit: 6f77178418df3b61ab810d9cdef32e936491813d
Parents: 84468bd 35d4d4f
Author: Ahmet Altay 
Authored: Mon May 8 16:26:38 2017 -0700
Committer: Ahmet Altay 
Committed: Mon May 8 16:26:38 2017 -0700

--
 content/documentation/programming-guide/index.html | 4 ++--
 content/get-started/quickstart-py/index.html   | 2 +-
 content/get-started/wordcount-example/index.html   | 6 +++---
 src/get-started/quickstart-py.md   | 2 +-
 src/get-started/wordcount-example.md   | 6 +++---
 5 files changed, 10 insertions(+), 10 deletions(-)
--




[2/3] beam-site git commit: Regenerate website

2017-05-08 Thread altay
Regenerate website


Project: http://git-wip-us.apache.org/repos/asf/beam-site/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam-site/commit/35d4d4f3
Tree: http://git-wip-us.apache.org/repos/asf/beam-site/tree/35d4d4f3
Diff: http://git-wip-us.apache.org/repos/asf/beam-site/diff/35d4d4f3

Branch: refs/heads/asf-site
Commit: 35d4d4f35af03418d1b9609aebeed005f0cfd729
Parents: d56bbc8
Author: Ahmet Altay 
Authored: Mon May 8 16:26:38 2017 -0700
Committer: Ahmet Altay 
Committed: Mon May 8 16:26:38 2017 -0700

--
 content/documentation/programming-guide/index.html | 4 ++--
 content/get-started/quickstart-py/index.html   | 2 +-
 content/get-started/wordcount-example/index.html   | 6 +++---
 3 files changed, 6 insertions(+), 6 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/beam-site/blob/35d4d4f3/content/documentation/programming-guide/index.html
--
diff --git a/content/documentation/programming-guide/index.html 
b/content/documentation/programming-guide/index.html
index bc71346..1e6e9a7 100644
--- a/content/documentation/programming-guide/index.html
+++ b/content/documentation/programming-guide/index.html
@@ -244,7 +244,7 @@
 
 
 import apache_beam as beam
-from apache_beam.utils.pipeline_options import PipelineOptions
+from apache_beam.options.pipeline_options import PipelineOptions
 
 p = beam.Pipeline(options=PipelineOptions())
 
@@ -268,7 +268,7 @@
 
 
 import apache_beam as beam
-from apache_beam.utils.pipeline_options import PipelineOptions
+from apache_beam.options.pipeline_options import PipelineOptions
 
 p = beam.Pipeline(options=PipelineOptions())
 

http://git-wip-us.apache.org/repos/asf/beam-site/blob/35d4d4f3/content/get-started/quickstart-py/index.html
--
diff --git a/content/get-started/quickstart-py/index.html 
b/content/get-started/quickstart-py/index.html
index 7c8f2d3..ac498d9 100644
--- a/content/get-started/quickstart-py/index.html
+++ b/content/get-started/quickstart-py/index.html
@@ -268,7 +268,7 @@ environment’s directories.
 
 For example, to run wordcount.py, 
run:
 
-python -m apache_beam.examples.wordcount --input 
README.md --output counts
+python -m apache_beam.examples.wordcount --input 
MANIFEST.in --output counts
 
 
 

http://git-wip-us.apache.org/repos/asf/beam-site/blob/35d4d4f3/content/get-started/wordcount-example/index.html
--
diff --git a/content/get-started/wordcount-example/index.html 
b/content/get-started/wordcount-example/index.html
index 030c3e2..5cc32f3 100644
--- a/content/get-started/wordcount-example/index.html
+++ b/content/get-started/wordcount-example/index.html
@@ -210,7 +210,7 @@
 
 Minimal WordCount demonstrates a simple pipeline that can read from a text 
file, apply transforms to tokenize and count the words, and write the data to 
an output text file. This example hard-codes the locations for its input and 
output files and doesn’t perform any error checking; it is intended to only 
show you the “bare bones” of creating a Beam pipeline. This lack of 
parameterization makes this particular pipeline less portable across different 
runners than standard Beam pipelines. In later examples, we will parameterize 
the pipeline’s input and output sources and show other best practices.
 
-To run this example, follow the instructions in the https://github.com/apache/beam/blob/master/examples/java/README.md#building-and-running;>Beam
 Examples README. To view the full code, see https://github.com/apache/beam/blob/master/examples/java/src/main/java/org/apache/beam/examples/MinimalWordCount.java;>MinimalWordCount.
+To run this example, follow the instructions in the Quickstart java or python. To view the full code, see 
https://github.com/apache/beam/blob/master/examples/java/src/main/java/org/apache/beam/examples/MinimalWordCount.java;>MinimalWordCount.
 
 Key Concepts:
 
@@ -380,7 +380,7 @@ Figure 1: The pipeline data flow.
 
 This section assumes that you have a good understanding of the basic 
concepts in building a pipeline. If you feel that you aren’t at that point 
yet, read the above section, Minimal 
WordCount.
 
-To run this example, follow the instructions in the https://github.com/apache/beam/blob/master/examples/java/README.md#building-and-running;>Beam
 Examples README. To view the full code, see https://github.com/apache/beam/blob/master/examples/java/src/main/java/org/apache/beam/examples/WordCount.java;>WordCount.
+To run this example, follow the instructions in the Quickstart java or python. To view the full code, see 
https://github.com/apache/beam/blob/master/examples/java/src/main/java/org/apache/beam/examples/WordCount.java;>WordCount.
 
 New 

[GitHub] beam pull request #2977: Remove old imports meant for backwards compatibilit...

2017-05-08 Thread sb2nov
GitHub user sb2nov opened a pull request:

https://github.com/apache/beam/pull/2977

Remove old imports meant for backwards compatibility

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [ ] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [ ] Make sure tests pass via `mvn clean verify`.
 - [ ] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [ ] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.pdf).

---


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/sb2nov/beam 
BEAM-remove-backward-compatibility-stuff

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/2977.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2977


commit 0d342db8b1feb2743a794cd2ce92b039576808c1
Author: Sourabh Bajaj 
Date:   2017-05-08T23:13:55Z

Remove old imports meant for backwards compatibility




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[2/2] beam git commit: This closes #2974: Cherry pick #2950 to release-2.0.0

2017-05-08 Thread jkff
This closes #2974: Cherry pick #2950 to release-2.0.0


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/5493c6c8
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/5493c6c8
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/5493c6c8

Branch: refs/heads/release-2.0.0
Commit: 5493c6c8fbd3b1b6a2b6276cf083a9a82d6c6d4e
Parents: 8f18344 8f14c18
Author: Eugene Kirpichov 
Authored: Mon May 8 16:19:59 2017 -0700
Committer: Eugene Kirpichov 
Committed: Mon May 8 16:19:59 2017 -0700

--
 .../main/java/org/apache/beam/sdk/io/gcp/pubsub/PubsubIO.java  | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)
--




[1/2] beam git commit: This closes #2950

2017-05-08 Thread jkff
Repository: beam
Updated Branches:
  refs/heads/release-2.0.0 8f1834432 -> 5493c6c8f


This closes #2950


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/8f14c18d
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/8f14c18d
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/8f14c18d

Branch: refs/heads/release-2.0.0
Commit: 8f14c18d21439754071711c93c32ba757b0a0776
Parents: 8f18344
Author: Eugene Kirpichov 
Authored: Mon May 8 13:45:25 2017 -0700
Committer: Eugene Kirpichov 
Committed: Mon May 8 16:19:28 2017 -0700

--
 .../main/java/org/apache/beam/sdk/io/gcp/pubsub/PubsubIO.java  | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/8f14c18d/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/pubsub/PubsubIO.java
--
diff --git 
a/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/pubsub/PubsubIO.java
 
b/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/pubsub/PubsubIO.java
index ac6cb44..048fded 100644
--- 
a/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/pubsub/PubsubIO.java
+++ 
b/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/pubsub/PubsubIO.java
@@ -435,7 +435,7 @@ public class PubsubIO {
* messages will only contain a {@link PubsubMessage#getPayload() payload}, 
but no {@link
* PubsubMessage#getAttributeMap() attributes}.
*/
-  public static Read readPubsubMessagesWithoutAttributes() {
+  public static Read readMessages() {
 return new AutoValue_PubsubIO_Read.Builder()
 .setCoder(PubsubMessagePayloadOnlyCoder.of())
 .setParseFn(new IdentityMessageFn())
@@ -448,7 +448,7 @@ public class PubsubIO {
* messages will contain both a {@link PubsubMessage#getPayload() payload} 
and {@link
* PubsubMessage#getAttributeMap() attributes}.
*/
-  public static Read readPubsubMessagesWithAttributes() {
+  public static Read readMessagesWithAttributes() {
 return new AutoValue_PubsubIO_Read.Builder()
 .setCoder(PubsubMessageWithAttributesCoder.of())
 .setParseFn(new IdentityMessageFn())
@@ -495,7 +495,7 @@ public class PubsubIO {
   }
 
   /** Returns A {@link PTransform} that writes to a Google Cloud Pub/Sub 
stream. */
-  public static Write writePubsubMessages() {
+  public static Write writeMessages() {
 return PubsubIO.write().withFormatFn(new 
IdentityMessageFn());
   }
 



[GitHub] beam pull request #2976: Actually run after-count tests.

2017-05-08 Thread robertwb
GitHub user robertwb opened a pull request:

https://github.com/apache/beam/pull/2976

Actually run after-count tests.

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [ ] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [ ] Make sure tests pass via `mvn clean verify`.
 - [ ] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [ ] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.pdf).

---


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/robertwb/incubator-beam run-pipeline

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/2976.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2976


commit 01bad9930199656b2f98c5f2166d0c3b647dccbd
Author: Robert Bradshaw 
Date:   2017-05-08T23:18:20Z

Actually run after-count tests.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (BEAM-2122) Writing to partitioned BigQuery tables from Dataflow is causing errors

2017-05-08 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2122?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16001773#comment-16001773
 ] 

ASF GitHub Bot commented on BEAM-2122:
--

Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/2970


> Writing to partitioned BigQuery tables from Dataflow is causing errors
> --
>
> Key: BEAM-2122
> URL: https://issues.apache.org/jira/browse/BEAM-2122
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-gcp
> Environment: Running with Beam 0.7.0-SNAPSHOT version 48 for 
> beam-sdks-java-io-google-cloud-platform, 49 for beam-sdks-java-core and 
> beam-runners-google-cloud-dataflow-java in Eclipse using Dataflow service.
>Reporter: Matthias Baetens
>Assignee: Reuven Lax
>
> Using the latest Beam SNAPSHOT which has a new BigQuery connector and trying 
> to write to partitioned tables according to the docs (or this Stackoverflow 
> question 
> http://stackoverflow.com/questions/43505534/writing-different-values-to-different-bigquery-tables-in-apache-beam/43655461#43655461):
>   static class PartitionedTableGeneration
>   implements 
> SerializableFunction {
>   @ProcessElement
>   public TableDestination apply(ValueInSingleWindow 
> value) {
>   // String dayString =
>   // 
> DateTimeFormat.forPattern("_MM_dd").withZone(DateTimeZone.UTC)
>   String dayString = 
> DateTimeFormat.forPattern("MMdd").withZone(DateTimeZone.UTC)
>   .print(((IntervalWindow) 
> value.getWindow()).start());
>   TableDestination td = new TableDestination(
>   "projecet:dataset.table + '$' 
> dayString, "");
>   return td;
>   }
>   }
> causes the following issues when running (depending on the specification of 
> the dayString):
> 1. "Invalid table ID \"partitioned_sample$20150905\". Table IDs must be 
> alphanumeric (plus underscores) and must be at most 1024 characters long. 
> Also, Table decorators cannot be used.",
>  2. java.lang.RuntimeException: org.apache.beam.sdk.util.UserCodeException: 
> java.lang.RuntimeException: Failed to create load job with id prefix 
> ...
> "errorResult" : {
>   "message" : "Invalid date partitioned table suffix: 2015_11_26",
>   "reason" : "invalid"
> }
> Writing to sharded tables (without the '$'-sign) is working fine.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[2/2] beam git commit: This closes #2970: Cherry pick #2953 to release-2.0.0

2017-05-08 Thread jkff
This closes #2970: Cherry pick #2953 to release-2.0.0


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/8f183443
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/8f183443
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/8f183443

Branch: refs/heads/release-2.0.0
Commit: 8f18344329d3af5882886a50a841d01062d408ef
Parents: 7dfc455 40bdbcb
Author: Eugene Kirpichov 
Authored: Mon May 8 16:06:15 2017 -0700
Committer: Eugene Kirpichov 
Committed: Mon May 8 16:06:15 2017 -0700

--
 .../beam/sdk/io/gcp/bigquery/BigQueryIO.java|  7 +++
 .../bigquery/DynamicDestinationsHelpers.java|  3 ++-
 .../sdk/io/gcp/bigquery/TableDestination.java   | 13 +---
 .../io/gcp/bigquery/TableDestinationCoder.java  | 19 +-
 .../sdk/io/gcp/bigquery/BigQueryIOTest.java | 21 ++--
 5 files changed, 36 insertions(+), 27 deletions(-)
--




[1/2] beam git commit: This closes #2953

2017-05-08 Thread jkff
Repository: beam
Updated Branches:
  refs/heads/release-2.0.0 7dfc45563 -> 8f1834432


This closes #2953


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/40bdbcb2
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/40bdbcb2
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/40bdbcb2

Branch: refs/heads/release-2.0.0
Commit: 40bdbcb28d5507c601d0791059c3f6ac9581a8f8
Parents: 7dfc455
Author: Eugene Kirpichov 
Authored: Mon May 8 13:41:01 2017 -0700
Committer: Eugene Kirpichov 
Committed: Mon May 8 16:00:33 2017 -0700

--
 .../beam/sdk/io/gcp/bigquery/BigQueryIO.java|  7 +++
 .../bigquery/DynamicDestinationsHelpers.java|  3 ++-
 .../sdk/io/gcp/bigquery/TableDestination.java   | 13 +---
 .../io/gcp/bigquery/TableDestinationCoder.java  | 19 +-
 .../sdk/io/gcp/bigquery/BigQueryIOTest.java | 21 ++--
 5 files changed, 36 insertions(+), 27 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/40bdbcb2/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryIO.java
--
diff --git 
a/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryIO.java
 
b/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryIO.java
index 304864a..8fb05ff 100644
--- 
a/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryIO.java
+++ 
b/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryIO.java
@@ -641,7 +641,6 @@ public class BigQueryIO {
   public static  Write write() {
 return new AutoValue_BigQueryIO_Write.Builder()
 .setValidate(true)
-.setTableDescription("")
 .setBigQueryServices(new BigQueryServicesImpl())
 .setCreateDisposition(Write.CreateDisposition.CREATE_IF_NEEDED)
 .setWriteDisposition(Write.WriteDisposition.WRITE_EMPTY)
@@ -684,7 +683,7 @@ public class BigQueryIO {
 abstract CreateDisposition getCreateDisposition();
 abstract WriteDisposition getWriteDisposition();
 /** Table description. Default is empty. */
-abstract String getTableDescription();
+@Nullable abstract String getTableDescription();
 /** An option to indicate if table validation is desired. Default is true. 
*/
 abstract boolean getValidate();
 abstract BigQueryServices getBigQueryServices();
@@ -1027,8 +1026,8 @@ public class BigQueryIO {
   .withLabel("Table WriteDisposition"))
   .addIfNotDefault(DisplayData.item("validation", getValidate())
   .withLabel("Validation Enabled"), true)
-  .addIfNotDefault(DisplayData.item("tableDescription", 
getTableDescription())
-  .withLabel("Table Description"), "");
+  .addIfNotNull(DisplayData.item("tableDescription", 
getTableDescription())
+  .withLabel("Table Description"));
 }
 
 /**

http://git-wip-us.apache.org/repos/asf/beam/blob/40bdbcb2/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/DynamicDestinationsHelpers.java
--
diff --git 
a/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/DynamicDestinationsHelpers.java
 
b/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/DynamicDestinationsHelpers.java
index 72a3314..530e2b6 100644
--- 
a/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/DynamicDestinationsHelpers.java
+++ 
b/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/DynamicDestinationsHelpers.java
@@ -40,9 +40,10 @@ class DynamicDestinationsHelpers {
*/
   static class ConstantTableDestinations extends DynamicDestinations {
 private final ValueProvider tableSpec;
+@Nullable
 private final String tableDescription;
 
-ConstantTableDestinations(ValueProvider tableSpec, String 
tableDescription) {
+ConstantTableDestinations(ValueProvider tableSpec, @Nullable 
String tableDescription) {
   this.tableSpec = tableSpec;
   this.tableDescription = tableDescription;
 }

http://git-wip-us.apache.org/repos/asf/beam/blob/40bdbcb2/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/TableDestination.java
--
diff --git 
a/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/TableDestination.java
 

Jenkins build is back to stable : beam_PostCommit_Java_ValidatesRunner_Flink #2730

2017-05-08 Thread Apache Jenkins Server
See 




[GitHub] beam pull request #2975: Don't log full exception traceback in direct runner...

2017-05-08 Thread robertwb
GitHub user robertwb opened a pull request:

https://github.com/apache/beam/pull/2975

Don't log full exception traceback in direct runner.

It will be logged later if not caught.

This reduces duplication for failing tests, and noise for tests
expecting failure.

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [ ] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [ ] Make sure tests pass via `mvn clean verify`.
 - [ ] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [ ] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.pdf).

---


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/robertwb/incubator-beam py-exceptions

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/2975.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2975


commit 3f5833e511972ba391f21d15a74b0d0961b22ca1
Author: Robert Bradshaw 
Date:   2017-05-08T23:06:30Z

Don't log full exception traceback in direct runner.

It will be logged later if not caught.

This reduces duplication for failing tests, and noise for tests
expecting failure.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (BEAM-831) ParDo Chaining

2017-05-08 Thread Davor Bonaci (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-831?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16001739#comment-16001739
 ] 

Davor Bonaci commented on BEAM-831:
---

I think everything is cherry-picked into the release branch. Resolving.

> ParDo Chaining
> --
>
> Key: BEAM-831
> URL: https://issues.apache.org/jira/browse/BEAM-831
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-apex
>Reporter: Thomas Weise
>Assignee: Chinmay Kolhatkar
> Fix For: 2.0.0
>
>
> Current state of Apex runner creates a plan that will place each operator in 
> a separate container (which would be processes when running on a YARN 
> cluster). Often the ParDo operators can be collocated in same thread or 
> container. Use Apex affinity/stream locality attributes for more efficient 
> execution plan.  



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (BEAM-831) ParDo Chaining

2017-05-08 Thread Davor Bonaci (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-831?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Davor Bonaci resolved BEAM-831.
---
Resolution: Fixed

> ParDo Chaining
> --
>
> Key: BEAM-831
> URL: https://issues.apache.org/jira/browse/BEAM-831
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-apex
>Reporter: Thomas Weise
>Assignee: Chinmay Kolhatkar
> Fix For: 2.0.0
>
>
> Current state of Apex runner creates a plan that will place each operator in 
> a separate container (which would be processes when running on a YARN 
> cluster). Often the ParDo operators can be collocated in same thread or 
> container. Use Apex affinity/stream locality attributes for more efficient 
> execution plan.  



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (BEAM-2210) PubsubIO.readPubsubMessagesWithoutAttributes is awkward

2017-05-08 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2210?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16001738#comment-16001738
 ] 

ASF GitHub Bot commented on BEAM-2210:
--

GitHub user jkff opened a pull request:

https://github.com/apache/beam/pull/2974

Cherrypick #2950 into release branch

Cherrypick of #2950 [BEAM-2210] Shorten awkward function names

R: @davorbonaci 

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/jkff/incubator-beam cp-2950

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/2974.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2974


commit 008783cb00a25169261704a2a20104bf6a901993
Author: Eugene Kirpichov 
Date:   2017-05-08T20:45:25Z

This closes #2950




> PubsubIO.readPubsubMessagesWithoutAttributes is awkward
> ---
>
> Key: BEAM-2210
> URL: https://issues.apache.org/jira/browse/BEAM-2210
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-gcp
>Reporter: Reuven Lax
>Assignee: Reuven Lax
> Fix For: 2.0.0
>
>
> This is the default mode of using PubSub, but the function name is awkward. 
> We should change this to simply readMessages.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (BEAM-2076) DirectRunner: minimal transitive API surface

2017-05-08 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2076?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16001737#comment-16001737
 ] 

ASF GitHub Bot commented on BEAM-2076:
--

GitHub user tgroh opened a pull request:

https://github.com/apache/beam/pull/2973

[BEAM-2076] Shade Additional Direct Runner Dependencies

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [ ] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [ ] Make sure tests pass via `mvn clean verify`.
 - [ ] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [ ] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.pdf).

---
Add a DirectRunner API Surface Test.

Repackaging runners-core appears to cause the direct runner to hang.
Investigations in progress.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/tgroh/beam direct_runner_deps_less_core

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/2973.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2973


commit 6ce163197d5da03d4750fc89bf617863b385d071
Author: Thomas Groh 
Date:   2017-05-03T17:53:12Z

Shade Additional Direct Runner Dependencies

Add a DirectRunner API Surface Test.

Repackaging runners-core appears to cause the direct runner to hang.
Investigations in progress.




> DirectRunner: minimal transitive API surface
> 
>
> Key: BEAM-2076
> URL: https://issues.apache.org/jira/browse/BEAM-2076
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-direct
>Reporter: Daniel Halperin
>Assignee: Thomas Groh
> Fix For: 2.0.0
>
>
> The {{DirectRunner}} is likely to accidentally be on many users' classpath 
> when they are running on other runners. As such, it should have a minimal 
> transitive API surface, shading things it needs directly and need not expose.
> My base assumption is that {{runners-core}} should be shaded. There may be 
> others tho -- this merits a bit of a look.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] beam pull request #2974: Cherrypick #2950 into release branch

2017-05-08 Thread jkff
GitHub user jkff opened a pull request:

https://github.com/apache/beam/pull/2974

Cherrypick #2950 into release branch

Cherrypick of #2950 [BEAM-2210] Shorten awkward function names

R: @davorbonaci 

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/jkff/incubator-beam cp-2950

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/2974.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2974


commit 008783cb00a25169261704a2a20104bf6a901993
Author: Eugene Kirpichov 
Date:   2017-05-08T20:45:25Z

This closes #2950




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] beam pull request #2973: [BEAM-2076] Shade Additional Direct Runner Dependen...

2017-05-08 Thread tgroh
GitHub user tgroh opened a pull request:

https://github.com/apache/beam/pull/2973

[BEAM-2076] Shade Additional Direct Runner Dependencies

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [ ] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [ ] Make sure tests pass via `mvn clean verify`.
 - [ ] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [ ] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.pdf).

---
Add a DirectRunner API Surface Test.

Repackaging runners-core appears to cause the direct runner to hang.
Investigations in progress.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/tgroh/beam direct_runner_deps_less_core

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/2973.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2973


commit 6ce163197d5da03d4750fc89bf617863b385d071
Author: Thomas Groh 
Date:   2017-05-03T17:53:12Z

Shade Additional Direct Runner Dependencies

Add a DirectRunner API Surface Test.

Repackaging runners-core appears to cause the direct runner to hang.
Investigations in progress.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] beam pull request #2972: [BEAM-2211] Move PathValidator into GCP-Core

2017-05-08 Thread dhalperi
GitHub user dhalperi opened a pull request:

https://github.com/apache/beam/pull/2972

[BEAM-2211] Move PathValidator into GCP-Core

For now, this is not a Beam concept.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/dhalperi/beam b2211-path-validator

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/2972.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2972


commit cd6dc1c4cfc940095ee1dd4c7c9d6a080d425d25
Author: Dan Halperin 
Date:   2017-05-08T22:37:29Z

[BEAM-2211] DataflowRunner: remove validation of file read/write paths

Now that users can implement and register custom FileSystems,
we can no longer really effectively validate filesystems they
can read or write files from. They can even register file://
to point to some HDFS path, e.g.,

commit b75596b2b75404ed695ec49bb9793b4b1048129e
Author: Dan Halperin 
Date:   2017-05-08T23:01:40Z

[BEAM-2211] Move PathValidator into GCP-Core

For now, this does not need to be a Beam concept




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (BEAM-2184) OutputTimeFn is not a Fn: in Python, rename to TimestampCombiner

2017-05-08 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2184?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16001732#comment-16001732
 ] 

ASF GitHub Bot commented on BEAM-2184:
--

GitHub user robertwb opened a pull request:

https://github.com/apache/beam/pull/2971

[BEAM-2184] Rename OutputTimeFn to TimestampCombiner.

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [ ] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [ ] Make sure tests pass via `mvn clean verify`.
 - [ ] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [ ] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.pdf).

---


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/robertwb/incubator-beam jira-2184

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/2971.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2971






> OutputTimeFn is not a Fn: in Python, rename to TimestampCombiner
> 
>
> Key: BEAM-2184
> URL: https://issues.apache.org/jira/browse/BEAM-2184
> Project: Beam
>  Issue Type: Bug
>  Components: beam-model-runner-api, sdk-py
>Reporter: Robert Bradshaw
>Assignee: Robert Bradshaw
> Fix For: 2.0.0
>
>
> See also https://issues.apache.org/jira/browse/BEAM-1327



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] beam pull request #2971: [BEAM-2184] Rename OutputTimeFn to TimestampCombine...

2017-05-08 Thread robertwb
GitHub user robertwb opened a pull request:

https://github.com/apache/beam/pull/2971

[BEAM-2184] Rename OutputTimeFn to TimestampCombiner.

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [ ] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [ ] Make sure tests pass via `mvn clean verify`.
 - [ ] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [ ] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.pdf).

---


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/robertwb/incubator-beam jira-2184

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/2971.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2971






---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] beam pull request #2970: TableDescription is allowed to be null.

2017-05-08 Thread jkff
GitHub user jkff opened a pull request:

https://github.com/apache/beam/pull/2970

TableDescription is allowed to be null.

https://github.com/apache/beam/pull/2953 BEAM-2122 Allow table descriptions 
to be null

R: @davorbonaci 

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/jkff/incubator-beam cp-2953

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/2970.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2970






---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (BEAM-2122) Writing to partitioned BigQuery tables from Dataflow is causing errors

2017-05-08 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2122?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16001730#comment-16001730
 ] 

ASF GitHub Bot commented on BEAM-2122:
--

GitHub user jkff opened a pull request:

https://github.com/apache/beam/pull/2970

TableDescription is allowed to be null.

https://github.com/apache/beam/pull/2953 BEAM-2122 Allow table descriptions 
to be null

R: @davorbonaci 

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/jkff/incubator-beam cp-2953

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/2970.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2970






> Writing to partitioned BigQuery tables from Dataflow is causing errors
> --
>
> Key: BEAM-2122
> URL: https://issues.apache.org/jira/browse/BEAM-2122
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-gcp
> Environment: Running with Beam 0.7.0-SNAPSHOT version 48 for 
> beam-sdks-java-io-google-cloud-platform, 49 for beam-sdks-java-core and 
> beam-runners-google-cloud-dataflow-java in Eclipse using Dataflow service.
>Reporter: Matthias Baetens
>Assignee: Reuven Lax
>
> Using the latest Beam SNAPSHOT which has a new BigQuery connector and trying 
> to write to partitioned tables according to the docs (or this Stackoverflow 
> question 
> http://stackoverflow.com/questions/43505534/writing-different-values-to-different-bigquery-tables-in-apache-beam/43655461#43655461):
>   static class PartitionedTableGeneration
>   implements 
> SerializableFunction {
>   @ProcessElement
>   public TableDestination apply(ValueInSingleWindow 
> value) {
>   // String dayString =
>   // 
> DateTimeFormat.forPattern("_MM_dd").withZone(DateTimeZone.UTC)
>   String dayString = 
> DateTimeFormat.forPattern("MMdd").withZone(DateTimeZone.UTC)
>   .print(((IntervalWindow) 
> value.getWindow()).start());
>   TableDestination td = new TableDestination(
>   "projecet:dataset.table + '$' 
> dayString, "");
>   return td;
>   }
>   }
> causes the following issues when running (depending on the specification of 
> the dayString):
> 1. "Invalid table ID \"partitioned_sample$20150905\". Table IDs must be 
> alphanumeric (plus underscores) and must be at most 1024 characters long. 
> Also, Table decorators cannot be used.",
>  2. java.lang.RuntimeException: org.apache.beam.sdk.util.UserCodeException: 
> java.lang.RuntimeException: Failed to create load job with id prefix 
> ...
> "errorResult" : {
>   "message" : "Invalid date partitioned table suffix: 2015_11_26",
>   "reason" : "invalid"
> }
> Writing to sharded tables (without the '$'-sign) is working fine.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (BEAM-2122) Writing to partitioned BigQuery tables from Dataflow is causing errors

2017-05-08 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2122?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16001727#comment-16001727
 ] 

ASF GitHub Bot commented on BEAM-2122:
--

GitHub user jkff opened a pull request:

https://github.com/apache/beam/pull/2969

Cherrypick #2953 into release branch

https://github.com/apache/beam/pull/2953 BEAM-2122 Allow table descriptions 
to be null

R: @davorbonaci 

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/jkff/incubator-beam cp-2953

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/2969.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2969


commit 56262d8f3064031606e01801eb36322cb3a38b95
Author: Davor Bonaci 
Date:   2017-05-05T22:53:46Z

Update version number from 0.7.0-SNAPSHOT to 2.0.0-SNAPSHOT

commit 1a77e208e440ef34b7d0f7e1104d5c8c5ee04474
Author: Dan Halperin 
Date:   2017-05-05T23:32:54Z

This cherry-picks #2926

commit b9c8cfe227d7a6bcb258b93717969a78a31dac07
Author: Dan Halperin 
Date:   2017-05-06T00:24:12Z

This closes #2932

commit d2fa51b78892f7ebf13da1a5fc7bb45755440a5f
Author: Davor Bonaci 
Date:   2017-05-05T23:17:43Z

Cherry-pick pull request #2911 into release-2.0.0

commit 67ea7ae4d2144525582c7de03b17d06daa9f35bb
Author: Davor Bonaci 
Date:   2017-05-05T23:20:28Z

Cherry-pick pull request #2907 into release-2.0.0 branch

commit 96aeb97cc41b4a93bec7d72cff4887e9f358eef2
Author: Dan Halperin 
Date:   2017-05-06T00:26:41Z

This closes #2931

commit 1ad3f84c68235eeae6927f283180829de2f0aa33
Author: Davor Bonaci 
Date:   2017-05-06T00:36:09Z

Set Dataflow runner's worker container image for version 2.0.0

commit f97e52b3b677a5a35bac7a2012366837b7bb15cb
Author: Davor Bonaci 
Date:   2017-05-06T01:46:31Z

[maven-release-plugin] prepare release v2.0.0-RC1

commit 3b7a62301fba55f3caf68d8876b39c53e80f171a
Author: Davor Bonaci 
Date:   2017-05-07T02:16:30Z

[maven-release-plugin] rollback changes from release preparation of 
v2.0.0-RC1

commit 6eab5c9465bda3da4d8a1ea9f73a74e9c8faec85
Author: chinmaykolhatkar 
Date:   2017-03-01T11:29:46Z

[BEAM-831] ParDo Fusion of Apex Runner

commit 3f5282d515fa53516fda6d0376cc912560fd6d85
Author: Thomas Weise 
Date:   2017-05-05T13:45:34Z

[BEAM-831] Fix chaining, add test.
closes #2216

commit bec30b3beaa241483814e859d781f1e04479394b
Author: Ahmet Altay 
Date:   2017-05-08T06:31:42Z

Cherry-pick pull request #2946 in 2.0.0 release branch.
Fix typo in datastore_wordcount.py.

commit 021468e03e5a5b0851e21f333ebc07060dc471cd
Author: Ahmet Altay 
Date:   2017-05-08T17:25:40Z

This closes #2955

commit 72241117cbf2d9682054a69ea895e4c6f6a93146
Author: Sourabh Bajaj 
Date:   2017-05-07T20:55:20Z

[BEAM-2206] Move pipelineOptions into options modules

commit c4f234c8cfb349d877eeb5c62eec7d80e844be07
Author: Sourabh Bajaj 
Date:   2017-05-08T01:08:49Z

Only cythonize files within apache_beam

commit 741bf7442d88a9e30064bd132046e5db55e7a740
Author: Ahmet Altay 
Date:   2017-05-08T21:02:54Z

This closes #2964

commit 25cda3abb0f6442f02ab6f31f0a4850be66d09d9
Author: Sourabh Bajaj 
Date:   2017-05-06T02:15:48Z

Update python dataflow worker

commit 265405bc85f9f705776e88680e3af26fab4e7de3
Author: Ahmet Altay 
Date:   2017-05-08T21:06:17Z

This closes #2943

commit e0faeeef80211ddbc632e622ecefc1e005c5ca29
Author: Dan Halperin 
Date:   2017-05-06T02:06:03Z

[BEAM-2212] FileBasedSource: improve message when logging.

ValueProvider should not be printed, rather the string instead.

commit bff819a9858c79c6c3232b4c03f262421d325c00
Author: Dan Halperin 
Date:   2017-05-08T16:59:16Z

[BEAM-2212] FileBasedSource: refactor to remove uses of 
fileOrPatternSpec.get()

Makes it less likely to have errors from printing ValueProviders instead of 
runtime values

commit 94d104064cc0e209fa54dd63f4cbe99cd6f2d591
Author: Dan Halperin 
Date:   2017-05-08T21:15:48Z

This closes #2958

commit 4ec11de1a8b876a8263c95c21b7ce830fb4e962b
Author: Dan Halperin 
Date:   2017-05-06T00:16:34Z

[BEAM-2190] pom.xml: do a better job of dependency management

Even if Beam appears to have the correct dependencies, we cannot
guarantee that modules that depend on us transitively get the right
dependencies. For 

[GitHub] beam pull request #2969: Cherrypick #2953 into release branch

2017-05-08 Thread jkff
GitHub user jkff opened a pull request:

https://github.com/apache/beam/pull/2969

Cherrypick #2953 into release branch

https://github.com/apache/beam/pull/2953 BEAM-2122 Allow table descriptions 
to be null

R: @davorbonaci 

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/jkff/incubator-beam cp-2953

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/2969.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2969


commit 56262d8f3064031606e01801eb36322cb3a38b95
Author: Davor Bonaci 
Date:   2017-05-05T22:53:46Z

Update version number from 0.7.0-SNAPSHOT to 2.0.0-SNAPSHOT

commit 1a77e208e440ef34b7d0f7e1104d5c8c5ee04474
Author: Dan Halperin 
Date:   2017-05-05T23:32:54Z

This cherry-picks #2926

commit b9c8cfe227d7a6bcb258b93717969a78a31dac07
Author: Dan Halperin 
Date:   2017-05-06T00:24:12Z

This closes #2932

commit d2fa51b78892f7ebf13da1a5fc7bb45755440a5f
Author: Davor Bonaci 
Date:   2017-05-05T23:17:43Z

Cherry-pick pull request #2911 into release-2.0.0

commit 67ea7ae4d2144525582c7de03b17d06daa9f35bb
Author: Davor Bonaci 
Date:   2017-05-05T23:20:28Z

Cherry-pick pull request #2907 into release-2.0.0 branch

commit 96aeb97cc41b4a93bec7d72cff4887e9f358eef2
Author: Dan Halperin 
Date:   2017-05-06T00:26:41Z

This closes #2931

commit 1ad3f84c68235eeae6927f283180829de2f0aa33
Author: Davor Bonaci 
Date:   2017-05-06T00:36:09Z

Set Dataflow runner's worker container image for version 2.0.0

commit f97e52b3b677a5a35bac7a2012366837b7bb15cb
Author: Davor Bonaci 
Date:   2017-05-06T01:46:31Z

[maven-release-plugin] prepare release v2.0.0-RC1

commit 3b7a62301fba55f3caf68d8876b39c53e80f171a
Author: Davor Bonaci 
Date:   2017-05-07T02:16:30Z

[maven-release-plugin] rollback changes from release preparation of 
v2.0.0-RC1

commit 6eab5c9465bda3da4d8a1ea9f73a74e9c8faec85
Author: chinmaykolhatkar 
Date:   2017-03-01T11:29:46Z

[BEAM-831] ParDo Fusion of Apex Runner

commit 3f5282d515fa53516fda6d0376cc912560fd6d85
Author: Thomas Weise 
Date:   2017-05-05T13:45:34Z

[BEAM-831] Fix chaining, add test.
closes #2216

commit bec30b3beaa241483814e859d781f1e04479394b
Author: Ahmet Altay 
Date:   2017-05-08T06:31:42Z

Cherry-pick pull request #2946 in 2.0.0 release branch.
Fix typo in datastore_wordcount.py.

commit 021468e03e5a5b0851e21f333ebc07060dc471cd
Author: Ahmet Altay 
Date:   2017-05-08T17:25:40Z

This closes #2955

commit 72241117cbf2d9682054a69ea895e4c6f6a93146
Author: Sourabh Bajaj 
Date:   2017-05-07T20:55:20Z

[BEAM-2206] Move pipelineOptions into options modules

commit c4f234c8cfb349d877eeb5c62eec7d80e844be07
Author: Sourabh Bajaj 
Date:   2017-05-08T01:08:49Z

Only cythonize files within apache_beam

commit 741bf7442d88a9e30064bd132046e5db55e7a740
Author: Ahmet Altay 
Date:   2017-05-08T21:02:54Z

This closes #2964

commit 25cda3abb0f6442f02ab6f31f0a4850be66d09d9
Author: Sourabh Bajaj 
Date:   2017-05-06T02:15:48Z

Update python dataflow worker

commit 265405bc85f9f705776e88680e3af26fab4e7de3
Author: Ahmet Altay 
Date:   2017-05-08T21:06:17Z

This closes #2943

commit e0faeeef80211ddbc632e622ecefc1e005c5ca29
Author: Dan Halperin 
Date:   2017-05-06T02:06:03Z

[BEAM-2212] FileBasedSource: improve message when logging.

ValueProvider should not be printed, rather the string instead.

commit bff819a9858c79c6c3232b4c03f262421d325c00
Author: Dan Halperin 
Date:   2017-05-08T16:59:16Z

[BEAM-2212] FileBasedSource: refactor to remove uses of 
fileOrPatternSpec.get()

Makes it less likely to have errors from printing ValueProviders instead of 
runtime values

commit 94d104064cc0e209fa54dd63f4cbe99cd6f2d591
Author: Dan Halperin 
Date:   2017-05-08T21:15:48Z

This closes #2958

commit 4ec11de1a8b876a8263c95c21b7ce830fb4e962b
Author: Dan Halperin 
Date:   2017-05-06T00:16:34Z

[BEAM-2190] pom.xml: do a better job of dependency management

Even if Beam appears to have the correct dependencies, we cannot
guarantee that modules that depend on us transitively get the right
dependencies. For example, even though grpc-protobuf-lite has
protobuf-lite excluded, and the Maven Enforcer banned-dependencies
check passes... if a user happens to get a transitive dependency on
grpc-all first, they may pull in grpc-protobuf from 

  1   2   3   4   >