Jenkins build is still unstable: beam_PostCommit_Java_MavenInstall #3581

2017-05-01 Thread Apache Jenkins Server
See 




[jira] [Commented] (BEAM-59) Switch from IOChannelFactory to FileSystems

2017-05-01 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-59?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15992406#comment-15992406
 ] 

ASF GitHub Bot commented on BEAM-59:


GitHub user dhalperi opened a pull request:

https://github.com/apache/beam/pull/2818

[BEAM-59] Delete old restrictions on output file paths

These predate Apache Beam and are no longer relevant now that Text and Avro 
are implemented
in the SDK.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/dhalperi/beam remove-old-validation

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/2818.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2818


commit 50d7ff5380dc038a0797a313414fd56ab291cabb
Author: Dan Halperin 
Date:   2017-05-02T06:36:57Z

[BEAM-59] Delete old restrictions on output file paths

These predate Apache Beam and are no longer relevant now that Text and Avro 
are implemented
in the SDK




> Switch from IOChannelFactory to FileSystems
> ---
>
> Key: BEAM-59
> URL: https://issues.apache.org/jira/browse/BEAM-59
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-java-core, sdk-java-gcp
>Reporter: Daniel Halperin
>Assignee: Daniel Halperin
> Fix For: First stable release
>
>
> Right now, FileBasedSource and FileBasedSink communication is mediated by 
> IOChannelFactory. There are a number of issues:
> * Global configuration -- e.g., all 'gs://' URIs use the same credentials. 
> This should be per-source/per-sink/etc.
> * Supported APIs -- currently IOChannelFactory is in the "non-public API" 
> util package and subject to change. We need users to be able to add new 
> backends ('s3://', 'hdfs://', etc.) directly, without fear that they will be 
> broken.
> * Per-backend features: e.g., creating buckets in GCS/s3, setting expiration 
> time, etc.
> Updates:
> Design docs posted on dev@ list:
> Part 1: IOChannelFactory Redesign: 
> https://docs.google.com/document/d/11TdPyZ9_zmjokhNWM3Id-XJsVG3qel2lhdKTknmZ_7M/edit#
> Part 2: Configurable BeamFileSystem:
> https://docs.google.com/document/d/1-7vo9nLRsEEzDGnb562PuL4q9mUiq_ZVpCAiyyJw8p8/edit#heading=h.p3gc3colc2cs



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] beam pull request #2818: [BEAM-59] Delete old restrictions on output file pa...

2017-05-01 Thread dhalperi
GitHub user dhalperi opened a pull request:

https://github.com/apache/beam/pull/2818

[BEAM-59] Delete old restrictions on output file paths

These predate Apache Beam and are no longer relevant now that Text and Avro 
are implemented
in the SDK.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/dhalperi/beam remove-old-validation

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/2818.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2818


commit 50d7ff5380dc038a0797a313414fd56ab291cabb
Author: Dan Halperin 
Date:   2017-05-02T06:36:57Z

[BEAM-59] Delete old restrictions on output file paths

These predate Apache Beam and are no longer relevant now that Text and Avro 
are implemented
in the SDK




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (BEAM-2063) type DATE/TIME/TIMESTAMP support

2017-05-01 Thread James Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2063?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15992403#comment-15992403
 ] 

James Xu commented on BEAM-2063:


Can i take this one?

> type DATE/TIME/TIMESTAMP support
> 
>
> Key: BEAM-2063
> URL: https://issues.apache.org/jira/browse/BEAM-2063
> Project: Beam
>  Issue Type: New Feature
>  Components: dsl-sql
>Reporter: Xu Mingmin
>
> Support data type DATE/TIME/TIMESTAMP, which could be used in aggregation.
> New data type support include changes:
> 1. type mapping in {{BeamSQLRow}}, and coder support;
> 2. {{BeamSqlPrimitive}} for static values;



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


Jenkins build is still unstable: beam_PostCommit_Java_MavenInstall #3580

2017-05-01 Thread Apache Jenkins Server
See 




Jenkins build is still unstable: beam_PostCommit_Java_MavenInstall #3579

2017-05-01 Thread Apache Jenkins Server
See 




Build failed in Jenkins: beam_PerformanceTests_JDBC #168

2017-05-01 Thread Apache Jenkins Server
See 


Changes:

[robertwb] Fn API support for Python.

[robertwb] Rename OutputValue to TaggedOutput.

[lcwik] Copy CloudObject to the Dataflow Module

[lcwik] [BEAM-1871] Remove deprecated org.apache.beam.sdk.options.GcsOptions

[kirpichov] Fixes javadoc of TextIO to not point to AvroIO

[kirpichov] Moves AvroSink to upper level

[kirpichov] Removes AvroIO.Read.Bound

[kirpichov] Adds AvroIO.readGenericRecords()

[kirpichov] Converts AvroIO.Read to AutoValue

[kirpichov] Removes AvroIO.Write.Bound

[kirpichov] Moves AvroIO.Read.withSchema into read()

[kirpichov] Converts AvroIO.Write to AutoValue; adds writeGenericRecords()

[kirpichov] Moves AvroIO.write().withSchema into write()

[kirpichov] Scattered minor improvements per review comments

[dhalperi] maptask_executor_runner_test: build fix

[lcwik] [BEAM-2005, BEAM-2030, BEAM-2031, BEAM-2032, BEAM-2033, BEAM-2070] Base

[lcwik] [BEAM-2005] Fix build break, ignore test due to change in

[lcwik] Remove aggregators from DoFn contexts and internal SDK usage

[dhalperi] [BEAM-59] Core tests: stop using gs:// paths

[dhalperi] [BEAM-59] TFRecordIOTest: cleanup

[dhalperi] [BEAM-59] DataflowRunnerTests: configure FileSystems in test

[dhalperi] [BEAM-59] AvroIOTest: use absolute paths for display data

--
[...truncated 750.47 KB...]
at 
com.google.cloud.dataflow.worker.runners.worker.DataflowWorker.doWork(DataflowWorker.java:272)
at 
com.google.cloud.dataflow.worker.runners.worker.DataflowWorker.getAndPerformWork(DataflowWorker.java:244)
at 
com.google.cloud.dataflow.worker.runners.worker.DataflowBatchWorkerHarness$WorkerThread.doWork(DataflowBatchWorkerHarness.java:127)
at 
com.google.cloud.dataflow.worker.runners.worker.DataflowBatchWorkerHarness$WorkerThread.call(DataflowBatchWorkerHarness.java:107)
at 
com.google.cloud.dataflow.worker.runners.worker.DataflowBatchWorkerHarness$WorkerThread.call(DataflowBatchWorkerHarness.java:94)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
(be1c279faf6e2425): java.lang.RuntimeException: 
org.apache.beam.sdk.util.UserCodeException: org.postgresql.util.PSQLException: 
The connection attempt failed.
at 
com.google.cloud.dataflow.worker.runners.worker.MapTaskExecutorFactory$3.typedApply(MapTaskExecutorFactory.java:289)
at 
com.google.cloud.dataflow.worker.runners.worker.MapTaskExecutorFactory$3.typedApply(MapTaskExecutorFactory.java:261)
at 
com.google.cloud.dataflow.worker.graph.Networks$TypeSafeNodeFunction.apply(Networks.java:55)
at 
com.google.cloud.dataflow.worker.graph.Networks$TypeSafeNodeFunction.apply(Networks.java:43)
at 
com.google.cloud.dataflow.worker.graph.Networks.replaceDirectedNetworkNodes(Networks.java:78)
at 
com.google.cloud.dataflow.worker.runners.worker.MapTaskExecutorFactory.create(MapTaskExecutorFactory.java:152)
at 
com.google.cloud.dataflow.worker.runners.worker.DataflowWorker.doWork(DataflowWorker.java:272)
at 
com.google.cloud.dataflow.worker.runners.worker.DataflowWorker.getAndPerformWork(DataflowWorker.java:244)
at 
com.google.cloud.dataflow.worker.runners.worker.DataflowBatchWorkerHarness$WorkerThread.doWork(DataflowBatchWorkerHarness.java:127)
at 
com.google.cloud.dataflow.worker.runners.worker.DataflowBatchWorkerHarness$WorkerThread.call(DataflowBatchWorkerHarness.java:107)
at 
com.google.cloud.dataflow.worker.runners.worker.DataflowBatchWorkerHarness$WorkerThread.call(DataflowBatchWorkerHarness.java:94)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.beam.sdk.util.UserCodeException: 
org.postgresql.util.PSQLException: The connection attempt failed.
at 
org.apache.beam.sdk.util.UserCodeException.wrap(UserCodeException.java:36)
at 
org.apache.beam.sdk.io.jdbc.JdbcIO$Read$ReadFn$auxiliary$i25EH9aw.invokeSetup(Unknown
 Source)
at 
com.google.cloud.dataflow.worker.runners.worker.DoFnInstanceManagers$ConcurrentQueueInstanceManager.deserializeCopy(DoFnInstanceManagers.java:66)
at 
com.google.cloud.dataflow.worker.runners.worker.DoFnInstanceManagers$ConcurrentQueueInstanceManager.peek(DoFnInstanceManagers.java:48)
at 
com.google.cloud.dataflow.worker.runners.worker.UserParDoFnFactory.create(UserParDoFnFactory.java:102)
at 
com.google.cloud.dataflow.worker.runners.worker.Defau

Build failed in Jenkins: beam_PerformanceTests_Dataflow #362

2017-05-01 Thread Apache Jenkins Server
See 


Changes:

[robertwb] Fn API support for Python.

[robertwb] Rename OutputValue to TaggedOutput.

[lcwik] Copy CloudObject to the Dataflow Module

[lcwik] [BEAM-1871] Remove deprecated org.apache.beam.sdk.options.GcsOptions

[kirpichov] Fixes javadoc of TextIO to not point to AvroIO

[kirpichov] Moves AvroSink to upper level

[kirpichov] Removes AvroIO.Read.Bound

[kirpichov] Adds AvroIO.readGenericRecords()

[kirpichov] Converts AvroIO.Read to AutoValue

[kirpichov] Removes AvroIO.Write.Bound

[kirpichov] Moves AvroIO.Read.withSchema into read()

[kirpichov] Converts AvroIO.Write to AutoValue; adds writeGenericRecords()

[kirpichov] Moves AvroIO.write().withSchema into write()

[kirpichov] Scattered minor improvements per review comments

[dhalperi] maptask_executor_runner_test: build fix

[lcwik] [BEAM-2005, BEAM-2030, BEAM-2031, BEAM-2032, BEAM-2033, BEAM-2070] Base

[lcwik] [BEAM-2005] Fix build break, ignore test due to change in

[lcwik] Remove aggregators from DoFn contexts and internal SDK usage

[dhalperi] [BEAM-59] Core tests: stop using gs:// paths

[dhalperi] [BEAM-59] TFRecordIOTest: cleanup

[dhalperi] [BEAM-59] DataflowRunnerTests: configure FileSystems in test

[dhalperi] [BEAM-59] AvroIOTest: use absolute paths for display data

--
Started by timer
[EnvInject] - Loading node environment variables.
Building remotely on beam5 (beam) in workspace 

 > git rev-parse --is-inside-work-tree # timeout=10
Fetching changes from the remote Git repository
 > git config remote.origin.url https://github.com/apache/beam.git # timeout=10
Fetching upstream changes from https://github.com/apache/beam.git
 > git --version # timeout=10
 > git fetch --tags --progress https://github.com/apache/beam.git 
 > +refs/heads/*:refs/remotes/origin/* +refs/pull/*:refs/remotes/origin/pr/*
 > git rev-parse origin/master^{commit} # timeout=10
Checking out Revision e92ead58cd661ff7ee952278f2983a578bdd2412 (origin/master)
 > git config core.sparsecheckout # timeout=10
 > git checkout -f e92ead58cd661ff7ee952278f2983a578bdd2412
 > git rev-list b8131fe935d00db7f760b6410a4eb2239db1d25a # timeout=10
Cleaning workspace
 > git rev-parse --verify HEAD # timeout=10
Resetting working tree
 > git reset --hard # timeout=10
 > git clean -fdx # timeout=10
[EnvInject] - Executing scripts and injecting environment variables after the 
SCM step.
[EnvInject] - Injecting as environment variables the properties content 
SPARK_LOCAL_IP=127.0.0.1

[EnvInject] - Variables injected successfully.
[beam_PerformanceTests_Dataflow] $ /bin/bash -xe 
/tmp/hudson7353765744138912579.sh
+ rm -rf PerfKitBenchmarker
[beam_PerformanceTests_Dataflow] $ /bin/bash -xe 
/tmp/hudson5378899830644152793.sh
+ git clone https://github.com/GoogleCloudPlatform/PerfKitBenchmarker.git
Cloning into 'PerfKitBenchmarker'...
[beam_PerformanceTests_Dataflow] $ /bin/bash -xe 
/tmp/hudson3104503862056589697.sh
+ pip install --user -r PerfKitBenchmarker/requirements.txt
Requirement already satisfied (use --upgrade to upgrade): python-gflags==3.1.1 
in /home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 14))
Requirement already satisfied (use --upgrade to upgrade): jinja2>=2.7 in 
/usr/local/lib/python2.7/dist-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 15))
Requirement already satisfied (use --upgrade to upgrade): setuptools in 
/usr/lib/python2.7/dist-packages (from -r PerfKitBenchmarker/requirements.txt 
(line 16))
Requirement already satisfied (use --upgrade to upgrade): 
colorlog[windows]==2.6.0 in /home/jenkins/.local/lib/python2.7/site-packages 
(from -r PerfKitBenchmarker/requirements.txt (line 17))
  Installing extra requirements: 'windows'
Requirement already satisfied (use --upgrade to upgrade): blinker>=1.3 in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 18))
Requirement already satisfied (use --upgrade to upgrade): futures>=3.0.3 in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 19))
Requirement already satisfied (use --upgrade to upgrade): PyYAML==3.11 in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 20))
Requirement already satisfied (use --upgrade to upgrade): pint>=0.7 in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 21))
Requirement already satisfied (use --upgrade to upgrade): numpy in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 22))
Requirement already satisfied (use --upgrade to upgrade): functools32 in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 23))
Requir

[jira] [Commented] (BEAM-775) Remove Aggregators from the Java SDK

2017-05-01 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15992366#comment-15992366
 ] 

ASF GitHub Bot commented on BEAM-775:
-

GitHub user dhalperi opened a pull request:

https://github.com/apache/beam/pull/2817

[BEAM-775] Remove Aggregators from StatefulDoFn runner

This is a clone of #2744 rebased on master to see if it's green. Will close 
and merge #2744 if passes.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/dhalperi/beam merge-pr-2744

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/2817.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2817


commit 31247c6510ff00563ba1869dd1d3b7abb186f5b6
Author: Pablo 
Date:   2017-04-27T16:43:41Z

Remove Aggregators from StatefulDoFn runner

commit 388b84c959595ee877ee8bb4745dac6d510dd374
Author: Pablo 
Date:   2017-04-27T23:10:48Z

Removing Aggregator from core runner code

commit 4d6d878a77f1f2a754607bba8da285438a39d96f
Author: Pablo 
Date:   2017-04-28T04:53:16Z

Remove accumulators from DoFn tester.




> Remove Aggregators from the Java SDK
> 
>
> Key: BEAM-775
> URL: https://issues.apache.org/jira/browse/BEAM-775
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-java-core
>Reporter: Ben Chambers
>Assignee: Pablo Estrada
>  Labels: backward-incompatible
> Fix For: First stable release
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] beam pull request #2817: [BEAM-775] Remove Aggregators from StatefulDoFn run...

2017-05-01 Thread dhalperi
GitHub user dhalperi opened a pull request:

https://github.com/apache/beam/pull/2817

[BEAM-775] Remove Aggregators from StatefulDoFn runner

This is a clone of #2744 rebased on master to see if it's green. Will close 
and merge #2744 if passes.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/dhalperi/beam merge-pr-2744

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/2817.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2817


commit 31247c6510ff00563ba1869dd1d3b7abb186f5b6
Author: Pablo 
Date:   2017-04-27T16:43:41Z

Remove Aggregators from StatefulDoFn runner

commit 388b84c959595ee877ee8bb4745dac6d510dd374
Author: Pablo 
Date:   2017-04-27T23:10:48Z

Removing Aggregator from core runner code

commit 4d6d878a77f1f2a754607bba8da285438a39d96f
Author: Pablo 
Date:   2017-04-28T04:53:16Z

Remove accumulators from DoFn tester.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (BEAM-59) Switch from IOChannelFactory to FileSystems

2017-05-01 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-59?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15992362#comment-15992362
 ] 

ASF GitHub Bot commented on BEAM-59:


Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/2814


> Switch from IOChannelFactory to FileSystems
> ---
>
> Key: BEAM-59
> URL: https://issues.apache.org/jira/browse/BEAM-59
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-java-core, sdk-java-gcp
>Reporter: Daniel Halperin
>Assignee: Daniel Halperin
> Fix For: First stable release
>
>
> Right now, FileBasedSource and FileBasedSink communication is mediated by 
> IOChannelFactory. There are a number of issues:
> * Global configuration -- e.g., all 'gs://' URIs use the same credentials. 
> This should be per-source/per-sink/etc.
> * Supported APIs -- currently IOChannelFactory is in the "non-public API" 
> util package and subject to change. We need users to be able to add new 
> backends ('s3://', 'hdfs://', etc.) directly, without fear that they will be 
> broken.
> * Per-backend features: e.g., creating buckets in GCS/s3, setting expiration 
> time, etc.
> Updates:
> Design docs posted on dev@ list:
> Part 1: IOChannelFactory Redesign: 
> https://docs.google.com/document/d/11TdPyZ9_zmjokhNWM3Id-XJsVG3qel2lhdKTknmZ_7M/edit#
> Part 2: Configurable BeamFileSystem:
> https://docs.google.com/document/d/1-7vo9nLRsEEzDGnb562PuL4q9mUiq_ZVpCAiyyJw8p8/edit#heading=h.p3gc3colc2cs



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] beam pull request #2814: [BEAM-59] Minor changes that can be separated out f...

2017-05-01 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/2814


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[3/5] beam git commit: [BEAM-59] Core tests: stop using gs:// paths

2017-05-01 Thread dhalperi
[BEAM-59] Core tests: stop using gs:// paths

Local filesystem is just as good an example, and in the
future these may break because GCS filesystem is not present
in sdks/java/core.


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/e83e5626
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/e83e5626
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/e83e5626

Branch: refs/heads/master
Commit: e83e56264d4862d9cfe499f865cd49c24accb02a
Parents: 87cf88a
Author: Dan Halperin 
Authored: Mon May 1 18:43:23 2017 -0700
Committer: Dan Halperin 
Committed: Mon May 1 22:43:54 2017 -0700

--
 .../test/java/org/apache/beam/sdk/io/AvroIOTest.java| 12 ++--
 .../test/java/org/apache/beam/sdk/io/TextIOTest.java|  4 ++--
 2 files changed, 8 insertions(+), 8 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/e83e5626/sdks/java/core/src/test/java/org/apache/beam/sdk/io/AvroIOTest.java
--
diff --git 
a/sdks/java/core/src/test/java/org/apache/beam/sdk/io/AvroIOTest.java 
b/sdks/java/core/src/test/java/org/apache/beam/sdk/io/AvroIOTest.java
index d14d9b2..790555a 100644
--- a/sdks/java/core/src/test/java/org/apache/beam/sdk/io/AvroIOTest.java
+++ b/sdks/java/core/src/test/java/org/apache/beam/sdk/io/AvroIOTest.java
@@ -102,8 +102,8 @@ public class AvroIOTest {
 
   @Test
   public void testAvroIOGetName() {
-assertEquals("AvroIO.Read", 
AvroIO.read(String.class).from("gs://bucket/foo*/baz").getName());
-assertEquals("AvroIO.Write", 
AvroIO.write(String.class).to("gs://bucket/foo/baz").getName());
+assertEquals("AvroIO.Read", 
AvroIO.read(String.class).from("/tmp/foo*/baz").getName());
+assertEquals("AvroIO.Write", 
AvroIO.write(String.class).to("/tmp/foo/baz").getName());
   }
 
   @DefaultCoder(AvroCoder.class)
@@ -403,14 +403,14 @@ public class AvroIOTest {
   @Test
   public void testWriteWithDefaultCodec() throws Exception {
 AvroIO.Write write = AvroIO.write(String.class)
-.to("gs://bucket/foo/baz");
+.to("/tmp/foo/baz");
 assertEquals(CodecFactory.deflateCodec(6).toString(), 
write.getCodec().toString());
   }
 
   @Test
   public void testWriteWithCustomCodec() throws Exception {
 AvroIO.Write write = AvroIO.write(String.class)
-.to("gs://bucket/foo/baz")
+.to("/tmp/foo/baz")
 .withCodec(CodecFactory.snappyCodec());
 assertEquals(SNAPPY_CODEC, write.getCodec().toString());
   }
@@ -419,7 +419,7 @@ public class AvroIOTest {
   @SuppressWarnings("unchecked")
   public void testWriteWithSerDeCustomDeflateCodec() throws Exception {
 AvroIO.Write write = AvroIO.write(String.class)
-.to("gs://bucket/foo/baz")
+.to("/tmp/foo/baz")
 .withCodec(CodecFactory.deflateCodec(9));
 
 assertEquals(
@@ -431,7 +431,7 @@ public class AvroIOTest {
   @SuppressWarnings("unchecked")
   public void testWriteWithSerDeCustomXZCodec() throws Exception {
 AvroIO.Write write = AvroIO.write(String.class)
-.to("gs://bucket/foo/baz")
+.to("/tmp/foo/baz")
 .withCodec(CodecFactory.xzCodec(9));
 
 assertEquals(

http://git-wip-us.apache.org/repos/asf/beam/blob/e83e5626/sdks/java/core/src/test/java/org/apache/beam/sdk/io/TextIOTest.java
--
diff --git 
a/sdks/java/core/src/test/java/org/apache/beam/sdk/io/TextIOTest.java 
b/sdks/java/core/src/test/java/org/apache/beam/sdk/io/TextIOTest.java
index 432b801..2ba1797 100644
--- a/sdks/java/core/src/test/java/org/apache/beam/sdk/io/TextIOTest.java
+++ b/sdks/java/core/src/test/java/org/apache/beam/sdk/io/TextIOTest.java
@@ -578,9 +578,9 @@ public class TextIOTest {
 
   @Test
   public void testCompressionTypeIsSet() throws Exception {
-TextIO.Read.Bound read = TextIO.Read.from("gs://bucket/test");
+TextIO.Read.Bound read = TextIO.Read.from("/tmp/test");
 assertEquals(AUTO, read.getCompressionType());
-read = TextIO.Read.from("gs://bucket/test").withCompressionType(GZIP);
+read = TextIO.Read.from("/tmp/test").withCompressionType(GZIP);
 assertEquals(GZIP, read.getCompressionType());
   }
 



[2/5] beam git commit: [BEAM-59] DataflowRunnerTests: configure FileSystems in test

2017-05-01 Thread dhalperi
[BEAM-59] DataflowRunnerTests: configure FileSystems in test

This enables the test to use gs:// URIs with the FileSystems API


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/afc39210
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/afc39210
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/afc39210

Branch: refs/heads/master
Commit: afc3921001119e5570a0f6f8e54819f817a171ca
Parents: a77ed33
Author: Dan Halperin 
Authored: Mon May 1 18:49:28 2017 -0700
Committer: Dan Halperin 
Committed: Mon May 1 22:43:54 2017 -0700

--
 .../beam/runners/dataflow/DataflowPipelineTranslatorTest.java| 4 
 .../org/apache/beam/runners/dataflow/DataflowRunnerTest.java | 4 
 2 files changed, 8 insertions(+)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/afc39210/runners/google-cloud-dataflow-java/src/test/java/org/apache/beam/runners/dataflow/DataflowPipelineTranslatorTest.java
--
diff --git 
a/runners/google-cloud-dataflow-java/src/test/java/org/apache/beam/runners/dataflow/DataflowPipelineTranslatorTest.java
 
b/runners/google-cloud-dataflow-java/src/test/java/org/apache/beam/runners/dataflow/DataflowPipelineTranslatorTest.java
index cf0cae4..343d51b 100644
--- 
a/runners/google-cloud-dataflow-java/src/test/java/org/apache/beam/runners/dataflow/DataflowPipelineTranslatorTest.java
+++ 
b/runners/google-cloud-dataflow-java/src/test/java/org/apache/beam/runners/dataflow/DataflowPipelineTranslatorTest.java
@@ -66,6 +66,7 @@ import org.apache.beam.sdk.coders.StringUtf8Coder;
 import org.apache.beam.sdk.coders.VarIntCoder;
 import org.apache.beam.sdk.coders.VoidCoder;
 import org.apache.beam.sdk.extensions.gcp.auth.TestCredential;
+import org.apache.beam.sdk.io.FileSystems;
 import org.apache.beam.sdk.io.TextIO;
 import org.apache.beam.sdk.options.PipelineOptionsFactory;
 import org.apache.beam.sdk.options.ValueProvider;
@@ -133,6 +134,9 @@ public class DataflowPipelineTranslatorTest implements 
Serializable {
 options.setRunner(DataflowRunner.class);
 Pipeline p = Pipeline.create(options);
 
+// Enable the FileSystems API to know about gs:// URIs in this test.
+FileSystems.setDefaultConfigInWorkers(options);
+
 p.apply("ReadMyFile", TextIO.Read.from("gs://bucket/object"))
  .apply("WriteMyFile", TextIO.Write.to("gs://bucket/object"));
 DataflowRunner runner = DataflowRunner.fromOptions(options);

http://git-wip-us.apache.org/repos/asf/beam/blob/afc39210/runners/google-cloud-dataflow-java/src/test/java/org/apache/beam/runners/dataflow/DataflowRunnerTest.java
--
diff --git 
a/runners/google-cloud-dataflow-java/src/test/java/org/apache/beam/runners/dataflow/DataflowRunnerTest.java
 
b/runners/google-cloud-dataflow-java/src/test/java/org/apache/beam/runners/dataflow/DataflowRunnerTest.java
index c1d3fe6..e3c884b 100644
--- 
a/runners/google-cloud-dataflow-java/src/test/java/org/apache/beam/runners/dataflow/DataflowRunnerTest.java
+++ 
b/runners/google-cloud-dataflow-java/src/test/java/org/apache/beam/runners/dataflow/DataflowRunnerTest.java
@@ -68,6 +68,7 @@ import org.apache.beam.sdk.coders.BigEndianIntegerCoder;
 import org.apache.beam.sdk.coders.Coder;
 import org.apache.beam.sdk.extensions.gcp.auth.NoopCredentialFactory;
 import org.apache.beam.sdk.extensions.gcp.auth.TestCredential;
+import org.apache.beam.sdk.io.FileSystems;
 import org.apache.beam.sdk.io.TextIO;
 import org.apache.beam.sdk.io.TextIO.Read;
 import org.apache.beam.sdk.options.PipelineOptions;
@@ -174,6 +175,9 @@ public class DataflowRunnerTest {
 p.apply("ReadMyFile", TextIO.Read.from("gs://bucket/object"))
 .apply("WriteMyFile", TextIO.Write.to("gs://bucket/object"));
 
+// Enable the FileSystems API to know about gs:// URIs in this test.
+FileSystems.setDefaultConfigInWorkers(options);
+
 return p;
   }
 



[1/5] beam git commit: [BEAM-59] AvroIOTest: use absolute paths for display data

2017-05-01 Thread dhalperi
Repository: beam
Updated Branches:
  refs/heads/master 87cf88ade -> e92ead58c


[BEAM-59] AvroIOTest: use absolute paths for display data

This is future-proofing against eventual conversion to FileSystems API in which 
all paths
are absolute


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/de36c6e5
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/de36c6e5
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/de36c6e5

Branch: refs/heads/master
Commit: de36c6e5cd3a89fce276cc455286f62df6871703
Parents: afc3921
Author: Dan Halperin 
Authored: Mon May 1 18:54:40 2017 -0700
Committer: Dan Halperin 
Committed: Mon May 1 22:43:54 2017 -0700

--
 .../core/src/test/java/org/apache/beam/sdk/io/AvroIOTest.java| 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/de36c6e5/sdks/java/core/src/test/java/org/apache/beam/sdk/io/AvroIOTest.java
--
diff --git 
a/sdks/java/core/src/test/java/org/apache/beam/sdk/io/AvroIOTest.java 
b/sdks/java/core/src/test/java/org/apache/beam/sdk/io/AvroIOTest.java
index 790555a..5991c96 100644
--- a/sdks/java/core/src/test/java/org/apache/beam/sdk/io/AvroIOTest.java
+++ b/sdks/java/core/src/test/java/org/apache/beam/sdk/io/AvroIOTest.java
@@ -552,7 +552,7 @@ public class AvroIOTest {
   @Test
   public void testWriteDisplayData() {
 AvroIO.Write write = AvroIO.write(GenericClass.class)
-.to("foo")
+.to("/foo")
 .withShardNameTemplate("-SS-of-NN-")
 .withSuffix("bar")
 .withNumShards(100)
@@ -560,7 +560,7 @@ public class AvroIOTest {
 
 DisplayData displayData = DisplayData.from(write);
 
-assertThat(displayData, hasDisplayItem("filePrefix", "foo"));
+assertThat(displayData, hasDisplayItem("filePrefix", "/foo"));
 assertThat(displayData, hasDisplayItem("shardNameTemplate", "-SS-of-NN-"));
 assertThat(displayData, hasDisplayItem("fileSuffix", "bar"));
 assertThat(displayData, hasDisplayItem("schema", GenericClass.class));



[5/5] beam git commit: This closes #2814

2017-05-01 Thread dhalperi
This closes #2814


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/e92ead58
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/e92ead58
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/e92ead58

Branch: refs/heads/master
Commit: e92ead58cd661ff7ee952278f2983a578bdd2412
Parents: 87cf88a de36c6e
Author: Dan Halperin 
Authored: Mon May 1 22:43:57 2017 -0700
Committer: Dan Halperin 
Committed: Mon May 1 22:43:57 2017 -0700

--
 .../dataflow/DataflowPipelineTranslatorTest.java|  4 
 .../beam/runners/dataflow/DataflowRunnerTest.java   |  4 
 .../java/org/apache/beam/sdk/io/AvroIOTest.java | 16 
 .../java/org/apache/beam/sdk/io/TFRecordIOTest.java |  6 +++---
 .../java/org/apache/beam/sdk/io/TextIOTest.java |  4 ++--
 5 files changed, 21 insertions(+), 13 deletions(-)
--




[4/5] beam git commit: [BEAM-59] TFRecordIOTest: cleanup

2017-05-01 Thread dhalperi
[BEAM-59] TFRecordIOTest: cleanup

1. Shrink file size slightly to reduce test time.
2. Switch from relative to absolute local file paths, because in the future
   this distinction will matter.


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/a77ed33e
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/a77ed33e
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/a77ed33e

Branch: refs/heads/master
Commit: a77ed33e91e83e13e4e32e03540a1c310ddbeb8c
Parents: e83e562
Author: Dan Halperin 
Authored: Mon May 1 18:46:24 2017 -0700
Committer: Dan Halperin 
Committed: Mon May 1 22:43:54 2017 -0700

--
 .../src/test/java/org/apache/beam/sdk/io/TFRecordIOTest.java   | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/a77ed33e/sdks/java/core/src/test/java/org/apache/beam/sdk/io/TFRecordIOTest.java
--
diff --git 
a/sdks/java/core/src/test/java/org/apache/beam/sdk/io/TFRecordIOTest.java 
b/sdks/java/core/src/test/java/org/apache/beam/sdk/io/TFRecordIOTest.java
index ae3a50d..9a9e840 100644
--- a/sdks/java/core/src/test/java/org/apache/beam/sdk/io/TFRecordIOTest.java
+++ b/sdks/java/core/src/test/java/org/apache/beam/sdk/io/TFRecordIOTest.java
@@ -93,7 +93,7 @@ public class TFRecordIOTest {
   private static final String[] FOO_BAR_RECORDS = {"foo", "bar"};
 
   private static final Iterable EMPTY = Collections.emptyList();
-  private static final Iterable LARGE = makeLines(5000);
+  private static final Iterable LARGE = makeLines(1000);
 
   private static Path tempFolder;
 
@@ -159,7 +159,7 @@ public class TFRecordIOTest {
   @Test
   public void testWriteDisplayData() {
 TFRecordIO.Write write = TFRecordIO.write()
-.to("foo")
+.to("/foo")
 .withSuffix("bar")
 .withShardNameTemplate("-SS-of-NN-")
 .withNumShards(100)
@@ -167,7 +167,7 @@ public class TFRecordIOTest {
 
 DisplayData displayData = DisplayData.from(write);
 
-assertThat(displayData, hasDisplayItem("filePrefix", "foo"));
+assertThat(displayData, hasDisplayItem("filePrefix", "/foo"));
 assertThat(displayData, hasDisplayItem("fileSuffix", "bar"));
 assertThat(displayData, hasDisplayItem("shardNameTemplate", "-SS-of-NN-"));
 assertThat(displayData, hasDisplayItem("numShards", 100));



[GitHub] beam pull request #2816: [BEAM-775] Remove Aggregators from the Java SDK

2017-05-01 Thread dhalperi
Github user dhalperi closed the pull request at:

https://github.com/apache/beam/pull/2816


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (BEAM-775) Remove Aggregators from the Java SDK

2017-05-01 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15992361#comment-15992361
 ] 

ASF GitHub Bot commented on BEAM-775:
-

Github user dhalperi closed the pull request at:

https://github.com/apache/beam/pull/2816


> Remove Aggregators from the Java SDK
> 
>
> Key: BEAM-775
> URL: https://issues.apache.org/jira/browse/BEAM-775
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-java-core
>Reporter: Ben Chambers
>Assignee: Pablo Estrada
>  Labels: backward-incompatible
> Fix For: First stable release
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


Jenkins build is still unstable: beam_PostCommit_Java_MavenInstall #3578

2017-05-01 Thread Apache Jenkins Server
See 




[jira] [Commented] (BEAM-775) Remove Aggregators from the Java SDK

2017-05-01 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15992357#comment-15992357
 ] 

ASF GitHub Bot commented on BEAM-775:
-

Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/2718


> Remove Aggregators from the Java SDK
> 
>
> Key: BEAM-775
> URL: https://issues.apache.org/jira/browse/BEAM-775
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-java-core
>Reporter: Ben Chambers
>Assignee: Pablo Estrada
>  Labels: backward-incompatible
> Fix For: First stable release
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[2/2] beam git commit: [BEAM-775] Remove Aggregators from the Java SDK

2017-05-01 Thread lcwik
[BEAM-775] Remove Aggregators from the Java SDK

This closes #2718


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/87cf88ad
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/87cf88ad
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/87cf88ad

Branch: refs/heads/master
Commit: 87cf88ade7459722f9daf902cd76ea56a180a041
Parents: 19ae45b 4253a60
Author: Luke Cwik 
Authored: Mon May 1 22:18:20 2017 -0700
Committer: Luke Cwik 
Committed: Mon May 1 22:18:20 2017 -0700

--
 .../operators/ApexGroupByKeyOperator.java   |   1 -
 .../apache/beam/runners/core/DoFnAdapters.java  |  18 --
 .../apache/beam/runners/core/DoFnRunners.java   |   6 +-
 .../GroupAlsoByWindowViaOutputBufferDoFn.java   |   1 -
 .../core/GroupAlsoByWindowViaWindowSetDoFn.java |   1 -
 .../GroupAlsoByWindowViaWindowSetNewDoFn.java   |  12 -
 .../core/LateDataDroppingDoFnRunner.java|  20 +-
 ...eBoundedSplittableProcessElementInvoker.java |   8 -
 .../beam/runners/core/ReduceFnRunner.java   |  15 +-
 .../beam/runners/core/SimpleDoFnRunner.java |  24 --
 .../beam/runners/core/SplittableParDo.java  |   8 -
 .../core/LateDataDroppingDoFnRunnerTest.java|  20 +-
 .../beam/runners/core/ReduceFnRunnerTest.java   |  69 --
 .../beam/runners/core/ReduceFnTester.java   |  38 
 .../beam/runners/direct/DirectRunner.java   |  10 +-
 .../GroupAlsoByWindowEvaluatorFactory.java  |   1 -
 .../wrappers/streaming/DoFnOperator.java|   3 +-
 runners/google-cloud-dataflow-java/pom.xml  |   2 +-
 .../SparkGroupAlsoByWindowViaWindowSet.java |   1 -
 ...SparkGroupAlsoByWindowViaOutputBufferFn.java |   1 -
 .../beam/sdk/AggregatorPipelineExtractor.java   |  10 +-
 .../apache/beam/sdk/transforms/Aggregator.java  |  24 --
 .../sdk/transforms/AggregatorRetriever.java |  45 
 .../org/apache/beam/sdk/transforms/DoFn.java| 120 +-
 .../apache/beam/sdk/transforms/DoFnTester.java  |  64 --
 .../org/apache/beam/sdk/transforms/Latest.java  |   3 +-
 .../sdk/AggregatorPipelineExtractorTest.java| 226 ---
 .../apache/beam/sdk/transforms/DoFnTest.java| 162 -
 .../beam/sdk/transforms/DoFnTesterTest.java |  47 +---
 .../beam/sdk/transforms/LatestFnTest.java   |  40 
 30 files changed, 106 insertions(+), 894 deletions(-)
--




[GitHub] beam pull request #2718: [BEAM-775] Remove Aggregators from the Java SDK

2017-05-01 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/2718


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[1/2] beam git commit: Remove aggregators from DoFn contexts and internal SDK usage

2017-05-01 Thread lcwik
dowEvaluatorFactory.java
@@ -178,7 +178,6 @@ class GroupAlsoByWindowEvaluatorFactory implements 
TransformEvaluatorFactory {
   timerInternals,
   new OutputWindowedValueToBundle<>(bundle),
   new UnsupportedSideInputReader("GroupAlsoByWindow"),
-  droppedDueToClosedWindow,
   reduceFn,
   evaluationContext.getPipelineOptions());
 

http://git-wip-us.apache.org/repos/asf/beam/blob/4253a60f/runners/flink/src/main/java/org/apache/beam/runners/flink/translation/wrappers/streaming/DoFnOperator.java
--
diff --git 
a/runners/flink/src/main/java/org/apache/beam/runners/flink/translation/wrappers/streaming/DoFnOperator.java
 
b/runners/flink/src/main/java/org/apache/beam/runners/flink/translation/wrappers/streaming/DoFnOperator.java
index d3d9078..62d7a9c 100644
--- 
a/runners/flink/src/main/java/org/apache/beam/runners/flink/translation/wrappers/streaming/DoFnOperator.java
+++ 
b/runners/flink/src/main/java/org/apache/beam/runners/flink/translation/wrappers/streaming/DoFnOperator.java
@@ -299,8 +299,7 @@ public class DoFnOperator
   doFnRunner = DoFnRunners.lateDataDroppingRunner(
   (DoFnRunner) doFnRunner,
   stepContext,
-  windowingStrategy,
-  ((GroupAlsoByWindowViaWindowSetNewDoFn) 
doFn).getDroppedDueToLatenessAggregator());
+  windowingStrategy);
 } else if (keyCoder != null) {
   // It is a stateful DoFn
 

http://git-wip-us.apache.org/repos/asf/beam/blob/4253a60f/runners/google-cloud-dataflow-java/pom.xml
--
diff --git a/runners/google-cloud-dataflow-java/pom.xml 
b/runners/google-cloud-dataflow-java/pom.xml
index 4b6ca98..4c9d8d3 100644
--- a/runners/google-cloud-dataflow-java/pom.xml
+++ b/runners/google-cloud-dataflow-java/pom.xml
@@ -33,7 +33,7 @@
   jar
 
   
-
beam-master-20170501
+
beam-master-20170501-pr2718
 
1
 
6
   

http://git-wip-us.apache.org/repos/asf/beam/blob/4253a60f/runners/spark/src/main/java/org/apache/beam/runners/spark/stateful/SparkGroupAlsoByWindowViaWindowSet.java
--
diff --git 
a/runners/spark/src/main/java/org/apache/beam/runners/spark/stateful/SparkGroupAlsoByWindowViaWindowSet.java
 
b/runners/spark/src/main/java/org/apache/beam/runners/spark/stateful/SparkGroupAlsoByWindowViaWindowSet.java
index 029c28a..c59e0e7 100644
--- 
a/runners/spark/src/main/java/org/apache/beam/runners/spark/stateful/SparkGroupAlsoByWindowViaWindowSet.java
+++ 
b/runners/spark/src/main/java/org/apache/beam/runners/spark/stateful/SparkGroupAlsoByWindowViaWindowSet.java
@@ -261,7 +261,6 @@ public class SparkGroupAlsoByWindowViaWindowSet {
   timerInternals,
   outputHolder,
   new 
UnsupportedSideInputReader("GroupAlsoByWindow"),
-  droppedDueToClosedWindow,
   reduceFn,
   runtimeContext.getPipelineOptions());
 

http://git-wip-us.apache.org/repos/asf/beam/blob/4253a60f/runners/spark/src/main/java/org/apache/beam/runners/spark/translation/SparkGroupAlsoByWindowViaOutputBufferFn.java
--
diff --git 
a/runners/spark/src/main/java/org/apache/beam/runners/spark/translation/SparkGroupAlsoByWindowViaOutputBufferFn.java
 
b/runners/spark/src/main/java/org/apache/beam/runners/spark/translation/SparkGroupAlsoByWindowViaOutputBufferFn.java
index ccc0fa3..0a00c45 100644
--- 
a/runners/spark/src/main/java/org/apache/beam/runners/spark/translation/SparkGroupAlsoByWindowViaOutputBufferFn.java
+++ 
b/runners/spark/src/main/java/org/apache/beam/runners/spark/translation/SparkGroupAlsoByWindowViaOutputBufferFn.java
@@ -107,7 +107,6 @@ public class SparkGroupAlsoByWindowViaOutputBufferFnhttp://git-wip-us.apache.org/repos/asf/beam/blob/4253a60f/sdks/java/core/src/main/java/org/apache/beam/sdk/AggregatorPipelineExtractor.java
--
diff --git 
a/sdks/java/core/src/main/java/org/apache/beam/sdk/AggregatorPipelineExtractor.java
 
b/sdks/java/core/src/main/java/org/apache/beam/sdk/AggregatorPipelineExtractor.java
index 8804f55..eeb9b45 100644
--- 
a/sdks/java/core/src/main/java/org/apache/beam/sdk/AggregatorPipelineExtractor.java
+++ 
b/sdks/java/core/src/main/java/org/apache/beam/sdk/AggregatorPipelineExtractor.java
@@ -25,7 +25,6 @@ import java.util.Map;
 import org.apache.beam.sdk.Pipeline.PipelineVisitor;
 import org.apache.beam.sdk.runners.TransformHierarchy;
 import org.apache.beam.sdk.transforms.Aggregator;
-import org.apache.beam.sdk.transforms.AggregatorRetriever;
 import org.apache.beam.sdk.transforms.PTransform;
 import org.apache.beam.sdk.transforms.ParDo;
 impor

[jira] [Updated] (BEAM-1641) Support synchronized processing time in Flink runner

2017-05-01 Thread Jingsong Lee (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-1641?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jingsong Lee updated BEAM-1641:
---
Fix Version/s: (was: First stable release)

> Support synchronized processing time in Flink runner
> 
>
> Key: BEAM-1641
> URL: https://issues.apache.org/jira/browse/BEAM-1641
> Project: Beam
>  Issue Type: Bug
>  Components: runner-flink
>Reporter: Kenneth Knowles
>Assignee: Aljoscha Krettek
>Priority: Blocker
>
> The "continuation trigger" for a processing time trigger is a synchronized 
> processing time trigger. Today, this throws an exception in the FlinkRunner.
> The supports the following:
>  - GBK1
>  - GBK2
> When GBK1 fires due to processing time past the first element in the pane and 
> that element arrives at GBK2, it will wait until all the other upstream keys 
> have also processed and emitted corresponding data.
> Sorry for the terseness of explanation - writing quickly so I don't forget.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


beam git commit: [BEAM-2005] Fix build break, ignore test due to change in TestPipeline/FileSystems interaction

2017-05-01 Thread lcwik
Repository: beam
Updated Branches:
  refs/heads/master a3ba8e71b -> 19ae45b2f


[BEAM-2005] Fix build break, ignore test due to change in 
TestPipeline/FileSystems interaction


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/19ae45b2
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/19ae45b2
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/19ae45b2

Branch: refs/heads/master
Commit: 19ae45b2fd8d351208205d1eb5cd0ee003fac934
Parents: a3ba8e7
Author: Luke Cwik 
Authored: Mon May 1 22:13:18 2017 -0700
Committer: Luke Cwik 
Committed: Mon May 1 22:13:18 2017 -0700

--
 .../java/org/apache/beam/sdk/io/hdfs/HadoopFileSystemTest.java | 2 ++
 1 file changed, 2 insertions(+)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/19ae45b2/sdks/java/io/hdfs/src/test/java/org/apache/beam/sdk/io/hdfs/HadoopFileSystemTest.java
--
diff --git 
a/sdks/java/io/hdfs/src/test/java/org/apache/beam/sdk/io/hdfs/HadoopFileSystemTest.java
 
b/sdks/java/io/hdfs/src/test/java/org/apache/beam/sdk/io/hdfs/HadoopFileSystemTest.java
index 6cb5326..a5957b5 100644
--- 
a/sdks/java/io/hdfs/src/test/java/org/apache/beam/sdk/io/hdfs/HadoopFileSystemTest.java
+++ 
b/sdks/java/io/hdfs/src/test/java/org/apache/beam/sdk/io/hdfs/HadoopFileSystemTest.java
@@ -46,6 +46,7 @@ import org.apache.hadoop.conf.Configuration;
 import org.apache.hadoop.hdfs.MiniDFSCluster;
 import org.junit.After;
 import org.junit.Before;
+import org.junit.Ignore;
 import org.junit.Rule;
 import org.junit.Test;
 import org.junit.rules.ExpectedException;
@@ -210,6 +211,7 @@ public class HadoopFileSystemTest {
   }
 
   @Test
+  @Ignore("TestPipeline needs a way to take in HadoopFileSystemOptions")
   public void testReadPipeline() throws Exception {
 create("testFileA", "testDataA".getBytes());
 create("testFileB", "testDataB".getBytes());



[jira] [Updated] (BEAM-2136) AvroCoderTest.testTwoClassLoaders fails on beam_PostCommit_Java_ValidatesRunner_Dataflow

2017-05-01 Thread Kenneth Knowles (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-2136?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kenneth Knowles updated BEAM-2136:
--
Fix Version/s: (was: First stable release)

> AvroCoderTest.testTwoClassLoaders fails on 
> beam_PostCommit_Java_ValidatesRunner_Dataflow
> 
>
> Key: BEAM-2136
> URL: https://issues.apache.org/jira/browse/BEAM-2136
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core
>Reporter: Luke Cwik
>Assignee: Kenneth Knowles
>Priority: Minor
>
> Example failure:
> https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/3003/org.apache.beam$beam-sdks-java-core/testReport/org.apache.beam.sdk.coders/AvroCoderTest/testTwoClassLoaders/
> java.lang.NullPointerException
>   at com.google.common.io.ByteStreams.toByteArray(ByteStreams.java:165)
>   at 
> org.apache.beam.sdk.coders.AvroCoderTest$InterceptingUrlClassLoader.loadClass(AvroCoderTest.java:179)
>   at 
> org.apache.beam.sdk.coders.AvroCoderTest.testTwoClassLoaders(AvroCoderTest.java:199)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.apache.beam.sdk.testing.TestPipeline$1.evaluate(TestPipeline.java:307)
>   at org.junit.rules.RunRules.evaluate(RunRules.java:20)
>   at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
>   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
>   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
>   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
>   at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
>   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
>   at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
>   at org.junit.runners.Suite.runChild(Suite.java:128)
>   at org.junit.runners.Suite.runChild(Suite.java:27)
>   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
>   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
>   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
>   at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
>   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
>   at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
>   at org.apache.maven.surefire.junitcore.JUnitCore.run(JUnitCore.java:55)
>   at 
> org.apache.maven.surefire.junitcore.JUnitCoreWrapper.createRequestAndRun(JUnitCoreWrapper.java:137)
>   at 
> org.apache.maven.surefire.junitcore.JUnitCoreWrapper.executeEager(JUnitCoreWrapper.java:107)
>   at 
> org.apache.maven.surefire.junitcore.JUnitCoreWrapper.execute(JUnitCoreWrapper.java:83)
>   at 
> org.apache.maven.surefire.junitcore.JUnitCoreWrapper.execute(JUnitCoreWrapper.java:75)
>   at 
> org.apache.maven.surefire.junitcore.JUnitCoreProvider.invoke(JUnitCoreProvider.java:157)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.apache.maven.surefire.util.ReflectionUtils.invokeMethodWithArray2(ReflectionUtils.java:202)
>   at 
> org.apache.maven.surefire.booter.ProviderFactory$ProviderProxy.invoke(ProviderFactory.java:158)
>   at 
> org.apache.maven.surefire.booter.ProviderFactory.invokeProvider(ProviderFactory.java:81)
>   at 
> org.apache.maven.plugin.surefire.InPluginVMSurefireStarter.runSuitesInProcess(InPluginVMSurefireStarter.java:84)
>   at 
> org.apache.maven.plugin.surefire.AbstractSurefireMojo.executeProvider(AbstractSurefireMojo.java:1060)
>   at 
> org.apache.maven.plugin.surefire.AbstractSurefireMojo.executeAfterPreconditionsChec

[jira] [Created] (BEAM-2137) Hadoop: Handle multiple Hadoop configurations within Hadoop FileSystem

2017-05-01 Thread Luke Cwik (JIRA)
Luke Cwik created BEAM-2137:
---

 Summary: Hadoop: Handle multiple Hadoop configurations within 
Hadoop FileSystem
 Key: BEAM-2137
 URL: https://issues.apache.org/jira/browse/BEAM-2137
 Project: Beam
  Issue Type: Improvement
  Components: sdk-java-extensions
Reporter: Luke Cwik


This requires adding a FileSystem implementation which multiplexes calls to 
underlying Hadoop FileSystem instances based upon using the fs.defaultFS as a 
prefix.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (BEAM-2030) Implement beam FileSystem's copy()

2017-05-01 Thread Luke Cwik (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-2030?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Luke Cwik resolved BEAM-2030.
-
Resolution: Fixed

> Implement beam FileSystem's copy()
> --
>
> Key: BEAM-2030
> URL: https://issues.apache.org/jira/browse/BEAM-2030
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-java-extensions
>Reporter: Stephen Sisk
>Assignee: Luke Cwik
> Fix For: First stable release
>
>
> Beam's FileSystem has a copy() command, however I can't find a good analog in 
> Hadoop's FileSystem. 
> https://hadoop.apache.org/docs/r2.8.0/api/org/apache/hadoop/fs/FileSystem.html
>  shows lots of copy to/from local files, but no "copy between these two 
> arbitrary paths".
> cc [~davor] [~dhalp...@google.com] did either of you have thoughts about 
> this? I don't think that it makes sense to have beam stream data from one 
> node just so it can write it back to another node. (it could be an extension 
> method, but I'd want to make it obvious that it's the inefficient version of 
> things)
> My default answer here is to throw an unimplemented exception on copy or to 
> remove it from the BFS interface altogether.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (BEAM-2005) Add a Hadoop FileSystem implementation of Beam's FileSystem

2017-05-01 Thread Luke Cwik (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-2005?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Luke Cwik resolved BEAM-2005.
-
Resolution: Fixed

> Add a Hadoop FileSystem implementation of Beam's FileSystem
> ---
>
> Key: BEAM-2005
> URL: https://issues.apache.org/jira/browse/BEAM-2005
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-java-extensions
>Reporter: Stephen Sisk
>Assignee: Luke Cwik
> Fix For: First stable release
>
>
> Beam's FileSystem creates an abstraction for reading from files in many 
> different places. 
> We should add a Hadoop FileSystem implementation 
> (https://hadoop.apache.org/docs/r2.8.0/api/org/apache/hadoop/fs/FileSystem.html)
>  - that would enable us to read from any file system that implements 
> FileSystem (including HDFS, azure, s3, etc..)
> I'm investigating this now.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (BEAM-2032) Implement delete

2017-05-01 Thread Luke Cwik (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-2032?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Luke Cwik resolved BEAM-2032.
-
Resolution: Fixed

> Implement delete
> 
>
> Key: BEAM-2032
> URL: https://issues.apache.org/jira/browse/BEAM-2032
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-java-extensions
>Reporter: Stephen Sisk
>Assignee: Luke Cwik
> Fix For: First stable release
>
>
> This seems like the simplest method to implement, so might be a good place to 
> start.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (BEAM-2033) Implement ResourceIds for HadoopFileSystem

2017-05-01 Thread Luke Cwik (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-2033?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Luke Cwik resolved BEAM-2033.
-
Resolution: Fixed

> Implement ResourceIds for HadoopFileSystem
> --
>
> Key: BEAM-2033
> URL: https://issues.apache.org/jira/browse/BEAM-2033
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-java-extensions
>Reporter: Stephen Sisk
>Assignee: Luke Cwik
> Fix For: First stable release
>
>
> There's an empty class for HadoopResourceId already - we will implement it.
> This blocks most of the other implementation since we can't do much without 
> ResourceIds



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (BEAM-2070) Implement match for HadoopFileSystem

2017-05-01 Thread Luke Cwik (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-2070?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Luke Cwik resolved BEAM-2070.
-
Resolution: Fixed

> Implement match for HadoopFileSystem
> 
>
> Key: BEAM-2070
> URL: https://issues.apache.org/jira/browse/BEAM-2070
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-java-extensions
>Reporter: Stephen Sisk
>Assignee: Luke Cwik
> Fix For: First stable release
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (BEAM-2031) Hadoop FileSystem needs to receive Hadoop Configuration

2017-05-01 Thread Luke Cwik (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-2031?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Luke Cwik resolved BEAM-2031.
-
Resolution: Fixed

> Hadoop FileSystem needs to receive Hadoop Configuration
> ---
>
> Key: BEAM-2031
> URL: https://issues.apache.org/jira/browse/BEAM-2031
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-java-extensions
>Reporter: Stephen Sisk
>Assignee: Luke Cwik
> Fix For: First stable release
>
>
> Since Beam FileSystem objects are configured via PipelineOptions, we need to 
> pass a Hadoop Configuration through PipelineOptions. I think that's very 
> solvable, but it does seem semi-complicated.
> cc [~pei...@gmail.com] I believe you mentioned in the past that you had an 
> answer to this - is that written down anywhere?



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (BEAM-2005) Add a Hadoop FileSystem implementation of Beam's FileSystem

2017-05-01 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15992322#comment-15992322
 ] 

ASF GitHub Bot commented on BEAM-2005:
--

Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/2777


> Add a Hadoop FileSystem implementation of Beam's FileSystem
> ---
>
> Key: BEAM-2005
> URL: https://issues.apache.org/jira/browse/BEAM-2005
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-java-extensions
>Reporter: Stephen Sisk
>Assignee: Luke Cwik
> Fix For: First stable release
>
>
> Beam's FileSystem creates an abstraction for reading from files in many 
> different places. 
> We should add a Hadoop FileSystem implementation 
> (https://hadoop.apache.org/docs/r2.8.0/api/org/apache/hadoop/fs/FileSystem.html)
>  - that would enable us to read from any file system that implements 
> FileSystem (including HDFS, azure, s3, etc..)
> I'm investigating this now.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] beam pull request #2777: [BEAM-2005, BEAM-2030, BEAM-2031, BEAM-2032, BEAM-2...

2017-05-01 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/2777


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[2/2] beam git commit: [BEAM-2005, BEAM-2030, BEAM-2031, BEAM-2032, BEAM-2033, BEAM-2070] Base implementation of HadoopFileSystem.

2017-05-01 Thread lcwik
[BEAM-2005, BEAM-2030, BEAM-2031, BEAM-2032, BEAM-2033, BEAM-2070] Base 
implementation of HadoopFileSystem.

This closes #2777


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/a3ba8e71
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/a3ba8e71
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/a3ba8e71

Branch: refs/heads/master
Commit: a3ba8e71bf73c81f253e27e6d680e36108e68159
Parents: 7fa0064 23d16f7
Author: Luke Cwik 
Authored: Mon May 1 21:34:03 2017 -0700
Committer: Luke Cwik 
Committed: Mon May 1 21:34:03 2017 -0700

--
 sdks/java/io/hdfs/pom.xml   |  44 
 .../beam/sdk/io/hdfs/HadoopFileSystem.java  | 179 +-
 .../sdk/io/hdfs/HadoopFileSystemRegistrar.java  |  27 +-
 .../beam/sdk/io/hdfs/HadoopResourceId.java  |  46 +++-
 .../io/hdfs/HadoopFileSystemRegistrarTest.java  |  39 ++-
 .../beam/sdk/io/hdfs/HadoopFileSystemTest.java  | 244 +++
 6 files changed, 555 insertions(+), 24 deletions(-)
--




[1/2] beam git commit: [BEAM-2005, BEAM-2030, BEAM-2031, BEAM-2032, BEAM-2033, BEAM-2070] Base implementation of HadoopFileSystem.

2017-05-01 Thread lcwik
Repository: beam
Updated Branches:
  refs/heads/master 7fa00647d -> a3ba8e71b


[BEAM-2005, BEAM-2030, BEAM-2031, BEAM-2032, BEAM-2033, BEAM-2070] Base 
implementation of HadoopFileSystem.

TODO:
* Add multiplexing FileSystem that is able to route requests based upon the 
base URI when configured for multiple file systems.
* Take a look at copy/rename again to see if we can do better than moving all 
the bytes through the local machine.


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/23d16f7d
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/23d16f7d
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/23d16f7d

Branch: refs/heads/master
Commit: 23d16f7deb8130457d7fadab498cebc3c994ad76
Parents: 7fa0064
Author: Luke Cwik 
Authored: Fri Apr 28 19:07:59 2017 -0700
Committer: Luke Cwik 
Committed: Mon May 1 21:33:16 2017 -0700

--
 sdks/java/io/hdfs/pom.xml   |  44 
 .../beam/sdk/io/hdfs/HadoopFileSystem.java  | 179 +-
 .../sdk/io/hdfs/HadoopFileSystemRegistrar.java  |  27 +-
 .../beam/sdk/io/hdfs/HadoopResourceId.java  |  46 +++-
 .../io/hdfs/HadoopFileSystemRegistrarTest.java  |  39 ++-
 .../beam/sdk/io/hdfs/HadoopFileSystemTest.java  | 244 +++
 6 files changed, 555 insertions(+), 24 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/23d16f7d/sdks/java/io/hdfs/pom.xml
--
diff --git a/sdks/java/io/hdfs/pom.xml b/sdks/java/io/hdfs/pom.xml
index 46cf8cf..daa3b26 100644
--- a/sdks/java/io/hdfs/pom.xml
+++ b/sdks/java/io/hdfs/pom.xml
@@ -44,6 +44,37 @@
 
   
 
+  
+
+2.7.3
+  
+
+  
+
+
+  
+org.apache.hadoop
+hadoop-hdfs
+tests
+${hadoop.version}
+  
+
+  
+org.apache.hadoop
+hadoop-minicluster
+${hadoop.version}
+  
+
+  
+
   
 
   org.apache.beam
@@ -131,6 +162,19 @@
 
 
 
+  org.apache.hadoop
+  hadoop-minicluster
+  test
+
+
+
+  org.apache.hadoop
+  hadoop-hdfs
+  tests
+  test
+
+
+
   org.apache.beam
   beam-runners-direct-java
   test

http://git-wip-us.apache.org/repos/asf/beam/blob/23d16f7d/sdks/java/io/hdfs/src/main/java/org/apache/beam/sdk/io/hdfs/HadoopFileSystem.java
--
diff --git 
a/sdks/java/io/hdfs/src/main/java/org/apache/beam/sdk/io/hdfs/HadoopFileSystem.java
 
b/sdks/java/io/hdfs/src/main/java/org/apache/beam/sdk/io/hdfs/HadoopFileSystem.java
index a8bdd44..154a818 100644
--- 
a/sdks/java/io/hdfs/src/main/java/org/apache/beam/sdk/io/hdfs/HadoopFileSystem.java
+++ 
b/sdks/java/io/hdfs/src/main/java/org/apache/beam/sdk/io/hdfs/HadoopFileSystem.java
@@ -17,65 +17,224 @@
  */
 package org.apache.beam.sdk.io.hdfs;
 
+import com.google.common.annotations.VisibleForTesting;
+import com.google.common.collect.ImmutableList;
 import java.io.IOException;
+import java.net.URI;
+import java.net.URISyntaxException;
+import java.nio.ByteBuffer;
+import java.nio.channels.Channels;
 import java.nio.channels.ReadableByteChannel;
+import java.nio.channels.SeekableByteChannel;
 import java.nio.channels.WritableByteChannel;
+import java.util.ArrayList;
 import java.util.Collection;
 import java.util.List;
 import org.apache.beam.sdk.io.FileSystem;
 import org.apache.beam.sdk.io.fs.CreateOptions;
 import org.apache.beam.sdk.io.fs.MatchResult;
+import org.apache.beam.sdk.io.fs.MatchResult.Metadata;
+import org.apache.beam.sdk.io.fs.MatchResult.Status;
+import org.apache.hadoop.conf.Configuration;
+import org.apache.hadoop.fs.FSDataInputStream;
+import org.apache.hadoop.fs.FSInputStream;
+import org.apache.hadoop.fs.FileStatus;
+import org.apache.hadoop.fs.FileUtil;
+import org.apache.hadoop.fs.Path;
 
 /**
  * Adapts {@link org.apache.hadoop.fs.FileSystem} connectors to be used as
  * Apache Beam {@link FileSystem FileSystems}.
+ *
+ * The following HDFS FileSystem(s) are known to be unsupported:
+ * 
+ *   FTPFileSystem: Missing seek support within FTPInputStream
+ * 
+ *
+ * This implementation assumes that the underlying Hadoop {@link 
FileSystem} is seek
+ * efficient when reading. The source code for the following {@link 
FSInputStream} implementations
+ * (as of Hadoop 2.7.1) do provide seek implementations:
+ * 
+ *   HarFsInputStream
+ *   S3InputStream
+ *   DFSInputStream
+ *   SwiftNativeInputStream
+ *   NativeS3FsInputStream
+ *   LocalFSFileInputStream
+ *   NativeAzureFsInputStream
+ *   S3AInputStream
+ * 
  */
 class HadoopFileSystem extends FileSystem {
+  @VisibleForTesting
+  final org.apache.hadoop.fs.FileSystem fileSystem;
 
-  HadoopFileSystem() {}
+  HadoopFileSystem(Configuration configuration) throws IOExc

Jenkins build is still unstable: beam_PostCommit_Java_ValidatesRunner_Dataflow #3005

2017-05-01 Thread Apache Jenkins Server
See 




Jenkins build is unstable: beam_PostCommit_Java_MavenInstall #3577

2017-05-01 Thread Apache Jenkins Server
See 




Jenkins build is back to normal : beam_PostCommit_Python_Verify #2077

2017-05-01 Thread Apache Jenkins Server
See 




[jira] [Commented] (BEAM-981) Not possible to directly submit a pipeline on spark cluster

2017-05-01 Thread holdenk (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15992305#comment-15992305
 ] 

holdenk commented on BEAM-981:
--

I can take a look at this later on this week if no one else is.

> Not possible to directly submit a pipeline on spark cluster
> ---
>
> Key: BEAM-981
> URL: https://issues.apache.org/jira/browse/BEAM-981
> Project: Beam
>  Issue Type: Bug
>  Components: runner-spark
>Affects Versions: 0.6.0
>Reporter: Jean-Baptiste Onofré
>Assignee: Kobi Salant
> Fix For: First stable release
>
>
> It's not possible to directly run a pipeline on the spark runner (for 
> instance using {{mvn exec:java}}. It fails with:
> {code}
> [appclient-register-master-threadpool-0] INFO 
> org.apache.spark.deploy.client.AppClient$ClientEndpoint - Connecting to 
> master spark://10.200.118.197:7077...
> [shuffle-client-0] ERROR org.apache.spark.network.client.TransportClient - 
> Failed to send RPC 6813731522650020739 to /10.200.118.197:7077: 
> java.lang.AbstractMethodError: 
> org.apache.spark.network.protocol.MessageWithHeader.touch(Ljava/lang/Object;)Lio/netty/util/ReferenceCounted;
> java.lang.AbstractMethodError: 
> org.apache.spark.network.protocol.MessageWithHeader.touch(Ljava/lang/Object;)Lio/netty/util/ReferenceCounted;
> at io.netty.util.ReferenceCountUtil.touch(ReferenceCountUtil.java:73)
> at 
> io.netty.channel.DefaultChannelPipeline.touch(DefaultChannelPipeline.java:107)
> at 
> io.netty.channel.AbstractChannelHandlerContext.write(AbstractChannelHandlerContext.java:820)
> at 
> io.netty.channel.AbstractChannelHandlerContext.write(AbstractChannelHandlerContext.java:733)
> at 
> io.netty.handler.codec.MessageToMessageEncoder.write(MessageToMessageEncoder.java:111)
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeWrite0(AbstractChannelHandlerContext.java:748)
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeWrite(AbstractChannelHandlerContext.java:740)
> at 
> io.netty.channel.AbstractChannelHandlerContext.write(AbstractChannelHandlerContext.java:826)
> at 
> io.netty.channel.AbstractChannelHandlerContext.write(AbstractChannelHandlerContext.java:733)
> at 
> io.netty.handler.timeout.IdleStateHandler.write(IdleStateHandler.java:284)
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeWrite0(AbstractChannelHandlerContext.java:748)
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeWrite(AbstractChannelHandlerContext.java:740)
> at 
> io.netty.channel.AbstractChannelHandlerContext.access$1900(AbstractChannelHandlerContext.java:38)
> at 
> io.netty.channel.AbstractChannelHandlerContext$AbstractWriteTask.write(AbstractChannelHandlerContext.java:1101)
> at 
> io.netty.channel.AbstractChannelHandlerContext$WriteAndFlushTask.write(AbstractChannelHandlerContext.java:1148)
> at 
> io.netty.channel.AbstractChannelHandlerContext$AbstractWriteTask.run(AbstractChannelHandlerContext.java:1090)
> at 
> io.netty.util.concurrent.SingleThreadEventExecutor.safeExecute(SingleThreadEventExecutor.java:451)
> at 
> io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:418)
> at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:401)
> at 
> io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:877)
> at java.lang.Thread.run(Thread.java:745)
> [appclient-register-master-threadpool-0] WARN 
> org.apache.spark.deploy.client.AppClient$ClientEndpoint - Failed to connect 
> to master 10.200.118.197:7077
> java.io.IOException: Failed to send RPC 6813731522650020739 to 
> /10.200.118.197:7077: java.lang.AbstractMethodError: 
> org.apache.spark.network.protocol.MessageWithHeader.touch(Ljava/lang/Object;)Lio/netty/util/ReferenceCounted;
> at 
> org.apache.spark.network.client.TransportClient$3.operationComplete(TransportClient.java:239)
> at 
> org.apache.spark.network.client.TransportClient$3.operationComplete(TransportClient.java:226)
> at 
> io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise.java:514)
> at 
> io.netty.util.concurrent.DefaultPromise.notifyListeners0(DefaultPromise.java:507)
> at 
> io.netty.util.concurrent.DefaultPromise.notifyListenersNow(DefaultPromise.java:486)
> at 
> io.netty.util.concurrent.DefaultPromise.notifyListeners(DefaultPromise.java:427)
> at 
> io.netty.util.concurrent.DefaultPromise.tryFailure(DefaultPromise.java:129)
> at 
> io.netty.channel.AbstractChannelHandlerContext.notifyOutboundHandlerException(AbstractChannelHandlerContext.java:845)
> at 
> io.netty.chann

Jenkins build is back to stable : beam_PostCommit_Java_ValidatesRunner_Flink #2598

2017-05-01 Thread Apache Jenkins Server
See 




[GitHub] beam pull request #2816: Remove aggregators from DoFn contexts and internal ...

2017-05-01 Thread dhalperi
GitHub user dhalperi opened a pull request:

https://github.com/apache/beam/pull/2816

Remove aggregators from DoFn contexts and internal SDK usage

This is #2718 rebased on master to see if build is fixed.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/dhalperi/beam fixup-2718

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/2816.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2816


commit 4a124f6c199a3c61731d3b4e3ac6232a81e7583f
Author: Pablo 
Date:   2017-04-26T17:55:53Z

Remove aggregators from DoFn contexts and internal SDK usage




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] beam pull request #2815: maptask_executor_runner_test: build fix

2017-05-01 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/2815


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[2/2] beam git commit: This closes #2815

2017-05-01 Thread dhalperi
This closes #2815


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/7fa00647
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/7fa00647
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/7fa00647

Branch: refs/heads/master
Commit: 7fa00647d5506281f456b4b9974ee9efc7484dba
Parents: 034565c 47569f5
Author: Dan Halperin 
Authored: Mon May 1 20:18:44 2017 -0700
Committer: Dan Halperin 
Committed: Mon May 1 20:18:44 2017 -0700

--
 .../runners/portability/maptask_executor_runner_test.py  | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)
--




[1/2] beam git commit: maptask_executor_runner_test: build fix

2017-05-01 Thread dhalperi
Repository: beam
Updated Branches:
  refs/heads/master 034565c68 -> 7fa00647d


maptask_executor_runner_test: build fix

OutputValue was renamed to TaggedOutput in #2810, but this was missed
or merge conflicted.


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/47569f56
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/47569f56
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/47569f56

Branch: refs/heads/master
Commit: 47569f56cc795a5068e8eadfcb1c34cddf9829db
Parents: 034565c
Author: Dan Halperin 
Authored: Mon May 1 20:16:03 2017 -0700
Committer: Dan Halperin 
Committed: Mon May 1 20:16:03 2017 -0700

--
 .../runners/portability/maptask_executor_runner_test.py  | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/47569f56/sdks/python/apache_beam/runners/portability/maptask_executor_runner_test.py
--
diff --git 
a/sdks/python/apache_beam/runners/portability/maptask_executor_runner_test.py 
b/sdks/python/apache_beam/runners/portability/maptask_executor_runner_test.py
index 6e13e73..b52c73c 100644
--- 
a/sdks/python/apache_beam/runners/portability/maptask_executor_runner_test.py
+++ 
b/sdks/python/apache_beam/runners/portability/maptask_executor_runner_test.py
@@ -102,7 +102,7 @@ class MapTaskExecutorRunnerTest(unittest.TestCase):
 def tee(elem, *tags):
   for tag in tags:
 if tag in elem:
-  yield beam.pvalue.OutputValue(tag, elem)
+  yield beam.pvalue.TaggedOutput(tag, elem)
 with self.create_pipeline() as p:
   xy = (p
 | 'Create' >> beam.Create(['x', 'y', 'xy'])
@@ -113,7 +113,7 @@ class MapTaskExecutorRunnerTest(unittest.TestCase):
   def test_pardo_side_and_main_outputs(self):
 def even_odd(elem):
   yield elem
-  yield beam.pvalue.OutputValue('odd' if elem % 2 else 'even', elem)
+  yield beam.pvalue.TaggedOutput('odd' if elem % 2 else 'even', elem)
 with self.create_pipeline() as p:
   ints = p | beam.Create([1, 2, 3])
   named = ints | 'named' >> beam.FlatMap(



[GitHub] beam pull request #2815: maptask_executor_runner_test: build fix

2017-05-01 Thread dhalperi
GitHub user dhalperi opened a pull request:

https://github.com/apache/beam/pull/2815

maptask_executor_runner_test: build fix

OutputValue was renamed to TaggedOutput in #2810, but this was missed
or merge conflicted.

R: @robertwb @aaltay 

This is for historical purposes, will merge quickly to fix build.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/dhalperi/beam fix-py

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/2815.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2815


commit 47569f56cc795a5068e8eadfcb1c34cddf9829db
Author: Dan Halperin 
Date:   2017-05-02T03:16:03Z

maptask_executor_runner_test: build fix

OutputValue was renamed to TaggedOutput in #2810, but this was missed
or merge conflicted.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (BEAM-2136) AvroCoderTest.testTwoClassLoaders fails on beam_PostCommit_Java_ValidatesRunner_Dataflow

2017-05-01 Thread Luke Cwik (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2136?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15992124#comment-15992124
 ] 

Luke Cwik commented on BEAM-2136:
-

Yes, I marked it minor and not part of first stable release for those reasons.

> AvroCoderTest.testTwoClassLoaders fails on 
> beam_PostCommit_Java_ValidatesRunner_Dataflow
> 
>
> Key: BEAM-2136
> URL: https://issues.apache.org/jira/browse/BEAM-2136
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core
>Reporter: Luke Cwik
>Assignee: Kenneth Knowles
>Priority: Minor
> Fix For: First stable release
>
>
> Example failure:
> https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/3003/org.apache.beam$beam-sdks-java-core/testReport/org.apache.beam.sdk.coders/AvroCoderTest/testTwoClassLoaders/
> java.lang.NullPointerException
>   at com.google.common.io.ByteStreams.toByteArray(ByteStreams.java:165)
>   at 
> org.apache.beam.sdk.coders.AvroCoderTest$InterceptingUrlClassLoader.loadClass(AvroCoderTest.java:179)
>   at 
> org.apache.beam.sdk.coders.AvroCoderTest.testTwoClassLoaders(AvroCoderTest.java:199)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.apache.beam.sdk.testing.TestPipeline$1.evaluate(TestPipeline.java:307)
>   at org.junit.rules.RunRules.evaluate(RunRules.java:20)
>   at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
>   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
>   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
>   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
>   at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
>   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
>   at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
>   at org.junit.runners.Suite.runChild(Suite.java:128)
>   at org.junit.runners.Suite.runChild(Suite.java:27)
>   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
>   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
>   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
>   at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
>   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
>   at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
>   at org.apache.maven.surefire.junitcore.JUnitCore.run(JUnitCore.java:55)
>   at 
> org.apache.maven.surefire.junitcore.JUnitCoreWrapper.createRequestAndRun(JUnitCoreWrapper.java:137)
>   at 
> org.apache.maven.surefire.junitcore.JUnitCoreWrapper.executeEager(JUnitCoreWrapper.java:107)
>   at 
> org.apache.maven.surefire.junitcore.JUnitCoreWrapper.execute(JUnitCoreWrapper.java:83)
>   at 
> org.apache.maven.surefire.junitcore.JUnitCoreWrapper.execute(JUnitCoreWrapper.java:75)
>   at 
> org.apache.maven.surefire.junitcore.JUnitCoreProvider.invoke(JUnitCoreProvider.java:157)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.apache.maven.surefire.util.ReflectionUtils.invokeMethodWithArray2(ReflectionUtils.java:202)
>   at 
> org.apache.maven.surefire.booter.ProviderFactory$ProviderProxy.invoke(ProviderFactory.java:158)
>   at 
> org.apache.maven.surefire.booter.ProviderFactory.invokeProvider(ProviderFactory.java:81)
>   at 
> org.apache.maven.plugin.surefire.InPluginVMSurefireStarter.runSuitesInProcess(InPluginVMSurefireStarter.java:84)
>   at 
> org.apache.maven.plugin.surefire.AbstractSurefireMojo.executeProvider(AbstractS

Build failed in Jenkins: beam_PostCommit_Python_Verify #2076

2017-05-01 Thread Apache Jenkins Server
See 


--
[...truncated 470.95 KB...]
test_with_http_error_that_should_not_be_retried 
(apache_beam.utils.retry_test.RetryTest) ... ok
test_with_no_retry_decorator (apache_beam.utils.retry_test.RetryTest) ... ok
test_with_real_clock (apache_beam.utils.retry_test.RetryTest) ... ok
test_basic_test_stream (apache_beam.utils.test_stream_test.TestStreamTest) ... 
ok
test_test_stream_errors (apache_beam.utils.test_stream_test.TestStreamTest) ... 
ok
test_arithmetic (apache_beam.utils.timestamp_test.DurationTest) ... ok
test_of (apache_beam.utils.timestamp_test.DurationTest) ... ok
test_precision (apache_beam.utils.timestamp_test.DurationTest) ... ok
test_sort_order (apache_beam.utils.timestamp_test.DurationTest) ... ok
test_str (apache_beam.utils.timestamp_test.DurationTest) ... ok
test_arithmetic (apache_beam.utils.timestamp_test.TimestampTest) ... ok
test_of (apache_beam.utils.timestamp_test.TimestampTest) ... ok
test_precision (apache_beam.utils.timestamp_test.TimestampTest) ... ok
test_sort_order (apache_beam.utils.timestamp_test.TimestampTest) ... ok
test_str (apache_beam.utils.timestamp_test.TimestampTest) ... ok
test_utc_timestamp (apache_beam.utils.timestamp_test.TimestampTest) ... ok
test_runtime_value_provider_keyword_argument 
(apache_beam.utils.value_provider_test.ValueProviderTests) ... ok
test_runtime_value_provider_positional_argument 
(apache_beam.utils.value_provider_test.ValueProviderTests) ... ok
test_set_runtime_option 
(apache_beam.utils.value_provider_test.ValueProviderTests) ... ok
test_static_value_provider_keyword_argument 
(apache_beam.utils.value_provider_test.ValueProviderTests) ... ok
test_static_value_provider_positional_argument 
(apache_beam.utils.value_provider_test.ValueProviderTests) ... ok
test_static_value_provider_type_cast 
(apache_beam.utils.value_provider_test.ValueProviderTests) ... ok
test_equality (apache_beam.utils.windowed_value_test.WindowedValueTest) ... ok
test_hash (apache_beam.utils.windowed_value_test.WindowedValueTest) ... ok
test_pickle (apache_beam.utils.windowed_value_test.WindowedValueTest) ... ok
test_timestamps (apache_beam.utils.windowed_value_test.WindowedValueTest) ... ok
test_with_value (apache_beam.utils.windowed_value_test.WindowedValueTest) ... ok
test_element (apache_beam.pipeline_test.DoFnTest) ... ok
test_side_input_no_tag (apache_beam.pipeline_test.DoFnTest) ... ok
test_side_input_tagged (apache_beam.pipeline_test.DoFnTest) ... ok
test_timestamp_param (apache_beam.pipeline_test.DoFnTest) ... ok
test_window_param (apache_beam.pipeline_test.DoFnTest) ... ok
test_attribute_setting (apache_beam.pipeline_test.PipelineOptionsTest) ... ok
test_defaults (apache_beam.pipeline_test.PipelineOptionsTest) ... ok
test_dir (apache_beam.pipeline_test.PipelineOptionsTest) ... ok
test_flag_parsing (apache_beam.pipeline_test.PipelineOptionsTest) ... ok
test_keyword_parsing (apache_beam.pipeline_test.PipelineOptionsTest) ... ok
test_view_as (apache_beam.pipeline_test.PipelineOptionsTest) ... ok
test_aggregator_empty_input (apache_beam.pipeline_test.PipelineTest) ... ok
test_apply_custom_transform (apache_beam.pipeline_test.PipelineTest) ... ok
test_create (apache_beam.pipeline_test.PipelineTest) ... ok
test_create_singleton_pcollection (apache_beam.pipeline_test.PipelineTest) ... 
ok
test_flatmap_builtin (apache_beam.pipeline_test.PipelineTest) ... ok
test_memory_usage (apache_beam.pipeline_test.PipelineTest) ... ok
test_metrics_in_source (apache_beam.pipeline_test.PipelineTest) ... ok
test_pipeline_as_context (apache_beam.pipeline_test.PipelineTest) ... 
:132:
 UserWarning: Using fallback coder for typehint: .
  warnings.warn('Using fallback coder for typehint: %r.' % typehint)
ok
test_read (apache_beam.pipeline_test.PipelineTest) ... ok
test_reuse_cloned_custom_transform_instance 
(apache_beam.pipeline_test.PipelineTest) ... ok
test_reuse_custom_transform_instance (apache_beam.pipeline_test.PipelineTest) 
... 
:196:
 DeprecationWarning: BaseException.message has been deprecated as of Python 2.6
  cm.exception.message,
ok
test_transform_no_super_init (apache_beam.pipeline_test.PipelineTest) ... ok
test_visit_entire_graph (apache_beam.pipeline_test.PipelineTest) ... ok
test_simple (apache_beam.pipeline_test.RunnerApiTest)
Tests serializing, deserializing, and running a simple pipeline. ... ok
test_pvalue_expected_arguments (apache_beam.pvalue_test.PValueTest) ... ok
test_append_extra_options (apache_beam.test_pipeline_test.TestPipelineTest) ... 
ok
test_append_verifier_in_extra_opt 
(apache_beam.test_pipeline_test.TestPipelineTest) ... ok
test_create_test_pipeline_options 
(apache_beam.test_pipeline_test.TestPipelineTest) ... ok
tes

Jenkins build is still unstable: beam_PostCommit_Java_ValidatesRunner_Dataflow #3004

2017-05-01 Thread Apache Jenkins Server
See 




Jenkins build became unstable: beam_PostCommit_Java_ValidatesRunner_Flink #2597

2017-05-01 Thread Apache Jenkins Server
See 




[jira] [Commented] (BEAM-775) Remove Aggregators from the Java SDK

2017-05-01 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15992113#comment-15992113
 ] 

ASF GitHub Bot commented on BEAM-775:
-

GitHub user pabloem reopened a pull request:

https://github.com/apache/beam/pull/2744

[BEAM-775] Remove Aggregators from StatefulDoFn runner

This PR depends on PR 2718.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/pabloem/incubator-beam remove-statefuldofn

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/2744.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2744


commit b04749182674e267b9ca8a169e587cfb360243e6
Author: Pablo 
Date:   2017-04-26T17:55:53Z

Remove aggregators from DoFn contexts and internal SDK usage

commit 197b8b096a8c0c938758bdeec83c30248b6fd548
Author: Pablo 
Date:   2017-04-27T00:31:20Z

Fix apex runner issue

commit f05b0b4df6d6004391e25ad1f05514e15fad2795
Author: Pablo 
Date:   2017-04-27T23:06:34Z

Addressing comments

commit a5f9236bb725a70be9c309658c28518d10dce906
Author: Pablo 
Date:   2017-05-02T01:47:11Z

Setting container ID

commit 7819139e2b72ba19dca7749b3b1e24cf91ff
Author: Pablo 
Date:   2017-04-27T16:43:41Z

Remove Aggregators from StatefulDoFn runner

commit 9376cd7e2491d6bc1cca1b6f425eb9bc098f3507
Author: Pablo 
Date:   2017-04-27T23:10:48Z

Removing Aggregator from core runner code

commit 374d92efc6549b0d951ee7cc5eeed83a51f1d55d
Author: Pablo 
Date:   2017-04-28T04:53:16Z

Remove accumulators from DoFn tester.




> Remove Aggregators from the Java SDK
> 
>
> Key: BEAM-775
> URL: https://issues.apache.org/jira/browse/BEAM-775
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-java-core
>Reporter: Ben Chambers
>Assignee: Pablo Estrada
>  Labels: backward-incompatible
> Fix For: First stable release
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (BEAM-775) Remove Aggregators from the Java SDK

2017-05-01 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15992112#comment-15992112
 ] 

ASF GitHub Bot commented on BEAM-775:
-

Github user pabloem closed the pull request at:

https://github.com/apache/beam/pull/2744


> Remove Aggregators from the Java SDK
> 
>
> Key: BEAM-775
> URL: https://issues.apache.org/jira/browse/BEAM-775
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-java-core
>Reporter: Ben Chambers
>Assignee: Pablo Estrada
>  Labels: backward-incompatible
> Fix For: First stable release
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] beam pull request #2744: [BEAM-775] Remove Aggregators from StatefulDoFn run...

2017-05-01 Thread pabloem
Github user pabloem closed the pull request at:

https://github.com/apache/beam/pull/2744


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] beam pull request #2744: [BEAM-775] Remove Aggregators from StatefulDoFn run...

2017-05-01 Thread pabloem
GitHub user pabloem reopened a pull request:

https://github.com/apache/beam/pull/2744

[BEAM-775] Remove Aggregators from StatefulDoFn runner

This PR depends on PR 2718.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/pabloem/incubator-beam remove-statefuldofn

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/2744.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2744


commit b04749182674e267b9ca8a169e587cfb360243e6
Author: Pablo 
Date:   2017-04-26T17:55:53Z

Remove aggregators from DoFn contexts and internal SDK usage

commit 197b8b096a8c0c938758bdeec83c30248b6fd548
Author: Pablo 
Date:   2017-04-27T00:31:20Z

Fix apex runner issue

commit f05b0b4df6d6004391e25ad1f05514e15fad2795
Author: Pablo 
Date:   2017-04-27T23:06:34Z

Addressing comments

commit a5f9236bb725a70be9c309658c28518d10dce906
Author: Pablo 
Date:   2017-05-02T01:47:11Z

Setting container ID

commit 7819139e2b72ba19dca7749b3b1e24cf91ff
Author: Pablo 
Date:   2017-04-27T16:43:41Z

Remove Aggregators from StatefulDoFn runner

commit 9376cd7e2491d6bc1cca1b6f425eb9bc098f3507
Author: Pablo 
Date:   2017-04-27T23:10:48Z

Removing Aggregator from core runner code

commit 374d92efc6549b0d951ee7cc5eeed83a51f1d55d
Author: Pablo 
Date:   2017-04-28T04:53:16Z

Remove accumulators from DoFn tester.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (BEAM-1641) Support synchronized processing time in Flink runner

2017-05-01 Thread Jingsong Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-1641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15992110#comment-15992110
 ] 

Jingsong Lee commented on BEAM-1641:


There are some differences between the processing of event time and 
synchronised processing time in {{DirectRunner}}. The Source just emit the 
{{BoundedWindow.TIMESTAMP_MAX_VALUE}} as the synchronizedProcessingTime, and 
the downStream use {{min(clock.now(), 
synchronizedProcessingInputWatermark.get())}} to generate 
synchronizedProcessingTime.
But I think from the fundamental point of view, ingestion time and synchronized 
processing time have produced almost the same effect. So I think we can use 
ingestion time and let Flink track ingestion and event time at the same time.

> Support synchronized processing time in Flink runner
> 
>
> Key: BEAM-1641
> URL: https://issues.apache.org/jira/browse/BEAM-1641
> Project: Beam
>  Issue Type: Bug
>  Components: runner-flink
>Reporter: Kenneth Knowles
>Assignee: Aljoscha Krettek
>Priority: Blocker
> Fix For: First stable release
>
>
> The "continuation trigger" for a processing time trigger is a synchronized 
> processing time trigger. Today, this throws an exception in the FlinkRunner.
> The supports the following:
>  - GBK1
>  - GBK2
> When GBK1 fires due to processing time past the first element in the pane and 
> that element arrives at GBK2, it will wait until all the other upstream keys 
> have also processed and emitted corresponding data.
> Sorry for the terseness of explanation - writing quickly so I don't forget.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (BEAM-2136) AvroCoderTest.testTwoClassLoaders fails on beam_PostCommit_Java_ValidatesRunner_Dataflow

2017-05-01 Thread Kenneth Knowles (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-2136?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kenneth Knowles updated BEAM-2136:
--
Fix Version/s: First stable release

> AvroCoderTest.testTwoClassLoaders fails on 
> beam_PostCommit_Java_ValidatesRunner_Dataflow
> 
>
> Key: BEAM-2136
> URL: https://issues.apache.org/jira/browse/BEAM-2136
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core
>Reporter: Luke Cwik
>Assignee: Kenneth Knowles
>Priority: Minor
> Fix For: First stable release
>
>
> Example failure:
> https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/3003/org.apache.beam$beam-sdks-java-core/testReport/org.apache.beam.sdk.coders/AvroCoderTest/testTwoClassLoaders/
> java.lang.NullPointerException
>   at com.google.common.io.ByteStreams.toByteArray(ByteStreams.java:165)
>   at 
> org.apache.beam.sdk.coders.AvroCoderTest$InterceptingUrlClassLoader.loadClass(AvroCoderTest.java:179)
>   at 
> org.apache.beam.sdk.coders.AvroCoderTest.testTwoClassLoaders(AvroCoderTest.java:199)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.apache.beam.sdk.testing.TestPipeline$1.evaluate(TestPipeline.java:307)
>   at org.junit.rules.RunRules.evaluate(RunRules.java:20)
>   at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
>   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
>   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
>   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
>   at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
>   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
>   at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
>   at org.junit.runners.Suite.runChild(Suite.java:128)
>   at org.junit.runners.Suite.runChild(Suite.java:27)
>   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
>   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
>   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
>   at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
>   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
>   at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
>   at org.apache.maven.surefire.junitcore.JUnitCore.run(JUnitCore.java:55)
>   at 
> org.apache.maven.surefire.junitcore.JUnitCoreWrapper.createRequestAndRun(JUnitCoreWrapper.java:137)
>   at 
> org.apache.maven.surefire.junitcore.JUnitCoreWrapper.executeEager(JUnitCoreWrapper.java:107)
>   at 
> org.apache.maven.surefire.junitcore.JUnitCoreWrapper.execute(JUnitCoreWrapper.java:83)
>   at 
> org.apache.maven.surefire.junitcore.JUnitCoreWrapper.execute(JUnitCoreWrapper.java:75)
>   at 
> org.apache.maven.surefire.junitcore.JUnitCoreProvider.invoke(JUnitCoreProvider.java:157)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.apache.maven.surefire.util.ReflectionUtils.invokeMethodWithArray2(ReflectionUtils.java:202)
>   at 
> org.apache.maven.surefire.booter.ProviderFactory$ProviderProxy.invoke(ProviderFactory.java:158)
>   at 
> org.apache.maven.surefire.booter.ProviderFactory.invokeProvider(ProviderFactory.java:81)
>   at 
> org.apache.maven.plugin.surefire.InPluginVMSurefireStarter.runSuitesInProcess(InPluginVMSurefireStarter.java:84)
>   at 
> org.apache.maven.plugin.surefire.AbstractSurefireMojo.executeProvider(AbstractSurefireMojo.java:1060)
>   at 
> org.apache.maven.plugin.surefire.AbstractSurefir

[jira] [Commented] (BEAM-2136) AvroCoderTest.testTwoClassLoaders fails on beam_PostCommit_Java_ValidatesRunner_Dataflow

2017-05-01 Thread Kenneth Knowles (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2136?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15992087#comment-15992087
 ] 

Kenneth Knowles commented on BEAM-2136:
---

I presume you marked this minor because this isn't actually a scenario that 
occurs in Dataflow? As long as the other tests and nothing else catastrophic is 
happening I agree and will remove it from FSR.

> AvroCoderTest.testTwoClassLoaders fails on 
> beam_PostCommit_Java_ValidatesRunner_Dataflow
> 
>
> Key: BEAM-2136
> URL: https://issues.apache.org/jira/browse/BEAM-2136
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core
>Reporter: Luke Cwik
>Assignee: Kenneth Knowles
>Priority: Minor
> Fix For: First stable release
>
>
> Example failure:
> https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/3003/org.apache.beam$beam-sdks-java-core/testReport/org.apache.beam.sdk.coders/AvroCoderTest/testTwoClassLoaders/
> java.lang.NullPointerException
>   at com.google.common.io.ByteStreams.toByteArray(ByteStreams.java:165)
>   at 
> org.apache.beam.sdk.coders.AvroCoderTest$InterceptingUrlClassLoader.loadClass(AvroCoderTest.java:179)
>   at 
> org.apache.beam.sdk.coders.AvroCoderTest.testTwoClassLoaders(AvroCoderTest.java:199)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.apache.beam.sdk.testing.TestPipeline$1.evaluate(TestPipeline.java:307)
>   at org.junit.rules.RunRules.evaluate(RunRules.java:20)
>   at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
>   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
>   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
>   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
>   at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
>   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
>   at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
>   at org.junit.runners.Suite.runChild(Suite.java:128)
>   at org.junit.runners.Suite.runChild(Suite.java:27)
>   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
>   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
>   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
>   at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
>   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
>   at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
>   at org.apache.maven.surefire.junitcore.JUnitCore.run(JUnitCore.java:55)
>   at 
> org.apache.maven.surefire.junitcore.JUnitCoreWrapper.createRequestAndRun(JUnitCoreWrapper.java:137)
>   at 
> org.apache.maven.surefire.junitcore.JUnitCoreWrapper.executeEager(JUnitCoreWrapper.java:107)
>   at 
> org.apache.maven.surefire.junitcore.JUnitCoreWrapper.execute(JUnitCoreWrapper.java:83)
>   at 
> org.apache.maven.surefire.junitcore.JUnitCoreWrapper.execute(JUnitCoreWrapper.java:75)
>   at 
> org.apache.maven.surefire.junitcore.JUnitCoreProvider.invoke(JUnitCoreProvider.java:157)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.apache.maven.surefire.util.ReflectionUtils.invokeMethodWithArray2(ReflectionUtils.java:202)
>   at 
> org.apache.maven.surefire.booter.ProviderFactory$ProviderProxy.invoke(ProviderFactory.java:158)
>   at 
> org.apache.maven.surefire.booter.ProviderFactory.invokeProvider(ProviderFactory.java:81)
>   at 
> org.apache.maven.plugin.surefire.InPluginVMSurefireStarter.runSuites

Build failed in Jenkins: beam_PostCommit_Java_MavenInstall #3576

2017-05-01 Thread Apache Jenkins Server
See 


Changes:

[kirpichov] Fixes javadoc of TextIO to not point to AvroIO

[kirpichov] Moves AvroSink to upper level

[kirpichov] Removes AvroIO.Read.Bound

[kirpichov] Adds AvroIO.readGenericRecords()

[kirpichov] Converts AvroIO.Read to AutoValue

[kirpichov] Removes AvroIO.Write.Bound

[kirpichov] Moves AvroIO.Read.withSchema into read()

[kirpichov] Converts AvroIO.Write to AutoValue; adds writeGenericRecords()

[kirpichov] Moves AvroIO.write().withSchema into write()

[kirpichov] Scattered minor improvements per review comments

--
[...truncated 2.12 MB...]
for result in results:
  File 
"
 line 105, in tee
yield beam.pvalue.OutputValue(tag, elem)
AttributeError: 'module' object has no attribute 'OutputValue' [while running 
'FlatMap(tee)/FlatMap(tee)']
 >> begin captured logging << 
root: INFO: Running [(u'Create/Read', 
WorkerRead(source=SourceBundle(weight=1.0, 
source=, 
start_position=None, stop_position=None))), (u'FlatMap(tee)/FlatMap(tee)', 
WorkerDoFn(output_tags=['out', 'out_x', 'out_y'], input=(0, 0))), 
(u'x/WindowInto(WindowIntoFn)', WorkerDoFn(output_tags=['out'], input=(1, 1))), 
(u'x/ToVoidKey', WorkerDoFn(output_tags=['out'], input=(2, 0))), 
(u'x/Group/pair_with_1', WorkerDoFn(output_tags=['out'], input=(3, 0))), 
(u'x/Group/Flatten/Write', WorkerInMemoryWrite(output_buffer=GroupingOutput[0], 
write_windowed_values=True, input=(4, 0))), (u'y/WindowInto(WindowIntoFn)', 
WorkerDoFn(output_tags=['out'], input=(1, 2))), (u'y/ToVoidKey', 
WorkerDoFn(output_tags=['out'], input=(6, 0))), (u'y/Group/pair_with_1', 
WorkerDoFn(output_tags=['out'], input=(7, 0))), (u'y/Group/Flatten/Write', 
WorkerInMemoryWrite(output_buffer=GroupingOutput[0], 
write_windowed_values=True, input=(8, 0)))]
root: DEBUG: Starting op 9 
root: DEBUG: Starting op 8 
root: DEBUG: Starting op 7 
root: DEBUG: Starting op 6 
root: DEBUG: Starting op 5 
root: DEBUG: Starting op 4 
root: DEBUG: Starting op 3 
root: DEBUG: Starting op 2 
root: DEBUG: Starting op 1 
root: DEBUG: Starting op 0 , 
start_position=None, stop_position=None)>
- >> end captured logging << -

==
FAIL: test_using_slow_impl (apache_beam.coders.slow_coders_test.SlowCoders)
--
Traceback (most recent call last):
  File 
"
 line 41, in test_using_slow_impl
import apache_beam.coders.stream
AssertionError: ImportError not raised

--
Ran 1198 tests in 285.604s

FAILED (failures=1, errors=2, skipped=15)
Test failed: 
error: Test failed: 
ERROR: InvocationError: 
'
 setup.py test'
___ summary 
  docs: commands succeeded
  lint: commands succeeded
ERROR:   py27: commands failed
ERROR:   py27cython: commands failed
ERROR:   py27gcp: commands failed
2017-05-02T02:47:14.441 [ERROR] Command execution failed.
org.apache.commons.exec.ExecuteException: Process exited with an error: 1 (Exit 
value: 1)
at 
org.apache.commons.exec.DefaultExecutor.executeInternal(DefaultExecutor.java:404)
at 
org.apache.commons.exec.DefaultExecutor.execute(DefaultExecutor.java:166)
at org.codehaus.mojo.exec.ExecMojo.executeCommandLine(ExecMojo.java:764)
at org.codehaus.mojo.exec.ExecMojo.executeCommandLine(ExecMojo.java:711)
at org.codehaus.mojo.exec.ExecMojo.execute(ExecMojo.java:289)
at 
org.apache.maven.plugin.DefaultBuildPluginManager.executeMojo(DefaultBuildPluginManager.java:134)
at 
org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:208)
at 
org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:153)
at 
org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:145)
at 
org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject(LifecycleModuleBuilder.java:116)
at 
org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject(LifecycleModuleBuilder.java:80)
at 
org.apache.maven.lifecycle.internal.builder.singlethreaded.SingleThreadedBuilder.build(SingleThreadedBuilder.java:51)
at 
org.apache.maven.lifecycle.internal.LifecycleStarter.execute(LifecycleStarter.java:128)
at org.apache.maven.DefaultMaven.doExecute(DefaultMaven.

[jira] [Commented] (BEAM-775) Remove Aggregators from the Java SDK

2017-05-01 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15992024#comment-15992024
 ] 

ASF GitHub Bot commented on BEAM-775:
-

GitHub user pabloem reopened a pull request:

https://github.com/apache/beam/pull/2718

[BEAM-775] Remove Aggregators from the Java SDK

Removing from internal SDK usage and DoFn contexts so users can not create 
them.
Some failures in Apex code remain. All else seems to be `mvn clean 
verify`ing well.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/pabloem/incubator-beam rmv-agg-sdk

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/2718.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2718


commit b04749182674e267b9ca8a169e587cfb360243e6
Author: Pablo 
Date:   2017-04-26T17:55:53Z

Remove aggregators from DoFn contexts and internal SDK usage

commit 197b8b096a8c0c938758bdeec83c30248b6fd548
Author: Pablo 
Date:   2017-04-27T00:31:20Z

Fix apex runner issue

commit f05b0b4df6d6004391e25ad1f05514e15fad2795
Author: Pablo 
Date:   2017-04-27T23:06:34Z

Addressing comments

commit a5f9236bb725a70be9c309658c28518d10dce906
Author: Pablo 
Date:   2017-05-02T01:47:11Z

Setting container ID




> Remove Aggregators from the Java SDK
> 
>
> Key: BEAM-775
> URL: https://issues.apache.org/jira/browse/BEAM-775
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-java-core
>Reporter: Ben Chambers
>Assignee: Pablo Estrada
>  Labels: backward-incompatible
> Fix For: First stable release
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (BEAM-775) Remove Aggregators from the Java SDK

2017-05-01 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15992023#comment-15992023
 ] 

ASF GitHub Bot commented on BEAM-775:
-

Github user pabloem closed the pull request at:

https://github.com/apache/beam/pull/2718


> Remove Aggregators from the Java SDK
> 
>
> Key: BEAM-775
> URL: https://issues.apache.org/jira/browse/BEAM-775
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-java-core
>Reporter: Ben Chambers
>Assignee: Pablo Estrada
>  Labels: backward-incompatible
> Fix For: First stable release
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] beam pull request #2718: [BEAM-775] Remove Aggregators from the Java SDK

2017-05-01 Thread pabloem
Github user pabloem closed the pull request at:

https://github.com/apache/beam/pull/2718


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] beam pull request #2718: [BEAM-775] Remove Aggregators from the Java SDK

2017-05-01 Thread pabloem
GitHub user pabloem reopened a pull request:

https://github.com/apache/beam/pull/2718

[BEAM-775] Remove Aggregators from the Java SDK

Removing from internal SDK usage and DoFn contexts so users can not create 
them.
Some failures in Apex code remain. All else seems to be `mvn clean 
verify`ing well.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/pabloem/incubator-beam rmv-agg-sdk

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/2718.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2718


commit b04749182674e267b9ca8a169e587cfb360243e6
Author: Pablo 
Date:   2017-04-26T17:55:53Z

Remove aggregators from DoFn contexts and internal SDK usage

commit 197b8b096a8c0c938758bdeec83c30248b6fd548
Author: Pablo 
Date:   2017-04-27T00:31:20Z

Fix apex runner issue

commit f05b0b4df6d6004391e25ad1f05514e15fad2795
Author: Pablo 
Date:   2017-04-27T23:06:34Z

Addressing comments

commit a5f9236bb725a70be9c309658c28518d10dce906
Author: Pablo 
Date:   2017-05-02T01:47:11Z

Setting container ID




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


Build failed in Jenkins: beam_PostCommit_Java_MavenInstall #3575

2017-05-01 Thread Apache Jenkins Server
See 


Changes:

[lcwik] [BEAM-1871] Remove deprecated org.apache.beam.sdk.options.GcsOptions

--
[...truncated 2.13 MB...]
for result in results:
  File 
"
 line 105, in tee
yield beam.pvalue.OutputValue(tag, elem)
AttributeError: 'module' object has no attribute 'OutputValue' [while running 
'FlatMap(tee)/FlatMap(tee)']
 >> begin captured logging << 
root: INFO: Running [(u'Create/Read', 
WorkerRead(source=SourceBundle(weight=1.0, 
source=, 
start_position=None, stop_position=None))), (u'FlatMap(tee)/FlatMap(tee)', 
WorkerDoFn(output_tags=['out', 'out_y', 'out_x'], input=(0, 0))), 
(u'x/WindowInto(WindowIntoFn)', WorkerDoFn(output_tags=['out'], input=(1, 2))), 
(u'x/ToVoidKey', WorkerDoFn(output_tags=['out'], input=(2, 0))), 
(u'x/Group/pair_with_1', WorkerDoFn(output_tags=['out'], input=(3, 0))), 
(u'x/Group/Flatten/Write', WorkerInMemoryWrite(output_buffer=GroupingOutput[0], 
write_windowed_values=True, input=(4, 0))), (u'y/WindowInto(WindowIntoFn)', 
WorkerDoFn(output_tags=['out'], input=(1, 1))), (u'y/ToVoidKey', 
WorkerDoFn(output_tags=['out'], input=(6, 0))), (u'y/Group/pair_with_1', 
WorkerDoFn(output_tags=['out'], input=(7, 0))), (u'y/Group/Flatten/Write', 
WorkerInMemoryWrite(output_buffer=GroupingOutput[0], 
write_windowed_values=True, input=(8, 0)))]
root: DEBUG: Starting op 9 
root: DEBUG: Starting op 8 
root: DEBUG: Starting op 7 
root: DEBUG: Starting op 6 
root: DEBUG: Starting op 5 
root: DEBUG: Starting op 4 
root: DEBUG: Starting op 3 
root: DEBUG: Starting op 2 
root: DEBUG: Starting op 1 
root: DEBUG: Starting op 0 , 
start_position=None, stop_position=None)>
- >> end captured logging << -

==
FAIL: test_using_slow_impl (apache_beam.coders.slow_coders_test.SlowCoders)
--
Traceback (most recent call last):
  File 
"
 line 41, in test_using_slow_impl
import apache_beam.coders.stream
AssertionError: ImportError not raised

--
Ran 1198 tests in 320.951s

FAILED (failures=1, errors=2, skipped=15)
Test failed: 
error: Test failed: 
ERROR: InvocationError: 
'
 setup.py test'
___ summary 
  docs: commands succeeded
  lint: commands succeeded
ERROR:   py27: commands failed
ERROR:   py27cython: commands failed
ERROR:   py27gcp: commands failed
2017-05-02T02:23:00.796 [ERROR] Command execution failed.
org.apache.commons.exec.ExecuteException: Process exited with an error: 1 (Exit 
value: 1)
at 
org.apache.commons.exec.DefaultExecutor.executeInternal(DefaultExecutor.java:404)
at 
org.apache.commons.exec.DefaultExecutor.execute(DefaultExecutor.java:166)
at org.codehaus.mojo.exec.ExecMojo.executeCommandLine(ExecMojo.java:764)
at org.codehaus.mojo.exec.ExecMojo.executeCommandLine(ExecMojo.java:711)
at org.codehaus.mojo.exec.ExecMojo.execute(ExecMojo.java:289)
at 
org.apache.maven.plugin.DefaultBuildPluginManager.executeMojo(DefaultBuildPluginManager.java:134)
at 
org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:208)
at 
org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:153)
at 
org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:145)
at 
org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject(LifecycleModuleBuilder.java:116)
at 
org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject(LifecycleModuleBuilder.java:80)
at 
org.apache.maven.lifecycle.internal.builder.singlethreaded.SingleThreadedBuilder.build(SingleThreadedBuilder.java:51)
at 
org.apache.maven.lifecycle.internal.LifecycleStarter.execute(LifecycleStarter.java:128)
at org.apache.maven.DefaultMaven.doExecute(DefaultMaven.java:307)
at org.apache.maven.DefaultMaven.doExecute(DefaultMaven.java:193)
at org.apache.maven.DefaultMaven.execute(DefaultMaven.java:106)
at 
org.jvnet.hudson.maven3.launcher.Maven33Launcher.main(Maven33Launcher.java:129)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMeth

Build failed in Jenkins: beam_PostCommit_Python_Verify #2075

2017-05-01 Thread Apache Jenkins Server
See 


Changes:

[kirpichov] Fixes javadoc of TextIO to not point to AvroIO

[kirpichov] Moves AvroSink to upper level

[kirpichov] Removes AvroIO.Read.Bound

[kirpichov] Adds AvroIO.readGenericRecords()

[kirpichov] Converts AvroIO.Read to AutoValue

[kirpichov] Removes AvroIO.Write.Bound

[kirpichov] Moves AvroIO.Read.withSchema into read()

[kirpichov] Converts AvroIO.Write to AutoValue; adds writeGenericRecords()

[kirpichov] Moves AvroIO.write().withSchema into write()

[kirpichov] Scattered minor improvements per review comments

--
[...truncated 471.81 KB...]
test_with_http_error_that_should_not_be_retried 
(apache_beam.utils.retry_test.RetryTest) ... ok
test_with_no_retry_decorator (apache_beam.utils.retry_test.RetryTest) ... ok
test_with_real_clock (apache_beam.utils.retry_test.RetryTest) ... ok
test_basic_test_stream (apache_beam.utils.test_stream_test.TestStreamTest) ... 
ok
test_test_stream_errors (apache_beam.utils.test_stream_test.TestStreamTest) ... 
ok
test_arithmetic (apache_beam.utils.timestamp_test.DurationTest) ... ok
test_of (apache_beam.utils.timestamp_test.DurationTest) ... ok
test_precision (apache_beam.utils.timestamp_test.DurationTest) ... ok
test_sort_order (apache_beam.utils.timestamp_test.DurationTest) ... ok
test_str (apache_beam.utils.timestamp_test.DurationTest) ... ok
test_arithmetic (apache_beam.utils.timestamp_test.TimestampTest) ... ok
test_of (apache_beam.utils.timestamp_test.TimestampTest) ... ok
test_precision (apache_beam.utils.timestamp_test.TimestampTest) ... ok
test_sort_order (apache_beam.utils.timestamp_test.TimestampTest) ... ok
test_str (apache_beam.utils.timestamp_test.TimestampTest) ... ok
test_utc_timestamp (apache_beam.utils.timestamp_test.TimestampTest) ... ok
test_runtime_value_provider_keyword_argument 
(apache_beam.utils.value_provider_test.ValueProviderTests) ... ok
test_runtime_value_provider_positional_argument 
(apache_beam.utils.value_provider_test.ValueProviderTests) ... ok
test_set_runtime_option 
(apache_beam.utils.value_provider_test.ValueProviderTests) ... ok
test_static_value_provider_keyword_argument 
(apache_beam.utils.value_provider_test.ValueProviderTests) ... ok
test_static_value_provider_positional_argument 
(apache_beam.utils.value_provider_test.ValueProviderTests) ... ok
test_static_value_provider_type_cast 
(apache_beam.utils.value_provider_test.ValueProviderTests) ... ok
test_equality (apache_beam.utils.windowed_value_test.WindowedValueTest) ... ok
test_hash (apache_beam.utils.windowed_value_test.WindowedValueTest) ... ok
test_pickle (apache_beam.utils.windowed_value_test.WindowedValueTest) ... ok
test_timestamps (apache_beam.utils.windowed_value_test.WindowedValueTest) ... ok
test_with_value (apache_beam.utils.windowed_value_test.WindowedValueTest) ... ok
test_element (apache_beam.pipeline_test.DoFnTest) ... ok
test_side_input_no_tag (apache_beam.pipeline_test.DoFnTest) ... ok
test_side_input_tagged (apache_beam.pipeline_test.DoFnTest) ... ok
test_timestamp_param (apache_beam.pipeline_test.DoFnTest) ... ok
test_window_param (apache_beam.pipeline_test.DoFnTest) ... ok
test_attribute_setting (apache_beam.pipeline_test.PipelineOptionsTest) ... ok
test_defaults (apache_beam.pipeline_test.PipelineOptionsTest) ... ok
test_dir (apache_beam.pipeline_test.PipelineOptionsTest) ... ok
test_flag_parsing (apache_beam.pipeline_test.PipelineOptionsTest) ... ok
test_keyword_parsing (apache_beam.pipeline_test.PipelineOptionsTest) ... ok
test_view_as (apache_beam.pipeline_test.PipelineOptionsTest) ... ok
test_aggregator_empty_input (apache_beam.pipeline_test.PipelineTest) ... ok
test_apply_custom_transform (apache_beam.pipeline_test.PipelineTest) ... ok
test_create (apache_beam.pipeline_test.PipelineTest) ... ok
test_create_singleton_pcollection (apache_beam.pipeline_test.PipelineTest) ... 
ok
test_flatmap_builtin (apache_beam.pipeline_test.PipelineTest) ... ok
test_memory_usage (apache_beam.pipeline_test.PipelineTest) ... ok
test_metrics_in_source (apache_beam.pipeline_test.PipelineTest) ... ok
test_pipeline_as_context (apache_beam.pipeline_test.PipelineTest) ... 
:132:
 UserWarning: Using fallback coder for typehint: .
  warnings.warn('Using fallback coder for typehint: %r.' % typehint)
ok
test_read (apache_beam.pipeline_test.PipelineTest) ... ok
test_reuse_cloned_custom_transform_instance 
(apache_beam.pipeline_test.PipelineTest) ... ok
test_reuse_custom_transform_instance (apache_beam.pipeline_test.PipelineTest) 
... 
:196:
 DeprecationWarning: BaseException.message has been deprecated as of Python 2.6
  cm.exception.message,
ok
test_transform_no_super_init (apache_beam.pipeline_test.PipelineTest) ...

[jira] [Commented] (BEAM-59) Switch from IOChannelFactory to FileSystems

2017-05-01 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-59?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15991974#comment-15991974
 ] 

ASF GitHub Bot commented on BEAM-59:


GitHub user dhalperi opened a pull request:

https://github.com/apache/beam/pull/2814

[BEAM-59] Minor changes that can be separated out from big PR

All these are minor changes, correct at HEAD, and standalone from #2779

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/dhalperi/beam test-cleanup

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/2814.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2814


commit dbfe24b408c0ebeb003804676484214640ea10df
Author: Dan Halperin 
Date:   2017-05-02T01:43:23Z

[BEAM-59] Core tests: stop using gs:// paths

Local filesystem is just as good an example, and in the
future these may break because GCS filesystem is not present
in sdks/java/core.

commit 6b0fa09eb45dfe51d9d93112deea79ff6578e167
Author: Dan Halperin 
Date:   2017-05-02T01:46:24Z

[BEAM-59] TFRecordIOTest: cleanup

1. Shrink file size slightly to reduce test time.
2. Switch from relative to absolute local file paths, because in the future
   this distinction will matter.

commit 7a02eab9f70f962499101c039a78bcafc2725d3e
Author: Dan Halperin 
Date:   2017-05-02T01:49:28Z

[BEAM-59] DataflowRunnerTests: configure FileSystems in test

This enables the test to use gs:// URIs with the FileSystems API

commit 5ac3cff9b774d1e0c3ce7ee91bfd8aa23cd2dc27
Author: Dan Halperin 
Date:   2017-05-02T01:54:40Z

[BEAM-59] AvroIOTest: use absolute paths for display data

This is future-proofing against eventual conversion to FileSystems API in 
which all paths
are absolute




> Switch from IOChannelFactory to FileSystems
> ---
>
> Key: BEAM-59
> URL: https://issues.apache.org/jira/browse/BEAM-59
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-java-core, sdk-java-gcp
>Reporter: Daniel Halperin
>Assignee: Daniel Halperin
> Fix For: First stable release
>
>
> Right now, FileBasedSource and FileBasedSink communication is mediated by 
> IOChannelFactory. There are a number of issues:
> * Global configuration -- e.g., all 'gs://' URIs use the same credentials. 
> This should be per-source/per-sink/etc.
> * Supported APIs -- currently IOChannelFactory is in the "non-public API" 
> util package and subject to change. We need users to be able to add new 
> backends ('s3://', 'hdfs://', etc.) directly, without fear that they will be 
> broken.
> * Per-backend features: e.g., creating buckets in GCS/s3, setting expiration 
> time, etc.
> Updates:
> Design docs posted on dev@ list:
> Part 1: IOChannelFactory Redesign: 
> https://docs.google.com/document/d/11TdPyZ9_zmjokhNWM3Id-XJsVG3qel2lhdKTknmZ_7M/edit#
> Part 2: Configurable BeamFileSystem:
> https://docs.google.com/document/d/1-7vo9nLRsEEzDGnb562PuL4q9mUiq_ZVpCAiyyJw8p8/edit#heading=h.p3gc3colc2cs



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


Build failed in Jenkins: beam_PostCommit_Java_MavenInstall #3574

2017-05-01 Thread Apache Jenkins Server
See 


Changes:

[lcwik] Copy CloudObject to the Dataflow Module

--
[...truncated 2.12 MB...]
for result in results:
  File 
"
 line 105, in tee
yield beam.pvalue.OutputValue(tag, elem)
AttributeError: 'module' object has no attribute 'OutputValue' [while running 
'FlatMap(tee)/FlatMap(tee)']
 >> begin captured logging << 
root: INFO: Running [(u'Create/Read', 
WorkerRead(source=SourceBundle(weight=1.0, 
source=, 
start_position=None, stop_position=None))), (u'FlatMap(tee)/FlatMap(tee)', 
WorkerDoFn(output_tags=['out', 'out_y', 'out_x'], input=(0, 0))), 
(u'x/WindowInto(WindowIntoFn)', WorkerDoFn(output_tags=['out'], input=(1, 2))), 
(u'x/ToVoidKey', WorkerDoFn(output_tags=['out'], input=(2, 0))), 
(u'x/Group/pair_with_1', WorkerDoFn(output_tags=['out'], input=(3, 0))), 
(u'x/Group/Flatten/Write', WorkerInMemoryWrite(output_buffer=GroupingOutput[0], 
write_windowed_values=True, input=(4, 0))), (u'y/WindowInto(WindowIntoFn)', 
WorkerDoFn(output_tags=['out'], input=(1, 1))), (u'y/ToVoidKey', 
WorkerDoFn(output_tags=['out'], input=(6, 0))), (u'y/Group/pair_with_1', 
WorkerDoFn(output_tags=['out'], input=(7, 0))), (u'y/Group/Flatten/Write', 
WorkerInMemoryWrite(output_buffer=GroupingOutput[0], 
write_windowed_values=True, input=(8, 0)))]
root: DEBUG: Starting op 9 
root: DEBUG: Starting op 8 
root: DEBUG: Starting op 7 
root: DEBUG: Starting op 6 
root: DEBUG: Starting op 5 
root: DEBUG: Starting op 4 
root: DEBUG: Starting op 3 
root: DEBUG: Starting op 2 
root: DEBUG: Starting op 1 
root: DEBUG: Starting op 0 , 
start_position=None, stop_position=None)>
- >> end captured logging << -

==
FAIL: test_using_slow_impl (apache_beam.coders.slow_coders_test.SlowCoders)
--
Traceback (most recent call last):
  File 
"
 line 41, in test_using_slow_impl
import apache_beam.coders.stream
AssertionError: ImportError not raised

--
Ran 1198 tests in 285.196s

FAILED (failures=1, errors=2, skipped=15)
Test failed: 
error: Test failed: 
ERROR: InvocationError: 
'
 setup.py test'
___ summary 
  docs: commands succeeded
  lint: commands succeeded
ERROR:   py27: commands failed
ERROR:   py27cython: commands failed
ERROR:   py27gcp: commands failed
2017-05-02T01:56:25.086 [ERROR] Command execution failed.
org.apache.commons.exec.ExecuteException: Process exited with an error: 1 (Exit 
value: 1)
at 
org.apache.commons.exec.DefaultExecutor.executeInternal(DefaultExecutor.java:404)
at 
org.apache.commons.exec.DefaultExecutor.execute(DefaultExecutor.java:166)
at org.codehaus.mojo.exec.ExecMojo.executeCommandLine(ExecMojo.java:764)
at org.codehaus.mojo.exec.ExecMojo.executeCommandLine(ExecMojo.java:711)
at org.codehaus.mojo.exec.ExecMojo.execute(ExecMojo.java:289)
at 
org.apache.maven.plugin.DefaultBuildPluginManager.executeMojo(DefaultBuildPluginManager.java:134)
at 
org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:208)
at 
org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:153)
at 
org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:145)
at 
org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject(LifecycleModuleBuilder.java:116)
at 
org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject(LifecycleModuleBuilder.java:80)
at 
org.apache.maven.lifecycle.internal.builder.singlethreaded.SingleThreadedBuilder.build(SingleThreadedBuilder.java:51)
at 
org.apache.maven.lifecycle.internal.LifecycleStarter.execute(LifecycleStarter.java:128)
at org.apache.maven.DefaultMaven.doExecute(DefaultMaven.java:307)
at org.apache.maven.DefaultMaven.doExecute(DefaultMaven.java:193)
at org.apache.maven.DefaultMaven.execute(DefaultMaven.java:106)
at 
org.jvnet.hudson.maven3.launcher.Maven33Launcher.main(Maven33Launcher.java:129)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(Delegat

[GitHub] beam pull request #2814: [BEAM-59] Minor changes that can be separated out f...

2017-05-01 Thread dhalperi
GitHub user dhalperi opened a pull request:

https://github.com/apache/beam/pull/2814

[BEAM-59] Minor changes that can be separated out from big PR

All these are minor changes, correct at HEAD, and standalone from #2779

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/dhalperi/beam test-cleanup

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/2814.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2814


commit dbfe24b408c0ebeb003804676484214640ea10df
Author: Dan Halperin 
Date:   2017-05-02T01:43:23Z

[BEAM-59] Core tests: stop using gs:// paths

Local filesystem is just as good an example, and in the
future these may break because GCS filesystem is not present
in sdks/java/core.

commit 6b0fa09eb45dfe51d9d93112deea79ff6578e167
Author: Dan Halperin 
Date:   2017-05-02T01:46:24Z

[BEAM-59] TFRecordIOTest: cleanup

1. Shrink file size slightly to reduce test time.
2. Switch from relative to absolute local file paths, because in the future
   this distinction will matter.

commit 7a02eab9f70f962499101c039a78bcafc2725d3e
Author: Dan Halperin 
Date:   2017-05-02T01:49:28Z

[BEAM-59] DataflowRunnerTests: configure FileSystems in test

This enables the test to use gs:// URIs with the FileSystems API

commit 5ac3cff9b774d1e0c3ce7ee91bfd8aa23cd2dc27
Author: Dan Halperin 
Date:   2017-05-02T01:54:40Z

[BEAM-59] AvroIOTest: use absolute paths for display data

This is future-proofing against eventual conversion to FileSystems API in 
which all paths
are absolute




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


Jenkins build is still unstable: beam_PostCommit_Java_MavenInstall #3572

2017-05-01 Thread Apache Jenkins Server
See 




Build failed in Jenkins: beam_PostCommit_Python_Verify #2074

2017-05-01 Thread Apache Jenkins Server
See 


Changes:

[robertwb] Rename OutputValue to TaggedOutput.

[lcwik] Copy CloudObject to the Dataflow Module

[lcwik] [BEAM-1871] Remove deprecated org.apache.beam.sdk.options.GcsOptions

--
[...truncated 473.24 KB...]
test_with_http_error_that_should_be_retried 
(apache_beam.utils.retry_test.RetryTest) ... ok
test_with_http_error_that_should_not_be_retried 
(apache_beam.utils.retry_test.RetryTest) ... ok
test_with_no_retry_decorator (apache_beam.utils.retry_test.RetryTest) ... ok
test_with_real_clock (apache_beam.utils.retry_test.RetryTest) ... ok
test_basic_test_stream (apache_beam.utils.test_stream_test.TestStreamTest) ... 
ok
test_test_stream_errors (apache_beam.utils.test_stream_test.TestStreamTest) ... 
ok
test_arithmetic (apache_beam.utils.timestamp_test.DurationTest) ... ok
test_of (apache_beam.utils.timestamp_test.DurationTest) ... ok
test_precision (apache_beam.utils.timestamp_test.DurationTest) ... ok
test_sort_order (apache_beam.utils.timestamp_test.DurationTest) ... ok
test_str (apache_beam.utils.timestamp_test.DurationTest) ... ok
test_arithmetic (apache_beam.utils.timestamp_test.TimestampTest) ... ok
test_of (apache_beam.utils.timestamp_test.TimestampTest) ... ok
test_precision (apache_beam.utils.timestamp_test.TimestampTest) ... ok
test_sort_order (apache_beam.utils.timestamp_test.TimestampTest) ... ok
test_str (apache_beam.utils.timestamp_test.TimestampTest) ... ok
test_utc_timestamp (apache_beam.utils.timestamp_test.TimestampTest) ... ok
test_runtime_value_provider_keyword_argument 
(apache_beam.utils.value_provider_test.ValueProviderTests) ... ok
test_runtime_value_provider_positional_argument 
(apache_beam.utils.value_provider_test.ValueProviderTests) ... ok
test_set_runtime_option 
(apache_beam.utils.value_provider_test.ValueProviderTests) ... ok
test_static_value_provider_keyword_argument 
(apache_beam.utils.value_provider_test.ValueProviderTests) ... ok
test_static_value_provider_positional_argument 
(apache_beam.utils.value_provider_test.ValueProviderTests) ... ok
test_static_value_provider_type_cast 
(apache_beam.utils.value_provider_test.ValueProviderTests) ... ok
test_equality (apache_beam.utils.windowed_value_test.WindowedValueTest) ... ok
test_hash (apache_beam.utils.windowed_value_test.WindowedValueTest) ... ok
test_pickle (apache_beam.utils.windowed_value_test.WindowedValueTest) ... ok
test_timestamps (apache_beam.utils.windowed_value_test.WindowedValueTest) ... ok
test_with_value (apache_beam.utils.windowed_value_test.WindowedValueTest) ... ok
test_element (apache_beam.pipeline_test.DoFnTest) ... ok
test_side_input_no_tag (apache_beam.pipeline_test.DoFnTest) ... ok
test_side_input_tagged (apache_beam.pipeline_test.DoFnTest) ... ok
test_timestamp_param (apache_beam.pipeline_test.DoFnTest) ... ok
test_window_param (apache_beam.pipeline_test.DoFnTest) ... ok
test_attribute_setting (apache_beam.pipeline_test.PipelineOptionsTest) ... ok
test_defaults (apache_beam.pipeline_test.PipelineOptionsTest) ... ok
test_dir (apache_beam.pipeline_test.PipelineOptionsTest) ... ok
test_flag_parsing (apache_beam.pipeline_test.PipelineOptionsTest) ... ok
test_keyword_parsing (apache_beam.pipeline_test.PipelineOptionsTest) ... ok
test_view_as (apache_beam.pipeline_test.PipelineOptionsTest) ... ok
test_aggregator_empty_input (apache_beam.pipeline_test.PipelineTest) ... ok
test_apply_custom_transform (apache_beam.pipeline_test.PipelineTest) ... ok
test_create (apache_beam.pipeline_test.PipelineTest) ... ok
test_create_singleton_pcollection (apache_beam.pipeline_test.PipelineTest) ... 
ok
test_flatmap_builtin (apache_beam.pipeline_test.PipelineTest) ... ok
test_memory_usage (apache_beam.pipeline_test.PipelineTest) ... ok
test_metrics_in_source (apache_beam.pipeline_test.PipelineTest) ... ok
test_pipeline_as_context (apache_beam.pipeline_test.PipelineTest) ... 
:132:
 UserWarning: Using fallback coder for typehint: .
  warnings.warn('Using fallback coder for typehint: %r.' % typehint)
ok
test_read (apache_beam.pipeline_test.PipelineTest) ... ok
test_reuse_cloned_custom_transform_instance 
(apache_beam.pipeline_test.PipelineTest) ... ok
test_reuse_custom_transform_instance (apache_beam.pipeline_test.PipelineTest) 
... 
:196:
 DeprecationWarning: BaseException.message has been deprecated as of Python 2.6
  cm.exception.message,
ok
test_transform_no_super_init (apache_beam.pipeline_test.PipelineTest) ... ok
test_visit_entire_graph (apache_beam.pipeline_test.PipelineTest) ... ok
test_simple (apache_beam.pipeline_test.RunnerApiTest)
Tests serializing, deserializing, and running a simple pipeline. ... ok
test_pvalue_expected_arguments (apache_beam.pvalue_test

[jira] [Commented] (BEAM-1402) Make TextIO and AvroIO use best-practice types.

2017-05-01 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-1402?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15991962#comment-15991962
 ] 

ASF GitHub Bot commented on BEAM-1402:
--

Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/2778


> Make TextIO and AvroIO use best-practice types.
> ---
>
> Key: BEAM-1402
> URL: https://issues.apache.org/jira/browse/BEAM-1402
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core
>Reporter: Reuven Lax
>Assignee: Reuven Lax
>  Labels: backward-incompatible, starter
> Fix For: First stable release
>
>
> Replace static Read/Write classes with type-instantiated classes.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[08/11] beam git commit: Converts AvroIO.Write to AutoValue; adds writeGenericRecords()

2017-05-01 Thread jkff
Converts AvroIO.Write to AutoValue; adds writeGenericRecords()


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/e0d74750
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/e0d74750
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/e0d74750

Branch: refs/heads/master
Commit: e0d74750da73658a067e7522f18c23c5e622fb2f
Parents: abb4916
Author: Eugene Kirpichov 
Authored: Fri Apr 28 19:21:15 2017 -0700
Committer: Eugene Kirpichov 
Committed: Mon May 1 18:43:38 2017 -0700

--
 .../beam/runners/spark/io/AvroPipelineTest.java |   2 +-
 .../java/org/apache/beam/sdk/io/AvroIO.java | 355 +--
 .../java/org/apache/beam/sdk/io/AvroIOTest.java |  19 +-
 .../apache/beam/sdk/io/AvroIOTransformTest.java |   4 +-
 4 files changed, 105 insertions(+), 275 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/e0d74750/runners/spark/src/test/java/org/apache/beam/runners/spark/io/AvroPipelineTest.java
--
diff --git 
a/runners/spark/src/test/java/org/apache/beam/runners/spark/io/AvroPipelineTest.java
 
b/runners/spark/src/test/java/org/apache/beam/runners/spark/io/AvroPipelineTest.java
index c58d81e..7188dc5 100644
--- 
a/runners/spark/src/test/java/org/apache/beam/runners/spark/io/AvroPipelineTest.java
+++ 
b/runners/spark/src/test/java/org/apache/beam/runners/spark/io/AvroPipelineTest.java
@@ -75,7 +75,7 @@ public class AvroPipelineTest {
 Pipeline p = pipelineRule.createPipeline();
 PCollection input = p.apply(
 AvroIO.readGenericRecords(schema).from(inputFile.getAbsolutePath()));
-
input.apply(AvroIO.write().to(outputDir.getAbsolutePath()).withSchema(schema));
+
input.apply(AvroIO.writeGenericRecords(schema).to(outputDir.getAbsolutePath()));
 p.run().waitUntilFinish();
 
 List records = readGenericFile();

http://git-wip-us.apache.org/repos/asf/beam/blob/e0d74750/sdks/java/core/src/main/java/org/apache/beam/sdk/io/AvroIO.java
--
diff --git a/sdks/java/core/src/main/java/org/apache/beam/sdk/io/AvroIO.java 
b/sdks/java/core/src/main/java/org/apache/beam/sdk/io/AvroIO.java
index 08fc8a9..8cdd4e7 100644
--- a/sdks/java/core/src/main/java/org/apache/beam/sdk/io/AvroIO.java
+++ b/sdks/java/core/src/main/java/org/apache/beam/sdk/io/AvroIO.java
@@ -154,7 +154,30 @@ public class AvroIO {
* pattern).
*/
   public static  Write write() {
-return new Write<>(null);
+return new AutoValue_AvroIO_Write.Builder()
+.setFilenameSuffix("")
+.setNumShards(0)
+.setShardTemplate(Write.DEFAULT_SHARD_TEMPLATE)
+.setCodec(Write.DEFAULT_CODEC)
+.setMetadata(ImmutableMap.of())
+.setWindowedWrites(false)
+.build();
+  }
+
+  /** Writes Avro records of the specified schema. */
+  public static Write writeGenericRecords(Schema schema) {
+return AvroIO.write()
+.toBuilder()
+.setRecordClass(GenericRecord.class)
+.setSchema(schema)
+.build();
+  }
+
+  /**
+   * Like {@link #writeGenericRecords(Schema)} but the schema is specified as 
a JSON-encoded string.
+   */
+  public static Write writeGenericRecords(String schema) {
+return writeGenericRecords(new Schema.Parser().parse(schema));
   }
 
   /** Implementation of {@link #read}. */
@@ -229,7 +252,8 @@ public class AvroIO {
   /
 
   /** Implementation of {@link #write}. */
-  public static class Write extends PTransform, PDone> {
+  @AutoValue
+  public abstract static class Write extends PTransform, 
PDone> {
 /**
  * A {@link PTransform} that writes a bounded {@link PCollection} to an 
Avro file (or
  * multiple Avro files matching a sharding pattern).
@@ -242,80 +266,38 @@ public class AvroIO {
 // This should be a multiple of 4 to not get a partial encoded byte.
 private static final int METADATA_BYTES_MAX_LENGTH = 40;
 
-/** The filename to write to. */
-@Nullable
-final String filenamePrefix;
-/** Suffix to use for each filename. */
-final String filenameSuffix;
-/** Requested number of shards. 0 for automatic. */
-final int numShards;
-/** Shard template string. */
-final String shardTemplate;
-/** The class type of the records. */
-final Class type;
-/** The schema of the output file. */
-@Nullable
-final Schema schema;
-final boolean windowedWrites;
-FileBasedSink.FilenamePolicy filenamePolicy;
-
+@Nullable abstract String getFilenamePrefix();
+abstract String getFilenameSuffix();
+abstract int getNumShards();
+abstract String getShardTemplate();
+abstract Class getRecordClass();
+@Nullable abstract

[09/11] beam git commit: Adds AvroIO.readGenericRecords()

2017-05-01 Thread jkff
Adds AvroIO.readGenericRecords()


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/ff7a1d42
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/ff7a1d42
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/ff7a1d42

Branch: refs/heads/master
Commit: ff7a1d42f2902bebdf998d3f00b2b268ba150058
Parents: 1499d25
Author: Eugene Kirpichov 
Authored: Fri Apr 28 18:36:20 2017 -0700
Committer: Eugene Kirpichov 
Committed: Mon May 1 18:43:38 2017 -0700

--
 .../beam/runners/spark/io/AvroPipelineTest.java |  2 +-
 .../java/org/apache/beam/sdk/io/AvroIO.java | 31 
 .../java/org/apache/beam/sdk/io/AvroIOTest.java |  8 ++---
 .../apache/beam/sdk/io/AvroIOTransformTest.java |  8 ++---
 4 files changed, 19 insertions(+), 30 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/ff7a1d42/runners/spark/src/test/java/org/apache/beam/runners/spark/io/AvroPipelineTest.java
--
diff --git 
a/runners/spark/src/test/java/org/apache/beam/runners/spark/io/AvroPipelineTest.java
 
b/runners/spark/src/test/java/org/apache/beam/runners/spark/io/AvroPipelineTest.java
index e3a44d2..62db14f 100644
--- 
a/runners/spark/src/test/java/org/apache/beam/runners/spark/io/AvroPipelineTest.java
+++ 
b/runners/spark/src/test/java/org/apache/beam/runners/spark/io/AvroPipelineTest.java
@@ -74,7 +74,7 @@ public class AvroPipelineTest {
 
 Pipeline p = pipelineRule.createPipeline();
 PCollection input = p.apply(
-AvroIO.read().from(inputFile.getAbsolutePath()).withSchema(schema));
+AvroIO.readGenericRecords(schema).from(inputFile.getAbsolutePath()));
 
input.apply(AvroIO.Write.to(outputDir.getAbsolutePath()).withSchema(schema));
 p.run().waitUntilFinish();
 

http://git-wip-us.apache.org/repos/asf/beam/blob/ff7a1d42/sdks/java/core/src/main/java/org/apache/beam/sdk/io/AvroIO.java
--
diff --git a/sdks/java/core/src/main/java/org/apache/beam/sdk/io/AvroIO.java 
b/sdks/java/core/src/main/java/org/apache/beam/sdk/io/AvroIO.java
index abde9cb..ed172d1 100644
--- a/sdks/java/core/src/main/java/org/apache/beam/sdk/io/AvroIO.java
+++ b/sdks/java/core/src/main/java/org/apache/beam/sdk/io/AvroIO.java
@@ -133,6 +133,18 @@ public class AvroIO {
 return new Read<>();
   }
 
+  /** Reads Avro file(s) containing records of the specified schema. */
+  public static Read readGenericRecords(Schema schema) {
+return new Read<>(null, null, GenericRecord.class, schema);
+  }
+
+  /**
+   * Like {@link #readGenericRecords(Schema)} but the schema is specified as a 
JSON-encoded string.
+   */
+  public static Read readGenericRecords(String schema) {
+return readGenericRecords(new Schema.Parser().parse(schema));
+  }
+
   /** Implementation of {@link #read}. */
   public static class Read extends PTransform> {
 /** The filepattern to read from. */
@@ -178,25 +190,6 @@ public class AvroIO {
   return new Read<>(name, filepattern, type, 
ReflectData.get().getSchema(type));
 }
 
-/**
- * Returns a new {@link PTransform} that's like this one but
- * that reads Avro file(s) containing records of the specified schema.
- */
-public Read withSchema(Schema schema) {
-  return new Read<>(name, filepattern, GenericRecord.class, schema);
-}
-
-/**
- * Returns a new {@link PTransform} that's like this one but
- * that reads Avro file(s) containing records of the specified schema
- * in a JSON-encoded string form.
- *
- * Does not modify this object.
- */
-public Read withSchema(String schema) {
-  return withSchema((new Schema.Parser()).parse(schema));
-}
-
 @Override
 public PCollection expand(PBegin input) {
   if (filepattern == null) {

http://git-wip-us.apache.org/repos/asf/beam/blob/ff7a1d42/sdks/java/core/src/test/java/org/apache/beam/sdk/io/AvroIOTest.java
--
diff --git 
a/sdks/java/core/src/test/java/org/apache/beam/sdk/io/AvroIOTest.java 
b/sdks/java/core/src/test/java/org/apache/beam/sdk/io/AvroIOTest.java
index 6d842b3..2144b0d 100644
--- a/sdks/java/core/src/test/java/org/apache/beam/sdk/io/AvroIOTest.java
+++ b/sdks/java/core/src/test/java/org/apache/beam/sdk/io/AvroIOTest.java
@@ -282,10 +282,6 @@ public class AvroIOTest {
 p.run();
   }
 
-  private TimestampedValue newValue(GenericClass element, 
Duration duration) {
-return TimestampedValue.of(element, new Instant(0).plus(duration));
-  }
-
   private static class WindowedFilenamePolicy extends FilenamePolicy {
 String outputFilePrefix;
 
@@ -550,8 +546,8 @@ public class AvroIOTest {
   public void testPrimitiveReadDisplayData() {
 D

[GitHub] beam pull request #2778: [BEAM-1402] AvroIO should comply with PTransform st...

2017-05-01 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/2778


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[03/11] beam git commit: Converts AvroIO.Read to AutoValue

2017-05-01 Thread jkff
Converts AvroIO.Read to AutoValue


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/439f2ca0
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/439f2ca0
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/439f2ca0

Branch: refs/heads/master
Commit: 439f2ca03c0d8994e5736b9493f61d9cb4267cb2
Parents: ff7a1d4
Author: Eugene Kirpichov 
Authored: Fri Apr 28 18:37:49 2017 -0700
Committer: Eugene Kirpichov 
Committed: Mon May 1 18:43:38 2017 -0700

--
 .../java/org/apache/beam/sdk/io/AvroIO.java | 67 +---
 1 file changed, 29 insertions(+), 38 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/439f2ca0/sdks/java/core/src/main/java/org/apache/beam/sdk/io/AvroIO.java
--
diff --git a/sdks/java/core/src/main/java/org/apache/beam/sdk/io/AvroIO.java 
b/sdks/java/core/src/main/java/org/apache/beam/sdk/io/AvroIO.java
index ed172d1..2f1d917 100644
--- a/sdks/java/core/src/main/java/org/apache/beam/sdk/io/AvroIO.java
+++ b/sdks/java/core/src/main/java/org/apache/beam/sdk/io/AvroIO.java
@@ -19,6 +19,7 @@ package org.apache.beam.sdk.io;
 
 import static com.google.common.base.Preconditions.checkArgument;
 
+import com.google.auto.value.AutoValue;
 import com.google.common.collect.ImmutableMap;
 import com.google.common.collect.Maps;
 import com.google.common.io.BaseEncoding;
@@ -130,12 +131,15 @@ public class AvroIO {
* The schema must be specified using one of the {@code withSchema} 
functions.
*/
   public static  Read read() {
-return new Read<>();
+return new AutoValue_AvroIO_Read.Builder().build();
   }
 
   /** Reads Avro file(s) containing records of the specified schema. */
   public static Read readGenericRecords(Schema schema) {
-return new Read<>(null, null, GenericRecord.class, schema);
+return new AutoValue_AvroIO_Read.Builder()
+.setRecordClass(GenericRecord.class)
+.setSchema(schema)
+.build();
   }
 
   /**
@@ -146,26 +150,21 @@ public class AvroIO {
   }
 
   /** Implementation of {@link #read}. */
-  public static class Read extends PTransform> {
-/** The filepattern to read from. */
-@Nullable
-final String filepattern;
-/** The class type of the records. */
-@Nullable
-final Class type;
-/** The schema of the input file. */
-@Nullable
-final Schema schema;
-
-Read() {
-  this(null, null, null, null);
-}
+  @AutoValue
+  public abstract static class Read extends PTransform> {
+@Nullable abstract String getFilepattern();
+@Nullable abstract Class getRecordClass();
+@Nullable abstract Schema getSchema();
+
+abstract Builder toBuilder();
 
-Read(String name, String filepattern, Class type, Schema schema) {
-  super(name);
-  this.filepattern = filepattern;
-  this.type = type;
-  this.schema = schema;
+@AutoValue.Builder
+abstract static class Builder {
+  abstract Builder setFilepattern(String filepattern);
+  abstract Builder setRecordClass(Class recordClass);
+  abstract Builder setSchema(Schema schema);
+
+  abstract Read build();
 }
 
 /**
@@ -178,7 +177,7 @@ public class AvroIO {
  * Filesystem glob patterns ("*", "?", "[..]") are supported.
  */
 public Read from(String filepattern) {
-  return new Read<>(name, filepattern, type, schema);
+  return toBuilder().setFilepattern(filepattern).build();
 }
 
 /**
@@ -187,26 +186,26 @@ public class AvroIO {
  * specified Avro-generated class.
  */
 public Read withSchema(Class type) {
-  return new Read<>(name, filepattern, type, 
ReflectData.get().getSchema(type));
+  return 
toBuilder().setRecordClass(type).setSchema(ReflectData.get().getSchema(type)).build();
 }
 
 @Override
 public PCollection expand(PBegin input) {
-  if (filepattern == null) {
+  if (getFilepattern() == null) {
 throw new IllegalStateException(
 "need to set the filepattern of an AvroIO.Read transform");
   }
-  if (schema == null) {
+  if (getSchema() == null) {
 throw new IllegalStateException("need to set the schema of an 
AvroIO.Read transform");
   }
 
   @SuppressWarnings("unchecked")
   Bounded read =
-  type == GenericRecord.class
+  getRecordClass() == GenericRecord.class
   ? (Bounded) org.apache.beam.sdk.io.Read.from(
-  AvroSource.from(filepattern).withSchema(schema))
+  AvroSource.from(getFilepattern()).withSchema(getSchema()))
   : org.apache.beam.sdk.io.Read.from(
-  AvroSource.from(filepattern).withSchema(type));
+  
AvroSource.from(getFilepattern()).withSchema(getRecordClass()));
 

[01/11] beam git commit: Scattered minor improvements per review comments

2017-05-01 Thread jkff
Repository: beam
Updated Branches:
  refs/heads/master 6d443bc39 -> 034565c68


Scattered minor improvements per review comments


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/caf2faeb
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/caf2faeb
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/caf2faeb

Branch: refs/heads/master
Commit: caf2faeb5e0b173f4e40f4af70c14d1d5d4244e4
Parents: 27d7462
Author: Eugene Kirpichov 
Authored: Mon May 1 17:00:46 2017 -0700
Committer: Eugene Kirpichov 
Committed: Mon May 1 18:43:38 2017 -0700

--
 .../java/org/apache/beam/sdk/io/AvroIO.java | 81 +---
 .../java/org/apache/beam/sdk/io/AvroSink.java   | 14 +---
 .../java/org/apache/beam/sdk/io/AvroSource.java |  4 +-
 .../beam/sdk/testing/SourceTestUtils.java   |  5 +-
 .../java/org/apache/beam/sdk/io/AvroIOTest.java | 44 ++-
 5 files changed, 69 insertions(+), 79 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/caf2faeb/sdks/java/core/src/main/java/org/apache/beam/sdk/io/AvroIO.java
--
diff --git a/sdks/java/core/src/main/java/org/apache/beam/sdk/io/AvroIO.java 
b/sdks/java/core/src/main/java/org/apache/beam/sdk/io/AvroIO.java
index 6b66a98..755cdb9 100644
--- a/sdks/java/core/src/main/java/org/apache/beam/sdk/io/AvroIO.java
+++ b/sdks/java/core/src/main/java/org/apache/beam/sdk/io/AvroIO.java
@@ -33,6 +33,7 @@ import org.apache.avro.reflect.ReflectData;
 import org.apache.beam.sdk.coders.AvroCoder;
 import org.apache.beam.sdk.coders.Coder;
 import org.apache.beam.sdk.coders.VoidCoder;
+import org.apache.beam.sdk.io.FileBasedSink.FilenamePolicy;
 import org.apache.beam.sdk.io.Read.Bounded;
 import org.apache.beam.sdk.runners.PipelineRunner;
 import org.apache.beam.sdk.transforms.PTransform;
@@ -45,11 +46,9 @@ import org.apache.beam.sdk.values.PDone;
 /**
  * {@link PTransform}s for reading and writing Avro files.
  *
- * To read a {@link PCollection} from one or more Avro files, use
- * {@code AvroIO.read()}, specifying {@link AvroIO.Read#from} to specify
- * the path of the file(s) to read from (e.g., a local filename or
- * filename pattern if running locally, or a Google Cloud Storage
- * filename or filename pattern of the form {@code 
"gs:///"}).
+ * To read a {@link PCollection} from one or more Avro files, use {@code 
AvroIO.read()},
+ * specifying {@link AvroIO.Read#from} to specify the filename or filepattern 
to read from.
+ * See {@link FileSystems} for information on supported file systems and 
filepatterns.
  *
  * To read specific records, such as Avro-generated classes, use {@link 
#read(Class)}.
  * To read {@link GenericRecord GenericRecords}, use {@link 
#readGenericRecords(Schema)} which takes
@@ -72,13 +71,12 @@ import org.apache.beam.sdk.values.PDone;
  *.from("gs://my_bucket/path/to/records-*.avro"));
  * } 
  *
- * To write a {@link PCollection} to one or more Avro files, use
- * {@link AvroIO.Write}, specifying {@code AvroIO.write().to(String)} to 
specify
- * the path of the file to write to (e.g., a local filename or sharded
- * filename pattern if running locally, or a Google Cloud Storage
- * filename or sharded filename pattern of the form
- * {@code "gs:///"}). {@code 
AvroIO.write().to(FileBasedSink.FilenamePolicy)}
- * can also be used to specify a custom file naming policy.
+ * To write a {@link PCollection} to one or more Avro files, use {@link 
AvroIO.Write}, specifying
+ * {@code AvroIO.write().to(String)} to specify the filename or sharded 
filepattern to write to.
+ * See {@link FileSystems} for information on supported file systems and 
{@link ShardNameTemplate}
+ * for information on naming of output files. You can also use {@code 
AvroIO.write()} with
+ * {@link Write#to(FileBasedSink.FilenamePolicy)} to
+ * specify a custom file naming policy.
  *
  * By default, all input is put into the global window before writing. If 
per-window writes are
  * desired - for example, when using a streaming runner -
@@ -140,7 +138,8 @@ public class AvroIO {
   }
 
   /**
-   * Like {@link #readGenericRecords(Schema)} but the schema is specified as a 
JSON-encoded string.
+   * Reads Avro file(s) containing records of the specified schema. The schema 
is specified as a
+   * JSON-encoded string.
*/
   public static Read readGenericRecords(String schema) {
 return readGenericRecords(new Schema.Parser().parse(schema));
@@ -165,6 +164,13 @@ public class AvroIO {
 .build();
   }
 
+  /**
+   * Writes Avro records of the specified schema. The schema is specified as a 
JSON-encoded string.
+   */
+  public static Write writeGenericRecords(String schema) {
+return writeGenericRecords(new Schema.Parser().parse(schema));
+  }
+
   private stat

[07/11] beam git commit: Removes AvroIO.Write.Bound

2017-05-01 Thread jkff
Removes AvroIO.Write.Bound


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/d1dfd4e2
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/d1dfd4e2
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/d1dfd4e2

Branch: refs/heads/master
Commit: d1dfd4e2a8b82451f28f1f0e6f261eae0d51bb5b
Parents: 439f2ca
Author: Eugene Kirpichov 
Authored: Fri Apr 28 18:59:03 2017 -0700
Committer: Eugene Kirpichov 
Committed: Mon May 1 18:43:38 2017 -0700

--
 .../beam/runners/spark/io/AvroPipelineTest.java |   2 +-
 .../java/org/apache/beam/sdk/io/AvroIO.java | 910 ---
 .../java/org/apache/beam/sdk/io/AvroIOTest.java |  37 +-
 .../apache/beam/sdk/io/AvroIOTransformTest.java |  12 +-
 4 files changed, 385 insertions(+), 576 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/d1dfd4e2/runners/spark/src/test/java/org/apache/beam/runners/spark/io/AvroPipelineTest.java
--
diff --git 
a/runners/spark/src/test/java/org/apache/beam/runners/spark/io/AvroPipelineTest.java
 
b/runners/spark/src/test/java/org/apache/beam/runners/spark/io/AvroPipelineTest.java
index 62db14f..c58d81e 100644
--- 
a/runners/spark/src/test/java/org/apache/beam/runners/spark/io/AvroPipelineTest.java
+++ 
b/runners/spark/src/test/java/org/apache/beam/runners/spark/io/AvroPipelineTest.java
@@ -75,7 +75,7 @@ public class AvroPipelineTest {
 Pipeline p = pipelineRule.createPipeline();
 PCollection input = p.apply(
 AvroIO.readGenericRecords(schema).from(inputFile.getAbsolutePath()));
-
input.apply(AvroIO.Write.to(outputDir.getAbsolutePath()).withSchema(schema));
+
input.apply(AvroIO.write().to(outputDir.getAbsolutePath()).withSchema(schema));
 p.run().waitUntilFinish();
 
 List records = readGenericFile();

http://git-wip-us.apache.org/repos/asf/beam/blob/d1dfd4e2/sdks/java/core/src/main/java/org/apache/beam/sdk/io/AvroIO.java
--
diff --git a/sdks/java/core/src/main/java/org/apache/beam/sdk/io/AvroIO.java 
b/sdks/java/core/src/main/java/org/apache/beam/sdk/io/AvroIO.java
index 2f1d917..4bde6ec 100644
--- a/sdks/java/core/src/main/java/org/apache/beam/sdk/io/AvroIO.java
+++ b/sdks/java/core/src/main/java/org/apache/beam/sdk/io/AvroIO.java
@@ -70,24 +70,24 @@ import org.apache.beam.sdk.values.PDone;
  * // A Read from a GCS file (runs locally and using remote execution):
  * Schema schema = new Schema.Parser().parse(new File("schema.avsc"));
  * PCollection records =
- * p.apply(AvroIO.Read
+ * p.apply(AvroIO.read()
  *.from("gs://my_bucket/path/to/records-*.avro")
  *.withSchema(schema));
  * } 
  *
  * To write a {@link PCollection} to one or more Avro files, use
- * {@link AvroIO.Write}, specifying {@link AvroIO.Write#to(String)} to specify
+ * {@link AvroIO.Write}, specifying {@code AvroIO.write().to(String)} to 
specify
  * the path of the file to write to (e.g., a local filename or sharded
  * filename pattern if running locally, or a Google Cloud Storage
  * filename or sharded filename pattern of the form
- * {@code "gs:///"}). {@link 
AvroIO.Write#to(FileBasedSink.FilenamePolicy)}
+ * {@code "gs:///"}). {@code 
AvroIO.write().to(FileBasedSink.FilenamePolicy)}
  * can also be used to specify a custom file naming policy.
  *
  * By default, all input is put into the global window before writing. If 
per-window writes are
  * desired - for example, when using a streaming runner -
- * {@link AvroIO.Write.Bound#withWindowedWrites()} will cause windowing and 
triggering to be
+ * {@link AvroIO.Write#withWindowedWrites()} will cause windowing and 
triggering to be
  * preserved. When producing windowed writes, the number of output shards must 
be set explicitly
- * using {@link AvroIO.Write.Bound#withNumShards(int)}; some runners may set 
this for you to a
+ * using {@link AvroIO.Write#withNumShards(int)}; some runners may set this 
for you to a
  * runner-chosen value, so you may need not set it yourself. A
  * {@link FileBasedSink.FilenamePolicy} must be set, and unique windows and 
triggers must produce
  * unique filenames.
@@ -103,13 +103,13 @@ import org.apache.beam.sdk.values.PDone;
  *  {@code
  * // A simple Write to a local file (only runs locally):
  * PCollection records = ...;
- * records.apply(AvroIO.Write.to("/path/to/file.avro")
+ * records.apply(AvroIO.write().to("/path/to/file.avro")
  *   .withSchema(AvroAutoGenClass.class));
  *
  * // A Write to a sharded GCS file (runs locally and using remote execution):
  * Schema schema = new Schema.Parser().parse(new File("schema.avsc"));
  * PCollection records = ...;
- * records.apply("WriteToAvro", AvroIO.Write
+ * records.apply("Writ

[11/11] beam git commit: This closes #2778

2017-05-01 Thread jkff
This closes #2778


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/034565c6
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/034565c6
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/034565c6

Branch: refs/heads/master
Commit: 034565c6811833fa1143362fb44f94672cef1e30
Parents: 6d443bc caf2fae
Author: Eugene Kirpichov 
Authored: Mon May 1 18:43:45 2017 -0700
Committer: Eugene Kirpichov 
Committed: Mon May 1 18:43:45 2017 -0700

--
 .../beam/runners/spark/io/AvroPipelineTest.java |4 +-
 .../java/org/apache/beam/sdk/io/AvroIO.java | 1193 +-
 .../java/org/apache/beam/sdk/io/AvroSink.java   |  142 +++
 .../java/org/apache/beam/sdk/io/AvroSource.java |4 +-
 .../java/org/apache/beam/sdk/io/TextIO.java |4 +-
 .../beam/sdk/testing/SourceTestUtils.java   |5 +-
 .../java/org/apache/beam/sdk/io/AvroIOTest.java |  108 +-
 .../apache/beam/sdk/io/AvroIOTransformTest.java |   30 +-
 8 files changed, 521 insertions(+), 969 deletions(-)
--




[10/11] beam git commit: Moves AvroIO.Read.withSchema into read()

2017-05-01 Thread jkff
Moves AvroIO.Read.withSchema into read()


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/abb4916c
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/abb4916c
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/abb4916c

Branch: refs/heads/master
Commit: abb4916ce2fa8d4a5caf783b66cc5541053ea83c
Parents: d1dfd4e
Author: Eugene Kirpichov 
Authored: Fri Apr 28 19:03:25 2017 -0700
Committer: Eugene Kirpichov 
Committed: Mon May 1 18:43:38 2017 -0700

--
 .../java/org/apache/beam/sdk/io/AvroIO.java | 35 
 .../java/org/apache/beam/sdk/io/AvroIOTest.java | 24 ++
 .../apache/beam/sdk/io/AvroIOTransformTest.java |  4 +--
 3 files changed, 25 insertions(+), 38 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/abb4916c/sdks/java/core/src/main/java/org/apache/beam/sdk/io/AvroIO.java
--
diff --git a/sdks/java/core/src/main/java/org/apache/beam/sdk/io/AvroIO.java 
b/sdks/java/core/src/main/java/org/apache/beam/sdk/io/AvroIO.java
index 4bde6ec..08fc8a9 100644
--- a/sdks/java/core/src/main/java/org/apache/beam/sdk/io/AvroIO.java
+++ b/sdks/java/core/src/main/java/org/apache/beam/sdk/io/AvroIO.java
@@ -46,16 +46,15 @@ import org.apache.beam.sdk.values.PDone;
  * {@link PTransform}s for reading and writing Avro files.
  *
  * To read a {@link PCollection} from one or more Avro files, use
- * {@link AvroIO.Read}, specifying {@link AvroIO.Read#from} to specify
+ * {@code AvroIO.read()}, specifying {@link AvroIO.Read#from} to specify
  * the path of the file(s) to read from (e.g., a local filename or
  * filename pattern if running locally, or a Google Cloud Storage
  * filename or filename pattern of the form {@code 
"gs:///"}).
  *
- * It is required to specify {@link AvroIO.Read#withSchema}. To
- * read specific records, such as Avro-generated classes, provide an
- * Avro-generated class type. To read {@link GenericRecord GenericRecords}, 
provide either
- * a {@link Schema} object or an Avro schema in a JSON-encoded string form.
- * An exception will be thrown if a record doesn't match the specified
+ * To read specific records, such as Avro-generated classes, use {@link 
#read(Class)}.
+ * To read {@link GenericRecord GenericRecords}, use {@link 
#readGenericRecords(Schema)} which takes
+ * a {@link Schema} object, or {@link #readGenericRecords(String)} which takes 
an Avro schema in a
+ * JSON-encoded string form. An exception will be thrown if a record doesn't 
match the specified
  * schema.
  *
  * For example:
@@ -64,15 +63,13 @@ import org.apache.beam.sdk.values.PDone;
  *
  * // A simple Read of a local file (only runs locally):
  * PCollection records =
- * p.apply(AvroIO.read().from("/path/to/file.avro")
- * .withSchema(AvroAutoGenClass.class));
+ * p.apply(AvroIO.read(AvroAutoGenClass.class).from("/path/to/file.avro"));
  *
  * // A Read from a GCS file (runs locally and using remote execution):
  * Schema schema = new Schema.Parser().parse(new File("schema.avsc"));
  * PCollection records =
- * p.apply(AvroIO.read()
- *.from("gs://my_bucket/path/to/records-*.avro")
- *.withSchema(schema));
+ * p.apply(AvroIO.readGenericRecords(schema)
+ *.from("gs://my_bucket/path/to/records-*.avro"));
  * } 
  *
  * To write a {@link PCollection} to one or more Avro files, use
@@ -130,8 +127,11 @@ public class AvroIO {
*
* The schema must be specified using one of the {@code withSchema} 
functions.
*/
-  public static  Read read() {
-return new AutoValue_AvroIO_Read.Builder().build();
+  public static  Read read(Class recordClass) {
+return new AutoValue_AvroIO_Read.Builder()
+.setRecordClass(recordClass)
+.setSchema(ReflectData.get().getSchema(recordClass))
+.build();
   }
 
   /** Reads Avro file(s) containing records of the specified schema. */
@@ -188,15 +188,6 @@ public class AvroIO {
   return toBuilder().setFilepattern(filepattern).build();
 }
 
-/**
- * Returns a new {@link PTransform} that's like this one but
- * that reads Avro file(s) containing records whose type is the
- * specified Avro-generated class.
- */
-public Read withSchema(Class type) {
-  return 
toBuilder().setRecordClass(type).setSchema(ReflectData.get().getSchema(type)).build();
-}
-
 @Override
 public PCollection expand(PBegin input) {
   if (getFilepattern() == null) {

http://git-wip-us.apache.org/repos/asf/beam/blob/abb4916c/sdks/java/core/src/test/java/org/apache/beam/sdk/io/AvroIOTest.java
--
diff --git 
a/sdks/java/core/src/test/java/org/apache/beam/sdk/io/AvroIOTest.java 
b/s

[05/11] beam git commit: Moves AvroSink to upper level

2017-05-01 Thread jkff
Moves AvroSink to upper level


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/0166e199
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/0166e199
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/0166e199

Branch: refs/heads/master
Commit: 0166e19991af956a48ef99310f5f1916225255aa
Parents: 2fa3c34
Author: Eugene Kirpichov 
Authored: Fri Apr 28 18:05:00 2017 -0700
Committer: Eugene Kirpichov 
Committed: Mon May 1 18:43:38 2017 -0700

--
 .../java/org/apache/beam/sdk/io/AvroIO.java | 131 
 .../java/org/apache/beam/sdk/io/AvroSink.java   | 150 +++
 2 files changed, 150 insertions(+), 131 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/0166e199/sdks/java/core/src/main/java/org/apache/beam/sdk/io/AvroIO.java
--
diff --git a/sdks/java/core/src/main/java/org/apache/beam/sdk/io/AvroIO.java 
b/sdks/java/core/src/main/java/org/apache/beam/sdk/io/AvroIO.java
index 2031569..75e14d5 100644
--- a/sdks/java/core/src/main/java/org/apache/beam/sdk/io/AvroIO.java
+++ b/sdks/java/core/src/main/java/org/apache/beam/sdk/io/AvroIO.java
@@ -19,33 +19,24 @@ package org.apache.beam.sdk.io;
 
 import static com.google.common.base.Preconditions.checkArgument;
 
-import com.google.common.annotations.VisibleForTesting;
 import com.google.common.collect.ImmutableMap;
 import com.google.common.collect.Maps;
 import com.google.common.io.BaseEncoding;
-import java.nio.channels.Channels;
-import java.nio.channels.WritableByteChannel;
 import java.util.Map;
 import java.util.regex.Pattern;
 import javax.annotation.Nullable;
 import org.apache.avro.Schema;
 import org.apache.avro.file.CodecFactory;
-import org.apache.avro.file.DataFileWriter;
-import org.apache.avro.generic.GenericDatumWriter;
 import org.apache.avro.generic.GenericRecord;
-import org.apache.avro.io.DatumWriter;
 import org.apache.avro.reflect.ReflectData;
-import org.apache.avro.reflect.ReflectDatumWriter;
 import org.apache.beam.sdk.coders.AvroCoder;
 import org.apache.beam.sdk.coders.Coder;
 import org.apache.beam.sdk.coders.VoidCoder;
 import org.apache.beam.sdk.io.Read.Bounded;
-import org.apache.beam.sdk.options.PipelineOptions;
 import org.apache.beam.sdk.runners.PipelineRunner;
 import org.apache.beam.sdk.transforms.PTransform;
 import org.apache.beam.sdk.transforms.display.DisplayData;
 import org.apache.beam.sdk.transforms.display.HasDisplayData;
-import org.apache.beam.sdk.util.MimeTypes;
 import org.apache.beam.sdk.values.PBegin;
 import org.apache.beam.sdk.values.PCollection;
 import org.apache.beam.sdk.values.PDone;
@@ -952,126 +943,4 @@ public class AvroIO {
 
   /** Disallow construction of utility class. */
   private AvroIO() {}
-
-  /**
-   * A {@link FileBasedSink} for Avro files.
-   */
-  @VisibleForTesting
-  static class AvroSink extends FileBasedSink {
-private final AvroCoder coder;
-private final SerializableAvroCodecFactory codec;
-private final ImmutableMap metadata;
-
-@VisibleForTesting
-AvroSink(
-FilenamePolicy filenamePolicy,
-AvroCoder coder,
-SerializableAvroCodecFactory codec,
-ImmutableMap metadata) {
-  super(filenamePolicy);
-  this.coder = coder;
-  this.codec = codec;
-  this.metadata = metadata;
-}
-
-@VisibleForTesting
-AvroSink(
-String baseOutputFilename,
-String extension,
-String fileNameTemplate,
-AvroCoder coder,
-SerializableAvroCodecFactory codec,
-ImmutableMap metadata) {
-  super(baseOutputFilename, extension, fileNameTemplate);
-  this.coder = coder;
-  this.codec = codec;
-  this.metadata = metadata;
-}
-
-@Override
-public FileBasedSink.FileBasedWriteOperation createWriteOperation() {
-  return new AvroWriteOperation<>(this, coder, codec, metadata);
-}
-
-/**
- * A {@link org.apache.beam.sdk.io.FileBasedSink.FileBasedWriteOperation
- * FileBasedWriteOperation} for Avro files.
- */
-private static class AvroWriteOperation extends 
FileBasedWriteOperation {
-  private final AvroCoder coder;
-  private final SerializableAvroCodecFactory codec;
-  private final ImmutableMap metadata;
-
-  private AvroWriteOperation(AvroSink sink,
- AvroCoder coder,
- SerializableAvroCodecFactory codec,
- ImmutableMap metadata) {
-super(sink);
-this.coder = coder;
-this.codec = codec;
-this.metadata = metadata;
-  }
-
-  @Override
-  public FileBasedWriter createWriter(PipelineOptions options) throws 
Exception {
-return new AvroWriter<>(this, coder, codec, metadata)

[06/11] beam git commit: Removes AvroIO.Read.Bound

2017-05-01 Thread jkff
Removes AvroIO.Read.Bound


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/1499d256
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/1499d256
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/1499d256

Branch: refs/heads/master
Commit: 1499d256c616e34b4416fa202a45aa256ac88d20
Parents: 0166e19
Author: Eugene Kirpichov 
Authored: Fri Apr 28 18:19:21 2017 -0700
Committer: Eugene Kirpichov 
Committed: Mon May 1 18:43:38 2017 -0700

--
 .../beam/runners/spark/io/AvroPipelineTest.java |   2 +-
 .../java/org/apache/beam/sdk/io/AvroIO.java | 222 +++
 .../java/org/apache/beam/sdk/io/AvroIOTest.java |  24 +-
 .../apache/beam/sdk/io/AvroIOTransformTest.java |  18 +-
 4 files changed, 108 insertions(+), 158 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/1499d256/runners/spark/src/test/java/org/apache/beam/runners/spark/io/AvroPipelineTest.java
--
diff --git 
a/runners/spark/src/test/java/org/apache/beam/runners/spark/io/AvroPipelineTest.java
 
b/runners/spark/src/test/java/org/apache/beam/runners/spark/io/AvroPipelineTest.java
index 2a73c28..e3a44d2 100644
--- 
a/runners/spark/src/test/java/org/apache/beam/runners/spark/io/AvroPipelineTest.java
+++ 
b/runners/spark/src/test/java/org/apache/beam/runners/spark/io/AvroPipelineTest.java
@@ -74,7 +74,7 @@ public class AvroPipelineTest {
 
 Pipeline p = pipelineRule.createPipeline();
 PCollection input = p.apply(
-AvroIO.Read.from(inputFile.getAbsolutePath()).withSchema(schema));
+AvroIO.read().from(inputFile.getAbsolutePath()).withSchema(schema));
 
input.apply(AvroIO.Write.to(outputDir.getAbsolutePath()).withSchema(schema));
 p.run().waitUntilFinish();
 

http://git-wip-us.apache.org/repos/asf/beam/blob/1499d256/sdks/java/core/src/main/java/org/apache/beam/sdk/io/AvroIO.java
--
diff --git a/sdks/java/core/src/main/java/org/apache/beam/sdk/io/AvroIO.java 
b/sdks/java/core/src/main/java/org/apache/beam/sdk/io/AvroIO.java
index 75e14d5..abde9cb 100644
--- a/sdks/java/core/src/main/java/org/apache/beam/sdk/io/AvroIO.java
+++ b/sdks/java/core/src/main/java/org/apache/beam/sdk/io/AvroIO.java
@@ -63,7 +63,7 @@ import org.apache.beam.sdk.values.PDone;
  *
  * // A simple Read of a local file (only runs locally):
  * PCollection records =
- * p.apply(AvroIO.Read.from("/path/to/file.avro")
+ * p.apply(AvroIO.read().from("/path/to/file.avro")
  * .withSchema(AvroAutoGenClass.class));
  *
  * // A Read from a GCS file (runs locally and using remote execution):
@@ -125,15 +125,39 @@ import org.apache.beam.sdk.values.PDone;
  */
 public class AvroIO {
   /**
-   * A root {@link PTransform} that reads from an Avro file (or multiple Avro
-   * files matching a pattern) and returns a {@link PCollection} containing
-   * the decoding of each record.
+   * Reads records of the given type from an Avro file (or multiple Avro files 
matching a pattern).
+   *
+   * The schema must be specified using one of the {@code withSchema} 
functions.
*/
-  public static class Read {
+  public static  Read read() {
+return new Read<>();
+  }
+
+  /** Implementation of {@link #read}. */
+  public static class Read extends PTransform> {
+/** The filepattern to read from. */
+@Nullable
+final String filepattern;
+/** The class type of the records. */
+@Nullable
+final Class type;
+/** The schema of the input file. */
+@Nullable
+final Schema schema;
+
+Read() {
+  this(null, null, null, null);
+}
+
+Read(String name, String filepattern, Class type, Schema schema) {
+  super(name);
+  this.filepattern = filepattern;
+  this.type = type;
+  this.schema = schema;
+}
 
 /**
- * Returns a {@link PTransform} that reads from the file(s)
- * with the given name or pattern. This can be a local filename
+ * Reads from the file(s) with the given name or pattern. This can be a 
local filename
  * or filename pattern (if running locally), or a Google Cloud
  * Storage filename or filename pattern of the form
  * {@code "gs:///"} (if running locally or
@@ -141,162 +165,82 @@ public class AvroIO {
  * http://docs.oracle.com/javase/tutorial/essential/io/find.html";>Java
  * Filesystem glob patterns ("*", "?", "[..]") are supported.
  */
-public static Bound from(String filepattern) {
-  return new Bound<>(GenericRecord.class).from(filepattern);
+public Read from(String filepattern) {
+  return new Read<>(name, filepattern, type, schema);
 }
 
 /**
- * Returns a {@link PTransform} that reads Avro file(s)
- * containing records whose t

[02/11] beam git commit: Moves AvroIO.write().withSchema into write()

2017-05-01 Thread jkff
Moves AvroIO.write().withSchema into write()


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/27d74622
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/27d74622
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/27d74622

Branch: refs/heads/master
Commit: 27d74622e877d017aa70feef0ee4cd26a4bece7a
Parents: e0d7475
Author: Eugene Kirpichov 
Authored: Fri Apr 28 19:25:45 2017 -0700
Committer: Eugene Kirpichov 
Committed: Mon May 1 18:43:38 2017 -0700

--
 .../java/org/apache/beam/sdk/io/AvroIO.java | 49 +---
 .../java/org/apache/beam/sdk/io/AvroIOTest.java | 42 +++--
 .../apache/beam/sdk/io/AvroIOTransformTest.java |  2 +-
 3 files changed, 40 insertions(+), 53 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/27d74622/sdks/java/core/src/main/java/org/apache/beam/sdk/io/AvroIO.java
--
diff --git a/sdks/java/core/src/main/java/org/apache/beam/sdk/io/AvroIO.java 
b/sdks/java/core/src/main/java/org/apache/beam/sdk/io/AvroIO.java
index 8cdd4e7..6b66a98 100644
--- a/sdks/java/core/src/main/java/org/apache/beam/sdk/io/AvroIO.java
+++ b/sdks/java/core/src/main/java/org/apache/beam/sdk/io/AvroIO.java
@@ -89,26 +89,23 @@ import org.apache.beam.sdk.values.PDone;
  * {@link FileBasedSink.FilenamePolicy} must be set, and unique windows and 
triggers must produce
  * unique filenames.
  *
- * It is required to specify {@link AvroIO.Write#withSchema}. To
- * write specific records, such as Avro-generated classes, provide an
- * Avro-generated class type. To write {@link GenericRecord GenericRecords}, 
provide either
- * a {@link Schema} object or a schema in a JSON-encoded string form.
- * An exception will be thrown if a record doesn't match the specified
- * schema.
+ * To write specific records, such as Avro-generated classes, use {@link 
#write(Class)}.
+ * To write {@link GenericRecord GenericRecords}, use either {@link 
#writeGenericRecords(Schema)}
+ * which takes a {@link Schema} object, or {@link 
#writeGenericRecords(String)} which takes a schema
+ * in a JSON-encoded string form. An exception will be thrown if a record 
doesn't match the
+ * specified schema.
  *
  * For example:
  *  {@code
  * // A simple Write to a local file (only runs locally):
  * PCollection records = ...;
- * records.apply(AvroIO.write().to("/path/to/file.avro")
- *   .withSchema(AvroAutoGenClass.class));
+ * 
records.apply(AvroIO.write(AvroAutoGenClass.class).to("/path/to/file.avro"));
  *
  * // A Write to a sharded GCS file (runs locally and using remote execution):
  * Schema schema = new Schema.Parser().parse(new File("schema.avsc"));
  * PCollection records = ...;
- * records.apply("WriteToAvro", AvroIO.write()
+ * records.apply("WriteToAvro", AvroIO.writeGenericRecords(schema)
  * .to("gs://my_bucket/path/to/numbers")
- * .withSchema(schema)
  * .withSuffix(".avro"));
  * } 
  *
@@ -153,26 +150,31 @@ public class AvroIO {
* Writes a {@link PCollection} to an Avro file (or multiple Avro files 
matching a sharding
* pattern).
*/
-  public static  Write write() {
-return new AutoValue_AvroIO_Write.Builder()
-.setFilenameSuffix("")
-.setNumShards(0)
-.setShardTemplate(Write.DEFAULT_SHARD_TEMPLATE)
-.setCodec(Write.DEFAULT_CODEC)
-.setMetadata(ImmutableMap.of())
-.setWindowedWrites(false)
+  public static  Write write(Class recordClass) {
+return AvroIO.defaultWriteBuilder()
+.setRecordClass(recordClass)
+.setSchema(ReflectData.get().getSchema(recordClass))
 .build();
   }
 
   /** Writes Avro records of the specified schema. */
   public static Write writeGenericRecords(Schema schema) {
-return AvroIO.write()
-.toBuilder()
+return AvroIO.defaultWriteBuilder()
 .setRecordClass(GenericRecord.class)
 .setSchema(schema)
 .build();
   }
 
+  private static  Write.Builder defaultWriteBuilder() {
+return new AutoValue_AvroIO_Write.Builder()
+.setFilenameSuffix("")
+.setNumShards(0)
+.setShardTemplate(Write.DEFAULT_SHARD_TEMPLATE)
+.setCodec(Write.DEFAULT_CODEC)
+.setMetadata(ImmutableMap.of())
+.setWindowedWrites(false);
+  }
+
   /**
* Like {@link #writeGenericRecords(Schema)} but the schema is specified as 
a JSON-encoded string.
*/
@@ -369,13 +371,6 @@ public class AvroIO {
   return toBuilder().setWindowedWrites(true).build();
 }
 
-/**
- * Writes to Avro file(s) containing records whose type is the specified 
Avro-generated class.
- */
-public Write withSchema(Class type) {
-  return 
toBuilder().setRecordClass(type).setSchema(ReflectData.get().getSchema(type)

[04/11] beam git commit: Fixes javadoc of TextIO to not point to AvroIO

2017-05-01 Thread jkff
Fixes javadoc of TextIO to not point to AvroIO


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/2fa3c348
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/2fa3c348
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/2fa3c348

Branch: refs/heads/master
Commit: 2fa3c348d5a6aab6a7da7c6c62f6b9254feb13af
Parents: 6d443bc
Author: Eugene Kirpichov 
Authored: Sat Apr 29 16:15:24 2017 -0700
Committer: Eugene Kirpichov 
Committed: Mon May 1 18:43:38 2017 -0700

--
 sdks/java/core/src/main/java/org/apache/beam/sdk/io/TextIO.java | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/2fa3c348/sdks/java/core/src/main/java/org/apache/beam/sdk/io/TextIO.java
--
diff --git a/sdks/java/core/src/main/java/org/apache/beam/sdk/io/TextIO.java 
b/sdks/java/core/src/main/java/org/apache/beam/sdk/io/TextIO.java
index 6b58391..0947702 100644
--- a/sdks/java/core/src/main/java/org/apache/beam/sdk/io/TextIO.java
+++ b/sdks/java/core/src/main/java/org/apache/beam/sdk/io/TextIO.java
@@ -85,9 +85,9 @@ import org.apache.beam.sdk.values.PDone;
  *
  * By default, all input is put into the global window before writing. If 
per-window writes are
  * desired - for example, when using a streaming runner -
- * {@link AvroIO.Write.Bound#withWindowedWrites()} will cause windowing and 
triggering to be
+ * {@link TextIO.Write.Bound#withWindowedWrites()} will cause windowing and 
triggering to be
  * preserved. When producing windowed writes, the number of output shards must 
be set explicitly
- * using {@link AvroIO.Write.Bound#withNumShards(int)}; some runners may set 
this for you to a
+ * using {@link TextIO.Write.Bound#withNumShards(int)}; some runners may set 
this for you to a
  * runner-chosen value, so you may need not set it yourself. A {@link 
FilenamePolicy} must be
  * set, and unique windows and triggers must produce unique filenames.
  *



[jira] [Created] (BEAM-2136) AvroCoderTest.testTwoClassLoaders fails on beam_PostCommit_Java_ValidatesRunner_Dataflow

2017-05-01 Thread Luke Cwik (JIRA)
Luke Cwik created BEAM-2136:
---

 Summary: AvroCoderTest.testTwoClassLoaders fails on 
beam_PostCommit_Java_ValidatesRunner_Dataflow
 Key: BEAM-2136
 URL: https://issues.apache.org/jira/browse/BEAM-2136
 Project: Beam
  Issue Type: Bug
  Components: sdk-java-core
Reporter: Luke Cwik
Assignee: Kenneth Knowles
Priority: Minor


Example failure:
https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/3003/org.apache.beam$beam-sdks-java-core/testReport/org.apache.beam.sdk.coders/AvroCoderTest/testTwoClassLoaders/

java.lang.NullPointerException
at com.google.common.io.ByteStreams.toByteArray(ByteStreams.java:165)
at 
org.apache.beam.sdk.coders.AvroCoderTest$InterceptingUrlClassLoader.loadClass(AvroCoderTest.java:179)
at 
org.apache.beam.sdk.coders.AvroCoderTest.testTwoClassLoaders(AvroCoderTest.java:199)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at 
org.apache.beam.sdk.testing.TestPipeline$1.evaluate(TestPipeline.java:307)
at org.junit.rules.RunRules.evaluate(RunRules.java:20)
at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
at org.junit.runners.Suite.runChild(Suite.java:128)
at org.junit.runners.Suite.runChild(Suite.java:27)
at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
at org.apache.maven.surefire.junitcore.JUnitCore.run(JUnitCore.java:55)
at 
org.apache.maven.surefire.junitcore.JUnitCoreWrapper.createRequestAndRun(JUnitCoreWrapper.java:137)
at 
org.apache.maven.surefire.junitcore.JUnitCoreWrapper.executeEager(JUnitCoreWrapper.java:107)
at 
org.apache.maven.surefire.junitcore.JUnitCoreWrapper.execute(JUnitCoreWrapper.java:83)
at 
org.apache.maven.surefire.junitcore.JUnitCoreWrapper.execute(JUnitCoreWrapper.java:75)
at 
org.apache.maven.surefire.junitcore.JUnitCoreProvider.invoke(JUnitCoreProvider.java:157)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.apache.maven.surefire.util.ReflectionUtils.invokeMethodWithArray2(ReflectionUtils.java:202)
at 
org.apache.maven.surefire.booter.ProviderFactory$ProviderProxy.invoke(ProviderFactory.java:158)
at 
org.apache.maven.surefire.booter.ProviderFactory.invokeProvider(ProviderFactory.java:81)
at 
org.apache.maven.plugin.surefire.InPluginVMSurefireStarter.runSuitesInProcess(InPluginVMSurefireStarter.java:84)
at 
org.apache.maven.plugin.surefire.AbstractSurefireMojo.executeProvider(AbstractSurefireMojo.java:1060)
at 
org.apache.maven.plugin.surefire.AbstractSurefireMojo.executeAfterPreconditionsChecked(AbstractSurefireMojo.java:907)
at 
org.apache.maven.plugin.surefire.AbstractSurefireMojo.execute(AbstractSurefireMojo.java:785)
at 
org.apache.maven.plugin.DefaultBuildPluginManager.executeMojo(DefaultBuildPluginManager.java:134)
at 
org.apache.maven.lifecycle.internal.MojoExecutor.execute(Mojo

[jira] [Commented] (BEAM-1871) Thin Java SDK Core

2017-05-01 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-1871?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15991948#comment-15991948
 ] 

ASF GitHub Bot commented on BEAM-1871:
--

Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/2804


> Thin Java SDK Core
> --
>
> Key: BEAM-1871
> URL: https://issues.apache.org/jira/browse/BEAM-1871
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core
>Reporter: Daniel Halperin
>Assignee: Luke Cwik
> Fix For: First stable release
>
>
> Before first stable release we need to thin out {{sdk-java-core}} module. 
> Some candidates for removal, but not a non-exhaustive list:
> {{sdk/io}}
> * anything BigQuery related
> * anything PubSub related
> * everything Protobuf related
> * TFRecordIO
> * XMLSink
> {{sdk/util}}
> * Everything GCS related
> * Everything Backoff related
> * Everything Google API related: ResponseInterceptors, RetryHttpBackoff, etc.
> * Everything CloudObject-related
> * Pubsub stuff
> {{sdk/coders}}
> * JAXBCoder
> * TableRowJsoNCoder



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] beam pull request #2804: [BEAM-1871] Remove deprecated org.apache.beam.sdk.o...

2017-05-01 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/2804


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[1/2] beam git commit: [BEAM-1871] Remove deprecated org.apache.beam.sdk.options.GcsOptions after updating Dataflow worker

2017-05-01 Thread lcwik
Repository: beam
Updated Branches:
  refs/heads/master 3afd338fd -> 6d443bc39


[BEAM-1871] Remove deprecated org.apache.beam.sdk.options.GcsOptions after 
updating Dataflow worker

Worker has been migrated to 
org.apache.beam.sdk.extensions.gcp.options.GcsOptions


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/a752cfdc
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/a752cfdc
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/a752cfdc

Branch: refs/heads/master
Commit: a752cfdc8dafa26b86cdf4343a638768e763c736
Parents: 3afd338
Author: Luke Cwik 
Authored: Mon May 1 13:07:11 2017 -0700
Committer: Luke Cwik 
Committed: Mon May 1 18:31:08 2017 -0700

--
 runners/google-cloud-dataflow-java/pom.xml  |   2 +-
 .../org/apache/beam/sdk/options/GcsOptions.java | 110 ---
 .../apache/beam/sdk/options/package-info.java   |  22 
 3 files changed, 1 insertion(+), 133 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/a752cfdc/runners/google-cloud-dataflow-java/pom.xml
--
diff --git a/runners/google-cloud-dataflow-java/pom.xml 
b/runners/google-cloud-dataflow-java/pom.xml
index e75abbd..4b6ca98 100644
--- a/runners/google-cloud-dataflow-java/pom.xml
+++ b/runners/google-cloud-dataflow-java/pom.xml
@@ -33,7 +33,7 @@
   jar
 
   
-
beam-master-20170430
+
beam-master-20170501
 
1
 
6
   

http://git-wip-us.apache.org/repos/asf/beam/blob/a752cfdc/sdks/java/extensions/gcp-core/src/main/java/org/apache/beam/sdk/options/GcsOptions.java
--
diff --git 
a/sdks/java/extensions/gcp-core/src/main/java/org/apache/beam/sdk/options/GcsOptions.java
 
b/sdks/java/extensions/gcp-core/src/main/java/org/apache/beam/sdk/options/GcsOptions.java
deleted file mode 100644
index 7cf695e..000
--- 
a/sdks/java/extensions/gcp-core/src/main/java/org/apache/beam/sdk/options/GcsOptions.java
+++ /dev/null
@@ -1,110 +0,0 @@
-/*
- * Licensed to the Apache Software Foundation (ASF) under one
- * or more contributor license agreements.  See the NOTICE file
- * distributed with this work for additional information
- * regarding copyright ownership.  The ASF licenses this file
- * to you under the Apache License, Version 2.0 (the
- * "License"); you may not use this file except in compliance
- * with the License.  You may obtain a copy of the License at
- *
- * http://www.apache.org/licenses/LICENSE-2.0
- *
- * Unless required by applicable law or agreed to in writing, software
- * distributed under the License is distributed on an "AS IS" BASIS,
- * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- * See the License for the specific language governing permissions and
- * limitations under the License.
- */
-package org.apache.beam.sdk.options;
-
-import com.fasterxml.jackson.annotation.JsonIgnore;
-import com.google.cloud.hadoop.util.AbstractGoogleAsyncWriteChannel;
-import java.util.concurrent.ExecutorService;
-import javax.annotation.Nullable;
-import org.apache.beam.sdk.extensions.gcp.options.GcpOptions;
-import org.apache.beam.sdk.util.GcsPathValidator;
-import org.apache.beam.sdk.util.GcsUtil;
-import org.apache.beam.sdk.util.PathValidator;
-
-/**
- * Options used to configure Google Cloud Storage.
- */
-@Deprecated
-public interface GcsOptions extends
-ApplicationNameOptions, GcpOptions, PipelineOptions {
-  /**
-   * The GcsUtil instance that should be used to communicate with Google Cloud 
Storage.
-   */
-  @JsonIgnore
-  @Description("The GcsUtil instance that should be used to communicate with 
Google Cloud Storage.")
-  @Default.InstanceFactory(GcsUtil.GcsUtilFactory.class)
-  @Hidden
-  GcsUtil getGcsUtil();
-  void setGcsUtil(GcsUtil value);
-
-  /**
-   * The ExecutorService instance to use to create threads, can be overridden 
to specify an
-   * ExecutorService that is compatible with the users environment. If unset, 
the
-   * default is to create an ExecutorService with an unbounded number of 
threads; this
-   * is compatible with Google AppEngine.
-   */
-  @JsonIgnore
-  @Description("The ExecutorService instance to use to create multiple 
threads. Can be overridden "
-  + "to specify an ExecutorService that is compatible with the users 
environment. If unset, "
-  + "the default is to create an ExecutorService with an unbounded number 
of threads; this "
-  + "is compatible with Google AppEngine.")
-  @Default.InstanceFactory(
-  
org.apache.beam.sdk.extensions.gcp.options.GcsOptions.ExecutorServiceFactory.class)
-  @Hidden
-  ExecutorService getExecutorService();
-  void setExecutorService(ExecutorService val

[2/2] beam git commit: [BEAM-1871] Remove deprecated org.apache.beam.sdk.options.GcsOptions after updating Dataflow worker

2017-05-01 Thread lcwik
[BEAM-1871] Remove deprecated org.apache.beam.sdk.options.GcsOptions after 
updating Dataflow worker

This closes #2804


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/6d443bc3
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/6d443bc3
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/6d443bc3

Branch: refs/heads/master
Commit: 6d443bc3965e42fc8e933a64f719bd64eac3e06a
Parents: 3afd338 a752cfd
Author: Luke Cwik 
Authored: Mon May 1 18:31:53 2017 -0700
Committer: Luke Cwik 
Committed: Mon May 1 18:31:53 2017 -0700

--
 runners/google-cloud-dataflow-java/pom.xml  |   2 +-
 .../org/apache/beam/sdk/options/GcsOptions.java | 110 ---
 .../apache/beam/sdk/options/package-info.java   |  22 
 3 files changed, 1 insertion(+), 133 deletions(-)
--




Jenkins build is unstable: beam_PostCommit_Java_ValidatesRunner_Dataflow #3003

2017-05-01 Thread Apache Jenkins Server
See 




Build failed in Jenkins: beam_PostCommit_Java_MavenInstall #3573

2017-05-01 Thread Apache Jenkins Server
See 


Changes:

[robertwb] Rename OutputValue to TaggedOutput.

--
[...truncated 2.12 MB...]
for result in results:
  File 
"
 line 105, in tee
yield beam.pvalue.OutputValue(tag, elem)
AttributeError: 'module' object has no attribute 'OutputValue' [while running 
'FlatMap(tee)/FlatMap(tee)']
 >> begin captured logging << 
root: INFO: Running [(u'Create/Read', 
WorkerRead(source=SourceBundle(weight=1.0, 
source=, 
start_position=None, stop_position=None))), (u'FlatMap(tee)/FlatMap(tee)', 
WorkerDoFn(output_tags=['out', 'out_x', 'out_y'], input=(0, 0))), 
(u'x/WindowInto(WindowIntoFn)', WorkerDoFn(output_tags=['out'], input=(1, 1))), 
(u'x/ToVoidKey', WorkerDoFn(output_tags=['out'], input=(2, 0))), 
(u'x/Group/pair_with_1', WorkerDoFn(output_tags=['out'], input=(3, 0))), 
(u'x/Group/Flatten/Write', WorkerInMemoryWrite(output_buffer=GroupingOutput[0], 
write_windowed_values=True, input=(4, 0))), (u'y/WindowInto(WindowIntoFn)', 
WorkerDoFn(output_tags=['out'], input=(1, 2))), (u'y/ToVoidKey', 
WorkerDoFn(output_tags=['out'], input=(6, 0))), (u'y/Group/pair_with_1', 
WorkerDoFn(output_tags=['out'], input=(7, 0))), (u'y/Group/Flatten/Write', 
WorkerInMemoryWrite(output_buffer=GroupingOutput[0], 
write_windowed_values=True, input=(8, 0)))]
root: DEBUG: Starting op 9 
root: DEBUG: Starting op 8 
root: DEBUG: Starting op 7 
root: DEBUG: Starting op 6 
root: DEBUG: Starting op 5 
root: DEBUG: Starting op 4 
root: DEBUG: Starting op 3 
root: DEBUG: Starting op 2 
root: DEBUG: Starting op 1 
root: DEBUG: Starting op 0 , 
start_position=None, stop_position=None)>
- >> end captured logging << -

==
FAIL: test_using_slow_impl (apache_beam.coders.slow_coders_test.SlowCoders)
--
Traceback (most recent call last):
  File 
"
 line 41, in test_using_slow_impl
import apache_beam.coders.stream
AssertionError: ImportError not raised

--
Ran 1198 tests in 132.858s

FAILED (failures=1, errors=2, skipped=15)
Test failed: 
error: Test failed: 
ERROR: InvocationError: 
'
 setup.py test'
___ summary 
  docs: commands succeeded
  lint: commands succeeded
ERROR:   py27: commands failed
ERROR:   py27cython: commands failed
ERROR:   py27gcp: commands failed
2017-05-02T01:28:29.954 [ERROR] Command execution failed.
org.apache.commons.exec.ExecuteException: Process exited with an error: 1 (Exit 
value: 1)
at 
org.apache.commons.exec.DefaultExecutor.executeInternal(DefaultExecutor.java:404)
at 
org.apache.commons.exec.DefaultExecutor.execute(DefaultExecutor.java:166)
at org.codehaus.mojo.exec.ExecMojo.executeCommandLine(ExecMojo.java:764)
at org.codehaus.mojo.exec.ExecMojo.executeCommandLine(ExecMojo.java:711)
at org.codehaus.mojo.exec.ExecMojo.execute(ExecMojo.java:289)
at 
org.apache.maven.plugin.DefaultBuildPluginManager.executeMojo(DefaultBuildPluginManager.java:134)
at 
org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:208)
at 
org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:153)
at 
org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:145)
at 
org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject(LifecycleModuleBuilder.java:116)
at 
org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject(LifecycleModuleBuilder.java:80)
at 
org.apache.maven.lifecycle.internal.builder.singlethreaded.SingleThreadedBuilder.build(SingleThreadedBuilder.java:51)
at 
org.apache.maven.lifecycle.internal.LifecycleStarter.execute(LifecycleStarter.java:128)
at org.apache.maven.DefaultMaven.doExecute(DefaultMaven.java:307)
at org.apache.maven.DefaultMaven.doExecute(DefaultMaven.java:193)
at org.apache.maven.DefaultMaven.execute(DefaultMaven.java:106)
at 
org.jvnet.hudson.maven3.launcher.Maven33Launcher.main(Maven33Launcher.java:129)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(Delegati

Jenkins build is still unstable: beam_PostCommit_Java_MavenInstall #3571

2017-05-01 Thread Apache Jenkins Server
See 




[GitHub] beam pull request #2808: Copy CloudObject to the Dataflow Module

2017-05-01 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/2808


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[1/2] beam git commit: Copy CloudObject to the Dataflow Module

2017-05-01 Thread lcwik
Repository: beam
Updated Branches:
  refs/heads/master 7f50ea2e5 -> 3afd338fd


Copy CloudObject to the Dataflow Module

Once migrated on the Dataflow worker, these classes can be removed from
the sdk.


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/79b364f7
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/79b364f7
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/79b364f7

Branch: refs/heads/master
Commit: 79b364f7df6122e438a4ee0f12b5cdc7cb694d91
Parents: 7f50ea2
Author: Thomas Groh 
Authored: Mon May 1 14:31:17 2017 -0700
Committer: Luke Cwik 
Committed: Mon May 1 18:02:23 2017 -0700

--
 .../runners/dataflow/util/CloudKnownType.java   | 138 ++
 .../beam/runners/dataflow/util/CloudObject.java | 185 +
 .../beam/runners/dataflow/util/Serializer.java  | 262 +++
 .../apache/beam/sdk/util/CloudKnownType.java|   7 +-
 .../org/apache/beam/sdk/util/CloudObject.java   |   3 +
 .../org/apache/beam/sdk/util/CoderUtils.java|  15 +-
 .../org/apache/beam/sdk/util/Serializer.java|   3 +
 .../apache/beam/sdk/util/CoderUtilsTest.java| 104 
 8 files changed, 600 insertions(+), 117 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/79b364f7/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/util/CloudKnownType.java
--
diff --git 
a/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/util/CloudKnownType.java
 
b/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/util/CloudKnownType.java
new file mode 100644
index 000..ce23a1b
--- /dev/null
+++ 
b/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/util/CloudKnownType.java
@@ -0,0 +1,138 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.runners.dataflow.util;
+
+import java.util.Collections;
+import java.util.HashMap;
+import java.util.Map;
+import javax.annotation.Nullable;
+
+/** A utility for manipulating well-known cloud types. */
+enum CloudKnownType {
+  TEXT("http://schema.org/Text";, String.class) {
+@Override
+public  T parse(Object value, Class clazz) {
+  return clazz.cast(value);
+}
+  },
+  BOOLEAN("http://schema.org/Boolean";, Boolean.class) {
+@Override
+public  T parse(Object value, Class clazz) {
+  return clazz.cast(value);
+}
+  },
+  INTEGER("http://schema.org/Integer";, Long.class, Integer.class) {
+@Override
+public  T parse(Object value, Class clazz) {
+  Object result = null;
+  if (value.getClass() == clazz) {
+result = value;
+  } else if (clazz == Long.class) {
+if (value instanceof Integer) {
+  result = ((Integer) value).longValue();
+} else if (value instanceof String) {
+  result = Long.valueOf((String) value);
+}
+  } else if (clazz == Integer.class) {
+if (value instanceof Long) {
+  result = ((Long) value).intValue();
+} else if (value instanceof String) {
+  result = Integer.valueOf((String) value);
+}
+  }
+  return clazz.cast(result);
+}
+  },
+  FLOAT("http://schema.org/Float";, Double.class, Float.class) {
+@Override
+public  T parse(Object value, Class clazz) {
+  Object result = null;
+  if (value.getClass() == clazz) {
+result = value;
+  } else if (clazz == Double.class) {
+if (value instanceof Float) {
+  result = ((Float) value).doubleValue();
+} else if (value instanceof String) {
+  result = Double.valueOf((String) value);
+}
+  } else if (clazz == Float.class) {
+if (value instanceof Double) {
+  result = ((Double) value).floatValue();
+} else if (value instanceof String) {
+  result = Float.valueOf((String) value);
+}
+  }
+  return clazz.cast(result);
+}
+  };
+
+  private final String u

[2/2] beam git commit: [BEAM-2020] Copy CloudObject to the Dataflow Module

2017-05-01 Thread lcwik
[BEAM-2020] Copy CloudObject to the Dataflow Module

This closes #2808


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/3afd338f
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/3afd338f
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/3afd338f

Branch: refs/heads/master
Commit: 3afd338fd928b4bbd43c29cd75c48e61aabb2118
Parents: 7f50ea2 79b364f
Author: Luke Cwik 
Authored: Mon May 1 18:03:28 2017 -0700
Committer: Luke Cwik 
Committed: Mon May 1 18:03:28 2017 -0700

--
 .../runners/dataflow/util/CloudKnownType.java   | 138 ++
 .../beam/runners/dataflow/util/CloudObject.java | 185 +
 .../beam/runners/dataflow/util/Serializer.java  | 262 +++
 .../apache/beam/sdk/util/CloudKnownType.java|   7 +-
 .../org/apache/beam/sdk/util/CloudObject.java   |   3 +
 .../org/apache/beam/sdk/util/CoderUtils.java|  15 +-
 .../org/apache/beam/sdk/util/Serializer.java|   3 +
 .../apache/beam/sdk/util/CoderUtilsTest.java| 104 
 8 files changed, 600 insertions(+), 117 deletions(-)
--




[jira] [Created] (BEAM-2135) Rename hdfs module to hadoop-file-system, rename gcp-core to google-cloud-platform-core

2017-05-01 Thread Luke Cwik (JIRA)
Luke Cwik created BEAM-2135:
---

 Summary: Rename hdfs module to hadoop-file-system, rename gcp-core 
to google-cloud-platform-core
 Key: BEAM-2135
 URL: https://issues.apache.org/jira/browse/BEAM-2135
 Project: Beam
  Issue Type: Improvement
  Components: sdk-java-extensions, sdk-java-gcp
Affects Versions: First stable release
Reporter: Luke Cwik
Assignee: Luke Cwik


Rename hdfs module to hadoop-file-system, rename gcp-core to 
google-cloud-platform-core

Similarly rename directories as well.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


Jenkins build is still unstable: beam_PostCommit_Java_MavenInstall #3570

2017-05-01 Thread Apache Jenkins Server
See 




[jira] [Commented] (BEAM-1949) Rename DoFn.Context#sideOutput to #output

2017-05-01 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-1949?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15991906#comment-15991906
 ] 

ASF GitHub Bot commented on BEAM-1949:
--

Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/2810


> Rename DoFn.Context#sideOutput to #output
> -
>
> Key: BEAM-1949
> URL: https://issues.apache.org/jira/browse/BEAM-1949
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core
>Reporter: Thomas Groh
>Assignee: Thomas Groh
> Fix For: First stable release
>
>
> Having two methods, both named output, one which takes the "main output type" 
> and one that takes a tag to specify the type more clearly communicates the 
> actual behavior - sideOutput isn't a "special" way to output, it's the same 
> as output(T), just to a specified PCollection. This will help pipeline 
> authors understand the actual behavior of outputting to a tag, and detangle 
> it from "sideInput", which is a special way to receive input. Giving them the 
> same name means that it's not even strange to call output and provide the 
> main output type, which is what we want - it's a more specific way to output, 
> but does not have different restrictions or capabilities.
> This is also a pretty small change within the SDK - it touches about 20 
> files, and the changes are pretty automatic.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] beam pull request #2810: [BEAM-1949 Rename OutputValue to TaggedOutput.

2017-05-01 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/2810


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[1/2] beam git commit: Rename OutputValue to TaggedOutput.

2017-05-01 Thread robertwb
Repository: beam
Updated Branches:
  refs/heads/master 7c425b097 -> 7f50ea2e5


Rename OutputValue to TaggedOutput.


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/9d3aebea
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/9d3aebea
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/9d3aebea

Branch: refs/heads/master
Commit: 9d3aebeab7891bebdbf13c7567478c6b5fe9b3f4
Parents: 7c425b0
Author: Robert Bradshaw 
Authored: Mon May 1 18:15:32 2017 -0500
Committer: Robert Bradshaw 
Committed: Mon May 1 17:45:48 2017 -0700

--
 .../examples/cookbook/multiple_output_pardo.py  |  6 +++---
 .../apache_beam/examples/snippets/snippets_test.py  |  6 +++---
 sdks/python/apache_beam/pvalue.py   |  6 +++---
 sdks/python/apache_beam/runners/common.py   | 10 +-
 .../direct/consumer_tracking_pipeline_visitor_test.py   |  2 +-
 sdks/python/apache_beam/transforms/core.py  |  4 ++--
 sdks/python/apache_beam/transforms/ptransform_test.py   | 12 ++--
 sdks/python/apache_beam/typehints/typecheck.py  |  4 ++--
 8 files changed, 25 insertions(+), 25 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/9d3aebea/sdks/python/apache_beam/examples/cookbook/multiple_output_pardo.py
--
diff --git a/sdks/python/apache_beam/examples/cookbook/multiple_output_pardo.py 
b/sdks/python/apache_beam/examples/cookbook/multiple_output_pardo.py
index b324ed1..9c82df4 100644
--- a/sdks/python/apache_beam/examples/cookbook/multiple_output_pardo.py
+++ b/sdks/python/apache_beam/examples/cookbook/multiple_output_pardo.py
@@ -97,15 +97,15 @@ class SplitLinesToWordsFn(beam.DoFn):
 """
 # yield a count (integer) to the OUTPUT_TAG_CHARACTER_COUNT tagged
 # collection.
-yield pvalue.OutputValue(self.OUTPUT_TAG_CHARACTER_COUNT,
- len(element))
+yield pvalue.TaggedOutput(
+self.OUTPUT_TAG_CHARACTER_COUNT, len(element))
 
 words = re.findall(r'[A-Za-z\']+', element)
 for word in words:
   if len(word) <= 3:
 # yield word as an output to the OUTPUT_TAG_SHORT_WORDS tagged
 # collection.
-yield pvalue.OutputValue(self.OUTPUT_TAG_SHORT_WORDS, word)
+yield pvalue.TaggedOutput(self.OUTPUT_TAG_SHORT_WORDS, word)
   else:
 # yield word to add it to the main collection.
 yield word

http://git-wip-us.apache.org/repos/asf/beam/blob/9d3aebea/sdks/python/apache_beam/examples/snippets/snippets_test.py
--
diff --git a/sdks/python/apache_beam/examples/snippets/snippets_test.py 
b/sdks/python/apache_beam/examples/snippets/snippets_test.py
index a3cdb24..afd7918 100644
--- a/sdks/python/apache_beam/examples/snippets/snippets_test.py
+++ b/sdks/python/apache_beam/examples/snippets/snippets_test.py
@@ -186,11 +186,11 @@ class ParDoTest(unittest.TestCase):
   yield element
 else:
   # Emit this word's long length to the 'above_cutoff_lengths' output.
-  yield pvalue.OutputValue(
+  yield pvalue.TaggedOutput(
   'above_cutoff_lengths', len(element))
 if element.startswith(marker):
   # Emit this word to a different output with the 'marked strings' tag.
-  yield pvalue.OutputValue('marked strings', element)
+  yield pvalue.TaggedOutput('marked strings', element)
 # [END model_pardo_emitting_values_on_tagged_outputs]
 
 words = ['a', 'an', 'the', 'music', 'xyz']
@@ -226,7 +226,7 @@ class ParDoTest(unittest.TestCase):
 
 # [START model_pardo_with_undeclared_outputs]
 def even_odd(x):
-  yield pvalue.OutputValue('odd' if x % 2 else 'even', x)
+  yield pvalue.TaggedOutput('odd' if x % 2 else 'even', x)
   if x % 10 == 0:
 yield x
 

http://git-wip-us.apache.org/repos/asf/beam/blob/9d3aebea/sdks/python/apache_beam/pvalue.py
--
diff --git a/sdks/python/apache_beam/pvalue.py 
b/sdks/python/apache_beam/pvalue.py
index d873669..2242c5a 100644
--- a/sdks/python/apache_beam/pvalue.py
+++ b/sdks/python/apache_beam/pvalue.py
@@ -230,19 +230,19 @@ class DoOutputsTuple(object):
 return pcoll
 
 
-class OutputValue(object):
+class TaggedOutput(object):
   """An object representing a tagged value.
 
   ParDo, Map, and FlatMap transforms can emit values on multiple outputs which
   are distinguished by string tags. The DoFn will return plain values
-  if it wants to emit on the main output and OutputValue objects
+  if it wants to emit on the main output and TaggedOutput objects
   if it wants to emit a value on a specific tagged output.
   """
 
   def __init__

  1   2   3   >