[jira] [Commented] (BEAM-1346) Drop Late Data in ReduceFnRunner

2017-02-13 Thread Aljoscha Krettek (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-1346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15865303#comment-15865303
 ] 

Aljoscha Krettek commented on BEAM-1346:


Ah dammit, you're right, I forgot about BEAM-696. In fact the Flink Runner also 
[disables that 
optimization|https://github.com/apache/beam/blob/master/runners/flink/runner/src/main/java/org/apache/beam/runners/flink/translation/FlinkStreamingTransformTranslators.java#L850].

> Drop Late Data in ReduceFnRunner
> 
>
> Key: BEAM-1346
> URL: https://issues.apache.org/jira/browse/BEAM-1346
> Project: Beam
>  Issue Type: Bug
>  Components: runner-core
>Affects Versions: 0.5.0
>Reporter: Aljoscha Krettek
>
> I think these two commits recently broke late-data dropping for the Flink 
> Runner (and maybe for other runners as well):
> - https://github.com/apache/beam/commit/2b26ec8
> - https://github.com/apache/beam/commit/8989473
> It boils down to the {{LateDataDroppingDoFnRunner}} not being used anymore  
> because {{DoFnRunners.lateDataDroppingRunner()}} is not called anymore when a 
> {{DoFn}} is a {{ReduceFnExecutor}} (because that interface was removed).
> Maybe we should think about dropping late data in another place, my 
> suggestion is {{ReduceFnRunner}} but that's open for discussion.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


Build failed in Jenkins: beam_PostCommit_Java_RunnableOnService_Dataflow #2288

2017-02-13 Thread Apache Jenkins Server
See 


Changes:

[jesse] Changed name of ToString.of() to ToString.elements().

[klk] Correct Javadoc on accessing windows in DoFn

--
[...truncated 25886 lines...]
[INFO] 2017-02-14T07:03:15.733Z: (e87f0e1cc9ed): Unzipping flatten 
s15-u22-u27 for input s17-reify-value1-c25
[INFO] 2017-02-14T07:03:15.735Z: (e87f0e1cc2ef): Fusing unzipped copy of 
PAssert$271/GroupGlobally/GroupDummyAndContents/Write, through flatten , into 
producer PAssert$271/GroupGlobally/GroupDummyAndContents/Reify
[INFO] 2017-02-14T07:03:15.738Z: (e87f0e1ccbf1): Fusing consumer 
PAssert$271/GroupGlobally/GroupDummyAndContents/GroupByWindow into 
PAssert$271/GroupGlobally/GroupDummyAndContents/Read
[INFO] 2017-02-14T07:03:15.740Z: (e87f0e1cc4f3): Fusing consumer 
PAssert$271/GroupGlobally/ParDo(Concat) into 
PAssert$271/GroupGlobally/Values/Values/Map
[INFO] 2017-02-14T07:03:15.742Z: (e87f0e1ccdf5): Fusing consumer 
PAssert$271/GetPane/Map into PAssert$271/GroupGlobally/ParDo(Concat)
[INFO] 2017-02-14T07:03:15.744Z: (e87f0e1cc6f7): Fusing consumer 
PAssert$271/GroupGlobally/Values/Values/Map into 
PAssert$271/GroupGlobally/GroupDummyAndContents/GroupByWindow
[INFO] 2017-02-14T07:03:15.747Z: (e87f0e1ccff9): Fusing consumer 
PAssert$271/GroupGlobally/GatherAllOutputs/GroupByKey/Write into 
PAssert$271/GroupGlobally/GatherAllOutputs/GroupByKey/Reify
[INFO] 2017-02-14T07:03:15.749Z: (e87f0e1cc8fb): Fusing consumer 
PAssert$271/GroupGlobally/GatherAllOutputs/ParDo(ReifyTimestampsAndWindows) 
into PAssert$271/GroupGlobally/Window.Into()
[INFO] 2017-02-14T07:03:15.751Z: (e87f0e1cc1fd): Fusing consumer 
PAssert$271/GroupGlobally/RemoveActualsTriggering/Identity into 
PAssert$271/GroupGlobally/KeyForDummy/AddKeys/Map
[INFO] 2017-02-14T07:03:15.753Z: (e87f0e1ccaff): Fusing consumer 
PAssert$271/GroupGlobally/GatherAllOutputs/GroupByKey/Reify into 
PAssert$271/GroupGlobally/GatherAllOutputs/Window.Into()
[INFO] 2017-02-14T07:03:15.756Z: (e87f0e1cc401): Fusing consumer 
PAssert$271/GroupGlobally/KeyForDummy/AddKeys/Map into 
PAssert$271/GroupGlobally/RewindowActuals
[INFO] 2017-02-14T07:03:15.758Z: (e87f0e1ccd03): Fusing consumer 
PAssert$271/GroupGlobally/GatherAllOutputs/Window.Into() into 
PAssert$271/GroupGlobally/GatherAllOutputs/WithKeys/AddKeys/Map
[INFO] 2017-02-14T07:03:15.760Z: (e87f0e1cc605): Fusing consumer 
PAssert$271/GroupGlobally/RewindowActuals into 
PAssert$271/GroupGlobally/GatherAllOutputs/Values/Values/Map
[INFO] 2017-02-14T07:03:15.763Z: (e87f0e1ccf07): Fusing consumer 
PAssert$271/GroupGlobally/GatherAllOutputs/Values/Values/Map into 
PAssert$271/GroupGlobally/GatherAllOutputs/GroupByKey/GroupByWindow
[INFO] 2017-02-14T07:03:15.765Z: (e87f0e1cc809): Fusing consumer 
PAssert$271/GroupGlobally/GatherAllOutputs/GroupByKey/GroupByWindow into 
PAssert$271/GroupGlobally/GatherAllOutputs/GroupByKey/Read
[INFO] 2017-02-14T07:03:15.768Z: (e87f0e1cc10b): Fusing consumer 
KvSwap/KvSwap/Map into Create.Values/Read(CreateSource)
[INFO] 2017-02-14T07:03:15.770Z: (e87f0e1cca0d): Fusing consumer 
PAssert$271/GroupGlobally/Window.Into() into KvSwap/KvSwap/Map
[INFO] 2017-02-14T07:03:15.772Z: (e87f0e1cc30f): Fusing consumer 
PAssert$271/GroupGlobally/GatherAllOutputs/WithKeys/AddKeys/Map into 
PAssert$271/GroupGlobally/GatherAllOutputs/ParDo(ReifyTimestampsAndWindows)
[INFO] 2017-02-14T07:03:15.774Z: (e87f0e1ccc11): Fusing consumer 
PAssert$271/GroupGlobally/NeverTrigger/Identity into 
PAssert$271/GroupGlobally/RemoveDummyTriggering/Identity
[INFO] 2017-02-14T07:03:15.777Z: (e87f0e1cc513): Fusing consumer 
PAssert$271/GroupGlobally/GroupDummyAndContents/Write into 
PAssert$271/GroupGlobally/GroupDummyAndContents/Reify
[INFO] 2017-02-14T07:03:15.779Z: (e87f0e1cce15): Fusing consumer 
PAssert$271/GroupGlobally/GroupDummyAndContents/Reify into 
PAssert$271/GroupGlobally/NeverTrigger/Identity
[INFO] 2017-02-14T07:03:15.781Z: (e87f0e1cc717): Fusing consumer 
PAssert$271/GroupGlobally/WindowIntoDummy into 
PAssert$271/GroupGlobally/Create.Values/Read(CreateSource)
[INFO] 2017-02-14T07:03:15.783Z: (e87f0e1cc019): Fusing consumer 
PAssert$271/GroupGlobally/RemoveDummyTriggering/Identity into 
PAssert$271/GroupGlobally/WindowIntoDummy
[INFO] 2017-02-14T07:03:15.869Z: (e87f0e1ccd63): Adding StepResource setup 
and teardown to workflow graph.
[INFO] 2017-02-14T07:03:15.917Z: S01: (b1ca253ddff49829): Executing operation 
PAssert$271/GroupGlobally/GatherAllOutputs/GroupByKey/Create
[INFO] 2017-02-14T07:03:16.123Z: (ab210bac8153e164): Starting 1 workers...
[INFO] 2017-02-14T07:03:16.147Z: S02: (b1ca253ddff492e8): Executing operation 
Create.Values/Read(CreateSource)+KvSwap/KvSwap/Map+PAssert$271/GroupGlobally/Window.Into()+PAssert$271/GroupGlobally/GatherAllOutputs/ParDo(ReifyTimestampsAndWindows)+PAssert$271/GroupGlobally/GatherAl

[jira] [Commented] (BEAM-1045) Windows OS compatibilities

2017-02-13 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-1045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15865206#comment-15865206
 ] 

ASF GitHub Bot commented on BEAM-1045:
--

Github user peihe closed the pull request at:

https://github.com/apache/beam/pull/1412


> Windows OS compatibilities
> --
>
> Key: BEAM-1045
> URL: https://issues.apache.org/jira/browse/BEAM-1045
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-java-core, sdk-java-gcp
>Reporter: Pei He
>
> One known issue is "*" is not allowed in Windows OS.
> For example, Paths.get("tempDir/*") might throw when code runs in Windows OS.
> http://stackoverflow.com/questions/27522581/asterisks-in-java-path
> This affecting IOChannelFactory.resolve(), toPath(), and match().
> For match(), since it only requires support globs in the final component of a 
> path. (local) FileIOChannelFactory could do things similar as 
> GcsIOChannelFactory:
> First, list all files under the directory path (this won't contain glob, such 
> as *).
> Then, check each returned files whether it matches glob.
> In this way, glob (*) stays within Apache Beam's code.
> From match()'s javadoc:
> """
> Glob handling is dependent on the implementation.  Implementations should
>* all support globs in the final component of a path (eg /foo/bar/*.txt),
>* however they are not required to support globs in the directory paths.
> """



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (BEAM-1352) io/google-cloud-platform should not depend on runners/dataflow for testing

2017-02-13 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-1352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15865208#comment-15865208
 ] 

ASF GitHub Bot commented on BEAM-1352:
--

Github user peihe closed the pull request at:

https://github.com/apache/beam/pull/1864


> io/google-cloud-platform should not depend on runners/dataflow for testing
> --
>
> Key: BEAM-1352
> URL: https://issues.apache.org/jira/browse/BEAM-1352
> Project: Beam
>  Issue Type: Task
>  Components: sdk-java-core
>Reporter: Pei He
>Assignee: Pei He
>
> dataflow-runner needs to depends on io/google-cloud-platform to specialize 
> configurations.
> Currently, it is done by putting GcsUtil in the sdk.util.
> https://github.com/apache/beam/blob/master/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/util/PackageUtil.java#L189
> It is no longer possible after FileSystem refactoring, given GcsFileSystem 
> and its configuration will be in io/google-cloud-platform.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] beam pull request #1864: [BEAM-1352] Remove dataflow-runner dependency from ...

2017-02-13 Thread peihe
Github user peihe closed the pull request at:

https://github.com/apache/beam/pull/1864


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] beam pull request #1412: [BEAM-1045] IOChannelFactory: removes toPath() and ...

2017-02-13 Thread peihe
Github user peihe closed the pull request at:

https://github.com/apache/beam/pull/1412


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] beam pull request #1448: [BEAM-1009] Update to Mockito 2 with mockito-core m...

2017-02-13 Thread peihe
Github user peihe closed the pull request at:

https://github.com/apache/beam/pull/1448


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (BEAM-1009) Upgrade from mockito-all 1 to mockito-core 2

2017-02-13 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-1009?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15865173#comment-15865173
 ] 

ASF GitHub Bot commented on BEAM-1009:
--

Github user peihe closed the pull request at:

https://github.com/apache/beam/pull/1448


> Upgrade from mockito-all 1 to mockito-core 2
> 
>
> Key: BEAM-1009
> URL: https://issues.apache.org/jira/browse/BEAM-1009
> Project: Beam
>  Issue Type: Test
>  Components: sdk-java-core
>Reporter: Pei He
>Assignee: Pei He
>
> Mockito 2 provides useful features, and the mockito-all module is no longer 
> generated.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (BEAM-1483) Support SetState in Flink runner

2017-02-13 Thread Kenneth Knowles (JIRA)
Kenneth Knowles created BEAM-1483:
-

 Summary: Support SetState in Flink runner
 Key: BEAM-1483
 URL: https://issues.apache.org/jira/browse/BEAM-1483
 Project: Beam
  Issue Type: New Feature
  Components: runner-flink
Reporter: Kenneth Knowles






--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (BEAM-1482) Support SetState in Spark runner

2017-02-13 Thread Kenneth Knowles (JIRA)
Kenneth Knowles created BEAM-1482:
-

 Summary: Support SetState in Spark runner
 Key: BEAM-1482
 URL: https://issues.apache.org/jira/browse/BEAM-1482
 Project: Beam
  Issue Type: New Feature
  Components: runner-spark
Reporter: Kenneth Knowles






--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (BEAM-1481) Support SetState in Gearpump runner

2017-02-13 Thread Kenneth Knowles (JIRA)
Kenneth Knowles created BEAM-1481:
-

 Summary: Support SetState in Gearpump runner
 Key: BEAM-1481
 URL: https://issues.apache.org/jira/browse/BEAM-1481
 Project: Beam
  Issue Type: New Feature
  Components: runner-gearpump
Reporter: Kenneth Knowles






--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (BEAM-1478) Support MapState in Gearpump runner

2017-02-13 Thread Kenneth Knowles (JIRA)
Kenneth Knowles created BEAM-1478:
-

 Summary: Support MapState in Gearpump runner
 Key: BEAM-1478
 URL: https://issues.apache.org/jira/browse/BEAM-1478
 Project: Beam
  Issue Type: New Feature
  Components: runner-gearpump
Reporter: Kenneth Knowles






--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (BEAM-1479) Support SetState in Dataflow runner

2017-02-13 Thread Kenneth Knowles (JIRA)
Kenneth Knowles created BEAM-1479:
-

 Summary: Support SetState in Dataflow runner
 Key: BEAM-1479
 URL: https://issues.apache.org/jira/browse/BEAM-1479
 Project: Beam
  Issue Type: New Feature
  Components: runner-dataflow
Reporter: Kenneth Knowles






--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (BEAM-1480) Support SetState in Apex runner

2017-02-13 Thread Kenneth Knowles (JIRA)
Kenneth Knowles created BEAM-1480:
-

 Summary: Support SetState in Apex runner
 Key: BEAM-1480
 URL: https://issues.apache.org/jira/browse/BEAM-1480
 Project: Beam
  Issue Type: New Feature
  Components: runner-apex
Reporter: Kenneth Knowles






--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (BEAM-1476) Support MapState in Flink runner

2017-02-13 Thread Kenneth Knowles (JIRA)
Kenneth Knowles created BEAM-1476:
-

 Summary: Support MapState in Flink runner
 Key: BEAM-1476
 URL: https://issues.apache.org/jira/browse/BEAM-1476
 Project: Beam
  Issue Type: New Feature
  Components: runner-flink
Reporter: Kenneth Knowles






--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (BEAM-1477) Support MapState in Spark runner

2017-02-13 Thread Kenneth Knowles (JIRA)
Kenneth Knowles created BEAM-1477:
-

 Summary: Support MapState in Spark runner
 Key: BEAM-1477
 URL: https://issues.apache.org/jira/browse/BEAM-1477
 Project: Beam
  Issue Type: New Feature
  Components: runner-spark
Reporter: Kenneth Knowles






--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (BEAM-1475) Support MapState in Apex runner

2017-02-13 Thread Kenneth Knowles (JIRA)
Kenneth Knowles created BEAM-1475:
-

 Summary: Support MapState in Apex runner
 Key: BEAM-1475
 URL: https://issues.apache.org/jira/browse/BEAM-1475
 Project: Beam
  Issue Type: New Feature
  Components: runner-apex
Reporter: Kenneth Knowles






--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (BEAM-1474) Support MapState in DataflowRunner

2017-02-13 Thread Kenneth Knowles (JIRA)
Kenneth Knowles created BEAM-1474:
-

 Summary: Support MapState in DataflowRunner
 Key: BEAM-1474
 URL: https://issues.apache.org/jira/browse/BEAM-1474
 Project: Beam
  Issue Type: New Feature
  Components: runner-dataflow
Reporter: Kenneth Knowles






--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (BEAM-27) Add user-ready API for interacting with timers

2017-02-13 Thread Kenneth Knowles (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-27?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kenneth Knowles resolved BEAM-27.
-
   Resolution: Implemented
Fix Version/s: 0.6.0

This is supported in enough runners that we should track only limitations and 
bugs going forward.

> Add user-ready API for interacting with timers
> --
>
> Key: BEAM-27
> URL: https://issues.apache.org/jira/browse/BEAM-27
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-java-core
>Reporter: Kenneth Knowles
>Assignee: Kenneth Knowles
> Fix For: 0.6.0
>
>
> Pipeline authors will benefit from a different factorization of interaction 
> with underlying timers. The current APIs are targeted at runner implementers.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (BEAM-25) Add user-ready API for interacting with state

2017-02-13 Thread Kenneth Knowles (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-25?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kenneth Knowles resolved BEAM-25.
-
   Resolution: Implemented
Fix Version/s: 0.6.0

This is now supported by enough runners that we should track only follow-up 
tickets for limitations, not the overall ticket.

> Add user-ready API for interacting with state
> -
>
> Key: BEAM-25
> URL: https://issues.apache.org/jira/browse/BEAM-25
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-java-core
>Reporter: Kenneth Knowles
>Assignee: Kenneth Knowles
>  Labels: State
> Fix For: 0.6.0
>
>
> Our current state API is targeted at runner implementers, not pipeline 
> authors. As such it has many capabilities that are not necessary nor 
> desirable for simple use cases of stateful ParDo (such as dynamic state tag 
> creation). Implement a simple state intended for user access.
> (Details of our current thoughts in forthcoming design doc)



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (BEAM-59) IOChannelFactory rethinking/redesign

2017-02-13 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-59?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15865139#comment-15865139
 ] 

ASF GitHub Bot commented on BEAM-59:


Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/1982


> IOChannelFactory rethinking/redesign
> 
>
> Key: BEAM-59
> URL: https://issues.apache.org/jira/browse/BEAM-59
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-java-core, sdk-java-gcp
>Reporter: Daniel Halperin
>Assignee: Pei He
>
> Right now, FileBasedSource and FileBasedSink communication is mediated by 
> IOChannelFactory. There are a number of issues:
> * Global configuration -- e.g., all 'gs://' URIs use the same credentials. 
> This should be per-source/per-sink/etc.
> * Supported APIs -- currently IOChannelFactory is in the "non-public API" 
> util package and subject to change. We need users to be able to add new 
> backends ('s3://', 'hdfs://', etc.) directly, without fear that they will be 
> broken.
> * Per-backend features: e.g., creating buckets in GCS/s3, setting expiration 
> time, etc.
> Updates:
> Design docs posted on dev@ list:
> Part 1: IOChannelFactory Redesign: 
> https://docs.google.com/document/d/11TdPyZ9_zmjokhNWM3Id-XJsVG3qel2lhdKTknmZ_7M/edit#
> Part 2: Configurable BeamFileSystem:
> https://docs.google.com/document/d/1-7vo9nLRsEEzDGnb562PuL4q9mUiq_ZVpCAiyyJw8p8/edit#heading=h.p3gc3colc2cs



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[1/2] beam git commit: [BEAM-59] Beam FileSystem: match() and its local implementation.

2017-02-13 Thread pei
Repository: beam
Updated Branches:
  refs/heads/master 49809d1d4 -> bea101a44


[BEAM-59] Beam FileSystem: match() and its local implementation.


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/d1648c47
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/d1648c47
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/d1648c47

Branch: refs/heads/master
Commit: d1648c47dd4fef00273ceb46d42d784325b3b1e8
Parents: 49809d1
Author: Pei He 
Authored: Fri Feb 10 21:53:31 2017 -0800
Committer: Pei He 
Committed: Mon Feb 13 22:08:50 2017 -0800

--
 .../java/org/apache/beam/sdk/io/FileSystem.java |  30 
 .../org/apache/beam/sdk/io/LocalFileSystem.java |  74 ++
 .../org/apache/beam/sdk/io/fs/MatchResult.java  | 125 
 .../apache/beam/sdk/io/LocalFileSystemTest.java | 148 +++
 .../beam/sdk/util/FileIOChannelFactoryTest.java |  13 +-
 .../beam/sdk/io/gcp/storage/GcsFileSystem.java  |   6 +
 .../beam/sdk/io/hdfs/HadoopFileSystem.java  |   6 +
 7 files changed, 391 insertions(+), 11 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/d1648c47/sdks/java/core/src/main/java/org/apache/beam/sdk/io/FileSystem.java
--
diff --git 
a/sdks/java/core/src/main/java/org/apache/beam/sdk/io/FileSystem.java 
b/sdks/java/core/src/main/java/org/apache/beam/sdk/io/FileSystem.java
index ecfa29b..001f596 100644
--- a/sdks/java/core/src/main/java/org/apache/beam/sdk/io/FileSystem.java
+++ b/sdks/java/core/src/main/java/org/apache/beam/sdk/io/FileSystem.java
@@ -24,6 +24,7 @@ import java.nio.channels.WritableByteChannel;
 import java.util.Collection;
 import java.util.List;
 import org.apache.beam.sdk.io.fs.CreateOptions;
+import org.apache.beam.sdk.io.fs.MatchResult;
 import org.apache.beam.sdk.io.fs.ResourceId;
 
 /**
@@ -35,6 +36,35 @@ import org.apache.beam.sdk.io.fs.ResourceId;
  * Clients should use {@link FileSystems} utility.
  */
 public abstract class FileSystem {
+  /**
+   * This is the entry point to convert user-provided specs to {@link 
ResourceIdT ResourceIds}.
+   * Callers should use {@link #match} to resolve users specs ambiguities 
before
+   * calling other methods.
+   *
+   * Implementation should handle the following ambiguities of a 
user-provided spec:
+   * 
+   * {@code spec} could be a glob or a uri. {@link #match} should be able 
to tell and
+   * choose efficient implementations.
+   * The user-provided {@code spec} might refer to files or directories. 
It is common that
+   * users that wish to indicate a directory will omit the trailing {@code /}, 
such as in a spec of
+   * {@code "/tmp/dir"}. The {@link FileSystem} should be able to recognize a 
directory with
+   * the trailing {@code /} omitted, but should always return a correct {@link 
ResourceIdT}
+   * (e.g., {@code "/tmp/dir/"} inside the returned {@link MatchResult}.
+   * 
+   *
+   * All {@link FileSystem} implementations should support glob in the 
final hierarchical path
+   * component of {@link ResourceIdT}. This allows SDK libraries to construct 
file system agnostic
+   * spec. {@link FileSystem FileSystems} can support additional patterns for 
user-provided specs.
+   *
+   * @return {@code List} in the same order of the input specs.
+   *
+   * @throws IllegalArgumentException if specs are invalid.
+   * @throws IOException if all specs failed to match due to issues like:
+   * network connection, authorization.
+   * Exception for individual spec need to be deferred until callers retrieve
+   * metadata with {@link MatchResult#metadata()}.
+   */
+  protected abstract List match(List specs) throws 
IOException;
 
   /**
* Returns a write channel for the given {@link ResourceIdT}.

http://git-wip-us.apache.org/repos/asf/beam/blob/d1648c47/sdks/java/core/src/main/java/org/apache/beam/sdk/io/LocalFileSystem.java
--
diff --git 
a/sdks/java/core/src/main/java/org/apache/beam/sdk/io/LocalFileSystem.java 
b/sdks/java/core/src/main/java/org/apache/beam/sdk/io/LocalFileSystem.java
index 0e79c9c..fe6b643 100644
--- a/sdks/java/core/src/main/java/org/apache/beam/sdk/io/LocalFileSystem.java
+++ b/sdks/java/core/src/main/java/org/apache/beam/sdk/io/LocalFileSystem.java
@@ -19,6 +19,11 @@ package org.apache.beam.sdk.io;
 
 import static com.google.common.base.Preconditions.checkArgument;
 
+import com.google.common.base.Predicate;
+import com.google.common.base.Predicates;
+import com.google.common.collect.ImmutableList;
+import com.google.common.collect.Iterables;
+import com.google.common.collect.Lists;
 import java.io.BufferedOutputStream;
 import java.io.File;
 import java.io.FileInputStream;
@@ -29,10 +34,16 @@ import java.nio.channels.ReadableByt

[2/2] beam git commit: This closes #1982

2017-02-13 Thread pei
This closes #1982


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/bea101a4
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/bea101a4
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/bea101a4

Branch: refs/heads/master
Commit: bea101a4479645b020198dccefb2c6f51a7cc6f0
Parents: 49809d1 d1648c4
Author: Pei He 
Authored: Mon Feb 13 22:09:59 2017 -0800
Committer: Pei He 
Committed: Mon Feb 13 22:09:59 2017 -0800

--
 .../java/org/apache/beam/sdk/io/FileSystem.java |  30 
 .../org/apache/beam/sdk/io/LocalFileSystem.java |  74 ++
 .../org/apache/beam/sdk/io/fs/MatchResult.java  | 125 
 .../apache/beam/sdk/io/LocalFileSystemTest.java | 148 +++
 .../beam/sdk/util/FileIOChannelFactoryTest.java |  13 +-
 .../beam/sdk/io/gcp/storage/GcsFileSystem.java  |   6 +
 .../beam/sdk/io/hdfs/HadoopFileSystem.java  |   6 +
 7 files changed, 391 insertions(+), 11 deletions(-)
--




[GitHub] beam pull request #1982: [BEAM-59] Beam FileSystem: match() and its local im...

2017-02-13 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/1982


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] beam pull request #1891: Correct Javadoc on accessing windows in DoFn

2017-02-13 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/1891


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[2/2] beam git commit: This closes #1891: Correct Javadoc on accessing windows in DoFn

2017-02-13 Thread kenn
This closes #1891: Correct Javadoc on accessing windows in DoFn


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/49809d1d
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/49809d1d
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/49809d1d

Branch: refs/heads/master
Commit: 49809d1d43c409b23771238af085f9ebcd0d3cb3
Parents: 98d8834 c1c8d83
Author: Kenneth Knowles 
Authored: Mon Feb 13 20:44:21 2017 -0800
Committer: Kenneth Knowles 
Committed: Mon Feb 13 20:44:21 2017 -0800

--
 .../src/main/java/org/apache/beam/sdk/transforms/DoFn.java | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/49809d1d/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/DoFn.java
--



[1/2] beam git commit: Correct Javadoc on accessing windows in DoFn

2017-02-13 Thread kenn
Repository: beam
Updated Branches:
  refs/heads/master 98d8834af -> 49809d1d4


Correct Javadoc on accessing windows in DoFn


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/c1c8d838
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/c1c8d838
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/c1c8d838

Branch: refs/heads/master
Commit: c1c8d8386fc035c65362b5ce36c20b38fe00f9a4
Parents: 0e6b379
Author: Ben Chambers 
Authored: Wed Feb 1 15:08:13 2017 -0800
Committer: Kenneth Knowles 
Committed: Mon Feb 13 20:44:04 2017 -0800

--
 .../src/main/java/org/apache/beam/sdk/transforms/DoFn.java | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/c1c8d838/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/DoFn.java
--
diff --git 
a/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/DoFn.java 
b/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/DoFn.java
index a161919..1ad05bb 100644
--- a/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/DoFn.java
+++ b/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/DoFn.java
@@ -532,8 +532,10 @@ public abstract class DoFn implements 
Serializable, HasDisplayD
* href="https://s.apache.org/splittable-do-fn";>splittable {@link 
DoFn} subject to the
* separate requirements described below. Items below are assuming this 
is not a splittable
* {@link DoFn}.
-   * If one of its arguments is {@link BoundedWindow}, this argument 
corresponds to the window
-   * of the current element. If absent, a runner may perform additional 
optimizations.
+   * If one of its arguments is a subtype of {@link BoundedWindow} then it 
will
+   * be passed the window of the current element. When applied by {@link 
ParDo} the subtype
+   * of {@link BoundedWindow} must match the type of windows on the input 
{@link PCollection}.
+   * If the window is not accessed a runner may perform additional 
optimizations.
* It must return {@code void}.
* 
*



[jira] [Commented] (BEAM-1460) Change ToString Method Name

2017-02-13 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-1460?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15865051#comment-15865051
 ] 

ASF GitHub Bot commented on BEAM-1460:
--

Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/1975


> Change ToString Method Name
> ---
>
> Key: BEAM-1460
> URL: https://issues.apache.org/jira/browse/BEAM-1460
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core
>Affects Versions: 0.5.0
>Reporter: Jesse Anderson
>Assignee: Jesse Anderson
>
> Need to change ToString's of() method to elements() to comply with the naming 
> guidelines.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] beam pull request #1975: [BEAM-1460] Changed name of ToString.of() to elemen...

2017-02-13 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/1975


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[1/2] beam git commit: Changed name of ToString.of() to ToString.elements().

2017-02-13 Thread kenn
Repository: beam
Updated Branches:
  refs/heads/master f32cb3e7a -> 98d8834af


Changed name of ToString.of() to ToString.elements().


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/a49acdad
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/a49acdad
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/a49acdad

Branch: refs/heads/master
Commit: a49acdadac37fd52cc1184e5d29cc30b99742a35
Parents: effca63
Author: Jesse Anderson 
Authored: Fri Feb 10 12:17:21 2017 -0800
Committer: Jesse Anderson 
Committed: Fri Feb 10 12:17:21 2017 -0800

--
 .../src/main/java/org/apache/beam/sdk/transforms/ToString.java   | 4 ++--
 .../core/src/test/java/org/apache/beam/sdk/io/WriteTest.java | 2 +-
 .../test/java/org/apache/beam/sdk/transforms/ToStringTest.java   | 2 +-
 3 files changed, 4 insertions(+), 4 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/a49acdad/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/ToString.java
--
diff --git 
a/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/ToString.java 
b/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/ToString.java
index d5c9784..5069a3c 100644
--- a/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/ToString.java
+++ b/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/ToString.java
@@ -42,7 +42,7 @@ public final class ToString {
* element of the input {@link PCollection} to a {@link String} using the
* {@link Object#toString} method.
*/
-  public static PTransform, PCollection> of() {
+  public static PTransform, PCollection> elements() {
 return new SimpleToString();
   }
 
@@ -97,7 +97,7 @@ public final class ToString {
* Example of use:
* {@code
* PCollection longs = ...;
-   * PCollection strings = longs.apply(ToString.of());
+   * PCollection strings = longs.apply(ToString.elements());
* }
*
*

http://git-wip-us.apache.org/repos/asf/beam/blob/a49acdad/sdks/java/core/src/test/java/org/apache/beam/sdk/io/WriteTest.java
--
diff --git a/sdks/java/core/src/test/java/org/apache/beam/sdk/io/WriteTest.java 
b/sdks/java/core/src/test/java/org/apache/beam/sdk/io/WriteTest.java
index f81cc0c..846d445 100644
--- a/sdks/java/core/src/test/java/org/apache/beam/sdk/io/WriteTest.java
+++ b/sdks/java/core/src/test/java/org/apache/beam/sdk/io/WriteTest.java
@@ -297,7 +297,7 @@ public class WriteTest {
   @Test
   public void testWriteUnbounded() {
 PCollection unbounded = p.apply(CountingInput.unbounded())
-.apply(ToString.of());
+.apply(ToString.elements());
 
 TestSink sink = new TestSink();
 thrown.expect(IllegalArgumentException.class);

http://git-wip-us.apache.org/repos/asf/beam/blob/a49acdad/sdks/java/core/src/test/java/org/apache/beam/sdk/transforms/ToStringTest.java
--
diff --git 
a/sdks/java/core/src/test/java/org/apache/beam/sdk/transforms/ToStringTest.java 
b/sdks/java/core/src/test/java/org/apache/beam/sdk/transforms/ToStringTest.java
index ab984f1..d2116da 100644
--- 
a/sdks/java/core/src/test/java/org/apache/beam/sdk/transforms/ToStringTest.java
+++ 
b/sdks/java/core/src/test/java/org/apache/beam/sdk/transforms/ToStringTest.java
@@ -48,7 +48,7 @@ public class ToStringTest {
 Integer[] ints = {1, 2, 3, 4, 5};
 String[] strings = {"1", "2", "3", "4", "5"};
 PCollection input = p.apply(Create.of(Arrays.asList(ints)));
-PCollection output = input.apply(ToString.of());
+PCollection output = input.apply(ToString.elements());
 PAssert.that(output).containsInAnyOrder(strings);
 p.run();
   }



[2/2] beam git commit: This closes #1975: Changed name of ToString.of() to ToString.elements().

2017-02-13 Thread kenn
This closes #1975: Changed name of ToString.of() to ToString.elements().


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/98d8834a
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/98d8834a
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/98d8834a

Branch: refs/heads/master
Commit: 98d8834afba705470a43ebc0349f76334d49f00a
Parents: f32cb3e a49acda
Author: Kenneth Knowles 
Authored: Mon Feb 13 20:29:32 2017 -0800
Committer: Kenneth Knowles 
Committed: Mon Feb 13 20:29:32 2017 -0800

--
 .../src/main/java/org/apache/beam/sdk/transforms/ToString.java   | 4 ++--
 .../core/src/test/java/org/apache/beam/sdk/io/WriteTest.java | 2 +-
 .../test/java/org/apache/beam/sdk/transforms/ToStringTest.java   | 2 +-
 3 files changed, 4 insertions(+), 4 deletions(-)
--




Jenkins build is back to stable : beam_PostCommit_Java_RunnableOnService_Dataflow #2285

2017-02-13 Thread Apache Jenkins Server
See 




[jira] [Commented] (BEAM-1468) Upgrade datastore dependency the 0.7.0

2017-02-13 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-1468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15864916#comment-15864916
 ] 

ASF GitHub Bot commented on BEAM-1468:
--

Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/1989


> Upgrade datastore dependency the 0.7.0
> --
>
> Key: BEAM-1468
> URL: https://issues.apache.org/jira/browse/BEAM-1468
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py
>Reporter: Ahmet Altay
>Assignee: Ahmet Altay
>Priority: Minor
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] beam pull request #1989: [BEAM-1468] Upgrading datatore dependency to 0.7.0 ...

2017-02-13 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/1989


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[1/2] beam git commit: Upgrading to datatore 0.7.0 library

2017-02-13 Thread altay
Repository: beam
Updated Branches:
  refs/heads/master 4018c835c -> f32cb3e7a


Upgrading to datatore 0.7.0 library


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/d8201a93
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/d8201a93
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/d8201a93

Branch: refs/heads/master
Commit: d8201a93c7776dbd0e77b4f428fb14a9840dd18a
Parents: 4018c83
Author: Ahmet Altay 
Authored: Fri Feb 10 19:02:25 2017 -0800
Committer: Ahmet Altay 
Committed: Mon Feb 13 18:25:38 2017 -0800

--
 .../apache_beam/examples/cookbook/datastore_wordcount.py | 6 +++---
 sdks/python/apache_beam/examples/snippets/snippets.py| 4 ++--
 sdks/python/apache_beam/io/datastore/v1/datastoreio.py   | 2 +-
 sdks/python/apache_beam/io/datastore/v1/datastoreio_test.py  | 4 ++--
 sdks/python/apache_beam/io/datastore/v1/fake_datastore.py| 4 ++--
 sdks/python/apache_beam/io/datastore/v1/helper.py| 6 +++---
 sdks/python/apache_beam/io/datastore/v1/helper_test.py   | 8 
 sdks/python/apache_beam/io/datastore/v1/query_splitter.py| 8 
 .../apache_beam/io/datastore/v1/query_splitter_test.py   | 6 +++---
 sdks/python/setup.py | 5 +++--
 10 files changed, 27 insertions(+), 26 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/d8201a93/sdks/python/apache_beam/examples/cookbook/datastore_wordcount.py
--
diff --git a/sdks/python/apache_beam/examples/cookbook/datastore_wordcount.py 
b/sdks/python/apache_beam/examples/cookbook/datastore_wordcount.py
index 282afbf..5d3bef6 100644
--- a/sdks/python/apache_beam/examples/cookbook/datastore_wordcount.py
+++ b/sdks/python/apache_beam/examples/cookbook/datastore_wordcount.py
@@ -35,7 +35,7 @@ The following options must be provided to run this pipeline 
in read-only mode:
 --project YOUR_PROJECT_ID
 --kind YOUR_DATASTORE_KIND
 --output [YOUR_LOCAL_FILE *or* gs://YOUR_OUTPUT_PATH]
---read-only
+--read_only
 ``
 
 Read-write Mode: In this mode, this example reads words from an input file,
@@ -66,8 +66,8 @@ import logging
 import re
 import uuid
 
-from google.datastore.v1 import entity_pb2
-from google.datastore.v1 import query_pb2
+from google.cloud.proto.datastore.v1 import entity_pb2
+from google.cloud.proto.datastore.v1 import query_pb2
 from googledatastore import helper as datastore_helper, PropertyFilter
 
 import apache_beam as beam

http://git-wip-us.apache.org/repos/asf/beam/blob/d8201a93/sdks/python/apache_beam/examples/snippets/snippets.py
--
diff --git a/sdks/python/apache_beam/examples/snippets/snippets.py 
b/sdks/python/apache_beam/examples/snippets/snippets.py
index 9ba46cd..6f081df 100644
--- a/sdks/python/apache_beam/examples/snippets/snippets.py
+++ b/sdks/python/apache_beam/examples/snippets/snippets.py
@@ -862,8 +862,8 @@ def model_datastoreio():
   """Using a Read and Write transform to read/write to Cloud Datastore."""
 
   import uuid
-  from google.datastore.v1 import entity_pb2
-  from google.datastore.v1 import query_pb2
+  from google.cloud.proto.datastore.v1 import entity_pb2
+  from google.cloud.proto.datastore.v1 import query_pb2
   import googledatastore
   import apache_beam as beam
   from apache_beam.utils.pipeline_options import PipelineOptions

http://git-wip-us.apache.org/repos/asf/beam/blob/d8201a93/sdks/python/apache_beam/io/datastore/v1/datastoreio.py
--
diff --git a/sdks/python/apache_beam/io/datastore/v1/datastoreio.py 
b/sdks/python/apache_beam/io/datastore/v1/datastoreio.py
index 5f6663a..562d88e 100644
--- a/sdks/python/apache_beam/io/datastore/v1/datastoreio.py
+++ b/sdks/python/apache_beam/io/datastore/v1/datastoreio.py
@@ -19,7 +19,7 @@
 
 import logging
 
-from google.datastore.v1 import datastore_pb2
+from google.cloud.proto.datastore.v1 import datastore_pb2
 from googledatastore import helper as datastore_helper
 
 from apache_beam.io.datastore.v1 import helper

http://git-wip-us.apache.org/repos/asf/beam/blob/d8201a93/sdks/python/apache_beam/io/datastore/v1/datastoreio_test.py
--
diff --git a/sdks/python/apache_beam/io/datastore/v1/datastoreio_test.py 
b/sdks/python/apache_beam/io/datastore/v1/datastoreio_test.py
index cdaccb1..3bd4630 100644
--- a/sdks/python/apache_beam/io/datastore/v1/datastoreio_test.py
+++ b/sdks/python/apache_beam/io/datastore/v1/datastoreio_test.py
@@ -17,8 +17,8 @@
 
 import unittest
 
-from google.datastore.v1 import datastore_pb2
-from google.datastore.v1 import query_pb2
+from google.cloud.proto.datastore.v1 impor

[2/2] beam git commit: This closes #1989

2017-02-13 Thread altay
This closes #1989


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/f32cb3e7
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/f32cb3e7
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/f32cb3e7

Branch: refs/heads/master
Commit: f32cb3e7a26b58e9cc548f2ba42a59bdfe41a131
Parents: 4018c83 d8201a9
Author: Ahmet Altay 
Authored: Mon Feb 13 18:25:43 2017 -0800
Committer: Ahmet Altay 
Committed: Mon Feb 13 18:25:43 2017 -0800

--
 .../apache_beam/examples/cookbook/datastore_wordcount.py | 6 +++---
 sdks/python/apache_beam/examples/snippets/snippets.py| 4 ++--
 sdks/python/apache_beam/io/datastore/v1/datastoreio.py   | 2 +-
 sdks/python/apache_beam/io/datastore/v1/datastoreio_test.py  | 4 ++--
 sdks/python/apache_beam/io/datastore/v1/fake_datastore.py| 4 ++--
 sdks/python/apache_beam/io/datastore/v1/helper.py| 6 +++---
 sdks/python/apache_beam/io/datastore/v1/helper_test.py   | 8 
 sdks/python/apache_beam/io/datastore/v1/query_splitter.py| 8 
 .../apache_beam/io/datastore/v1/query_splitter_test.py   | 6 +++---
 sdks/python/setup.py | 5 +++--
 10 files changed, 27 insertions(+), 26 deletions(-)
--




Jenkins build is back to normal : beam_PostCommit_Java_MavenInstall #2626

2017-02-13 Thread Apache Jenkins Server
See 




[jira] [Commented] (BEAM-1310) Add running integration tests for JdbcIO

2017-02-13 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-1310?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15864887#comment-15864887
 ] 

ASF GitHub Bot commented on BEAM-1310:
--

Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/1841


> Add running integration tests for JdbcIO
> 
>
> Key: BEAM-1310
> URL: https://issues.apache.org/jira/browse/BEAM-1310
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core
>Reporter: Stephen Sisk
>Assignee: Stephen Sisk
>
> Jdbc IO could use some integration tests! We'd like to have them run against 
> a real list instance of postgres.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] beam pull request #1841: BEAM-1310 Add integration tests for JdbcIO

2017-02-13 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/1841


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[2/2] beam git commit: This closes #1841

2017-02-13 Thread tgroh
This closes #1841


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/4018c835
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/4018c835
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/4018c835

Branch: refs/heads/master
Commit: 4018c835c421cd6672c563b10d1686685af01079
Parents: 8e0573b b284fb4
Author: Thomas Groh 
Authored: Mon Feb 13 18:12:07 2017 -0800
Committer: Thomas Groh 
Committed: Mon Feb 13 18:12:07 2017 -0800

--
 sdks/java/io/jdbc/pom.xml   |  11 ++
 .../org/apache/beam/sdk/io/jdbc/JdbcIOIT.java   | 175 +++
 .../org/apache/beam/sdk/io/jdbc/JdbcIOTest.java | 107 +---
 .../beam/sdk/io/jdbc/JdbcTestDataSet.java   | 127 ++
 .../beam/sdk/io/jdbc/PostgresTestOptions.java   |  60 +++
 .../kubernetes/postgres-pod-no-vol.yml  |  32 
 .../kubernetes/postgres-service-public.yml  |  27 +++
 .../kubernetes/setup-postgres-service.sh|  20 +++
 8 files changed, 494 insertions(+), 65 deletions(-)
--




[1/2] beam git commit: Add JDBC postgres IT, load script and k8 script

2017-02-13 Thread tgroh
Repository: beam
Updated Branches:
  refs/heads/master 8e0573ba5 -> 4018c835c


Add JDBC postgres IT, load script and k8 script


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/b284fb4d
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/b284fb4d
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/b284fb4d

Branch: refs/heads/master
Commit: b284fb4dedde188e302be01dd9426d09d7ef0021
Parents: 8e0573b
Author: Stephen Sisk 
Authored: Tue Jan 24 17:56:35 2017 -0800
Committer: Thomas Groh 
Committed: Mon Feb 13 18:12:01 2017 -0800

--
 sdks/java/io/jdbc/pom.xml   |  11 ++
 .../org/apache/beam/sdk/io/jdbc/JdbcIOIT.java   | 175 +++
 .../org/apache/beam/sdk/io/jdbc/JdbcIOTest.java | 107 +---
 .../beam/sdk/io/jdbc/JdbcTestDataSet.java   | 127 ++
 .../beam/sdk/io/jdbc/PostgresTestOptions.java   |  60 +++
 .../kubernetes/postgres-pod-no-vol.yml  |  32 
 .../kubernetes/postgres-service-public.yml  |  27 +++
 .../kubernetes/setup-postgres-service.sh|  20 +++
 8 files changed, 494 insertions(+), 65 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/b284fb4d/sdks/java/io/jdbc/pom.xml
--
diff --git a/sdks/java/io/jdbc/pom.xml b/sdks/java/io/jdbc/pom.xml
index 92a3761..23feab6 100644
--- a/sdks/java/io/jdbc/pom.xml
+++ b/sdks/java/io/jdbc/pom.xml
@@ -74,6 +74,11 @@
   2.1.1
 
 
+
+  joda-time
+  joda-time
+
+
 
 
   com.google.auto.value
@@ -120,6 +125,12 @@
   slf4j-jdk14
   test
 
+
+  org.postgresql
+  postgresql
+  9.4.1212.jre7
+  test
+
   
 
 

http://git-wip-us.apache.org/repos/asf/beam/blob/b284fb4d/sdks/java/io/jdbc/src/test/java/org/apache/beam/sdk/io/jdbc/JdbcIOIT.java
--
diff --git 
a/sdks/java/io/jdbc/src/test/java/org/apache/beam/sdk/io/jdbc/JdbcIOIT.java 
b/sdks/java/io/jdbc/src/test/java/org/apache/beam/sdk/io/jdbc/JdbcIOIT.java
new file mode 100644
index 000..15206c7
--- /dev/null
+++ b/sdks/java/io/jdbc/src/test/java/org/apache/beam/sdk/io/jdbc/JdbcIOIT.java
@@ -0,0 +1,175 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.sdk.io.jdbc;
+
+import java.sql.Connection;
+import java.sql.PreparedStatement;
+import java.sql.ResultSet;
+import java.sql.SQLException;
+import java.sql.Statement;
+import java.util.ArrayList;
+import java.util.List;
+
+import org.apache.beam.sdk.coders.BigEndianIntegerCoder;
+import org.apache.beam.sdk.coders.KvCoder;
+import org.apache.beam.sdk.coders.StringUtf8Coder;
+import org.apache.beam.sdk.options.PipelineOptionsFactory;
+import org.apache.beam.sdk.testing.PAssert;
+import org.apache.beam.sdk.testing.TestPipeline;
+import org.apache.beam.sdk.transforms.Count;
+import org.apache.beam.sdk.transforms.Create;
+import org.apache.beam.sdk.values.KV;
+import org.apache.beam.sdk.values.PCollection;
+import org.junit.AfterClass;
+import org.junit.Assert;
+import org.junit.BeforeClass;
+import org.junit.Test;
+import org.junit.runner.RunWith;
+import org.junit.runners.JUnit4;
+import org.postgresql.ds.PGSimpleDataSource;
+
+
+/**
+ * A test of {@link org.apache.beam.sdk.io.jdbc.JdbcIO} on an independent 
Postgres instance.
+ *
+ * This test requires a running instance of Postgres, and the test dataset 
must exist in the
+ * database. `JdbcTestDataSet` will create the read table.
+ *
+ * You can run just this test by doing the following:
+ * 
+ * mvn test-compile compile failsafe:integration-test -D 
beamTestPipelineOptions='[
+ * "--postgresServerName=1.2.3.4",
+ * "--postgresUsername=postgres",
+ * "--postgresDatabaseName=myfancydb",
+ * "--postgresPassword=yourpassword",
+ * "--postgresSsl=false"
+ * ]' -DskipITs=false -Dit.test=org.apache.beam.sdk.io.jdbc.JdbcIOIT 
-DfailIfNoTests=false
+ * 
+ */
+@RunWith(JUnit4.class)
+public class JdbcIOIT {
+  private static PGSimpleDataSource 

[jira] [Commented] (BEAM-59) IOChannelFactory rethinking/redesign

2017-02-13 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-59?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15864859#comment-15864859
 ] 

ASF GitHub Bot commented on BEAM-59:


GitHub user peihe opened a pull request:

https://github.com/apache/beam/pull/2002

[BEAM-59] Beam GcsFileSystem: port expand() from GcsUtil for glob matching.



Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [ ] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [ ] Make sure tests pass via `mvn clean verify`. (Even better, enable
   Travis-CI on your fork and ensure the whole test matrix passes).
 - [ ] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [ ] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.txt).

---


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/peihe/incubator-beam gcs-util-refactor-expand

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/2002.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2002


commit 7cd3aed6ea9261f4624c9a387f5379ed99de619a
Author: Pei He 
Date:   2017-02-14T01:17:55Z

[BEAM-59] Beam GcsFileSystem: port expand() from GcsUtil for glob matching.




> IOChannelFactory rethinking/redesign
> 
>
> Key: BEAM-59
> URL: https://issues.apache.org/jira/browse/BEAM-59
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-java-core, sdk-java-gcp
>Reporter: Daniel Halperin
>Assignee: Pei He
>
> Right now, FileBasedSource and FileBasedSink communication is mediated by 
> IOChannelFactory. There are a number of issues:
> * Global configuration -- e.g., all 'gs://' URIs use the same credentials. 
> This should be per-source/per-sink/etc.
> * Supported APIs -- currently IOChannelFactory is in the "non-public API" 
> util package and subject to change. We need users to be able to add new 
> backends ('s3://', 'hdfs://', etc.) directly, without fear that they will be 
> broken.
> * Per-backend features: e.g., creating buckets in GCS/s3, setting expiration 
> time, etc.
> Updates:
> Design docs posted on dev@ list:
> Part 1: IOChannelFactory Redesign: 
> https://docs.google.com/document/d/11TdPyZ9_zmjokhNWM3Id-XJsVG3qel2lhdKTknmZ_7M/edit#
> Part 2: Configurable BeamFileSystem:
> https://docs.google.com/document/d/1-7vo9nLRsEEzDGnb562PuL4q9mUiq_ZVpCAiyyJw8p8/edit#heading=h.p3gc3colc2cs



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] beam pull request #2002: [BEAM-59] Beam GcsFileSystem: port expand() from Gc...

2017-02-13 Thread peihe
GitHub user peihe opened a pull request:

https://github.com/apache/beam/pull/2002

[BEAM-59] Beam GcsFileSystem: port expand() from GcsUtil for glob matching.



Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [ ] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [ ] Make sure tests pass via `mvn clean verify`. (Even better, enable
   Travis-CI on your fork and ensure the whole test matrix passes).
 - [ ] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [ ] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.txt).

---


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/peihe/incubator-beam gcs-util-refactor-expand

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/2002.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2002


commit 7cd3aed6ea9261f4624c9a387f5379ed99de619a
Author: Pei He 
Date:   2017-02-14T01:17:55Z

[BEAM-59] Beam GcsFileSystem: port expand() from GcsUtil for glob matching.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (BEAM-1471) Make IterableCoder binary compatible across SDKs

2017-02-13 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-1471?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15864848#comment-15864848
 ] 

ASF GitHub Bot commented on BEAM-1471:
--

Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/1996


> Make IterableCoder binary compatible across SDKs
> 
>
> Key: BEAM-1471
> URL: https://issues.apache.org/jira/browse/BEAM-1471
> Project: Beam
>  Issue Type: Improvement
>  Components: beam-model
>Affects Versions: 0.5.0
>Reporter: Vikas Kedigehalli
>Assignee: Vikas Kedigehalli
>
> Ensure IterableCoder across SDKs binary compatible and add tests.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[1/2] beam git commit: Add cross-sdk tests for IterableCoder

2017-02-13 Thread dhalperi
Repository: beam
Updated Branches:
  refs/heads/master b67bd111e -> 8e0573ba5


Add cross-sdk tests for IterableCoder


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/cab5e634
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/cab5e634
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/cab5e634

Branch: refs/heads/master
Commit: cab5e6347f3fcece2cac4819e268488d7ce66830
Parents: b67bd11
Author: Vikas Kedigehalli 
Authored: Mon Feb 13 10:23:28 2017 -0800
Committer: Vikas Kedigehalli 
Committed: Mon Feb 13 15:35:19 2017 -0800

--
 .../org/apache/beam/fn/v1/standard_coders.yaml  | 20 
 .../apache/beam/sdk/coders/CommonCoderTest.java | 12 
 .../apache_beam/coders/standard_coders_test.py  |  2 ++
 .../apache_beam/tests/data/standard_coders.yaml | 20 
 4 files changed, 54 insertions(+)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/cab5e634/sdks/common/fn-api/src/test/resources/org/apache/beam/fn/v1/standard_coders.yaml
--
diff --git 
a/sdks/common/fn-api/src/test/resources/org/apache/beam/fn/v1/standard_coders.yaml
 
b/sdks/common/fn-api/src/test/resources/org/apache/beam/fn/v1/standard_coders.yaml
index 948ac6b..58a2a90 100644
--- 
a/sdks/common/fn-api/src/test/resources/org/apache/beam/fn/v1/standard_coders.yaml
+++ 
b/sdks/common/fn-api/src/test/resources/org/apache/beam/fn/v1/standard_coders.yaml
@@ -106,3 +106,23 @@ examples:
   
"\u0080\u\u0001\u0053\u0034\u00ec\u0074\u00e8\u0080\u0090\u00fb\u00d3\u0009"
 : {end: 1456881825000, span: 259200}
   "\u007f\u00df\u003b\u0064\u005a\u001c\u00ad\u0076\u00ed\u0002" : {end: 
-9223372036854410, span: 365}
   "\u0080\u0020\u00c4\u009b\u00a5\u00e3\u0053\u00f7\u" : {end: 
9223372036854775, span: 0}
+
+---
+
+coder:
+  urn: "urn:beam:coders:stream:0.1"
+  components: [{urn: "urn:beam:coders:varint:0.1"}]
+examples:
+  "\0\0\0\u0001\0": [0]
+  "\0\0\0\u0004\u0001\n\u00c8\u0001\u00e8\u0007": [1, 10, 200, 1000]
+  "\0\0\0\0": []
+
+---
+
+coder:
+  urn: "urn:beam:coders:stream:0.1"
+  components: [{urn: "urn:beam:coders:bytes:0.1"}]
+examples:
+  "\0\0\0\u0001\u0003abc": ["abc"]
+  "\0\0\0\u0002\u0004ab\0c\u0004de\0f": ["ab\0c", "de\0f"]
+  "\0\0\0\0": []
\ No newline at end of file

http://git-wip-us.apache.org/repos/asf/beam/blob/cab5e634/sdks/java/core/src/test/java/org/apache/beam/sdk/coders/CommonCoderTest.java
--
diff --git 
a/sdks/java/core/src/test/java/org/apache/beam/sdk/coders/CommonCoderTest.java 
b/sdks/java/core/src/test/java/org/apache/beam/sdk/coders/CommonCoderTest.java
index ad5d9c3..7eafbe2 100644
--- 
a/sdks/java/core/src/test/java/org/apache/beam/sdk/coders/CommonCoderTest.java
+++ 
b/sdks/java/core/src/test/java/org/apache/beam/sdk/coders/CommonCoderTest.java
@@ -70,6 +70,7 @@ public class CommonCoderTest {
   .put("urn:beam:coders:kv:0.1", KvCoder.class)
   .put("urn:beam:coders:varint:0.1", VarLongCoder.class)
   .put("urn:beam:coders:intervalwindow:0.1", IntervalWindowCoder.class)
+  .put("urn:beam:coders:stream:0.1", IterableCoder.class)
   .build();
 
   @AutoValue
@@ -198,6 +199,15 @@ public class CommonCoderTest {
 Duration span = Duration.millis(((Number) 
kvMap.get("span")).longValue());
 return new IntervalWindow(end.minus(span), span);
   }
+  case "urn:beam:coders:stream:0.1":
+Coder elementCoder = ((IterableCoder) coder).getElemCoder();
+List elements = (List) value;
+List convertedElements = new LinkedList<>();
+for (Object element : elements) {
+  convertedElements.add(
+  convertValue(element, coderSpec.getComponents().get(0), 
elementCoder));
+}
+return convertedElements;
   default:
 throw new IllegalStateException("Unknown coder URN: " + 
coderSpec.getUrn());
 }
@@ -217,6 +227,8 @@ public class CommonCoderTest {
 return VarLongCoder.of();
   case "urn:beam:coders:intervalwindow:0.1":
 return IntervalWindowCoder.of();
+  case "urn:beam:coders:stream:0.1":
+return IterableCoder.of(components.get(0));
   default:
 throw new IllegalStateException("Unknown coder URN: " + 
coder.getUrn());
 }

http://git-wip-us.apache.org/repos/asf/beam/blob/cab5e634/sdks/python/apache_beam/coders/standard_coders_test.py
--
diff --git a/sdks/python/apache_beam/coders/standard_coders_test.py 
b/sdks/python/apache_beam/coders/standard_coders_test.py
index e66ec7b..d4179eb 100644
--- a/sdks/python/apache_beam/coders/standard_coders_test.py
+++ b/sdks/python/apache_beam/coders/standard_coders_test.py
@@ -38,

[2/2] beam git commit: This closes #1996

2017-02-13 Thread dhalperi
This closes #1996


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/8e0573ba
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/8e0573ba
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/8e0573ba

Branch: refs/heads/master
Commit: 8e0573ba53918b1c4403a6f48a8e983f12d53270
Parents: b67bd11 cab5e63
Author: Dan Halperin 
Authored: Mon Feb 13 17:39:22 2017 -0800
Committer: Dan Halperin 
Committed: Mon Feb 13 17:39:22 2017 -0800

--
 .../org/apache/beam/fn/v1/standard_coders.yaml  | 20 
 .../apache/beam/sdk/coders/CommonCoderTest.java | 12 
 .../apache_beam/coders/standard_coders_test.py  |  2 ++
 .../apache_beam/tests/data/standard_coders.yaml | 20 
 4 files changed, 54 insertions(+)
--




[GitHub] beam pull request #1996: [BEAM-1471]: Add cross-sdk tests for IterableCoder

2017-02-13 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/1996


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (BEAM-1473) Remove unused windmill proto files from python sdk

2017-02-13 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-1473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15864831#comment-15864831
 ] 

ASF GitHub Bot commented on BEAM-1473:
--

GitHub user sb2nov opened a pull request:

https://github.com/apache/beam/pull/2001

[BEAM-1473] Remove unused windmill proto from python sdk

R: @charlesccychen PTAL

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [ ] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [ ] Make sure tests pass via `mvn clean verify`. (Even better, enable
   Travis-CI on your fork and ensure the whole test matrix passes).
 - [ ] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [ ] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.txt).

---


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/sb2nov/beam BEAM-1473-remove-windmill-proto

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/2001.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2001






> Remove unused windmill proto files from python sdk
> --
>
> Key: BEAM-1473
> URL: https://issues.apache.org/jira/browse/BEAM-1473
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-py
>Affects Versions: Not applicable
>Reporter: Sourabh Bajaj
>Assignee: Sourabh Bajaj
>Priority: Minor
> Fix For: Not applicable
>
>
> https://github.com/apache/beam/blob/master/sdks/python/apache_beam/internal/windmill_pb2.py
>  
> There are two unused windmill files in beam that should be cleaned.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] beam pull request #2001: [BEAM-1473] Remove unused windmill proto from pytho...

2017-02-13 Thread sb2nov
GitHub user sb2nov opened a pull request:

https://github.com/apache/beam/pull/2001

[BEAM-1473] Remove unused windmill proto from python sdk

R: @charlesccychen PTAL

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [ ] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [ ] Make sure tests pass via `mvn clean verify`. (Even better, enable
   Travis-CI on your fork and ensure the whole test matrix passes).
 - [ ] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [ ] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.txt).

---


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/sb2nov/beam BEAM-1473-remove-windmill-proto

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/2001.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2001






---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Updated] (BEAM-1473) Remove unused windmill proto files from python sdk

2017-02-13 Thread Sourabh Bajaj (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-1473?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sourabh Bajaj updated BEAM-1473:

Description: 
https://github.com/apache/beam/blob/master/sdks/python/apache_beam/internal/windmill_pb2.py
 

There are two unused windmill files in beam that should be cleaned.

> Remove unused windmill proto files from python sdk
> --
>
> Key: BEAM-1473
> URL: https://issues.apache.org/jira/browse/BEAM-1473
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-py
>Affects Versions: Not applicable
>Reporter: Sourabh Bajaj
>Assignee: Sourabh Bajaj
>Priority: Minor
> Fix For: Not applicable
>
>
> https://github.com/apache/beam/blob/master/sdks/python/apache_beam/internal/windmill_pb2.py
>  
> There are two unused windmill files in beam that should be cleaned.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (BEAM-1473) Remove unused windmill proto files from python sdk

2017-02-13 Thread Sourabh Bajaj (JIRA)
Sourabh Bajaj created BEAM-1473:
---

 Summary: Remove unused windmill proto files from python sdk
 Key: BEAM-1473
 URL: https://issues.apache.org/jira/browse/BEAM-1473
 Project: Beam
  Issue Type: Improvement
  Components: sdk-py
Affects Versions: Not applicable
Reporter: Sourabh Bajaj
Assignee: Sourabh Bajaj
Priority: Minor
 Fix For: Not applicable






--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (BEAM-1464) Make sure all tests use the TestPipeline class

2017-02-13 Thread Sourabh Bajaj (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-1464?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sourabh Bajaj resolved BEAM-1464.
-
Resolution: Fixed

> Make sure all tests use the TestPipeline class
> --
>
> Key: BEAM-1464
> URL: https://issues.apache.org/jira/browse/BEAM-1464
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-py
>Affects Versions: Not applicable
>Reporter: Sourabh Bajaj
>Assignee: Sourabh Bajaj
>Priority: Minor
> Fix For: Not applicable
>
>
> Some tests still specify using the DirectRunner. They should all use the 
> TestRunner in those cases.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (BEAM-1472) Use cross-language serialization schema for triggers in Python SDK

2017-02-13 Thread Kenneth Knowles (JIRA)
Kenneth Knowles created BEAM-1472:
-

 Summary: Use cross-language serialization schema for triggers in 
Python SDK
 Key: BEAM-1472
 URL: https://issues.apache.org/jira/browse/BEAM-1472
 Project: Beam
  Issue Type: New Feature
  Components: sdk-py
Reporter: Kenneth Knowles
Assignee: Vikas Kedigehalli


Basically, we need to do something like 
https://github.com/apache/beam/pull/1988/files for Python.

This is key for non-Python runners being able to understand trigger specs from 
Python.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (BEAM-1461) duplication with StartBundle and prepareForProcessing in DoFn

2017-02-13 Thread Thomas Groh (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-1461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15864666#comment-15864666
 ] 

Thomas Groh commented on BEAM-1461:
---

Sure;

{{prepareForProcessing}} exists purely to support Aggregators - it sets up some 
finalization details within the base DoFn class. However, we plan on removing 
Aggregators, tracked in [BEAM-775]. Once that's done, we should remove the 
parts of the SDK that exist to support Aggregators, which includes 
{{prepareForProcessing}}. We should signal this now, and I believe making the 
method final and deprecated is the most effective way to signal that it's an 
implementation detail of the DoFn internals rather than the user-visible 
processing method.

Additionally, overriding {{prepareForProcessing}} could lead to a lack of 
precondition enforcement within DoFn, so it is not generally safe to override.

On an additional note, we probably would have missed the naming duplication in 
prepareForProcessing as well as the fact that it's actually an Aggregator 
method without this Jira, so thank you for posting it.

> duplication with StartBundle and prepareForProcessing in DoFn
> -
>
> Key: BEAM-1461
> URL: https://issues.apache.org/jira/browse/BEAM-1461
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core
>Reporter: Xu Mingmin
>Assignee: Davor Bonaci
>
> There're one annotation `StartBundle`, and one public function 
> `prepareForProcessing` in DoFn, which are called both before 
> `ProcessElement`. It's confused which one should be implemented in a subclass.
> The call sequence seems as:
> prepareForProcessing -> StartBundle -> processElement



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (BEAM-1467) Use well-known coder types for known window coders

2017-02-13 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-1467?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15864648#comment-15864648
 ] 

ASF GitHub Bot commented on BEAM-1467:
--

Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/1984


> Use well-known coder types for known window coders
> --
>
> Key: BEAM-1467
> URL: https://issues.apache.org/jira/browse/BEAM-1467
> Project: Beam
>  Issue Type: Improvement
>  Components: beam-model-fn-api, beam-model-runner-api
>Reporter: Daniel Halperin
>Assignee: Daniel Halperin
>Priority: Minor
>
> Known window types include:
> * GlobalWindow
> * IntervalWindow
> Standardizing the name and encodings of these windows will enable many more 
> pipelines to work across the Fn API with low overhead.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] beam pull request #1984: [BEAM-1467] Add cross-SDK implementations and tests...

2017-02-13 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/1984


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[2/2] beam git commit: This closes #1984

2017-02-13 Thread dhalperi
This closes #1984


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/b67bd111
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/b67bd111
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/b67bd111

Branch: refs/heads/master
Commit: b67bd111e86e5e4f57fb8f1eddc790d7399209b2
Parents: a628ce3 ac7c471
Author: Dan Halperin 
Authored: Mon Feb 13 15:17:24 2017 -0800
Committer: Dan Halperin 
Committed: Mon Feb 13 15:17:24 2017 -0800

--
 .../org/apache/beam/fn/v1/standard_coders.yaml  | 10 +++
 .../transforms/windowing/IntervalWindow.java| 15 ++
 .../org/apache/beam/sdk/util/CoderUtils.java|  2 ++
 .../apache/beam/sdk/coders/CommonCoderTest.java | 31 
 sdks/python/apache_beam/coders/coder_impl.py| 31 +++-
 sdks/python/apache_beam/coders/coders.py| 15 ++
 .../apache_beam/coders/coders_test_common.py|  9 +-
 sdks/python/apache_beam/coders/slow_stream.py   |  6 
 .../apache_beam/coders/standard_coders_test.py  | 11 +--
 .../apache_beam/tests/data/standard_coders.yaml | 10 +++
 10 files changed, 130 insertions(+), 10 deletions(-)
--




[1/2] beam git commit: Add cross-SDK implementations and tests of IntervalWindowCoder

2017-02-13 Thread dhalperi
Repository: beam
Updated Branches:
  refs/heads/master a628ce353 -> b67bd111e


Add cross-SDK implementations and tests of IntervalWindowCoder


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/ac7c4714
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/ac7c4714
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/ac7c4714

Branch: refs/heads/master
Commit: ac7c471473510e4f9a9281447a99ceb9552acd17
Parents: a628ce3
Author: Dan Halperin 
Authored: Fri Feb 10 11:56:00 2017 -0800
Committer: Dan Halperin 
Committed: Mon Feb 13 15:17:21 2017 -0800

--
 .../org/apache/beam/fn/v1/standard_coders.yaml  | 10 +++
 .../transforms/windowing/IntervalWindow.java| 15 ++
 .../org/apache/beam/sdk/util/CoderUtils.java|  2 ++
 .../apache/beam/sdk/coders/CommonCoderTest.java | 31 
 sdks/python/apache_beam/coders/coder_impl.py| 31 +++-
 sdks/python/apache_beam/coders/coders.py| 15 ++
 .../apache_beam/coders/coders_test_common.py|  9 +-
 sdks/python/apache_beam/coders/slow_stream.py   |  6 
 .../apache_beam/coders/standard_coders_test.py  | 11 +--
 .../apache_beam/tests/data/standard_coders.yaml | 10 +++
 10 files changed, 130 insertions(+), 10 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/ac7c4714/sdks/common/fn-api/src/test/resources/org/apache/beam/fn/v1/standard_coders.yaml
--
diff --git 
a/sdks/common/fn-api/src/test/resources/org/apache/beam/fn/v1/standard_coders.yaml
 
b/sdks/common/fn-api/src/test/resources/org/apache/beam/fn/v1/standard_coders.yaml
index afa92e9..948ac6b 100644
--- 
a/sdks/common/fn-api/src/test/resources/org/apache/beam/fn/v1/standard_coders.yaml
+++ 
b/sdks/common/fn-api/src/test/resources/org/apache/beam/fn/v1/standard_coders.yaml
@@ -96,3 +96,13 @@ nested: true
 examples:
   "\u0003abc\u0003def": {key: abc, value: def}
   "\u0004ab\0c\u0004de\0f": {key: "ab\0c", value: "de\0f"}
+
+---
+
+coder:
+  urn: "urn:beam:coders:intervalwindow:0.1"
+examples:
+  "\u0080\u\u0001\u0052\u009a\u00a4\u009b\u0068\u0080\u00dd\u00db\u0001" : 
{end: 1454293425000, span: 360}
+  
"\u0080\u\u0001\u0053\u0034\u00ec\u0074\u00e8\u0080\u0090\u00fb\u00d3\u0009"
 : {end: 1456881825000, span: 259200}
+  "\u007f\u00df\u003b\u0064\u005a\u001c\u00ad\u0076\u00ed\u0002" : {end: 
-9223372036854410, span: 365}
+  "\u0080\u0020\u00c4\u009b\u00a5\u00e3\u0053\u00f7\u" : {end: 
9223372036854775, span: 0}

http://git-wip-us.apache.org/repos/asf/beam/blob/ac7c4714/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/windowing/IntervalWindow.java
--
diff --git 
a/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/windowing/IntervalWindow.java
 
b/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/windowing/IntervalWindow.java
index fb0fc11..c0ad2c0 100644
--- 
a/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/windowing/IntervalWindow.java
+++ 
b/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/windowing/IntervalWindow.java
@@ -26,6 +26,7 @@ import org.apache.beam.sdk.coders.Coder;
 import org.apache.beam.sdk.coders.CoderException;
 import org.apache.beam.sdk.coders.DurationCoder;
 import org.apache.beam.sdk.coders.InstantCoder;
+import org.apache.beam.sdk.util.CloudObject;
 import org.joda.time.Duration;
 import org.joda.time.Instant;
 import org.joda.time.ReadableDuration;
@@ -166,10 +167,9 @@ public class IntervalWindow extends BoundedWindow
   /**
* Encodes an {@link IntervalWindow} as a pair of its upper bound and 
duration.
*/
-  private static class IntervalWindowCoder extends AtomicCoder 
{
+  public static class IntervalWindowCoder extends AtomicCoder {
 
-private static final IntervalWindowCoder INSTANCE =
-new IntervalWindowCoder();
+private static final IntervalWindowCoder INSTANCE = new 
IntervalWindowCoder();
 
 private static final Coder instantCoder = InstantCoder.of();
 private static final Coder durationCoder = 
DurationCoder.of();
@@ -180,9 +180,7 @@ public class IntervalWindow extends BoundedWindow
 }
 
 @Override
-public void encode(IntervalWindow window,
-   OutputStream outStream,
-   Context context)
+public void encode(IntervalWindow window, OutputStream outStream, Context 
context)
 throws IOException, CoderException {
   instantCoder.encode(window.end, outStream, context.nested());
   durationCoder.encode(new Duration(window.start, window.end), outStream, 
context);
@@ -195,5 +193,10 @@ public class IntervalWindow extends BoundedWindow
   ReadableDuration duration = durationCoder.decode(i

[GitHub] beam pull request #1999: Upgrade google-api-services-dataflow

2017-02-13 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/1999


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[1/2] beam git commit: Upgrade google-api-services-dataflow to v1b3-rev186-1.22.0

2017-02-13 Thread dhalperi
Repository: beam
Updated Branches:
  refs/heads/master 2c0cffaf7 -> a628ce353


Upgrade google-api-services-dataflow to v1b3-rev186-1.22.0


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/cad43544
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/cad43544
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/cad43544

Branch: refs/heads/master
Commit: cad4354401ac8e43b216adc900b01429f659a5ec
Parents: 2c0cffa
Author: Eric Roshan-Eisner 
Authored: Mon Feb 13 13:38:37 2017 -0800
Committer: Eric Roshan-Eisner 
Committed: Mon Feb 13 13:38:37 2017 -0800

--
 pom.xml | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/cad43544/pom.xml
--
diff --git a/pom.xml b/pom.xml
index f4e458e..be75659 100644
--- a/pom.xml
+++ b/pom.xml
@@ -108,7 +108,7 @@
 v1-rev6-1.22.0
 0.1.0
 v2-rev8-1.22.0
-v1b3-rev43-1.22.0
+v1b3-rev186-1.22.0
 0.5.160222
 1.4.0
 1.3.0



[2/2] beam git commit: This closes #1999

2017-02-13 Thread dhalperi
This closes #1999


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/a628ce35
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/a628ce35
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/a628ce35

Branch: refs/heads/master
Commit: a628ce35388c36dac6f96f8069542d9068d31035
Parents: 2c0cffa cad4354
Author: Dan Halperin 
Authored: Mon Feb 13 14:49:05 2017 -0800
Committer: Dan Halperin 
Committed: Mon Feb 13 14:49:05 2017 -0800

--
 pom.xml | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
--




Jenkins build became unstable: beam_PostCommit_Java_RunnableOnService_Dataflow #2283

2017-02-13 Thread Apache Jenkins Server
See 




[jira] [Commented] (BEAM-1457) Enable rat plugin and findbugs plugin in default build

2017-02-13 Thread Davor Bonaci (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-1457?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15864609#comment-15864609
 ] 

Davor Bonaci commented on BEAM-1457:


I'd keep this issue open for a little bit. I'd probably not make a move on it 
just yet, and see how things progress build-wise from this point on. Now, with 
Python in, build times are off quite a bit, and impact therefore is a bit 
muted. Let's sit on it and reevaluate shortly.

> Enable rat plugin and findbugs plugin in default build
> --
>
> Key: BEAM-1457
> URL: https://issues.apache.org/jira/browse/BEAM-1457
> Project: Beam
>  Issue Type: Improvement
>  Components: build-system
>Reporter: Aviem Zur
>Assignee: Davor Bonaci
>
> Today, maven rat plugin and findbugs plugin only run when `release` profile 
> is specified.
> Since these plugins do not add a large amount of time compared to the normal 
> build, and their checks are required to pass to approve pull requests - let's 
> enable them by default.
> [Original dev list 
> discussion|https://lists.apache.org/thread.html/e1f80e54b44b4a39630d978abe79fb6a6cecf71d9821ee1881b47afb@%3Cdev.beam.apache.org%3E]



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (BEAM-1461) duplication with StartBundle and prepareForProcessing in DoFn

2017-02-13 Thread Davor Bonaci (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-1461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15864604#comment-15864604
 ] 

Davor Bonaci commented on BEAM-1461:


I think what [~tgroh] did make sense. [~tgroh], can you please expand?

> duplication with StartBundle and prepareForProcessing in DoFn
> -
>
> Key: BEAM-1461
> URL: https://issues.apache.org/jira/browse/BEAM-1461
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core
>Reporter: Xu Mingmin
>Assignee: Davor Bonaci
>
> There're one annotation `StartBundle`, and one public function 
> `prepareForProcessing` in DoFn, which are called both before 
> `ProcessElement`. It's confused which one should be implemented in a subclass.
> The call sequence seems as:
> prepareForProcessing -> StartBundle -> processElement



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (BEAM-115) Beam Runner API

2017-02-13 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15864586#comment-15864586
 ] 

ASF GitHub Bot commented on BEAM-115:
-

GitHub user kennknowles opened a pull request:

https://github.com/apache/beam/pull/2000

[BEAM-115] Unify Fn API and Runner API FunctionSpec

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [ ] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [ ] Make sure tests pass via `mvn clean verify`. (Even better, enable
   Travis-CI on your fork and ensure the whole test matrix passes).
 - [ ] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [ ] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.txt).

---

Do not review yet - utilizing Jenkins

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/kennknowles/beam fn-api-functionspec

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/2000.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2000


commit 90ed46256915d726b537454b1239cb70767b2e2d
Author: Kenneth Knowles 
Date:   2017-02-13T16:38:40Z

Unify Fn API and Runner API FunctionSpec




> Beam Runner API
> ---
>
> Key: BEAM-115
> URL: https://issues.apache.org/jira/browse/BEAM-115
> Project: Beam
>  Issue Type: Improvement
>  Components: beam-model-runner-api
>Reporter: Kenneth Knowles
>Assignee: Kenneth Knowles
>
> The PipelineRunner API from the SDK is not ideal for the Beam technical 
> vision.
> It has technical limitations:
>  - The user's DAG (even including library expansions) is never explicitly 
> represented, so it cannot be analyzed except incrementally, and cannot 
> necessarily be reconstructed (for example, to display it!).
>  - The flattened DAG of just primitive transforms isn't well-suited for 
> display or transform override.
>  - The TransformHierarchy isn't well-suited for optimizations.
>  - The user must realistically pre-commit to a runner, and its configuration 
> (batch vs streaming) prior to graph construction, since the runner will be 
> modifying the graph as it is built.
>  - It is fairly language- and SDK-specific.
> It has usability issues (these are not from intuition, but derived from 
> actual cases of failure to use according to the design)
>  - The interleaving of apply() methods in PTransform/Pipeline/PipelineRunner 
> is confusing.
>  - The TransformHierarchy, accessible only via visitor traversals, is 
> cumbersome.
>  - The staging of construction-time vs run-time is not always obvious.
> These are just examples. This ticket tracks designing, coming to consensus, 
> and building an API that more simply and directly supports the technical 
> vision.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] beam pull request #2000: [BEAM-115] Unify Fn API and Runner API FunctionSpec

2017-02-13 Thread kennknowles
GitHub user kennknowles opened a pull request:

https://github.com/apache/beam/pull/2000

[BEAM-115] Unify Fn API and Runner API FunctionSpec

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [ ] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [ ] Make sure tests pass via `mvn clean verify`. (Even better, enable
   Travis-CI on your fork and ensure the whole test matrix passes).
 - [ ] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [ ] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.txt).

---

Do not review yet - utilizing Jenkins

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/kennknowles/beam fn-api-functionspec

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/2000.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2000


commit 90ed46256915d726b537454b1239cb70767b2e2d
Author: Kenneth Knowles 
Date:   2017-02-13T16:38:40Z

Unify Fn API and Runner API FunctionSpec




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] beam pull request #1999: Upgrade google-api-services-dataflow

2017-02-13 Thread edre
GitHub user edre opened a pull request:

https://github.com/apache/beam/pull/1999

Upgrade google-api-services-dataflow

to v1b3-rev186-1.22.0

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/edre/beam master

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/1999.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1999


commit cad4354401ac8e43b216adc900b01429f659a5ec
Author: Eric Roshan-Eisner 
Date:   2017-02-13T21:38:37Z

Upgrade google-api-services-dataflow to v1b3-rev186-1.22.0




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] beam-site pull request #152: Add blog post "Stateful Processing with Apache ...

2017-02-13 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/beam-site/pull/152


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[3/5] beam-site git commit: Regenerate website

2017-02-13 Thread davor
http://git-wip-us.apache.org/repos/asf/beam-site/blob/2dd05932/content/feed.xml
--
diff --git a/content/feed.xml b/content/feed.xml
index f94ee48..10bccbd 100644
--- a/content/feed.xml
+++ b/content/feed.xml
@@ -9,6 +9,580 @@
 Jekyll v3.2.0
 
   
+Stateful processing with Apache Beam
+

Beam lets you process unbounded, out-of-order, global-scale data with portable +high-level pipelines. Stateful processing is a new feature of the Beam model +that expands the capabilities of Beam, unlocking new use cases and new +efficiencies. In this post, I will guide you through stateful processing in +Beam: how it works, how it fits in with the other features of the Beam model, +what you might use it for, and what it looks like in code.

+ + + +
+

Warning: new features ahead!: This is a very new aspect of the Beam +model. Runners are still adding support. You can try it out today on multiple +runners, but do check the runner capability +matrix for +the current status in each runner.

+
+ +

First, a quick recap: In Beam, a big data processing pipeline is a directed, +acyclic graph of parallel operations called PTransforms processing data +from PCollections I’ll expand on that by walking through this illustration:

+ +

A Beam 
Pipeline - PTransforms are boxes - PCollections are arrows

+ +

The boxes are PTransforms and the edges represent the data in PCollections +flowing from one PTransform to the next. A PCollection may be bounded (which +means it is finite and you know it) or unbounded (which means you don’t know if +it is finite or not - basically, it is like an incoming stream of data that may +or may not ever terminate). The cylinders are the data sources and sinks at the +edges of your pipeline, such as bounded collections of log files or unbounded +data streaming over a Kafka topic. This blog post isn’t about sources or sinks, +but about what happens in between - your data processing.

+ +

There are two main building blocks for processing your data in Beam: ParDo, +for performing an operation in parallel across all elements, and GroupByKey +(and the closely related CombinePerKey that I will talk about quite soon) +for aggregating elements to which you have assigned the same key. In the +picture below (featured in many of our presentations) the color indicates the +key of the element. Thus the GroupByKey/CombinePerKey transform gathers all the +green squares to produce a single output element.

+ +

ParDo and GroupByKey/CombinePerKey:  Elementwise versus 
aggregating computations

+ +

But not all use cases are easily expressed as pipelines of simple ParDo/Map and +GroupByKey/CombinePerKey transforms. The topic of this blog post is a new +extension to the Beam programming model: per-element operation augmented with +mutable state.

+ +

Stateful ParDo - sequential per-key processing with persistent 
state

+ +

In the illustration above, ParDo now has a bit of durable, consistent state on +the side, which can be read and written during the processing of each element. +The state is partitioned by key, so it is drawn as having disjoint sections for +each color. It is also partitioned per window, but I thought plaid +A 
plaid storage cylinder +would be a bit much :-). I’ll talk about +why state is partitioned this way a bit later, via my first example.

+ +

For the rest of this post, I will describe this new feature of Beam in detail - +how it works at a high level, how it differs from existing features, how to +make sure it is still massively scalable. After that introduction at the model +level, I’ll walk through a simple example of how you use it in the Beam Java +SDK.

+ +


[1/5] beam-site git commit: Add klk as a blog author

2017-02-13 Thread davor
Repository: beam-site
Updated Branches:
  refs/heads/asf-site 4282f6bc5 -> 00e375003


Add klk as a blog author


Project: http://git-wip-us.apache.org/repos/asf/beam-site/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam-site/commit/31387225
Tree: http://git-wip-us.apache.org/repos/asf/beam-site/tree/31387225
Diff: http://git-wip-us.apache.org/repos/asf/beam-site/diff/31387225

Branch: refs/heads/asf-site
Commit: 3138722512cfe60ea4dd231389e7736eb8d420f3
Parents: 4282f6b
Author: Kenneth Knowles 
Authored: Thu Feb 9 19:37:56 2017 -0800
Committer: Davor Bonaci 
Committed: Mon Feb 13 13:29:08 2017 -0800

--
 src/_data/authors.yml | 4 
 1 file changed, 4 insertions(+)
--


http://git-wip-us.apache.org/repos/asf/beam-site/blob/31387225/src/_data/authors.yml
--
diff --git a/src/_data/authors.yml b/src/_data/authors.yml
index 147dd7b..e4aa332 100644
--- a/src/_data/authors.yml
+++ b/src/_data/authors.yml
@@ -36,3 +36,7 @@ thw:
 name: Thomas Weise
 email: t...@apache.org
 twitter: thweise
+klk:
+name: Kenneth Knowles
+email: k...@apache.org
+twitter: KennKnowles



[4/5] beam-site git commit: Regenerate website

2017-02-13 Thread davor
Regenerate website


Project: http://git-wip-us.apache.org/repos/asf/beam-site/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam-site/commit/2dd05932
Tree: http://git-wip-us.apache.org/repos/asf/beam-site/tree/2dd05932
Diff: http://git-wip-us.apache.org/repos/asf/beam-site/diff/2dd05932

Branch: refs/heads/asf-site
Commit: 2dd059328e13788c7e58a6be5e2098ea5af82336
Parents: 4c43991
Author: Davor Bonaci 
Authored: Mon Feb 13 13:31:19 2017 -0800
Committer: Davor Bonaci 
Committed: Mon Feb 13 13:31:19 2017 -0800

--
 .../blog/2017/02/13/stateful-processing.html| 751 +++
 content/blog/index.html |  21 +
 content/feed.xml| 593 ++-
 .../blog/stateful-processing/assign-indices.png | Bin 0 -> 29308 bytes
 .../blog/stateful-processing/combinefn.png  | Bin 0 -> 18138 bytes
 .../stateful-processing/combiner-lifting.png| Bin 0 -> 37974 bytes
 .../blog/stateful-processing/pardo-and-gbk.png  | Bin 0 -> 27227 bytes
 .../blog/stateful-processing/pipeline.png   | Bin 0 -> 14308 bytes
 .../images/blog/stateful-processing/plaid.png   | Bin 0 -> 46216 bytes
 .../blog/stateful-processing/stateful-dofn.png  | Bin 0 -> 9 bytes
 .../blog/stateful-processing/stateful-pardo.png | Bin 0 -> 18035 bytes
 content/index.html  |   4 +-
 12 files changed, 1348 insertions(+), 21 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/beam-site/blob/2dd05932/content/blog/2017/02/13/stateful-processing.html
--
diff --git a/content/blog/2017/02/13/stateful-processing.html 
b/content/blog/2017/02/13/stateful-processing.html
new file mode 100644
index 000..dcf077a
--- /dev/null
+++ b/content/blog/2017/02/13/stateful-processing.html
@@ -0,0 +1,751 @@
+
+
+
+  
+  
+  
+  
+
+  Stateful processing with Apache Beam
+  
+
+  
+  
+  https://ajax.googleapis.com/ajax/libs/jquery/2.2.0/jquery.min.js";>
+  
+  
+  https://beam.apache.org/blog/2017/02/13/stateful-processing.html"; 
data-proofer-ignore>
+  https://beam.apache.org/feed.xml";>
+  
+
(function(i,s,o,g,r,a,m){i['GoogleAnalyticsObject']=r;i[r]=i[r]||function(){
+(i[r].q=i[r].q||[]).push(arguments)},i[r].l=1*new 
Date();a=s.createElement(o),
+
m=s.getElementsByTagName(o)[0];a.async=1;a.src=g;m.parentNode.insertBefore(a,m)
+
})(window,document,'script','//www.google-analytics.com/analytics.js','ga');
+
+ga('create', 'UA-73650088-1', 'auto');
+ga('send', 'pageview');
+
+  
+  
+
+
+
+  
+
+
+  
+
+  
+
+  
+  
+Toggle navigation
+
+
+
+  
+
+
+  
+
+ Get Started 
+ 
+ Beam 
Overview
+Quickstart - Java
+Quickstart - Python
+ 
+ Example Walkthroughs
+ WordCount
+ Mobile Gaming
+  
+  Resources
+  Downloads
+  Support
+ 
+   
+
+ Documentation 
+ 
+ Using the 
Documentation
+ 
+ Beam Concepts
+ Programming Guide
+ Additional 
Resources
+ 
+  Pipeline Fundamentals
+  Design Your 
Pipeline
+  Create Your 
Pipeline
+  Test 
Your Pipeline
+  
+ SDKs
+ Java 
SDK
+ Java SDK API Reference 
+
+Python SDK
+ 
+ Runners
+ Capability Matrix
+ Direct 
Runner
+ Apache 
Apex Runner
+ Apache 
Flink Runner
+ Apache 
Spark Runner
+ Cloud 
Dataflow Runner
+ 
+   
+
+ Contribute 
+ 
+ Get Started 
Contributing
+
+Guides
+ Contribution Guide
+Testing Guide
+Release Guide
+PTransform Style 
Guide
+
+Technical References
+Design Principles
+ Ongoing 
Projects
+Source 
Repository  
+
+ Promotion
+Presentation 
Materials
+Logos and Design
+
+Maturity Model
+Team
+ 
+   
+
+Blog
+  
+  
+
+  https://www.apache.org/foundation/press/kit/feather_small.png"; alt="Apache 
Logo" style="height:24px;

[5/5] beam-site git commit: This closes #152

2017-02-13 Thread davor
This closes #152


Project: http://git-wip-us.apache.org/repos/asf/beam-site/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam-site/commit/00e37500
Tree: http://git-wip-us.apache.org/repos/asf/beam-site/tree/00e37500
Diff: http://git-wip-us.apache.org/repos/asf/beam-site/diff/00e37500

Branch: refs/heads/asf-site
Commit: 00e375003c3c33e1e9234ea506a6c746967f5988
Parents: 4282f6b 2dd0593
Author: Davor Bonaci 
Authored: Mon Feb 13 13:31:20 2017 -0800
Committer: Davor Bonaci 
Committed: Mon Feb 13 13:31:20 2017 -0800

--
 .../blog/2017/02/13/stateful-processing.html| 751 +++
 content/blog/index.html |  21 +
 content/feed.xml| 593 ++-
 .../blog/stateful-processing/assign-indices.png | Bin 0 -> 29308 bytes
 .../blog/stateful-processing/combinefn.png  | Bin 0 -> 18138 bytes
 .../stateful-processing/combiner-lifting.png| Bin 0 -> 37974 bytes
 .../blog/stateful-processing/pardo-and-gbk.png  | Bin 0 -> 27227 bytes
 .../blog/stateful-processing/pipeline.png   | Bin 0 -> 14308 bytes
 .../images/blog/stateful-processing/plaid.png   | Bin 0 -> 46216 bytes
 .../blog/stateful-processing/stateful-dofn.png  | Bin 0 -> 9 bytes
 .../blog/stateful-processing/stateful-pardo.png | Bin 0 -> 18035 bytes
 content/index.html  |   4 +-
 src/_data/authors.yml   |   4 +
 src/_posts/2017-02-13-stateful-processing.md| 550 ++
 .../blog/stateful-processing/assign-indices.png | Bin 0 -> 29308 bytes
 .../blog/stateful-processing/combinefn.png  | Bin 0 -> 18138 bytes
 .../stateful-processing/combiner-lifting.png| Bin 0 -> 37974 bytes
 .../blog/stateful-processing/pardo-and-gbk.png  | Bin 0 -> 27227 bytes
 .../blog/stateful-processing/pipeline.png   | Bin 0 -> 14308 bytes
 src/images/blog/stateful-processing/plaid.png   | Bin 0 -> 46216 bytes
 .../blog/stateful-processing/stateful-dofn.png  | Bin 0 -> 9 bytes
 .../blog/stateful-processing/stateful-pardo.png | Bin 0 -> 18035 bytes
 22 files changed, 1902 insertions(+), 21 deletions(-)
--




[2/5] beam-site git commit: Add blog post "Stateful Processing with Apache Beam"

2017-02-13 Thread davor
Add blog post "Stateful Processing with Apache Beam"


Project: http://git-wip-us.apache.org/repos/asf/beam-site/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam-site/commit/4c439913
Tree: http://git-wip-us.apache.org/repos/asf/beam-site/tree/4c439913
Diff: http://git-wip-us.apache.org/repos/asf/beam-site/diff/4c439913

Branch: refs/heads/asf-site
Commit: 4c4399132b1987884c03edaf2570614c00f651cb
Parents: 3138722
Author: Kenneth Knowles 
Authored: Thu Feb 9 19:38:13 2017 -0800
Committer: Davor Bonaci 
Committed: Mon Feb 13 13:29:11 2017 -0800

--
 src/_posts/2017-02-13-stateful-processing.md| 550 +++
 .../blog/stateful-processing/assign-indices.png | Bin 0 -> 29308 bytes
 .../blog/stateful-processing/combinefn.png  | Bin 0 -> 18138 bytes
 .../stateful-processing/combiner-lifting.png| Bin 0 -> 37974 bytes
 .../blog/stateful-processing/pardo-and-gbk.png  | Bin 0 -> 27227 bytes
 .../blog/stateful-processing/pipeline.png   | Bin 0 -> 14308 bytes
 src/images/blog/stateful-processing/plaid.png   | Bin 0 -> 46216 bytes
 .../blog/stateful-processing/stateful-dofn.png  | Bin 0 -> 9 bytes
 .../blog/stateful-processing/stateful-pardo.png | Bin 0 -> 18035 bytes
 9 files changed, 550 insertions(+)
--


http://git-wip-us.apache.org/repos/asf/beam-site/blob/4c439913/src/_posts/2017-02-13-stateful-processing.md
--
diff --git a/src/_posts/2017-02-13-stateful-processing.md 
b/src/_posts/2017-02-13-stateful-processing.md
new file mode 100644
index 000..28aee12
--- /dev/null
+++ b/src/_posts/2017-02-13-stateful-processing.md
@@ -0,0 +1,550 @@
+---
+layout: post
+title:  "Stateful processing with Apache Beam"
+date:   2017-02-13 00:00:01 -0800
+excerpt_separator: 
+categories: blog
+authors:
+  - klk
+---
+
+Beam lets you process unbounded, out-of-order, global-scale data with portable
+high-level pipelines. Stateful processing is a new feature of the Beam model
+that expands the capabilities of Beam, unlocking new use cases and new
+efficiencies. In this post, I will guide you through stateful processing in
+Beam: how it works, how it fits in with the other features of the Beam model,
+what you might use it for, and what it looks like in code.
+
+
+
+> **Warning: new features ahead!**: This is a very new aspect of the Beam
+> model. Runners are still adding support. You can try it out today on multiple
+> runners, but do check the [runner capability
+> matrix]({{ site.baseurl }}/documentation/runners/capability-matrix/) for
+> the current status in each runner.
+
+First, a quick recap: In Beam, a big data processing _pipeline_ is a directed,
+acyclic graph of parallel operations called _`PTransforms`_ processing data
+from _`PCollections`_ I'll expand on that by walking through this illustration:
+
+
+
+The boxes are `PTransforms` and the edges represent the data in `PCollections`
+flowing from one `PTransform` to the next. A `PCollection` may be _bounded_ 
(which
+means it is finite and you know it) or _unbounded_ (which means you don't know 
if
+it is finite or not - basically, it is like an incoming stream of data that may
+or may not ever terminate). The cylinders are the data sources and sinks at the
+edges of your pipeline, such as bounded collections of log files or unbounded
+data streaming over a Kafka topic. This blog post isn't about sources or sinks,
+but about what happens in between - your data processing.
+
+There are two main building blocks for processing your data in Beam: _`ParDo`_,
+for performing an operation in parallel across all elements, and _`GroupByKey`_
+(and the closely related `CombinePerKey` that I will talk about quite soon)
+for aggregating elements to which you have assigned the same key. In the
+picture below (featured in many of our presentations) the color indicates the
+key of the element. Thus the `GroupByKey`/`CombinePerKey` transform gathers 
all the
+green squares to produce a single output element.
+
+
+
+But not all use cases are easily expressed as pipelines of simple 
`ParDo`/`Map` and
+`GroupByKey`/`CombinePerKey` transforms. The topic of this blog post is a new
+extension to the Beam programming model: **per-element operation augmented with
+mutable state**.
+
+
+
+In the illustration above, ParDo now has a bit of durable, consistent state on
+the side, which can be read and written during the processing of each element.
+The state is partitioned by key, so it is drawn as having disjoint sections for
+each color. It is also partitioned per window, but I thought plaid 
+ 
+would be a bit much  :-). I'll talk about
+why state is partitioned this way a bit later, via my first example.
+
+For the rest of this post, I will describe this new feature of Beam in detail -
+how it works at a high level, how it differs from existing features, 

[jira] [Commented] (BEAM-1346) Drop Late Data in ReduceFnRunner

2017-02-13 Thread Kenneth Knowles (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-1346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15864437#comment-15864437
 ] 

Kenneth Knowles commented on BEAM-1346:
---

I agree with your paranoia. This is related to BEAM-696. I think 
{{PushbackSideInputRunner}} is technically OK because it pushes all the 
complexity to the runner. The runner can provide a 
{{ReadyCheckingSideInputReader}} that understands the merging, and the runner 
decides when to wake up the processing and feed the pushed-back elements, so it 
can alter its timers, etc, according to merging. And if this is all too 
complex, maybe the runner can [disable any troublesome 
optimizations|https://github.com/apache/beam/blob/master/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/DataflowPipelineTranslator.java#L811]
 for now.

> Drop Late Data in ReduceFnRunner
> 
>
> Key: BEAM-1346
> URL: https://issues.apache.org/jira/browse/BEAM-1346
> Project: Beam
>  Issue Type: Bug
>  Components: runner-core
>Affects Versions: 0.5.0
>Reporter: Aljoscha Krettek
>
> I think these two commits recently broke late-data dropping for the Flink 
> Runner (and maybe for other runners as well):
> - https://github.com/apache/beam/commit/2b26ec8
> - https://github.com/apache/beam/commit/8989473
> It boils down to the {{LateDataDroppingDoFnRunner}} not being used anymore  
> because {{DoFnRunners.lateDataDroppingRunner()}} is not called anymore when a 
> {{DoFn}} is a {{ReduceFnExecutor}} (because that interface was removed).
> Maybe we should think about dropping late data in another place, my 
> suggestion is {{ReduceFnRunner}} but that's open for discussion.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (BEAM-1280) Remove label-first variants of PTransform constructors (use >> instead)

2017-02-13 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-1280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15864436#comment-15864436
 ] 

ASF GitHub Bot commented on BEAM-1280:
--

Github user sb2nov closed the pull request at:

https://github.com/apache/beam/pull/1997


> Remove label-first variants of PTransform constructors (use >> instead)
> ---
>
> Key: BEAM-1280
> URL: https://issues.apache.org/jira/browse/BEAM-1280
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py
>Reporter: Ahmet Altay
>Assignee: Ahmet Altay
>  Labels: sdk-consistency
>
> (Related in Java SDK: https://issues.apache.org/jira/browse/BEAM-370)



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] beam pull request #1997: [BEAM-1280] Remove passing label from PTransform in...

2017-02-13 Thread sb2nov
Github user sb2nov closed the pull request at:

https://github.com/apache/beam/pull/1997


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[2/3] beam-site git commit: Regenerate website

2017-02-13 Thread altay
Regenerate website


Project: http://git-wip-us.apache.org/repos/asf/beam-site/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam-site/commit/aa9b7fea
Tree: http://git-wip-us.apache.org/repos/asf/beam-site/tree/aa9b7fea
Diff: http://git-wip-us.apache.org/repos/asf/beam-site/diff/aa9b7fea

Branch: refs/heads/asf-site
Commit: aa9b7fea05f5e1177ff40f7b3977cdfb9ec0dd19
Parents: 8bc6392
Author: Ahmet Altay 
Authored: Mon Feb 13 12:11:35 2017 -0800
Committer: Ahmet Altay 
Committed: Mon Feb 13 12:11:35 2017 -0800

--
 content/documentation/programming-guide/index.html | 9 -
 content/get-started/wordcount-example/index.html   | 4 ++--
 2 files changed, 6 insertions(+), 7 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/beam-site/blob/aa9b7fea/content/documentation/programming-guide/index.html
--
diff --git a/content/documentation/programming-guide/index.html 
b/content/documentation/programming-guide/index.html
index f02fd40..0aa0575 100644
--- a/content/documentation/programming-guide/index.html
+++ b/content/documentation/programming-guide/index.html
@@ -515,7 +515,7 @@
 
 Inside your DoFn subclass, you’ll write a method 
annotated with @ProcessElement where you 
provide the actual processing logic. You don’t need to manually extract the 
elements from the input collection; the Beam SDKs handle that for you. Your 
@ProcessElement method should accept an 
object of type ProcessContext. The ProcessContext object gives you access to an 
input element and a method for emitting an output element:
 
-Inside your DoFn 
subclass, you’ll write a method process where you provide the actual 
processing logic. You don’t need to manually extract the elements from the 
input collection; the Beam SDKs handle that for you. Your process method should accept an object of type 
context. The context object gives you access to an input 
element and output is emitted by using yield or return statement inside process method.
+Inside your DoFn 
subclass, you’ll write a method process where you provide the actual 
processing logic. You don’t need to manually extract the elements from the 
input collection; the Beam SDKs handle that for you. Your process method should accept an object of type 
element. This is the input element and 
output is emitted by using yield or 
return statement inside process method.
 
 static class ComputeWordLengthFn extends DoFn {
   @ProcessElement
@@ -610,11 +610,11 @@
 
 Using GroupByKey
 
-GroupByKey is a Beam transform for 
processing collections of key/value pairs. It’s a parallel reduction 
operation, analagous to the Shuffle phase of a Map/Shuffle/Reduce-style 
algorithm. The input to GroupByKey is a 
collection of key/value pairs that represents a multimap, where the 
collection contains multiple pairs that have the same key, but different 
values. Given such a collection, you use GroupByKey to collect all of the values 
associated with each unique key.
+GroupByKey is a Beam transform for 
processing collections of key/value pairs. It’s a parallel reduction 
operation, analogous to the Shuffle phase of a Map/Shuffle/Reduce-style 
algorithm. The input to GroupByKey is a 
collection of key/value pairs that represents a multimap, where the 
collection contains multiple pairs that have the same key, but different 
values. Given such a collection, you use GroupByKey to collect all of the values 
associated with each unique key.
 
 GroupByKey is a good way to 
aggregate data that has something in common. For example, if you have a 
collection that stores records of customer orders, you might want to group 
together all the orders from the same postal code (wherein the “key” of the 
key/value pair is the postal code field, and the “value” is the remainder 
of the record).
 
-Let’s examine the mechanics of GroupByKey with a simple xample case, where 
our data set consists of words from a text file and the line number on which 
they appear. We want to group together all the line numbers (values) that share 
the same word (key), letting us see all the places in the text where a 
particular word appears.
+Let’s examine the mechanics of GroupByKey with a simple example case, where 
our data set consists of words from a text file and the line number on which 
they appear. We want to group together all the line numbers (values) that share 
the same word (key), letting us see all the places in the text where a 
particular word appears.
 
 Our input is a PCollection of 
key/value pairs where each word is a key, and the value is a line number in the 
file where the word appears. Here’s a list of the key/value pairs in the 
input collection:
 
@@ -1046,7 +1046,7 @@ tree, [2]
 
 
 # We can also pass side inputs to a ParDo transform, which 
will get pa

[GitHub] beam-site pull request #150: Update ParDo documentation for Python

2017-02-13 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/beam-site/pull/150


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[1/3] beam-site git commit: Update ParDo documentation for Python

2017-02-13 Thread altay
Repository: beam-site
Updated Branches:
  refs/heads/asf-site a1e2a39f6 -> 4282f6bc5


Update ParDo documentation for Python


Project: http://git-wip-us.apache.org/repos/asf/beam-site/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam-site/commit/8bc63920
Tree: http://git-wip-us.apache.org/repos/asf/beam-site/tree/8bc63920
Diff: http://git-wip-us.apache.org/repos/asf/beam-site/diff/8bc63920

Branch: refs/heads/asf-site
Commit: 8bc639206de6054a15da9601d56a7bf82c4f076f
Parents: a1e2a39
Author: Hadar Hod 
Authored: Thu Feb 9 10:37:07 2017 -0800
Committer: Ahmet Altay 
Committed: Mon Feb 13 12:11:01 2017 -0800

--
 src/documentation/programming-guide.md | 9 -
 1 file changed, 4 insertions(+), 5 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/beam-site/blob/8bc63920/src/documentation/programming-guide.md
--
diff --git a/src/documentation/programming-guide.md 
b/src/documentation/programming-guide.md
index 587c9f9..641ad2d 100644
--- a/src/documentation/programming-guide.md
+++ b/src/documentation/programming-guide.md
@@ -338,7 +338,7 @@ static class ComputeWordLengthFn extends DoFn { ... }
 Inside your `DoFn` subclass, you'll write a method annotated with 
`@ProcessElement` where you provide the actual processing logic. You don't need 
to manually extract the elements from the input collection; the Beam SDKs 
handle that for you. Your `@ProcessElement` method should accept an object of 
type `ProcessContext`. The `ProcessContext` object gives you access to an input 
element and a method for emitting an output element:
 
 {:.language-py}
-Inside your `DoFn` subclass, you'll write a method `process` where you provide 
the actual processing logic. You don't need to manually extract the elements 
from the input collection; the Beam SDKs handle that for you. Your `process` 
method should accept an object of type `context`. The `context` object gives 
you access to an input element and output is emitted by using `yield` or 
`return` statement inside `process` method.
+Inside your `DoFn` subclass, you'll write a method `process` where you provide 
the actual processing logic. You don't need to manually extract the elements 
from the input collection; the Beam SDKs handle that for you. Your `process` 
method should accept an object of type `element`. This is the input element and 
output is emitted by using `yield` or `return` statement inside `process` 
method.
 
 ```java
 static class ComputeWordLengthFn extends DoFn {
@@ -429,11 +429,11 @@ words = ...
 
  Using GroupByKey
 
-`GroupByKey` is a Beam transform for processing collections of key/value 
pairs. It's a parallel reduction operation, analagous to the Shuffle phase of a 
Map/Shuffle/Reduce-style algorithm. The input to `GroupByKey` is a collection 
of key/value pairs that represents a *multimap*, where the collection contains 
multiple pairs that have the same key, but different values. Given such a 
collection, you use `GroupByKey` to collect all of the values associated with 
each unique key.
+`GroupByKey` is a Beam transform for processing collections of key/value 
pairs. It's a parallel reduction operation, analogous to the Shuffle phase of a 
Map/Shuffle/Reduce-style algorithm. The input to `GroupByKey` is a collection 
of key/value pairs that represents a *multimap*, where the collection contains 
multiple pairs that have the same key, but different values. Given such a 
collection, you use `GroupByKey` to collect all of the values associated with 
each unique key.
 
 `GroupByKey` is a good way to aggregate data that has something in common. For 
example, if you have a collection that stores records of customer orders, you 
might want to group together all the orders from the same postal code (wherein 
the "key" of the key/value pair is the postal code field, and the "value" is 
the remainder of the record).
 
-Let's examine the mechanics of `GroupByKey` with a simple xample case, where 
our data set consists of words from a text file and the line number on which 
they appear. We want to group together all the line numbers (values) that share 
the same word (key), letting us see all the places in the text where a 
particular word appears.
+Let's examine the mechanics of `GroupByKey` with a simple example case, where 
our data set consists of words from a text file and the line number on which 
they appear. We want to group together all the line numbers (values) that share 
the same word (key), letting us see all the places in the text where a 
particular word appears.
 
 Our input is a `PCollection` of key/value pairs where each word is a key, and 
the value is a line number in the file where the word appears. Here's a list of 
the key/value pairs in the input collection:
 
@@ -790,12 +790,11 @@ Side inputs are useful if your 

[3/3] beam-site git commit: This closes #150

2017-02-13 Thread altay
This closes #150


Project: http://git-wip-us.apache.org/repos/asf/beam-site/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam-site/commit/4282f6bc
Tree: http://git-wip-us.apache.org/repos/asf/beam-site/tree/4282f6bc
Diff: http://git-wip-us.apache.org/repos/asf/beam-site/diff/4282f6bc

Branch: refs/heads/asf-site
Commit: 4282f6bc54a0ccc6bb7a4bac65547e429d78538e
Parents: a1e2a39 aa9b7fe
Author: Ahmet Altay 
Authored: Mon Feb 13 12:11:36 2017 -0800
Committer: Ahmet Altay 
Committed: Mon Feb 13 12:11:36 2017 -0800

--
 content/documentation/programming-guide/index.html | 9 -
 content/get-started/wordcount-example/index.html   | 4 ++--
 src/documentation/programming-guide.md | 9 -
 3 files changed, 10 insertions(+), 12 deletions(-)
--




[jira] [Commented] (BEAM-646) Get runners out of the apply()

2017-02-13 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15864228#comment-15864228
 ] 

ASF GitHub Bot commented on BEAM-646:
-

GitHub user tgroh opened a pull request:

https://github.com/apache/beam/pull/1998

[BEAM-646] Add Pipeline#replaceTransform

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [ ] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [ ] Make sure tests pass via `mvn clean verify`. (Even better, enable
   Travis-CI on your fork and ensure the whole test matrix passes).
 - [ ] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [ ] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.txt).

---
This is the base method for Pipeline Surgery. It takes a
PTransformMatcher and a PTransformOverrideFactory, and replaces all
matching PTransforms with the result of the Override Factory.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/tgroh/beam replace_transform

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/1998.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1998


commit 963a6b31f5d670ddebac7d5347697948c8463607
Author: Thomas Groh 
Date:   2017-02-09T19:40:50Z

Add Pipeline#replaceTransform

This is the base method for Pipeline Surgery. It takes a
PTransformMatcher and a PTransformOverrideFactory, and replaces all
matching PTransforms with the result of the Override Factory.




> Get runners out of the apply()
> --
>
> Key: BEAM-646
> URL: https://issues.apache.org/jira/browse/BEAM-646
> Project: Beam
>  Issue Type: Improvement
>  Components: beam-model-runner-api, sdk-java-core
>Reporter: Kenneth Knowles
>Assignee: Thomas Groh
>
> Right now, the runner intercepts calls to apply() and replaces transforms as 
> we go. This means that there is no "original" user graph. For portability and 
> misc architectural benefits, we would like to build the original graph first, 
> and have the runner override later.
> Some runners already work in this manner, but we could integrate it more 
> smoothly, with more validation, via some handy APIs on e.g. the Pipeline 
> object.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] beam pull request #1998: [BEAM-646] Add Pipeline#replaceTransform

2017-02-13 Thread tgroh
GitHub user tgroh opened a pull request:

https://github.com/apache/beam/pull/1998

[BEAM-646] Add Pipeline#replaceTransform

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [ ] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [ ] Make sure tests pass via `mvn clean verify`. (Even better, enable
   Travis-CI on your fork and ensure the whole test matrix passes).
 - [ ] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [ ] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.txt).

---
This is the base method for Pipeline Surgery. It takes a
PTransformMatcher and a PTransformOverrideFactory, and replaces all
matching PTransforms with the result of the Override Factory.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/tgroh/beam replace_transform

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/1998.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1998


commit 963a6b31f5d670ddebac7d5347697948c8463607
Author: Thomas Groh 
Date:   2017-02-09T19:40:50Z

Add Pipeline#replaceTransform

This is the base method for Pipeline Surgery. It takes a
PTransformMatcher and a PTransformOverrideFactory, and replaces all
matching PTransforms with the result of the Override Factory.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] beam pull request #1995: [BEAM-646] Add ReplaceOutputs to PTransformOverride...

2017-02-13 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/1995


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (BEAM-646) Get runners out of the apply()

2017-02-13 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15864213#comment-15864213
 ] 

ASF GitHub Bot commented on BEAM-646:
-

Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/1995


> Get runners out of the apply()
> --
>
> Key: BEAM-646
> URL: https://issues.apache.org/jira/browse/BEAM-646
> Project: Beam
>  Issue Type: Improvement
>  Components: beam-model-runner-api, sdk-java-core
>Reporter: Kenneth Knowles
>Assignee: Thomas Groh
>
> Right now, the runner intercepts calls to apply() and replaces transforms as 
> we go. This means that there is no "original" user graph. For portability and 
> misc architectural benefits, we would like to build the original graph first, 
> and have the runner override later.
> Some runners already work in this manner, but we could integrate it more 
> smoothly, with more validation, via some handy APIs on e.g. the Pipeline 
> object.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[1/2] beam git commit: Add ReplaceOutputs to PTransformOverrideFactory

2017-02-13 Thread tgroh
Repository: beam
Updated Branches:
  refs/heads/master 30cb93ced -> 2c0cffaf7


Add ReplaceOutputs to PTransformOverrideFactory

This maps the outputs produced by applying a Replacement PTransform to
the outputs produced by the original PTransform.


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/86f00db6
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/86f00db6
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/86f00db6

Branch: refs/heads/master
Commit: 86f00db6612e6055c4cc3899f77f196ee682ecf2
Parents: 30cb93c
Author: Thomas Groh 
Authored: Thu Feb 9 11:11:23 2017 -0800
Committer: Thomas Groh 
Committed: Mon Feb 13 10:58:58 2017 -0800

--
 .../direct/DirectGBKIntoKeyedWorkItemsOverrideFactory.java  | 9 +
 .../runners/direct/DirectGroupByKeyOverrideFactory.java | 9 +
 .../beam/runners/direct/ParDoMultiOverrideFactory.java  | 9 +
 .../runners/direct/ParDoSingleViaMultiOverrideFactory.java  | 9 +
 .../beam/runners/direct/TestStreamEvaluatorFactory.java | 9 +
 .../apache/beam/runners/direct/ViewEvaluatorFactory.java| 9 +
 .../beam/runners/direct/WriteWithShardingFactory.java   | 9 +
 .../apache/beam/sdk/runners/PTransformOverrideFactory.java  | 8 
 8 files changed, 71 insertions(+)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/86f00db6/runners/direct-java/src/main/java/org/apache/beam/runners/direct/DirectGBKIntoKeyedWorkItemsOverrideFactory.java
--
diff --git 
a/runners/direct-java/src/main/java/org/apache/beam/runners/direct/DirectGBKIntoKeyedWorkItemsOverrideFactory.java
 
b/runners/direct-java/src/main/java/org/apache/beam/runners/direct/DirectGBKIntoKeyedWorkItemsOverrideFactory.java
index caf61db..8de7b93 100644
--- 
a/runners/direct-java/src/main/java/org/apache/beam/runners/direct/DirectGBKIntoKeyedWorkItemsOverrideFactory.java
+++ 
b/runners/direct-java/src/main/java/org/apache/beam/runners/direct/DirectGBKIntoKeyedWorkItemsOverrideFactory.java
@@ -19,13 +19,16 @@ package org.apache.beam.runners.direct;
 
 import com.google.common.collect.Iterables;
 import java.util.List;
+import java.util.Map;
 import org.apache.beam.runners.core.KeyedWorkItem;
+import org.apache.beam.runners.core.ReplacementOutputs;
 import org.apache.beam.runners.core.SplittableParDo.GBKIntoKeyedWorkItems;
 import org.apache.beam.sdk.Pipeline;
 import org.apache.beam.sdk.runners.PTransformOverrideFactory;
 import org.apache.beam.sdk.transforms.PTransform;
 import org.apache.beam.sdk.values.KV;
 import org.apache.beam.sdk.values.PCollection;
+import org.apache.beam.sdk.values.PValue;
 import org.apache.beam.sdk.values.TaggedPValue;
 
 /**
@@ -47,4 +50,10 @@ class DirectGBKIntoKeyedWorkItemsOverrideFactory
   List inputs, Pipeline p) {
 return (PCollection>) 
Iterables.getOnlyElement(inputs).getValue();
   }
+
+  @Override
+  public Map mapOutputs(
+  List outputs, PCollection> 
newOutput) {
+return ReplacementOutputs.singleton(outputs, newOutput);
+  }
 }

http://git-wip-us.apache.org/repos/asf/beam/blob/86f00db6/runners/direct-java/src/main/java/org/apache/beam/runners/direct/DirectGroupByKeyOverrideFactory.java
--
diff --git 
a/runners/direct-java/src/main/java/org/apache/beam/runners/direct/DirectGroupByKeyOverrideFactory.java
 
b/runners/direct-java/src/main/java/org/apache/beam/runners/direct/DirectGroupByKeyOverrideFactory.java
index 8a5413b..eedee31 100644
--- 
a/runners/direct-java/src/main/java/org/apache/beam/runners/direct/DirectGroupByKeyOverrideFactory.java
+++ 
b/runners/direct-java/src/main/java/org/apache/beam/runners/direct/DirectGroupByKeyOverrideFactory.java
@@ -19,12 +19,15 @@ package org.apache.beam.runners.direct;
 
 import com.google.common.collect.Iterables;
 import java.util.List;
+import java.util.Map;
+import org.apache.beam.runners.core.ReplacementOutputs;
 import org.apache.beam.sdk.Pipeline;
 import org.apache.beam.sdk.runners.PTransformOverrideFactory;
 import org.apache.beam.sdk.transforms.GroupByKey;
 import org.apache.beam.sdk.transforms.PTransform;
 import org.apache.beam.sdk.values.KV;
 import org.apache.beam.sdk.values.PCollection;
+import org.apache.beam.sdk.values.PValue;
 import org.apache.beam.sdk.values.TaggedPValue;
 
 /** A {@link PTransformOverrideFactory} for {@link GroupByKey} PTransforms. */
@@ -42,4 +45,10 @@ final class DirectGroupByKeyOverrideFactory
   List inputs, Pipeline p) {
 return (PCollection>) Iterables.getOnlyElement(inputs).getValue();
   }
+
+  @Override
+  public Map mapOutputs(
+  List outputs, PCollection>> newOutput) {
+return ReplacementOutputs.singleton(outputs, newOutput);
+  }
 }

http://git-w

[2/2] beam git commit: This closes #1995

2017-02-13 Thread tgroh
This closes #1995


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/2c0cffaf
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/2c0cffaf
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/2c0cffaf

Branch: refs/heads/master
Commit: 2c0cffaf7e55e2ca8b49368b360823b5735c5b10
Parents: 30cb93c 86f00db
Author: Thomas Groh 
Authored: Mon Feb 13 10:58:59 2017 -0800
Committer: Thomas Groh 
Committed: Mon Feb 13 10:58:59 2017 -0800

--
 .../direct/DirectGBKIntoKeyedWorkItemsOverrideFactory.java  | 9 +
 .../runners/direct/DirectGroupByKeyOverrideFactory.java | 9 +
 .../beam/runners/direct/ParDoMultiOverrideFactory.java  | 9 +
 .../runners/direct/ParDoSingleViaMultiOverrideFactory.java  | 9 +
 .../beam/runners/direct/TestStreamEvaluatorFactory.java | 9 +
 .../apache/beam/runners/direct/ViewEvaluatorFactory.java| 9 +
 .../beam/runners/direct/WriteWithShardingFactory.java   | 9 +
 .../apache/beam/sdk/runners/PTransformOverrideFactory.java  | 8 
 8 files changed, 71 insertions(+)
--




[jira] [Commented] (BEAM-1280) Remove label-first variants of PTransform constructors (use >> instead)

2017-02-13 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-1280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15864212#comment-15864212
 ] 

ASF GitHub Bot commented on BEAM-1280:
--

GitHub user sb2nov opened a pull request:

https://github.com/apache/beam/pull/1997

[BEAM-1280] Remove passing label from Combiners

R: @aaltay PTAL

After this only assert_that and AsSingleton etc for SideInputs take the 
label as the argument. Rest should be gone.

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [ ] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [ ] Make sure tests pass via `mvn clean verify`. (Even better, enable
   Travis-CI on your fork and ensure the whole test matrix passes).
 - [ ] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [ ] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.txt).

---


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/sb2nov/beam 
BEAM-1280-remove-label-from-combiners

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/1997.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1997


commit 5abfbf9300bada36adc7c55b0c2fe21a66215b67
Author: Sourabh Bajaj 
Date:   2017-02-13T16:48:18Z

Remove passing label from Combiners




> Remove label-first variants of PTransform constructors (use >> instead)
> ---
>
> Key: BEAM-1280
> URL: https://issues.apache.org/jira/browse/BEAM-1280
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py
>Reporter: Ahmet Altay
>Assignee: Ahmet Altay
>  Labels: sdk-consistency
>
> (Related in Java SDK: https://issues.apache.org/jira/browse/BEAM-370)



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] beam pull request #1997: [BEAM-1280] Remove passing label from Combiners

2017-02-13 Thread sb2nov
GitHub user sb2nov opened a pull request:

https://github.com/apache/beam/pull/1997

[BEAM-1280] Remove passing label from Combiners

R: @aaltay PTAL

After this only assert_that and AsSingleton etc for SideInputs take the 
label as the argument. Rest should be gone.

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [ ] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [ ] Make sure tests pass via `mvn clean verify`. (Even better, enable
   Travis-CI on your fork and ensure the whole test matrix passes).
 - [ ] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [ ] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.txt).

---


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/sb2nov/beam 
BEAM-1280-remove-label-from-combiners

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/1997.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1997


commit 5abfbf9300bada36adc7c55b0c2fe21a66215b67
Author: Sourabh Bajaj 
Date:   2017-02-13T16:48:18Z

Remove passing label from Combiners




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (BEAM-1471) Make IterableCoder binary compatible across SDKs

2017-02-13 Thread Vikas Kedigehalli (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-1471?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15864188#comment-15864188
 ] 

Vikas Kedigehalli commented on BEAM-1471:
-

Verified that when the iterable length is known, the IterableCoder is 
compatible across Java and Python SDKs. 
https://github.com/apache/beam/pull/1996 tests confirm that. 

When the iterable length is unknown, they are not compatible. Python SDK errors 
out right now. This still needs to be fixed. 

> Make IterableCoder binary compatible across SDKs
> 
>
> Key: BEAM-1471
> URL: https://issues.apache.org/jira/browse/BEAM-1471
> Project: Beam
>  Issue Type: Improvement
>  Components: beam-model
>Affects Versions: 0.5.0
>Reporter: Vikas Kedigehalli
>Assignee: Vikas Kedigehalli
>
> Ensure IterableCoder across SDKs binary compatible and add tests.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (BEAM-1471) Make IterableCoder binary compatible across SDKs

2017-02-13 Thread Vikas Kedigehalli (JIRA)
Vikas Kedigehalli created BEAM-1471:
---

 Summary: Make IterableCoder binary compatible across SDKs
 Key: BEAM-1471
 URL: https://issues.apache.org/jira/browse/BEAM-1471
 Project: Beam
  Issue Type: Improvement
  Components: beam-model
Affects Versions: 0.5.0
Reporter: Vikas Kedigehalli
Assignee: Vikas Kedigehalli


Ensure IterableCoder across SDKs binary compatible and add tests.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] beam pull request #1996: Add cross-sdk tests for IterableCoder

2017-02-13 Thread vikkyrk
GitHub user vikkyrk opened a pull request:

https://github.com/apache/beam/pull/1996

Add cross-sdk tests for IterableCoder

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [ ] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [ ] Make sure tests pass via `mvn clean verify`. (Even better, enable
   Travis-CI on your fork and ensure the whole test matrix passes).
 - [ ] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [ ] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.txt).

---


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/vikkyrk/incubator-beam common_iterable_coder

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/1996.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1996


commit 965160c0c0bada4500d51ca44c6c8e5e1c0476bf
Author: Vikas Kedigehalli 
Date:   2017-02-13T18:23:28Z

Add cross-sdk tests for IterableCoder




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Resolved] (BEAM-1433) Remove coder from TextIO

2017-02-13 Thread Aviem Zur (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-1433?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aviem Zur resolved BEAM-1433.
-
   Resolution: Done
Fix Version/s: 0.6.0

> Remove coder from TextIO
> 
>
> Key: BEAM-1433
> URL: https://issues.apache.org/jira/browse/BEAM-1433
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-java-core
>Reporter: Aviem Zur
>Assignee: Aviem Zur
> Fix For: 0.6.0
>
>
> Remove coder usage in TextIO.
> TextIO should only deal with Strings.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] beam pull request #1995: [BEAM-646] Add ReplaceOutputs to PTransformOverride...

2017-02-13 Thread tgroh
GitHub user tgroh opened a pull request:

https://github.com/apache/beam/pull/1995

[BEAM-646] Add ReplaceOutputs to PTransformOverrideFactory

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [ ] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [ ] Make sure tests pass via `mvn clean verify`. (Even better, enable
   Travis-CI on your fork and ensure the whole test matrix passes).
 - [ ] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [ ] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.txt).

---
This maps the outputs produced by applying a Replacement PTransform to
the outputs produced by the original PTransform.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/tgroh/beam replacement_output

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/1995.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1995


commit f46459d3179c7d3aaa675bff2f88e15fca1ae883
Author: Thomas Groh 
Date:   2017-02-09T19:11:23Z

Add ReplaceOutputs to PTransformOverrideFactory

This maps the outputs produced by applying a Replacement PTransform to
the outputs produced by the original PTransform.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (BEAM-646) Get runners out of the apply()

2017-02-13 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15864094#comment-15864094
 ] 

ASF GitHub Bot commented on BEAM-646:
-

GitHub user tgroh opened a pull request:

https://github.com/apache/beam/pull/1995

[BEAM-646] Add ReplaceOutputs to PTransformOverrideFactory

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [ ] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [ ] Make sure tests pass via `mvn clean verify`. (Even better, enable
   Travis-CI on your fork and ensure the whole test matrix passes).
 - [ ] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [ ] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.txt).

---
This maps the outputs produced by applying a Replacement PTransform to
the outputs produced by the original PTransform.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/tgroh/beam replacement_output

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/1995.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1995


commit f46459d3179c7d3aaa675bff2f88e15fca1ae883
Author: Thomas Groh 
Date:   2017-02-09T19:11:23Z

Add ReplaceOutputs to PTransformOverrideFactory

This maps the outputs produced by applying a Replacement PTransform to
the outputs produced by the original PTransform.




> Get runners out of the apply()
> --
>
> Key: BEAM-646
> URL: https://issues.apache.org/jira/browse/BEAM-646
> Project: Beam
>  Issue Type: Improvement
>  Components: beam-model-runner-api, sdk-java-core
>Reporter: Kenneth Knowles
>Assignee: Thomas Groh
>
> Right now, the runner intercepts calls to apply() and replaces transforms as 
> we go. This means that there is no "original" user graph. For portability and 
> misc architectural benefits, we would like to build the original graph first, 
> and have the runner override later.
> Some runners already work in this manner, but we could integrate it more 
> smoothly, with more validation, via some handy APIs on e.g. the Pipeline 
> object.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (BEAM-298) Make TestPipeline implement the TestRule interface

2017-02-13 Thread Neville Li (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15864087#comment-15864087
 ] 

Neville Li commented on BEAM-298:
-

Turns out I referenced {{TestPipeline}} in our `src/main` for some reason. I'll 
work around it locally.

> Make TestPipeline implement the TestRule interface
> --
>
> Key: BEAM-298
> URL: https://issues.apache.org/jira/browse/BEAM-298
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core
>Reporter: Thomas Groh
>Assignee: Stas Levin
>Priority: Minor
> Fix For: 0.5.0
>
>
> https://github.com/junit-team/junit4/wiki/Rules
> A JUnit Rule allows a test to use a field annotated with @Rule to wrap 
> executing tests. Doing so allows the TestPipeline to, at the time the test 
> completes, assert that all applied transforms have been executed. This 
> ensures that all unit tests that utilize a TestPipeline rule either are 
> declared to explicitly not expect to execute or have executed the pipeline.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (BEAM-1461) duplication with StartBundle and prepareForProcessing in DoFn

2017-02-13 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-1461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15864077#comment-15864077
 ] 

ASF GitHub Bot commented on BEAM-1461:
--

Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/1993


> duplication with StartBundle and prepareForProcessing in DoFn
> -
>
> Key: BEAM-1461
> URL: https://issues.apache.org/jira/browse/BEAM-1461
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core
>Reporter: Xu Mingmin
>Assignee: Davor Bonaci
>
> There're one annotation `StartBundle`, and one public function 
> `prepareForProcessing` in DoFn, which are called both before 
> `ProcessElement`. It's confused which one should be implemented in a subclass.
> The call sequence seems as:
> prepareForProcessing -> StartBundle -> processElement



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[2/2] beam git commit: Make PrepareForProcessing Final, Deprecated

2017-02-13 Thread tgroh
Make PrepareForProcessing Final, Deprecated

This is only used by aggregators; Users should use @Setup or
@StartBundle instead.


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/82690271
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/82690271
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/82690271

Branch: refs/heads/master
Commit: 826902717376257d4c22718ba4981b95b9db7db2
Parents: 982ea7a
Author: Thomas Groh 
Authored: Mon Feb 13 09:03:32 2017 -0800
Committer: Thomas Groh 
Committed: Mon Feb 13 09:50:12 2017 -0800

--
 .../core/src/main/java/org/apache/beam/sdk/transforms/DoFn.java   | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/82690271/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/DoFn.java
--
diff --git 
a/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/DoFn.java 
b/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/DoFn.java
index a161919..126e967 100644
--- a/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/DoFn.java
+++ b/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/DoFn.java
@@ -808,7 +808,8 @@ public abstract class DoFn implements 
Serializable, HasDisplayD
* Finalize the {@link DoFn} construction to prepare for processing.
* This method should be called by runners before any processing methods.
*/
-  public void prepareForProcessing() {
+  @Deprecated
+  public final void prepareForProcessing() {
 aggregatorsAreFinal = true;
   }
 



[GitHub] beam pull request #1993: [BEAM-1461] Make PrepareForProcessing Final, Deprec...

2017-02-13 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/1993


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[1/2] beam git commit: This closes #1993

2017-02-13 Thread tgroh
Repository: beam
Updated Branches:
  refs/heads/master 982ea7af7 -> 30cb93ced


This closes #1993


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/30cb93ce
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/30cb93ce
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/30cb93ce

Branch: refs/heads/master
Commit: 30cb93ced35494db79c81c49e07a298f560c3905
Parents: 982ea7a 8269027
Author: Thomas Groh 
Authored: Mon Feb 13 09:50:12 2017 -0800
Committer: Thomas Groh 
Committed: Mon Feb 13 09:50:12 2017 -0800

--
 .../core/src/main/java/org/apache/beam/sdk/transforms/DoFn.java   | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)
--




[GitHub] beam pull request #1992: [BEAM-1469] Increase the bounds on the test to redu...

2017-02-13 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/1992


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


  1   2   >