[jira] [Assigned] (BEAM-2175) Support state API in Spark batch mode.

2017-05-05 Thread Aviem Zur (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-2175?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aviem Zur reassigned BEAM-2175:
---

Assignee: Jingsong Lee  (was: Amit Sela)

> Support state API in Spark batch mode.
> --
>
> Key: BEAM-2175
> URL: https://issues.apache.org/jira/browse/BEAM-2175
> Project: Beam
>  Issue Type: Sub-task
>  Components: runner-spark
>Reporter: Aviem Zur
>Assignee: Jingsong Lee
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (BEAM-2176) Support state API in Spark streaming mode.

2017-05-05 Thread Aviem Zur (JIRA)
Aviem Zur created BEAM-2176:
---

 Summary: Support state API in Spark streaming mode.
 Key: BEAM-2176
 URL: https://issues.apache.org/jira/browse/BEAM-2176
 Project: Beam
  Issue Type: Sub-task
  Components: runner-spark
Reporter: Aviem Zur
Assignee: Amit Sela






--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (BEAM-2175) Support state API in Spark batch mode.

2017-05-05 Thread Aviem Zur (JIRA)
Aviem Zur created BEAM-2175:
---

 Summary: Support state API in Spark batch mode.
 Key: BEAM-2175
 URL: https://issues.apache.org/jira/browse/BEAM-2175
 Project: Beam
  Issue Type: Sub-task
  Components: runner-spark
Reporter: Aviem Zur
Assignee: Amit Sela






--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Assigned] (BEAM-2176) Support state API in Spark streaming mode.

2017-05-05 Thread Aviem Zur (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-2176?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aviem Zur reassigned BEAM-2176:
---

Assignee: Aviem Zur  (was: Amit Sela)

> Support state API in Spark streaming mode.
> --
>
> Key: BEAM-2176
> URL: https://issues.apache.org/jira/browse/BEAM-2176
> Project: Beam
>  Issue Type: Sub-task
>  Components: runner-spark
>Reporter: Aviem Zur
>Assignee: Aviem Zur
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Assigned] (BEAM-1035) Support for new State API in SparkRunner

2017-05-05 Thread Aviem Zur (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-1035?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aviem Zur reassigned BEAM-1035:
---

Assignee: (was: Aviem Zur)

> Support for new State API in SparkRunner
> 
>
> Key: BEAM-1035
> URL: https://issues.apache.org/jira/browse/BEAM-1035
> Project: Beam
>  Issue Type: New Feature
>  Components: runner-spark
>Reporter: Kenneth Knowles
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


Jenkins build became unstable: beam_PostCommit_Java_ValidatesRunner_Flink #2681

2017-05-05 Thread Apache Jenkins Server
See 




Jenkins build is back to stable : beam_PostCommit_Java_ValidatesRunner_Flink #2682

2017-05-05 Thread Apache Jenkins Server
See 




[jira] [Assigned] (BEAM-2161) add support String functions

2017-05-05 Thread James Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-2161?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

James Xu reassigned BEAM-2161:
--

Assignee: James Xu

> add support String functions
> 
>
> Key: BEAM-2161
> URL: https://issues.apache.org/jira/browse/BEAM-2161
> Project: Beam
>  Issue Type: Task
>  Components: dsl-sql
>Reporter: Xu Mingmin
>Assignee: James Xu
>  Labels: bundle, split
>
> All functions are listed as below:
> {code}
> string || string
> CHAR_LENGTH(string)
> CHARACTER_LENGTH(string)
> UPPER(string)
> LOWER(string)
> POSITION(string1 IN string2)
> POSITION(string1 IN string2 FROM integer)
> TRIM( { BOTH | LEADING | TRAILING } string1 FROM string2)
> OVERLAY(string1 PLACING string2 FROM integer [ FOR integer2 ])
> SUBSTRING(string FROM integer)
> SUBSTRING(string FROM integer FOR integer)
> INITCAP(string)
> {code}
> see 
> https://calcite.apache.org/docs/reference.html#character-string-operators-and-functions
>  for more information.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (BEAM-2161) add support String functions

2017-05-05 Thread James Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2161?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15998016#comment-15998016
 ] 

James Xu commented on BEAM-2161:


I will take this one.

> add support String functions
> 
>
> Key: BEAM-2161
> URL: https://issues.apache.org/jira/browse/BEAM-2161
> Project: Beam
>  Issue Type: Task
>  Components: dsl-sql
>Reporter: Xu Mingmin
>Assignee: James Xu
>  Labels: bundle, split
>
> All functions are listed as below:
> {code}
> string || string
> CHAR_LENGTH(string)
> CHARACTER_LENGTH(string)
> UPPER(string)
> LOWER(string)
> POSITION(string1 IN string2)
> POSITION(string1 IN string2 FROM integer)
> TRIM( { BOTH | LEADING | TRAILING } string1 FROM string2)
> OVERLAY(string1 PLACING string2 FROM integer [ FOR integer2 ])
> SUBSTRING(string FROM integer)
> SUBSTRING(string FROM integer FOR integer)
> INITCAP(string)
> {code}
> see 
> https://calcite.apache.org/docs/reference.html#character-string-operators-and-functions
>  for more information.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[2/2] beam git commit: This closes #2421

2017-05-05 Thread aljoscha
This closes #2421


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/3bffe0e0
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/3bffe0e0
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/3bffe0e0

Branch: refs/heads/master
Commit: 3bffe0e0014bdd6ae73dc2e8ecfc2b61d066120c
Parents: 7903e59 fc4534c
Author: Aljoscha Krettek 
Authored: Fri May 5 13:05:01 2017 +0200
Committer: Aljoscha Krettek 
Committed: Fri May 5 13:05:01 2017 +0200

--
 .../streaming/SplittableDoFnOperator.java   | 28 +++-
 1 file changed, 27 insertions(+), 1 deletion(-)
--




[1/2] beam git commit: [BEAM-1862] SplittableDoFnOperator should close the ScheduledExecutorService

2017-05-05 Thread aljoscha
Repository: beam
Updated Branches:
  refs/heads/master 7903e59c4 -> 3bffe0e00


[BEAM-1862] SplittableDoFnOperator should close the ScheduledExecutorService


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/fc4534cd
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/fc4534cd
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/fc4534cd

Branch: refs/heads/master
Commit: fc4534cd6e5366a5f12cefebcd52ac1fe7cdde41
Parents: 7903e59
Author: JingsongLi 
Authored: Tue Apr 4 18:28:15 2017 +0800
Committer: Aljoscha Krettek 
Committed: Fri May 5 12:01:49 2017 +0200

--
 .../streaming/SplittableDoFnOperator.java   | 28 +++-
 1 file changed, 27 insertions(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/fc4534cd/runners/flink/src/main/java/org/apache/beam/runners/flink/translation/wrappers/streaming/SplittableDoFnOperator.java
--
diff --git 
a/runners/flink/src/main/java/org/apache/beam/runners/flink/translation/wrappers/streaming/SplittableDoFnOperator.java
 
b/runners/flink/src/main/java/org/apache/beam/runners/flink/translation/wrappers/streaming/SplittableDoFnOperator.java
index 7d54cfa..968fc0a 100644
--- 
a/runners/flink/src/main/java/org/apache/beam/runners/flink/translation/wrappers/streaming/SplittableDoFnOperator.java
+++ 
b/runners/flink/src/main/java/org/apache/beam/runners/flink/translation/wrappers/streaming/SplittableDoFnOperator.java
@@ -24,6 +24,8 @@ import java.util.Collections;
 import java.util.List;
 import java.util.Map;
 import java.util.concurrent.Executors;
+import java.util.concurrent.ScheduledExecutorService;
+import java.util.concurrent.TimeUnit;
 import org.apache.beam.runners.core.ElementAndRestriction;
 import org.apache.beam.runners.core.KeyedWorkItem;
 import org.apache.beam.runners.core.KeyedWorkItems;
@@ -57,6 +59,8 @@ public class SplittableDoFnOperator<
 extends DoFnOperator<
 KeyedWorkItem>, 
FnOutputT, OutputT> {
 
+  private transient ScheduledExecutorService executorService;
+
   public SplittableDoFnOperator(
   DoFn>, 
FnOutputT> doFn,
   String stepName,
@@ -108,6 +112,8 @@ public class SplittableDoFnOperator<
   }
 };
 
+executorService = 
Executors.newSingleThreadScheduledExecutor(Executors.defaultThreadFactory());
+
 ((SplittableParDo.ProcessFn) 
doFn).setStateInternalsFactory(stateInternalsFactory);
 ((SplittableParDo.ProcessFn) 
doFn).setTimerInternalsFactory(timerInternalsFactory);
 ((SplittableParDo.ProcessFn) doFn).setProcessElementInvoker(
@@ -137,7 +143,7 @@ public class SplittableDoFnOperator<
   }
 },
 sideInputReader,
-
Executors.newSingleThreadScheduledExecutor(Executors.defaultThreadFactory()),
+executorService,
 1,
 Duration.standardSeconds(10)));
   }
@@ -149,4 +155,24 @@ public class SplittableDoFnOperator<
 (String) stateInternals.getKey(),
 Collections.singletonList(timer.getNamespace();
   }
+
+  @Override
+  public void close() throws Exception {
+super.close();
+
+executorService.shutdown();
+
+long shutdownTimeout = Duration.standardSeconds(10).getMillis();
+try {
+  if (!executorService.awaitTermination(shutdownTimeout, 
TimeUnit.MILLISECONDS)) {
+LOG.debug("The scheduled executor service did not properly terminate. 
Shutting "
++ "it down now.");
+executorService.shutdownNow();
+  }
+} catch (InterruptedException e) {
+  LOG.debug("Could not properly await the termination of the scheduled 
executor service.", e);
+  executorService.shutdownNow();
+}
+  }
+
 }



[GitHub] beam pull request #2421: [BEAM-1862] SplittableDoFnOperator should close the...

2017-05-05 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/2421


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (BEAM-1862) SplittableDoFnOperator should close the ScheduledExecutorService

2017-05-05 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-1862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15998157#comment-15998157
 ] 

ASF GitHub Bot commented on BEAM-1862:
--

Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/2421


> SplittableDoFnOperator should close the ScheduledExecutorService
> 
>
> Key: BEAM-1862
> URL: https://issues.apache.org/jira/browse/BEAM-1862
> Project: Beam
>  Issue Type: Bug
>  Components: runner-flink
>Reporter: Jingsong Lee
>Assignee: Jingsong Lee
>
> {{SplittableDoFnOperator}} new a {{ScheduledExecutorService}} to 
> {{OutputAndTimeBoundedSplittableProcessElementInvoker}}, but not shutdown it.
> We should shutdown it in {{close()}}.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Closed] (BEAM-1862) SplittableDoFnOperator should close the ScheduledExecutorService

2017-05-05 Thread Aljoscha Krettek (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-1862?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aljoscha Krettek closed BEAM-1862.
--
   Resolution: Fixed
Fix Version/s: First stable release

> SplittableDoFnOperator should close the ScheduledExecutorService
> 
>
> Key: BEAM-1862
> URL: https://issues.apache.org/jira/browse/BEAM-1862
> Project: Beam
>  Issue Type: Bug
>  Components: runner-flink
>Reporter: Jingsong Lee
>Assignee: Jingsong Lee
> Fix For: First stable release
>
>
> {{SplittableDoFnOperator}} new a {{ScheduledExecutorService}} to 
> {{OutputAndTimeBoundedSplittableProcessElementInvoker}}, but not shutdown it.
> We should shutdown it in {{close()}}.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


Jenkins build became unstable: beam_PostCommit_Java_ValidatesRunner_Flink #2683

2017-05-05 Thread Apache Jenkins Server
See 




[jira] [Commented] (BEAM-991) DatastoreIO Write should flush early for large batches

2017-05-05 Thread Florian Scharinger (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-991?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15998181#comment-15998181
 ] 

Florian Scharinger commented on BEAM-991:
-

Just wanted to say that we are also experiencing this issue, which blocks us 
from writing data into Datastore in a compact structure. Would be great to get 
this into the next release. 

> DatastoreIO Write should flush early for large batches
> --
>
> Key: BEAM-991
> URL: https://issues.apache.org/jira/browse/BEAM-991
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-gcp
>Reporter: Vikas Kedigehalli
>Assignee: Vikas Kedigehalli
>
> If entities are large (avg size > 20KB) then the a single batched write (500 
> entities) would exceed the Datastore size limit of a single request (10MB) 
> from https://cloud.google.com/datastore/docs/concepts/limits.
> First reported in: 
> http://stackoverflow.com/questions/40156400/why-does-dataflow-erratically-fail-in-datastore-access



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (BEAM-991) DatastoreIO Write should flush early for large batches

2017-05-05 Thread Joshua Fox (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-991?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15998188#comment-15998188
 ] 

Joshua Fox commented on BEAM-991:
-

@Florian Scharinger The strange thing is how few people have reported this 
issue -- just you and me. This issue prevents Datastore from writing entities 
with average size more than approx. 10 kB! It's hard to see how anyone can work 
that way!

> DatastoreIO Write should flush early for large batches
> --
>
> Key: BEAM-991
> URL: https://issues.apache.org/jira/browse/BEAM-991
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-gcp
>Reporter: Vikas Kedigehalli
>Assignee: Vikas Kedigehalli
>
> If entities are large (avg size > 20KB) then the a single batched write (500 
> entities) would exceed the Datastore size limit of a single request (10MB) 
> from https://cloud.google.com/datastore/docs/concepts/limits.
> First reported in: 
> http://stackoverflow.com/questions/40156400/why-does-dataflow-erratically-fail-in-datastore-access



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


Jenkins build became unstable: beam_PostCommit_Java_ValidatesRunner_Spark #1955

2017-05-05 Thread Apache Jenkins Server
See 




Build failed in Jenkins: beam_PerformanceTests_Dataflow #375

2017-05-05 Thread Apache Jenkins Server
See 


Changes:

[aljoscha.krettek] [BEAM-1862] SplittableDoFnOperator should close the

--
Started by timer
[EnvInject] - Loading node environment variables.
Building remotely on beam5 (beam) in workspace 

 > git rev-parse --is-inside-work-tree # timeout=10
Fetching changes from the remote Git repository
 > git config remote.origin.url https://github.com/apache/beam.git # timeout=10
Fetching upstream changes from https://github.com/apache/beam.git
 > git --version # timeout=10
 > git fetch --tags --progress https://github.com/apache/beam.git 
 > +refs/heads/*:refs/remotes/origin/* +refs/pull/*:refs/remotes/origin/pr/*
 > git rev-parse origin/master^{commit} # timeout=10
Checking out Revision 3bffe0e0014bdd6ae73dc2e8ecfc2b61d066120c (origin/master)
 > git config core.sparsecheckout # timeout=10
 > git checkout -f 3bffe0e0014bdd6ae73dc2e8ecfc2b61d066120c
 > git rev-list 7903e59c4e07822f8def85ee6a7f8ef3ccb1ca7a # timeout=10
Cleaning workspace
 > git rev-parse --verify HEAD # timeout=10
Resetting working tree
 > git reset --hard # timeout=10
 > git clean -fdx # timeout=10
[EnvInject] - Executing scripts and injecting environment variables after the 
SCM step.
[EnvInject] - Injecting as environment variables the properties content 
SPARK_LOCAL_IP=127.0.0.1

[EnvInject] - Variables injected successfully.
[beam_PerformanceTests_Dataflow] $ /bin/bash -xe 
/tmp/hudson2417919858563827519.sh
+ rm -rf PerfKitBenchmarker
[beam_PerformanceTests_Dataflow] $ /bin/bash -xe 
/tmp/hudson7895987214291620795.sh
+ git clone https://github.com/GoogleCloudPlatform/PerfKitBenchmarker.git
Cloning into 'PerfKitBenchmarker'...
[beam_PerformanceTests_Dataflow] $ /bin/bash -xe 
/tmp/hudson2766035856415306403.sh
+ pip install --user -r PerfKitBenchmarker/requirements.txt
Requirement already satisfied (use --upgrade to upgrade): python-gflags==3.1.1 
in /home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 14))
Requirement already satisfied (use --upgrade to upgrade): jinja2>=2.7 in 
/usr/local/lib/python2.7/dist-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 15))
Requirement already satisfied (use --upgrade to upgrade): setuptools in 
/usr/lib/python2.7/dist-packages (from -r PerfKitBenchmarker/requirements.txt 
(line 16))
Requirement already satisfied (use --upgrade to upgrade): 
colorlog[windows]==2.6.0 in /home/jenkins/.local/lib/python2.7/site-packages 
(from -r PerfKitBenchmarker/requirements.txt (line 17))
  Installing extra requirements: 'windows'
Requirement already satisfied (use --upgrade to upgrade): blinker>=1.3 in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 18))
Requirement already satisfied (use --upgrade to upgrade): futures>=3.0.3 in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 19))
Requirement already satisfied (use --upgrade to upgrade): PyYAML==3.11 in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 20))
Requirement already satisfied (use --upgrade to upgrade): pint>=0.7 in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 21))
Requirement already satisfied (use --upgrade to upgrade): numpy in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 22))
Requirement already satisfied (use --upgrade to upgrade): functools32 in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 23))
Requirement already satisfied (use --upgrade to upgrade): contextlib2>=0.5.1 in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 24))
Cleaning up...
[beam_PerformanceTests_Dataflow] $ /bin/bash -xe 
/tmp/hudson5946974480346757738.sh
+ python PerfKitBenchmarker/pkb.py --project=apache-beam-testing 
--dpb_log_level=INFO --maven_binary=/home/jenkins/tools/maven/latest/bin/mvn 
--bigquery_table=beam_performance.pkb_results --official=true 
--benchmarks=dpb_wordcount_benchmark 
--dpb_dataflow_staging_location=gs://temp-storage-for-perf-tests/staging 
--dpb_wordcount_input=dataflow-samples/shakespeare/kinglear.txt 
--config_override=dpb_wordcount_benchmark.dpb_service.service_type=dataflow
WARNING:root:File resource loader root perfkitbenchmarker/data/ycsb is not a 
directory.
2017-05-05 12:00:18,732 d6d41712 MainThread INFO Verbose logging to: 
/tmp/perfkitbenchmarker/runs/d6d41712/pkb.log
2017-05-05 12:00:18,732 d6d41712 MainThread INFO PerfKitBenchmarker 
version: v1.11.0-49-g10dd95b
2017-05-05 12:00:18,733 d6d41712 MainThread INFO Flag values:
--maven_binary=/home/jenkins/tools/maven/latest/bin/mvn
--project=

Jenkins build is still unstable: beam_PostCommit_Java_ValidatesRunner_Flink #2684

2017-05-05 Thread Apache Jenkins Server
See 




Jenkins build is back to stable : beam_PostCommit_Java_ValidatesRunner_Spark #1956

2017-05-05 Thread Apache Jenkins Server
See 




[jira] [Assigned] (BEAM-806) Maven Release Plugin Does Not Set Archetype Versions

2017-05-05 Thread Elek, Marton (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-806?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Elek, Marton reassigned BEAM-806:
-

Assignee: Elek, Marton  (was: Davor Bonaci)

> Maven Release Plugin Does Not Set Archetype Versions
> 
>
> Key: BEAM-806
> URL: https://issues.apache.org/jira/browse/BEAM-806
> Project: Beam
>  Issue Type: Improvement
>  Components: build-system
>Affects Versions: 0.1.0-incubating, 0.2.0-incubating, 0.3.0-incubating
>Reporter: Aljoscha Krettek
>Assignee: Elek, Marton
>Priority: Blocker
> Fix For: First stable release
>
>
> When running {{mvn release:prepare}} as described in the new release guide 
> this does not update the version of the poms in the archetypes. To be clear, 
> the version of the archetype pom is updated, the pom in 
> {{archetype-resources}} (the pom of the project generated by the archetype) 
> is not updated.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (BEAM-806) Maven Release Plugin Does Not Set Archetype Versions

2017-05-05 Thread Elek, Marton (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15998284#comment-15998284
 ] 

Elek, Marton commented on BEAM-806:
---

Sorry, I couldn't reopen the issue, but it's not yet fully resolved. The 
starter archetype contains a reference project which contains hardcoded version 
at 
sdks/java/maven-archetypes/starter/src/test/resources/projects/basic/reference/pom.xml

It also should be replaced during the build. 

PR is coming...


> Maven Release Plugin Does Not Set Archetype Versions
> 
>
> Key: BEAM-806
> URL: https://issues.apache.org/jira/browse/BEAM-806
> Project: Beam
>  Issue Type: Improvement
>  Components: build-system
>Affects Versions: 0.1.0-incubating, 0.2.0-incubating, 0.3.0-incubating
>Reporter: Aljoscha Krettek
>Assignee: Elek, Marton
>Priority: Blocker
> Fix For: First stable release
>
>
> When running {{mvn release:prepare}} as described in the new release guide 
> this does not update the version of the poms in the archetypes. To be clear, 
> the version of the archetype pom is updated, the pom in 
> {{archetype-resources}} (the pom of the project generated by the archetype) 
> is not updated.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] beam pull request #2911: [BEAM-806] Use filtering for the version of the ref...

2017-05-05 Thread elek
GitHub user elek opened a pull request:

https://github.com/apache/beam/pull/2911

[BEAM-806] Use filtering for the version of the reference test project

With BEAM-2093 the archetypes almos could be released with maven commands 
without manual version adjustment.

The missing part is in the starter archetype where an integration test 
compares a predefined project with the generated one. The prefefined reference 
project also should be filtered to always use the current version.


sdks/java/maven-archetypes/starter/src/test/resources/projects/basic/reference/pom.xml

To reproduce the problem (without this patch) do `mvn versions:set 
-DnewVersion=0.8.0-SNAPSHOT`, and try to build the project.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/elek/beam BEAM-806

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/2911.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2911


commit 0aa753b5846afd8873432b02a00b6126bd4b1f01
Author: Elek, Márton 
Date:   2017-05-05T12:46:48Z

[BEAM-806] Use filtering for the version of the reference test project




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (BEAM-806) Maven Release Plugin Does Not Set Archetype Versions

2017-05-05 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15998287#comment-15998287
 ] 

ASF GitHub Bot commented on BEAM-806:
-

GitHub user elek opened a pull request:

https://github.com/apache/beam/pull/2911

[BEAM-806] Use filtering for the version of the reference test project

With BEAM-2093 the archetypes almos could be released with maven commands 
without manual version adjustment.

The missing part is in the starter archetype where an integration test 
compares a predefined project with the generated one. The prefefined reference 
project also should be filtered to always use the current version.


sdks/java/maven-archetypes/starter/src/test/resources/projects/basic/reference/pom.xml

To reproduce the problem (without this patch) do `mvn versions:set 
-DnewVersion=0.8.0-SNAPSHOT`, and try to build the project.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/elek/beam BEAM-806

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/2911.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2911


commit 0aa753b5846afd8873432b02a00b6126bd4b1f01
Author: Elek, Márton 
Date:   2017-05-05T12:46:48Z

[BEAM-806] Use filtering for the version of the reference test project




> Maven Release Plugin Does Not Set Archetype Versions
> 
>
> Key: BEAM-806
> URL: https://issues.apache.org/jira/browse/BEAM-806
> Project: Beam
>  Issue Type: Improvement
>  Components: build-system
>Affects Versions: 0.1.0-incubating, 0.2.0-incubating, 0.3.0-incubating
>Reporter: Aljoscha Krettek
>Assignee: Elek, Marton
>Priority: Blocker
> Fix For: First stable release
>
>
> When running {{mvn release:prepare}} as described in the new release guide 
> this does not update the version of the poms in the archetypes. To be clear, 
> the version of the archetype pom is updated, the pom in 
> {{archetype-resources}} (the pom of the project generated by the archetype) 
> is not updated.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Assigned] (BEAM-160) Port 'NexMark Queries' to Beam for use as integration test

2017-05-05 Thread Etienne Chauchot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-160?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Etienne Chauchot reassigned BEAM-160:
-

Assignee: Etienne Chauchot  (was: Ismaël Mejía)

> Port 'NexMark Queries' to Beam for use as integration test
> --
>
> Key: BEAM-160
> URL: https://issues.apache.org/jira/browse/BEAM-160
> Project: Beam
>  Issue Type: Test
>  Components: testing
>Reporter: Mark Shields
>Assignee: Etienne Chauchot
>
> A while back we implemented the 'queries' from
>   http://datalab.cs.pdx.edu/niagara/NEXMark/
> as Gooogle Dataflow pipelines. We found them useful
> for uncovering performance problems with the sdk, our runners,
> and our service. Many of those problems only manifested under
> high load, multi-day runs, or with high 'backlog' on the incoming
> pub/sub subscriptions.
> We thus think they would be useful for other runners.
> Disclaimer: Though the original 'queries' were proposed as a way to
> benchmark 'continuous SQL' implementations, we have so far only
> used them for internal A/B and regression testing and have not validated
> them as representative of customer workloads. We would thus discourage their 
> use for competitive benchmarks without more work.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (BEAM-2177) Support file scheme

2017-05-05 Thread JIRA
Jean-Baptiste Onofré created BEAM-2177:
--

 Summary: Support file scheme
 Key: BEAM-2177
 URL: https://issues.apache.org/jira/browse/BEAM-2177
 Project: Beam
  Issue Type: Sub-task
  Components: sdk-java-core
Reporter: Jean-Baptiste Onofré
Assignee: Jean-Baptiste Onofré


Now, we support "new" filesystems using schema. For instance, it's possible to 
do:

{code}
.apply(TextIO.write().to("hdfs://foo"))
{code}

Of course, if:

{code}
.apply(TextIO.write().to("/path/to/foo"))
{code}

works, users may be tempted to use:

{code}
.apply(TextIO.write().to("file://path/to/foo"))
{code}

which actually doesn't work today.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (BEAM-2178) Update hadoop-file-system README.md

2017-05-05 Thread JIRA
Jean-Baptiste Onofré created BEAM-2178:
--

 Summary: Update hadoop-file-system README.md
 Key: BEAM-2178
 URL: https://issues.apache.org/jira/browse/BEAM-2178
 Project: Beam
  Issue Type: Sub-task
  Components: sdk-java-extensions
Reporter: Jean-Baptiste Onofré
Assignee: Jean-Baptiste Onofré


The {{README.md}} in {{hadoop-file-system}} is still the one from {{HdfsIO}} 
which is no more valid.

With my tests on HDFS, I'm preparing a PR to update this {{README.md}} 
explaining how to use Hadoop file system in Beam, and the configuration part 
(using command line options or {{HADOOP_CONF_DIR}}).



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (BEAM-2179) Archetype generate-sources.sh should delete old files

2017-05-05 Thread JIRA
Jean-Baptiste Onofré created BEAM-2179:
--

 Summary: Archetype generate-sources.sh should delete old files
 Key: BEAM-2179
 URL: https://issues.apache.org/jira/browse/BEAM-2179
 Project: Beam
  Issue Type: Bug
  Components: project-management
Reporter: Jean-Baptiste Onofré
Assignee: Jean-Baptiste Onofré


Examples archetypes use {{generate-sources.sh}} to copy the sources from 
examples modules into the archetype resources. However, it's a simple copy and 
doesn't remove previously copied files (it's just an overwrite).
It means that if we remove a file in examples module, it's not updated in the 
archetypes (the file is still present).



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (BEAM-2180) Upgrade Apex dependency to 3.6.0

2017-05-05 Thread Thomas Weise (JIRA)
Thomas Weise created BEAM-2180:
--

 Summary: Upgrade Apex dependency to 3.6.0
 Key: BEAM-2180
 URL: https://issues.apache.org/jira/browse/BEAM-2180
 Project: Beam
  Issue Type: Task
  Components: runner-apex
Reporter: Thomas Weise
Assignee: Thomas Weise
Priority: Minor
 Fix For: First stable release






--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Assigned] (BEAM-593) Support unblocking run() in FlinkRunner and cancel() and waitUntilFinish() in FlinkRunnerResult

2017-05-05 Thread Aljoscha Krettek (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-593?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aljoscha Krettek reassigned BEAM-593:
-

Assignee: Aljoscha Krettek

> Support unblocking run() in FlinkRunner and cancel() and waitUntilFinish() in 
> FlinkRunnerResult
> ---
>
> Key: BEAM-593
> URL: https://issues.apache.org/jira/browse/BEAM-593
> Project: Beam
>  Issue Type: New Feature
>  Components: runner-flink
>Reporter: Pei He
>Assignee: Aljoscha Krettek
>
> We introduced both functions to PipelineResult.
> Currently, both of them throw UnsupportedOperationException in Flink runner.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] beam pull request #2912: [BEAM-2180] update Apex version to 3.6.0

2017-05-05 Thread tweise
GitHub user tweise opened a pull request:

https://github.com/apache/beam/pull/2912

[BEAM-2180] update Apex version to 3.6.0

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [ ] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [ ] Make sure tests pass via `mvn clean verify`.
 - [ ] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [ ] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.pdf).

---


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/tweise/beam apex-3.6.0

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/2912.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2912


commit 0066ca8d9bca3a5e2106548f2736351913c7f681
Author: Thomas Weise 
Date:   2017-05-02T15:21:45Z

[BEAM-2180] update Apex version to 3.6.0




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (BEAM-2180) Upgrade Apex dependency to 3.6.0

2017-05-05 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15998379#comment-15998379
 ] 

ASF GitHub Bot commented on BEAM-2180:
--

GitHub user tweise opened a pull request:

https://github.com/apache/beam/pull/2912

[BEAM-2180] update Apex version to 3.6.0

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [ ] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [ ] Make sure tests pass via `mvn clean verify`.
 - [ ] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [ ] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.pdf).

---


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/tweise/beam apex-3.6.0

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/2912.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2912


commit 0066ca8d9bca3a5e2106548f2736351913c7f681
Author: Thomas Weise 
Date:   2017-05-02T15:21:45Z

[BEAM-2180] update Apex version to 3.6.0




> Upgrade Apex dependency to 3.6.0
> 
>
> Key: BEAM-2180
> URL: https://issues.apache.org/jira/browse/BEAM-2180
> Project: Beam
>  Issue Type: Task
>  Components: runner-apex
>Reporter: Thomas Weise
>Assignee: Thomas Weise
>Priority: Minor
> Fix For: First stable release
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (BEAM-454) Validate Pubsub Topic exists when reading

2017-05-05 Thread Borisa Zivkovic (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-454?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15998422#comment-15998422
 ] 

Borisa Zivkovic commented on BEAM-454:
--

I can take this if you want. Few questions:

- would it make sense to check for existence of topic when writing data to it? 
Today if you try to write to topic that does not exist it will only be 
discovered once data starts flowing - not during pipeline construction time

- are there any e2e tests that use directly pubsub today?

- today if you try to read from topic that does not exist an exception would be 
thrown - I guess you only want to make exception more readable and user 
friendly, saying something like: unable to read from topic X because it does 
not exist, right?

> Validate Pubsub Topic exists when reading
> -
>
> Key: BEAM-454
> URL: https://issues.apache.org/jira/browse/BEAM-454
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-gcp
>Reporter: Frances Perry
>Priority: Minor
>  Labels: newbie, starter
>
> When reading from Pubsub, we should validate the pubsub topic exists at graph 
> construction time (similar to the way we validate a BQ dataset and table 
> exist).



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (BEAM-2181) Upgrade Bigtable dependency to 0.9.6.2

2017-05-05 Thread Solomon Duskis (JIRA)
Solomon Duskis created BEAM-2181:


 Summary: Upgrade Bigtable dependency to 0.9.6.2
 Key: BEAM-2181
 URL: https://issues.apache.org/jira/browse/BEAM-2181
 Project: Beam
  Issue Type: Bug
  Components: sdk-java-gcp
Reporter: Solomon Duskis
Assignee: Daniel Halperin


Cloud Bigtable 0.9.6.2 has some fixes relating to:

1) Using dependencies for GCP protobuf objects rather than including generated 
artifacts directly in bigtable-protos
2) BulkMutation bug fixes
3) Auth token management
4) Using fewer grpc experimental features.

All are important in the context of beam, so the beam dependency should be 
upgraded.

One snag came up.  BigtableSession.isAlpnProviderEnabled() was removed in order 
to reduce the number of grpc experimental features.  
BigtableServiceImpl.tableExists() can no longer depend on 
isAlpnProviderEnabled().



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (BEAM-2172) ProcessBundleHandlerTest.testCreatingAndProcessingDoFn fails at HEAD

2017-05-05 Thread Thomas Groh (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-2172?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Groh resolved BEAM-2172.
---
   Resolution: Fixed
Fix Version/s: Not applicable

> ProcessBundleHandlerTest.testCreatingAndProcessingDoFn fails at HEAD
> 
>
> Key: BEAM-2172
> URL: https://issues.apache.org/jira/browse/BEAM-2172
> Project: Beam
>  Issue Type: Bug
>  Components: beam-model-fn-api
>Reporter: Eugene Kirpichov
>Assignee: Thomas Groh
> Fix For: Not applicable
>
>
> java.lang.AssertionError: 
> Expected: iterable containing [ pane=PaneInfo.NO_FIRING}>]
>  but: item 0: was  timestamp=294247-01-09T04:00:54.775Z, pane=PaneInfo.NO_FIRING}>
>   at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20)
>   at org.junit.Assert.assertThat(Assert.java:956)
>   at org.junit.Assert.assertThat(Assert.java:923)
>   at 
> org.apache.beam.fn.harness.control.ProcessBundleHandlerTest.testCreatingAndProcessingDoFn(ProcessBundleHandlerTest.java:436)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>   at 
> org.junit.rules.ExpectedException$ExpectedExceptionStatement.evaluate(ExpectedException.java:239)
>   at org.junit.rules.RunRules.evaluate(RunRules.java:20)
>   at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
>   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
>   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
>   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
>   at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
>   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
>   at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
>   at org.junit.runner.JUnitCore.run(JUnitCore.java:137)
>   at 
> com.intellij.junit4.JUnit4IdeaTestRunner.startRunnerWithArgs(JUnit4IdeaTestRunner.java:68)
>   at 
> com.intellij.rt.execution.junit.IdeaTestRunner$Repeater.startRunnerWithArgs(IdeaTestRunner.java:51)
>   at 
> com.intellij.rt.execution.junit.JUnitStarter.prepareStreamsAndStart(JUnitStarter.java:237)
>   at 
> com.intellij.rt.execution.junit.JUnitStarter.main(JUnitStarter.java:70)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at com.intellij.rt.execution.application.AppMain.main(AppMain.java:147)



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (BEAM-1742) UnboundedSource CheckpointMark should have more precise documentation

2017-05-05 Thread Thomas Groh (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-1742?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Groh resolved BEAM-1742.
---
   Resolution: Fixed
Fix Version/s: First stable release

> UnboundedSource CheckpointMark should have more precise documentation
> -
>
> Key: BEAM-1742
> URL: https://issues.apache.org/jira/browse/BEAM-1742
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core
>Reporter: Thomas Groh
>Assignee: Thomas Groh
> Fix For: First stable release
>
>
> It should describe when a Checkpoint can committed, and should have more 
> precise definitions of when a runner is permitted to finalize it.
> Generally these definitions are more forgiving than what some checkpoints 
> assume, so existing Unbounded Sources should be audited to ensure they behave 
> correctly in this context.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] beam pull request #2913: [BEAM-2179] Archetype generate-sources.sh cleanup t...

2017-05-05 Thread jbonofre
GitHub user jbonofre opened a pull request:

https://github.com/apache/beam/pull/2913

[BEAM-2179] Archetype generate-sources.sh cleanup the existing sources 
before rsync

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [X] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [X] Make sure tests pass via `mvn clean verify`.
 - [X] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [X] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.pdf).

---


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/jbonofre/beam 
BEAM-2179-ARCHETYPES-GENERATE-SOURCES

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/2913.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2913






---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (BEAM-831) ParDo Chaining

2017-05-05 Thread Thomas Weise (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-831?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15998507#comment-15998507
 ] 

Thomas Weise commented on BEAM-831:
---

There are changes needed on top of the original PR, I will submit a new PR. 
Also I see that one of the unit tests does not terminate as expected when the 
chaining is enabled, that needs to be looked into also.

> ParDo Chaining
> --
>
> Key: BEAM-831
> URL: https://issues.apache.org/jira/browse/BEAM-831
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-apex
>Reporter: Thomas Weise
>Assignee: Chinmay Kolhatkar
> Fix For: First stable release
>
>
> Current state of Apex runner creates a plan that will place each operator in 
> a separate container (which would be processes when running on a YARN 
> cluster). Often the ParDo operators can be collocated in same thread or 
> container. Use Apex affinity/stream locality attributes for more efficient 
> execution plan.  



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (BEAM-2179) Archetype generate-sources.sh should delete old files

2017-05-05 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15998506#comment-15998506
 ] 

ASF GitHub Bot commented on BEAM-2179:
--

GitHub user jbonofre opened a pull request:

https://github.com/apache/beam/pull/2913

[BEAM-2179] Archetype generate-sources.sh cleanup the existing sources 
before rsync

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [X] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [X] Make sure tests pass via `mvn clean verify`.
 - [X] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [X] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.pdf).

---


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/jbonofre/beam 
BEAM-2179-ARCHETYPES-GENERATE-SOURCES

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/2913.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2913






> Archetype generate-sources.sh should delete old files
> -
>
> Key: BEAM-2179
> URL: https://issues.apache.org/jira/browse/BEAM-2179
> Project: Beam
>  Issue Type: Bug
>  Components: project-management
>Reporter: Jean-Baptiste Onofré
>Assignee: Jean-Baptiste Onofré
>
> Examples archetypes use {{generate-sources.sh}} to copy the sources from 
> examples modules into the archetype resources. However, it's a simple copy 
> and doesn't remove previously copied files (it's just an overwrite).
> It means that if we remove a file in examples module, it's not updated in the 
> archetypes (the file is still present).



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] beam pull request #2914: [BEAM-2021] Convert Coder into an Abstract Static C...

2017-05-05 Thread tgroh
GitHub user tgroh opened a pull request:

https://github.com/apache/beam/pull/2914

[BEAM-2021] Convert Coder into an Abstract Static Class

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [ ] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [ ] Make sure tests pass via `mvn clean verify`.
 - [ ] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [ ] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.pdf).

---


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/tgroh/beam coder_abstract_class

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/2914.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2914


commit 518335c0245a00f3740440aba074f66cf6f448d5
Author: Thomas Groh 
Date:   2017-05-05T16:20:10Z

Convert Coder into an Abstract Static Class




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (BEAM-2021) Fix Java's Coder class hierarchy

2017-05-05 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2021?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15998520#comment-15998520
 ] 

ASF GitHub Bot commented on BEAM-2021:
--

GitHub user tgroh opened a pull request:

https://github.com/apache/beam/pull/2914

[BEAM-2021] Convert Coder into an Abstract Static Class

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [ ] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [ ] Make sure tests pass via `mvn clean verify`.
 - [ ] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [ ] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.pdf).

---


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/tgroh/beam coder_abstract_class

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/2914.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2914


commit 518335c0245a00f3740440aba074f66cf6f448d5
Author: Thomas Groh 
Date:   2017-05-05T16:20:10Z

Convert Coder into an Abstract Static Class




> Fix Java's Coder class hierarchy
> 
>
> Key: BEAM-2021
> URL: https://issues.apache.org/jira/browse/BEAM-2021
> Project: Beam
>  Issue Type: Improvement
>  Components: beam-model-runner-api, sdk-java-core
>Affects Versions: First stable release
>Reporter: Kenneth Knowles
>Assignee: Thomas Groh
>  Labels: backward-incompatible
> Fix For: First stable release
>
>
> This is thoroughly out of hand. In the runner API world, there are two paths:
> 1. URN plus component coders plus custom payload (in the form of component 
> coders alongside an SdkFunctionSpec)
> 2. Custom coder (a single URN) and payload is serialized Java. I think this 
> never has component coders.
> The other base classes have now been shown to be extraneous: they favor 
> saving ~3 lines of boilerplate for rarely written code at the cost of 
> readability. Instead they should just be dropped.
> The custom payload is an Any proto in the runner API. But tying the Coder 
> interface to proto would be unfortunate from a design perspective and cannot 
> be done anyhow due to dependency hell.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] beam pull request #2915: [BEAM-593] Add non-blocking pipeline execution on F...

2017-05-05 Thread aljoscha
GitHub user aljoscha opened a pull request:

https://github.com/apache/beam/pull/2915

[BEAM-593] Add non-blocking pipeline execution on Flink Runner 



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/aljoscha/beam jira-593-async-execute

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/2915.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2915


commit f0cb644c684b6cbe744ba6f827700895e7dee4ee
Author: Aljoscha Krettek 
Date:   2017-05-05T12:13:01Z

Add FlinkPipelineExecutor with subclasses for batch and streaming

This replaces the old FlinkPipelineExecutionEnvironment which was
responsible for both batch and stream execution, which made the code
more complicated.

commit 3de09e8b060b1d17e5789b09ac0e056c53fbc7b0
Author: Aljoscha Krettek 
Date:   2017-05-05T16:36:29Z

[BEAM-593] Add non-blocking pipeline execution on Flink Runner

This directly uses the lower level interfaces to submit Flink jobs and
to query their state and accumulators.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (BEAM-593) Support unblocking run() in FlinkRunner and cancel() and waitUntilFinish() in FlinkRunnerResult

2017-05-05 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15998540#comment-15998540
 ] 

ASF GitHub Bot commented on BEAM-593:
-

GitHub user aljoscha opened a pull request:

https://github.com/apache/beam/pull/2915

[BEAM-593] Add non-blocking pipeline execution on Flink Runner 



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/aljoscha/beam jira-593-async-execute

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/2915.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2915


commit f0cb644c684b6cbe744ba6f827700895e7dee4ee
Author: Aljoscha Krettek 
Date:   2017-05-05T12:13:01Z

Add FlinkPipelineExecutor with subclasses for batch and streaming

This replaces the old FlinkPipelineExecutionEnvironment which was
responsible for both batch and stream execution, which made the code
more complicated.

commit 3de09e8b060b1d17e5789b09ac0e056c53fbc7b0
Author: Aljoscha Krettek 
Date:   2017-05-05T16:36:29Z

[BEAM-593] Add non-blocking pipeline execution on Flink Runner

This directly uses the lower level interfaces to submit Flink jobs and
to query their state and accumulators.




> Support unblocking run() in FlinkRunner and cancel() and waitUntilFinish() in 
> FlinkRunnerResult
> ---
>
> Key: BEAM-593
> URL: https://issues.apache.org/jira/browse/BEAM-593
> Project: Beam
>  Issue Type: New Feature
>  Components: runner-flink
>Reporter: Pei He
>Assignee: Aljoscha Krettek
>
> We introduced both functions to PipelineResult.
> Currently, both of them throw UnsupportedOperationException in Flink runner.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (BEAM-2182) AvroCoder is quite overconfident

2017-05-05 Thread Kenneth Knowles (JIRA)
Kenneth Knowles created BEAM-2182:
-

 Summary: AvroCoder is quite overconfident
 Key: BEAM-2182
 URL: https://issues.apache.org/jira/browse/BEAM-2182
 Project: Beam
  Issue Type: Bug
  Components: sdk-java-core
Reporter: Kenneth Knowles


AvoCoder claims it can create a schema for:

{code}
public static class X {
  public Thread t = Thread.currentThread();
}
{code}

The issue goes all the way to Avro, we think. To be investigated. Meanwhile, we 
need AvroCoder to say no to such things.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (BEAM-1582) ResumeFromCheckpointStreamingTest flakes with what appears as a second firing.

2017-05-05 Thread Davor Bonaci (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-1582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15998557#comment-15998557
 ] 

Davor Bonaci commented on BEAM-1582:


+1 -- this makes total sense to me.

> ResumeFromCheckpointStreamingTest flakes with what appears as a second firing.
> --
>
> Key: BEAM-1582
> URL: https://issues.apache.org/jira/browse/BEAM-1582
> Project: Beam
>  Issue Type: Bug
>  Components: runner-spark
>Reporter: Amit Sela
>Priority: Minor
>  Labels: flake
>
> See: 
> https://builds.apache.org/view/Beam/job/beam_PostCommit_Java_MavenInstall/org.apache.beam$beam-runners-spark/2788/testReport/junit/org.apache.beam.runners.spark.translation.streaming/ResumeFromCheckpointStreamingTest/testWithResume/
> After some digging in it appears that a second firing occurs (though only one 
> is expected) but it doesn't come from a stale state (state is empty before it 
> fires).
> Might be a retry happening for some reason, which is OK in terms of 
> fault-tolerance guarantees (at-least-once), but not so much in terms of flaky 
> tests. 
> I'm looking into this hoping to fix this ASAP.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Closed] (BEAM-1004) Autogenerate example archetypes as part of build process

2017-05-05 Thread Daniel Halperin (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-1004?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Halperin closed BEAM-1004.
-
   Resolution: Fixed
 Assignee: Kenneth Knowles
Fix Version/s: Not applicable

> Autogenerate example archetypes as part of build process
> 
>
> Key: BEAM-1004
> URL: https://issues.apache.org/jira/browse/BEAM-1004
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-extensions
>Reporter: Kenneth Knowles
>Assignee: Kenneth Knowles
> Fix For: Not applicable
>
>
> Previously, the maven archetypes were manually curated. Recently, the 
> generation of the content for the example archetype was automated, and 
> another Java 8 example archetype created. The generated content is currently 
> checked into source control, but should be instead generated as part of the 
> build process.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[1/3] beam git commit: [BEAM-2016] Delete HdfsFileSource & Sink

2017-05-05 Thread dhalperi
Repository: beam
Updated Branches:
  refs/heads/master 3bffe0e00 -> 610bda168


http://git-wip-us.apache.org/repos/asf/beam/blob/7512a73c/sdks/java/io/hadoop-file-system/src/test/java/org/apache/beam/sdk/io/hdfs/HDFSFileSinkTest.java
--
diff --git 
a/sdks/java/io/hadoop-file-system/src/test/java/org/apache/beam/sdk/io/hdfs/HDFSFileSinkTest.java
 
b/sdks/java/io/hadoop-file-system/src/test/java/org/apache/beam/sdk/io/hdfs/HDFSFileSinkTest.java
deleted file mode 100644
index 9fa6606..000
--- 
a/sdks/java/io/hadoop-file-system/src/test/java/org/apache/beam/sdk/io/hdfs/HDFSFileSinkTest.java
+++ /dev/null
@@ -1,172 +0,0 @@
-/*
- * Licensed to the Apache Software Foundation (ASF) under one
- * or more contributor license agreements.  See the NOTICE file
- * distributed with this work for additional information
- * regarding copyright ownership.  The ASF licenses this file
- * to you under the Apache License, Version 2.0 (the
- * "License"); you may not use this file except in compliance
- * with the License.  You may obtain a copy of the License at
- *
- * http://www.apache.org/licenses/LICENSE-2.0
- *
- * Unless required by applicable law or agreed to in writing, software
- * distributed under the License is distributed on an "AS IS" BASIS,
- * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- * See the License for the specific language governing permissions and
- * limitations under the License.
- */
-package org.apache.beam.sdk.io.hdfs;
-
-import static org.junit.Assert.assertEquals;
-
-import com.google.common.base.MoreObjects;
-import java.io.File;
-import java.nio.charset.Charset;
-import java.nio.file.Files;
-import java.util.Collections;
-import java.util.List;
-import java.util.Objects;
-import java.util.UUID;
-import org.apache.avro.file.DataFileReader;
-import org.apache.avro.file.FileReader;
-import org.apache.avro.generic.GenericData;
-import org.apache.avro.generic.GenericDatumReader;
-import org.apache.avro.mapred.AvroKey;
-import org.apache.beam.sdk.coders.AvroCoder;
-import org.apache.beam.sdk.coders.DefaultCoder;
-import org.apache.beam.sdk.options.PipelineOptions;
-import org.apache.beam.sdk.options.PipelineOptionsFactory;
-import org.apache.beam.sdk.transforms.SerializableFunction;
-import org.apache.beam.sdk.values.KV;
-import org.apache.hadoop.conf.Configuration;
-import org.apache.hadoop.fs.Path;
-import org.apache.hadoop.io.NullWritable;
-import org.apache.hadoop.io.SequenceFile;
-import org.apache.hadoop.io.Text;
-import org.apache.hadoop.mapreduce.lib.output.SequenceFileOutputFormat;
-import org.junit.Rule;
-import org.junit.Test;
-import org.junit.rules.TemporaryFolder;
-
-/**
- * Tests for HDFSFileSinkTest.
- */
-public class HDFSFileSinkTest {
-
-  @Rule
-  public TemporaryFolder tmpFolder = new TemporaryFolder();
-
-  private final String part0 = "part-r-0";
-  private final String foobar = "foobar";
-
-  private  void doWrite(Sink sink,
-   PipelineOptions options,
-   Iterable toWrite) throws Exception {
-Sink.WriteOperation writeOperation =
-(Sink.WriteOperation) sink.createWriteOperation();
-Sink.Writer writer = writeOperation.createWriter(options);
-writer.openUnwindowed(UUID.randomUUID().toString(),  -1, -1);
-for (T t: toWrite) {
-  writer.write(t);
-}
-String writeResult = writer.close();
-writeOperation.finalize(Collections.singletonList(writeResult), options);
-  }
-
-  @Test
-  public void testWriteSingleRecord() throws Exception {
-PipelineOptions options = PipelineOptionsFactory.create();
-File file = tmpFolder.newFolder();
-
-HDFSFileSink sink =
-HDFSFileSink.to(
-file.toString(),
-SequenceFileOutputFormat.class,
-NullWritable.class,
-Text.class,
-new SerializableFunction>() {
-  @Override
-  public KV apply(String input) {
-return KV.of(NullWritable.get(), new Text(input));
-  }
-});
-
-doWrite(sink, options, Collections.singletonList(foobar));
-
-SequenceFile.Reader.Option opts =
-SequenceFile.Reader.file(new Path(file.toString(), part0));
-SequenceFile.Reader reader = new SequenceFile.Reader(new Configuration(), 
opts);
-assertEquals(NullWritable.class.getName(), reader.getKeyClassName());
-assertEquals(Text.class.getName(), reader.getValueClassName());
-NullWritable k = NullWritable.get();
-Text v = new Text();
-assertEquals(true, reader.next(k, v));
-assertEquals(NullWritable.get(), k);
-assertEquals(new Text(foobar), v);
-  }
-
-  @Test
-  public void testToText() throws Exception {
-PipelineOptions options = PipelineOptionsFactory.create();
-File file = tmpFolder.newFolder();
-
-HDFSFileSink sink = 
HDFSFileSink.toText(file.toString());
-
-doWrite(sink, options, Collec

[3/3] beam git commit: This closes #2908

2017-05-05 Thread dhalperi
This closes #2908


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/610bda16
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/610bda16
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/610bda16

Branch: refs/heads/master
Commit: 610bda1682d132a2fad958743ab173cbfab085cf
Parents: 3bffe0e 7512a73
Author: Dan Halperin 
Authored: Fri May 5 09:57:15 2017 -0700
Committer: Dan Halperin 
Committed: Fri May 5 09:57:15 2017 -0700

--
 sdks/java/io/hadoop-file-system/README.md   |  43 --
 sdks/java/io/hadoop-file-system/pom.xml |  24 -
 .../apache/beam/sdk/io/hdfs/HDFSFileSink.java   | 478 --
 .../apache/beam/sdk/io/hdfs/HDFSFileSource.java | 625 ---
 .../java/org/apache/beam/sdk/io/hdfs/Sink.java  | 195 --
 .../org/apache/beam/sdk/io/hdfs/UGIHelper.java  |  38 --
 .../java/org/apache/beam/sdk/io/hdfs/Write.java | 588 -
 .../apache/beam/sdk/io/hdfs/package-info.java   |   3 +-
 .../beam/sdk/io/hdfs/HDFSFileSinkTest.java  | 172 -
 .../beam/sdk/io/hdfs/HDFSFileSourceTest.java| 231 ---
 10 files changed, 2 insertions(+), 2395 deletions(-)
--




[2/3] beam git commit: [BEAM-2016] Delete HdfsFileSource & Sink

2017-05-05 Thread dhalperi
[BEAM-2016] Delete HdfsFileSource & Sink


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/7512a73c
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/7512a73c
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/7512a73c

Branch: refs/heads/master
Commit: 7512a73cf8aa2a527c89ecb054e92207411ed241
Parents: 3bffe0e
Author: Dan Halperin 
Authored: Thu May 4 18:33:16 2017 -0700
Committer: Dan Halperin 
Committed: Fri May 5 08:48:04 2017 -0700

--
 sdks/java/io/hadoop-file-system/README.md   |  43 --
 sdks/java/io/hadoop-file-system/pom.xml |  24 -
 .../apache/beam/sdk/io/hdfs/HDFSFileSink.java   | 478 --
 .../apache/beam/sdk/io/hdfs/HDFSFileSource.java | 625 ---
 .../java/org/apache/beam/sdk/io/hdfs/Sink.java  | 195 --
 .../org/apache/beam/sdk/io/hdfs/UGIHelper.java  |  38 --
 .../java/org/apache/beam/sdk/io/hdfs/Write.java | 588 -
 .../apache/beam/sdk/io/hdfs/package-info.java   |   3 +-
 .../beam/sdk/io/hdfs/HDFSFileSinkTest.java  | 172 -
 .../beam/sdk/io/hdfs/HDFSFileSourceTest.java| 231 ---
 10 files changed, 2 insertions(+), 2395 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/7512a73c/sdks/java/io/hadoop-file-system/README.md
--
diff --git a/sdks/java/io/hadoop-file-system/README.md 
b/sdks/java/io/hadoop-file-system/README.md
deleted file mode 100644
index 3a734f2..000
--- a/sdks/java/io/hadoop-file-system/README.md
+++ /dev/null
@@ -1,43 +0,0 @@
-
-
-# HDFS IO
-
-This library provides HDFS sources and sinks to make it possible to read and
-write Apache Hadoop file formats from Apache Beam pipelines.
-
-Currently, only the read path is implemented. A `HDFSFileSource` allows any
-Hadoop `FileInputFormat` to be read as a `PCollection`.
-
-A `HDFSFileSource` can be read from using the
-`org.apache.beam.sdk.io.Read` transform. For example:
-
-```java
-HDFSFileSource source = HDFSFileSource.from(path, MyInputFormat.class,
-  MyKey.class, MyValue.class);
-PCollection> records = pipeline.apply(Read.from(mySource));
-```
-
-Alternatively, the `readFrom` method is a convenience method that returns a 
read
-transform. For example:
-
-```java
-PCollection> records = 
pipeline.apply(HDFSFileSource.readFrom(path,
-  MyInputFormat.class, MyKey.class, MyValue.class));
-```

http://git-wip-us.apache.org/repos/asf/beam/blob/7512a73c/sdks/java/io/hadoop-file-system/pom.xml
--
diff --git a/sdks/java/io/hadoop-file-system/pom.xml 
b/sdks/java/io/hadoop-file-system/pom.xml
index 562277e..3b392c2 100644
--- a/sdks/java/io/hadoop-file-system/pom.xml
+++ b/sdks/java/io/hadoop-file-system/pom.xml
@@ -82,11 +82,6 @@
 
 
 
-  org.apache.beam
-  beam-sdks-java-io-hadoop-common
-
-
-
   com.fasterxml.jackson.core
   jackson-core
 
@@ -124,25 +119,6 @@
 
 
 
-  org.apache.avro
-  avro
-
-
-
-  org.apache.avro
-  avro-mapred
-  ${avro.version}
-  hadoop2
-  
-
-
-  org.mortbay.jetty
-  servlet-api
-
-  
-
-
-
   org.apache.hadoop
   hadoop-client
   provided

http://git-wip-us.apache.org/repos/asf/beam/blob/7512a73c/sdks/java/io/hadoop-file-system/src/main/java/org/apache/beam/sdk/io/hdfs/HDFSFileSink.java
--
diff --git 
a/sdks/java/io/hadoop-file-system/src/main/java/org/apache/beam/sdk/io/hdfs/HDFSFileSink.java
 
b/sdks/java/io/hadoop-file-system/src/main/java/org/apache/beam/sdk/io/hdfs/HDFSFileSink.java
deleted file mode 100644
index aee73c4..000
--- 
a/sdks/java/io/hadoop-file-system/src/main/java/org/apache/beam/sdk/io/hdfs/HDFSFileSink.java
+++ /dev/null
@@ -1,478 +0,0 @@
-/*
- * Licensed to the Apache Software Foundation (ASF) under one
- * or more contributor license agreements.  See the NOTICE file
- * distributed with this work for additional information
- * regarding copyright ownership.  The ASF licenses this file
- * to you under the Apache License, Version 2.0 (the
- * "License"); you may not use this file except in compliance
- * with the License.  You may obtain a copy of the License at
- *
- * http://www.apache.org/licenses/LICENSE-2.0
- *
- * Unless required by applicable law or agreed to in writing, software
- * distributed under the License is distributed on an "AS IS" BASIS,
- * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- * See the License for the specific language governing permissions and
- * limitations under the License.
- */
-package org.apache.beam.sdk.io.hdfs;
-
-import static com.google.common.base.Preconditions.checkNot

[GitHub] beam pull request #2908: [BEAM-2016] Delete HdfsFileSource & Sink

2017-05-05 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/2908


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (BEAM-2016) Delete HDFSFileSource/Sink

2017-05-05 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15998569#comment-15998569
 ] 

ASF GitHub Bot commented on BEAM-2016:
--

Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/2908


> Delete HDFSFileSource/Sink
> --
>
> Key: BEAM-2016
> URL: https://issues.apache.org/jira/browse/BEAM-2016
> Project: Beam
>  Issue Type: Task
>  Components: sdk-java-extensions
>Reporter: Eugene Kirpichov
>Assignee: Jean-Baptiste Onofré
> Fix For: First stable release
>
>
> After https://issues.apache.org/jira/browse/BEAM-2005, delete 
> https://github.com/apache/beam/tree/master/sdks/java/io/hdfs since it'll be 
> redundant with the ability to read HDFS via other file-based IOs.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


Jenkins build is still unstable: beam_PostCommit_Java_ValidatesRunner_Flink #2685

2017-05-05 Thread Apache Jenkins Server
See 




[GitHub] beam pull request #2853: [BEAM-1035][BEAM-1115] Support for new State and Ti...

2017-05-05 Thread JingsongLi
Github user JingsongLi closed the pull request at:

https://github.com/apache/beam/pull/2853


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (BEAM-1035) Support for new State API in SparkRunner

2017-05-05 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-1035?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15998581#comment-15998581
 ] 

ASF GitHub Bot commented on BEAM-1035:
--

Github user JingsongLi closed the pull request at:

https://github.com/apache/beam/pull/2853


> Support for new State API in SparkRunner
> 
>
> Key: BEAM-1035
> URL: https://issues.apache.org/jira/browse/BEAM-1035
> Project: Beam
>  Issue Type: New Feature
>  Components: runner-spark
>Reporter: Kenneth Knowles
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[3/3] beam git commit: This closes #2902: Activate WindowedWordCountIT on Apex runner

2017-05-05 Thread kenn
This closes #2902: Activate WindowedWordCountIT on Apex runner

  No parallelism for Apex WindowedWordCountIT
  Activate WindowedWordCountIT on Apex runner


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/625c996c
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/625c996c
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/625c996c

Branch: refs/heads/master
Commit: 625c996c47df3c125d57c68576d866e40ffcff2b
Parents: 610bda1 c1811a4
Author: Kenneth Knowles 
Authored: Fri May 5 10:10:19 2017 -0700
Committer: Kenneth Knowles 
Committed: Fri May 5 10:10:19 2017 -0700

--
 examples/java/pom.xml | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)
--




[2/3] beam git commit: No parallelism for Apex WindowedWordCountIT

2017-05-05 Thread kenn
No parallelism for Apex WindowedWordCountIT


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/c1811a4a
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/c1811a4a
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/c1811a4a

Branch: refs/heads/master
Commit: c1811a4adbfb290226d183eac01bbe9bdbce2462
Parents: 513ef11
Author: Kenneth Knowles 
Authored: Thu May 4 21:34:54 2017 -0700
Committer: Kenneth Knowles 
Committed: Thu May 4 21:34:54 2017 -0700

--
 examples/java/pom.xml | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/c1811a4a/examples/java/pom.xml
--
diff --git a/examples/java/pom.xml b/examples/java/pom.xml
index 65625df..fb26135 100644
--- a/examples/java/pom.xml
+++ b/examples/java/pom.xml
@@ -305,7 +305,7 @@
 WordCountIT.java
 WindowedWordCountIT.java
   
-  all
+  none
   4
   
 



[1/3] beam git commit: Activate WindowedWordCountIT on Apex runner

2017-05-05 Thread kenn
Repository: beam
Updated Branches:
  refs/heads/master 610bda168 -> 625c996c4


Activate WindowedWordCountIT on Apex runner


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/513ef114
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/513ef114
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/513ef114

Branch: refs/heads/master
Commit: 513ef1145aa6c563625ab4154296d5aa4fc6c17c
Parents: 8af5c28
Author: Kenneth Knowles 
Authored: Thu May 4 15:01:24 2017 -0700
Committer: Kenneth Knowles 
Committed: Thu May 4 15:01:24 2017 -0700

--
 examples/java/pom.xml | 1 +
 1 file changed, 1 insertion(+)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/513ef114/examples/java/pom.xml
--
diff --git a/examples/java/pom.xml b/examples/java/pom.xml
index d673da2..65625df 100644
--- a/examples/java/pom.xml
+++ b/examples/java/pom.xml
@@ -303,6 +303,7 @@
 
   
 WordCountIT.java
+WindowedWordCountIT.java
   
   all
   4



[GitHub] beam pull request #2902: [BEAM-966] Activate WindowedWordCountIT on Apex run...

2017-05-05 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/2902


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (BEAM-966) Run WindowedWordCount Integration Test in Apex runner

2017-05-05 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15998588#comment-15998588
 ] 

ASF GitHub Bot commented on BEAM-966:
-

Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/2902


> Run WindowedWordCount Integration Test in Apex runner
> -
>
> Key: BEAM-966
> URL: https://issues.apache.org/jira/browse/BEAM-966
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-apex
>Reporter: Kenneth Knowles
>Assignee: Kenneth Knowles
> Fix For: Not applicable
>
>
> The purpose of running WindowedWordCountIT in Apex is to have a streaming 
> test pipeline running in Jenkins pre-commit using TestApexRunner.
> More discussion happened here:
> https://github.com/apache/incubator-beam/pull/1045#issuecomment-251531770



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (BEAM-966) Run WindowedWordCount Integration Test in Apex runner

2017-05-05 Thread Kenneth Knowles (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-966?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kenneth Knowles resolved BEAM-966.
--
   Resolution: Fixed
Fix Version/s: Not applicable

> Run WindowedWordCount Integration Test in Apex runner
> -
>
> Key: BEAM-966
> URL: https://issues.apache.org/jira/browse/BEAM-966
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-apex
>Reporter: Kenneth Knowles
>Assignee: Kenneth Knowles
> Fix For: Not applicable
>
>
> The purpose of running WindowedWordCountIT in Apex is to have a streaming 
> test pipeline running in Jenkins pre-commit using TestApexRunner.
> More discussion happened here:
> https://github.com/apache/incubator-beam/pull/1045#issuecomment-251531770



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (BEAM-2183) Maven-archetypes should depend on all Beam modules that their sources compile against

2017-05-05 Thread Daniel Halperin (JIRA)
Daniel Halperin created BEAM-2183:
-

 Summary: Maven-archetypes should depend on all Beam modules that 
their sources compile against
 Key: BEAM-2183
 URL: https://issues.apache.org/jira/browse/BEAM-2183
 Project: Beam
  Issue Type: Bug
  Components: sdk-java-core
Reporter: Daniel Halperin
Assignee: Daniel Halperin
 Fix For: First stable release


The archetypes in {{sdks/java/maven-archetypes/*}} compile and run tests for 
source code that depends on various Beam modules. The dependencies are 
reflected in the *inner* pom.xml inside 
{{src/main/resources/archetype-resources/pom.xml}}.

The outer module needs to have the same Beam dependencies to force the Maven 
Reactor build order to process those modules first. Otherwise, a simple run of 
{{mvn install}} (e.g.,) , even without parallelism, may run the 
{{maven-archetypes}} step before it install the dependent modules. This means 
that the code may run against a previous nightly snapshot or other 
non-up-to-date-copy of those modules.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] beam pull request #2916: [BEAM-2183] Archetypes: fix build order

2017-05-05 Thread dhalperi
GitHub user dhalperi opened a pull request:

https://github.com/apache/beam/pull/2916

[BEAM-2183] Archetypes: fix build order

R: @davorbonaci 
CC: @jbonofre 

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/dhalperi/beam maven-archetype-build-order

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/2916.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2916


commit e8584253a9036f10e98666cb281e4bfa3ef362f5
Author: Dan Halperin 
Date:   2017-05-05T16:45:50Z

Archetypes: fix build order




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (BEAM-2183) Maven-archetypes should depend on all Beam modules that their sources compile against

2017-05-05 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2183?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15998597#comment-15998597
 ] 

ASF GitHub Bot commented on BEAM-2183:
--

GitHub user dhalperi opened a pull request:

https://github.com/apache/beam/pull/2916

[BEAM-2183] Archetypes: fix build order

R: @davorbonaci 
CC: @jbonofre 

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/dhalperi/beam maven-archetype-build-order

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/2916.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2916


commit e8584253a9036f10e98666cb281e4bfa3ef362f5
Author: Dan Halperin 
Date:   2017-05-05T16:45:50Z

Archetypes: fix build order




> Maven-archetypes should depend on all Beam modules that their sources compile 
> against
> -
>
> Key: BEAM-2183
> URL: https://issues.apache.org/jira/browse/BEAM-2183
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core
>Reporter: Daniel Halperin
>Assignee: Daniel Halperin
> Fix For: First stable release
>
>
> The archetypes in {{sdks/java/maven-archetypes/*}} compile and run tests for 
> source code that depends on various Beam modules. The dependencies are 
> reflected in the *inner* pom.xml inside 
> {{src/main/resources/archetype-resources/pom.xml}}.
> The outer module needs to have the same Beam dependencies to force the Maven 
> Reactor build order to process those modules first. Otherwise, a simple run 
> of {{mvn install}} (e.g.,) , even without parallelism, may run the 
> {{maven-archetypes}} step before it install the dependent modules. This means 
> that the code may run against a previous nightly snapshot or other 
> non-up-to-date-copy of those modules.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (BEAM-2184) Align OutputTimeFn API between Java, Python (and Runner API)

2017-05-05 Thread Robert Bradshaw (JIRA)
Robert Bradshaw created BEAM-2184:
-

 Summary: Align OutputTimeFn API between Java, Python (and Runner 
API)
 Key: BEAM-2184
 URL: https://issues.apache.org/jira/browse/BEAM-2184
 Project: Beam
  Issue Type: Bug
  Components: beam-model-runner-api, sdk-py
Reporter: Robert Bradshaw
Assignee: Kenneth Knowles
 Fix For: First stable release


See also https://issues.apache.org/jira/browse/BEAM-1327



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] beam pull request #2917: [BEAM-2175] [BEAM-1115] Support for new State and T...

2017-05-05 Thread JingsongLi
GitHub user JingsongLi opened a pull request:

https://github.com/apache/beam/pull/2917

[BEAM-2175] [BEAM-1115] Support for new State and Timer API in Spark batch 
mode

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [ ] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [ ] Make sure tests pass via `mvn clean verify`.
 - [ ] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [ ] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.pdf).

---


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/JingsongLi/beam BEAM-2175

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/2917.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2917


commit 4d26c4134f9e8ec0ce71ed24511e2b244e3a110d
Author: JingsongLi 
Date:   2017-05-03T05:15:26Z

[BEAM-2175] [BEAM-1115] Support for new State and Timer API in Spark batch 
mode




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (BEAM-1327) Replace OutputTimeFn with enum

2017-05-05 Thread Robert Bradshaw (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-1327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15998603#comment-15998603
 ] 

Robert Bradshaw commented on BEAM-1327:
---

I had re-opened this to make sure the APIs between SDKs are aligned. Instead, 
I've created https://issues.apache.org/jira/browse/BEAM-2184 which may just 
involve changing the name in Python. 

> Replace OutputTimeFn with enum
> --
>
> Key: BEAM-1327
> URL: https://issues.apache.org/jira/browse/BEAM-1327
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-java-core
>Reporter: Kenneth Knowles
>Assignee: Kenneth Knowles
>Priority: Minor
>  Labels: backward-incompatible
> Fix For: First stable release
>
>
> The class {{OutputTimeFn}} is overkill for a Fn API crossing. There are only 
> three sensible values known: MIN, MAX, EOW. The interface is right for 
> implementing these, but the full class is left over from the days when there 
> was little cost to shipping new kinds of fns. An enum is concise.
> This can be done "mostly" backwards compatibly with legacy adapters in place, 
> but might be less confusing without them.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (BEAM-2175) Support state API in Spark batch mode.

2017-05-05 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15998604#comment-15998604
 ] 

ASF GitHub Bot commented on BEAM-2175:
--

GitHub user JingsongLi opened a pull request:

https://github.com/apache/beam/pull/2917

[BEAM-2175] [BEAM-1115] Support for new State and Timer API in Spark batch 
mode

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [ ] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [ ] Make sure tests pass via `mvn clean verify`.
 - [ ] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [ ] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.pdf).

---


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/JingsongLi/beam BEAM-2175

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/2917.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2917


commit 4d26c4134f9e8ec0ce71ed24511e2b244e3a110d
Author: JingsongLi 
Date:   2017-05-03T05:15:26Z

[BEAM-2175] [BEAM-1115] Support for new State and Timer API in Spark batch 
mode




> Support state API in Spark batch mode.
> --
>
> Key: BEAM-2175
> URL: https://issues.apache.org/jira/browse/BEAM-2175
> Project: Beam
>  Issue Type: Sub-task
>  Components: runner-spark
>Reporter: Aviem Zur
>Assignee: Jingsong Lee
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] beam pull request #2918: [BEAM-59] Fully delete IOChannelUtils

2017-05-05 Thread dhalperi
GitHub user dhalperi opened a pull request:

https://github.com/apache/beam/pull/2918

[BEAM-59] Fully delete IOChannelUtils

With a new Dataflow worker that no longer depends on it.

Reminder for those watching: Dataflow worker breaks will become a thing of 
the past after
first stable API! (For the stable parts of the codebase at least.)

R: @tgroh 

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/dhalperi/beam delete-iocu

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/2918.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2918


commit e1cf5c69ea614f9284e6a5b601eb9fec489ee65e
Author: Dan Halperin 
Date:   2017-05-05T17:20:38Z

[BEAM-59] Fully delete IOChannelUtils

With a new Dataflow worker that no longer depends on it.

Reminder for those watching: Dataflow worker breaks will become a thing of 
the past after
first stable API! (For the stable parts of the codebase at least.)




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (BEAM-59) Switch from IOChannelFactory to FileSystems

2017-05-05 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-59?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15998608#comment-15998608
 ] 

ASF GitHub Bot commented on BEAM-59:


GitHub user dhalperi opened a pull request:

https://github.com/apache/beam/pull/2918

[BEAM-59] Fully delete IOChannelUtils

With a new Dataflow worker that no longer depends on it.

Reminder for those watching: Dataflow worker breaks will become a thing of 
the past after
first stable API! (For the stable parts of the codebase at least.)

R: @tgroh 

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/dhalperi/beam delete-iocu

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/2918.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2918


commit e1cf5c69ea614f9284e6a5b601eb9fec489ee65e
Author: Dan Halperin 
Date:   2017-05-05T17:20:38Z

[BEAM-59] Fully delete IOChannelUtils

With a new Dataflow worker that no longer depends on it.

Reminder for those watching: Dataflow worker breaks will become a thing of 
the past after
first stable API! (For the stable parts of the codebase at least.)




> Switch from IOChannelFactory to FileSystems
> ---
>
> Key: BEAM-59
> URL: https://issues.apache.org/jira/browse/BEAM-59
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-java-core, sdk-java-gcp
>Reporter: Daniel Halperin
>Assignee: Daniel Halperin
> Fix For: First stable release
>
>
> Right now, FileBasedSource and FileBasedSink communication is mediated by 
> IOChannelFactory. There are a number of issues:
> * Global configuration -- e.g., all 'gs://' URIs use the same credentials. 
> This should be per-source/per-sink/etc.
> * Supported APIs -- currently IOChannelFactory is in the "non-public API" 
> util package and subject to change. We need users to be able to add new 
> backends ('s3://', 'hdfs://', etc.) directly, without fear that they will be 
> broken.
> * Per-backend features: e.g., creating buckets in GCS/s3, setting expiration 
> time, etc.
> Updates:
> Design docs posted on dev@ list:
> Part 1: IOChannelFactory Redesign: 
> https://docs.google.com/document/d/11TdPyZ9_zmjokhNWM3Id-XJsVG3qel2lhdKTknmZ_7M/edit#
> Part 2: Configurable BeamFileSystem:
> https://docs.google.com/document/d/1-7vo9nLRsEEzDGnb562PuL4q9mUiq_ZVpCAiyyJw8p8/edit#heading=h.p3gc3colc2cs



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] beam pull request #2919: [BEAM-2161] support string operators

2017-05-05 Thread xumingming
GitHub user xumingming opened a pull request:

https://github.com/apache/beam/pull/2919

[BEAM-2161] support string operators

https://issues.apache.org/jira/browse/BEAM-2161 @XuMingmin 

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/xumingming/beam 
BEAM-2161-support-string-operators

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/2919.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2919


commit bc0506b28aaa41da0795facb78ab6dc936d87d3d
Author: James Xu 
Date:   2017-05-05T17:20:54Z

implement string operators




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (BEAM-2161) add support String functions

2017-05-05 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2161?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15998612#comment-15998612
 ] 

ASF GitHub Bot commented on BEAM-2161:
--

GitHub user xumingming opened a pull request:

https://github.com/apache/beam/pull/2919

[BEAM-2161] support string operators

https://issues.apache.org/jira/browse/BEAM-2161 @XuMingmin 

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/xumingming/beam 
BEAM-2161-support-string-operators

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/2919.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2919


commit bc0506b28aaa41da0795facb78ab6dc936d87d3d
Author: James Xu 
Date:   2017-05-05T17:20:54Z

implement string operators




> add support String functions
> 
>
> Key: BEAM-2161
> URL: https://issues.apache.org/jira/browse/BEAM-2161
> Project: Beam
>  Issue Type: Task
>  Components: dsl-sql
>Reporter: Xu Mingmin
>Assignee: James Xu
>  Labels: bundle, split
>
> All functions are listed as below:
> {code}
> string || string
> CHAR_LENGTH(string)
> CHARACTER_LENGTH(string)
> UPPER(string)
> LOWER(string)
> POSITION(string1 IN string2)
> POSITION(string1 IN string2 FROM integer)
> TRIM( { BOTH | LEADING | TRAILING } string1 FROM string2)
> OVERLAY(string1 PLACING string2 FROM integer [ FOR integer2 ])
> SUBSTRING(string FROM integer)
> SUBSTRING(string FROM integer FOR integer)
> INITCAP(string)
> {code}
> see 
> https://calcite.apache.org/docs/reference.html#character-string-operators-and-functions
>  for more information.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


Jenkins build is still unstable: beam_PostCommit_Java_ValidatesRunner_Flink #2686

2017-05-05 Thread Apache Jenkins Server
See 




[4/4] beam git commit: This closes #2764

2017-05-05 Thread tgroh
This closes #2764


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/6d859002
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/6d859002
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/6d859002

Branch: refs/heads/master
Commit: 6d8590028d8578ecd4772af9ddc5fc960d292da2
Parents: 625c996 2ac2a34
Author: Thomas Groh 
Authored: Fri May 5 10:27:59 2017 -0700
Committer: Thomas Groh 
Committed: Fri May 5 10:27:59 2017 -0700

--
 .../beam/examples/WindowedWordCountIT.java  |  1 -
 .../game/utils/WriteWindowedToBigQuery.java |  1 -
 .../apache/beam/runners/apex/ApexRunner.java|  2 +-
 .../apex/translation/TransformTranslator.java   |  1 -
 .../apex/translation/TranslationContext.java|  2 +-
 .../DeduplicatedFlattenFactory.java |  2 +-
 .../EmptyFlattenAsCreateFactory.java|  2 +-
 .../core/construction/PTransformMatchers.java   |  2 +-
 .../construction/PTransformReplacements.java|  2 +-
 .../runners/core/construction/PTransforms.java  |  2 +-
 .../core/construction/PrimitiveCreate.java  |  2 +-
 .../core/construction/SdkComponents.java|  2 +-
 .../UnsupportedOverrideFactory.java |  2 +-
 .../EmptyFlattenAsCreateFactoryTest.java|  2 +-
 .../construction/PTransformMatchersTest.java|  2 +-
 .../PTransformReplacementsTest.java |  2 +-
 .../core/construction/PTransformsTest.java  |  2 +-
 .../core/construction/SdkComponentsTest.java|  2 +-
 .../SingleInputOutputOverrideFactoryTest.java   |  2 +-
 .../direct/BoundedReadEvaluatorFactory.java |  2 +-
 .../beam/runners/direct/CommittedResult.java|  2 +-
 .../beam/runners/direct/CompletionCallback.java |  2 +-
 ...ectGBKIntoKeyedWorkItemsOverrideFactory.java |  2 +-
 .../apache/beam/runners/direct/DirectGraph.java |  2 +-
 .../beam/runners/direct/DirectGraphVisitor.java |  2 +-
 .../direct/DirectGroupByKeyOverrideFactory.java |  2 +-
 .../beam/runners/direct/EmptyInputProvider.java |  2 +-
 .../beam/runners/direct/EvaluationContext.java  |  2 +-
 .../direct/ExecutorServiceParallelExecutor.java |  2 +-
 .../runners/direct/FlattenEvaluatorFactory.java |  2 +-
 .../GroupAlsoByWindowEvaluatorFactory.java  |  2 +-
 .../direct/GroupByKeyOnlyEvaluatorFactory.java  |  2 +-
 .../direct/ImmutabilityEnforcementFactory.java  |  2 +-
 .../beam/runners/direct/ModelEnforcement.java   |  2 +-
 .../runners/direct/ModelEnforcementFactory.java |  2 +-
 .../beam/runners/direct/ParDoEvaluator.java |  2 +-
 .../runners/direct/ParDoEvaluatorFactory.java   |  2 +-
 .../direct/ParDoMultiOverrideFactory.java   |  2 +-
 .../direct/PassthroughTransformEvaluator.java   |  2 +-
 .../beam/runners/direct/PipelineExecutor.java   |  2 +-
 .../beam/runners/direct/RootInputProvider.java  |  2 +-
 .../runners/direct/RootProviderRegistry.java|  2 +-
 ...littableProcessElementsEvaluatorFactory.java |  2 +-
 .../direct/StatefulParDoEvaluatorFactory.java   |  2 +-
 .../apache/beam/runners/direct/StepAndKey.java  |  2 +-
 .../runners/direct/StepTransformResult.java |  2 +-
 .../direct/TestStreamEvaluatorFactory.java  |  2 +-
 .../direct/TransformEvaluatorFactory.java   |  2 +-
 .../direct/TransformEvaluatorRegistry.java  |  2 +-
 .../beam/runners/direct/TransformExecutor.java  |  2 +-
 .../beam/runners/direct/TransformResult.java|  2 +-
 .../direct/UnboundedReadEvaluatorFactory.java   |  2 +-
 .../runners/direct/ViewEvaluatorFactory.java|  2 +-
 .../runners/direct/ViewOverrideFactory.java |  2 +-
 .../direct/WatermarkCallbackExecutor.java   |  2 +-
 .../beam/runners/direct/WatermarkManager.java   |  2 +-
 .../runners/direct/WindowEvaluatorFactory.java  |  2 +-
 .../direct/WriteWithShardingFactory.java|  2 +-
 .../direct/BoundedReadEvaluatorFactoryTest.java |  2 +-
 .../runners/direct/CommittedResultTest.java |  2 +-
 .../runners/direct/DirectGraphVisitorTest.java  |  2 +-
 .../beam/runners/direct/DirectGraphs.java   |  2 +-
 .../DirectGroupByKeyOverrideFactoryTest.java|  2 +-
 .../runners/direct/EvaluationContextTest.java   |  2 +-
 .../direct/FlattenEvaluatorFactoryTest.java |  2 +-
 .../ImmutabilityEnforcementFactoryTest.java |  2 +-
 .../beam/runners/direct/ParDoEvaluatorTest.java |  2 +-
 .../StatefulParDoEvaluatorFactoryTest.java  |  2 +-
 .../runners/direct/StepTransformResultTest.java |  2 +-
 .../direct/TestStreamEvaluatorFactoryTest.java  |  2 +-
 .../runners/direct/TransformExecutorTest.java   |  2 +-
 .../UnboundedReadEvaluatorFactoryTest.java  |  2 +-
 .../direct/ViewEvaluatorFactoryTest.java|  2 +-
 .../runners/direct/ViewOverrideFactoryTest.java |  2 +-
 .../direct/WatermarkCallbackExecutorTest.java   |  2 +-
 .../runners/direct/WatermarkManagerTest.java|  2 +-
 .../direct/WriteWithShardingFactoryTest.java|  2 +-
 .../flink/FlinkBatchTranslationContext.

[2/4] beam git commit: Move AppliedPTransform into the Runners package

2017-05-05 Thread tgroh
Move AppliedPTransform into the Runners package

This is a read-only view of the TransformHierarchy.Node, and is intended
for use by Runner authors.


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/2ac2a34b
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/2ac2a34b
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/2ac2a34b

Branch: refs/heads/master
Commit: 2ac2a34b90b8d03ef71b01a75c14a9b16fa482b9
Parents: c2f4751
Author: Thomas Groh 
Authored: Thu May 4 14:07:36 2017 -0700
Committer: Thomas Groh 
Committed: Fri May 5 10:27:58 2017 -0700

--
 .../apache/beam/runners/apex/ApexRunner.java|  2 +-
 .../apex/translation/TranslationContext.java|  2 +-
 .../DeduplicatedFlattenFactory.java |  2 +-
 .../EmptyFlattenAsCreateFactory.java|  2 +-
 .../core/construction/PTransformMatchers.java   |  2 +-
 .../construction/PTransformReplacements.java|  2 +-
 .../runners/core/construction/PTransforms.java  |  2 +-
 .../core/construction/PrimitiveCreate.java  |  2 +-
 .../core/construction/SdkComponents.java|  2 +-
 .../UnsupportedOverrideFactory.java |  2 +-
 .../EmptyFlattenAsCreateFactoryTest.java|  2 +-
 .../construction/PTransformMatchersTest.java|  2 +-
 .../PTransformReplacementsTest.java |  2 +-
 .../core/construction/PTransformsTest.java  |  2 +-
 .../core/construction/SdkComponentsTest.java|  2 +-
 .../SingleInputOutputOverrideFactoryTest.java   |  2 +-
 .../direct/BoundedReadEvaluatorFactory.java |  2 +-
 .../beam/runners/direct/CommittedResult.java|  2 +-
 .../beam/runners/direct/CompletionCallback.java |  2 +-
 ...ectGBKIntoKeyedWorkItemsOverrideFactory.java |  2 +-
 .../apache/beam/runners/direct/DirectGraph.java |  2 +-
 .../beam/runners/direct/DirectGraphVisitor.java |  2 +-
 .../direct/DirectGroupByKeyOverrideFactory.java |  2 +-
 .../beam/runners/direct/EmptyInputProvider.java |  2 +-
 .../beam/runners/direct/EvaluationContext.java  |  2 +-
 .../direct/ExecutorServiceParallelExecutor.java |  2 +-
 .../runners/direct/FlattenEvaluatorFactory.java |  2 +-
 .../GroupAlsoByWindowEvaluatorFactory.java  |  2 +-
 .../direct/GroupByKeyOnlyEvaluatorFactory.java  |  2 +-
 .../direct/ImmutabilityEnforcementFactory.java  |  2 +-
 .../beam/runners/direct/ModelEnforcement.java   |  2 +-
 .../runners/direct/ModelEnforcementFactory.java |  2 +-
 .../beam/runners/direct/ParDoEvaluator.java |  2 +-
 .../runners/direct/ParDoEvaluatorFactory.java   |  2 +-
 .../direct/ParDoMultiOverrideFactory.java   |  2 +-
 .../direct/PassthroughTransformEvaluator.java   |  2 +-
 .../beam/runners/direct/PipelineExecutor.java   |  2 +-
 .../beam/runners/direct/RootInputProvider.java  |  2 +-
 .../runners/direct/RootProviderRegistry.java|  2 +-
 ...littableProcessElementsEvaluatorFactory.java |  2 +-
 .../direct/StatefulParDoEvaluatorFactory.java   |  2 +-
 .../apache/beam/runners/direct/StepAndKey.java  |  2 +-
 .../runners/direct/StepTransformResult.java |  2 +-
 .../direct/TestStreamEvaluatorFactory.java  |  2 +-
 .../direct/TransformEvaluatorFactory.java   |  2 +-
 .../direct/TransformEvaluatorRegistry.java  |  2 +-
 .../beam/runners/direct/TransformExecutor.java  |  2 +-
 .../beam/runners/direct/TransformResult.java|  2 +-
 .../direct/UnboundedReadEvaluatorFactory.java   |  2 +-
 .../runners/direct/ViewEvaluatorFactory.java|  2 +-
 .../runners/direct/ViewOverrideFactory.java |  2 +-
 .../direct/WatermarkCallbackExecutor.java   |  2 +-
 .../beam/runners/direct/WatermarkManager.java   |  2 +-
 .../runners/direct/WindowEvaluatorFactory.java  |  2 +-
 .../direct/WriteWithShardingFactory.java|  2 +-
 .../direct/BoundedReadEvaluatorFactoryTest.java |  2 +-
 .../runners/direct/CommittedResultTest.java |  2 +-
 .../runners/direct/DirectGraphVisitorTest.java  |  2 +-
 .../beam/runners/direct/DirectGraphs.java   |  2 +-
 .../DirectGroupByKeyOverrideFactoryTest.java|  2 +-
 .../runners/direct/EvaluationContextTest.java   |  2 +-
 .../direct/FlattenEvaluatorFactoryTest.java |  2 +-
 .../ImmutabilityEnforcementFactoryTest.java |  2 +-
 .../beam/runners/direct/ParDoEvaluatorTest.java |  2 +-
 .../StatefulParDoEvaluatorFactoryTest.java  |  2 +-
 .../runners/direct/StepTransformResultTest.java |  2 +-
 .../direct/TestStreamEvaluatorFactoryTest.java  |  2 +-
 .../runners/direct/TransformExecutorTest.java   |  2 +-
 .../UnboundedReadEvaluatorFactoryTest.java  |  2 +-
 .../direct/ViewEvaluatorFactoryTest.java|  2 +-
 .../runners/direct/ViewOverrideFactoryTest.java |  2 +-
 .../direct/WatermarkCallbackExecutorTest.java   |  2 +-
 .../runners/direct/WatermarkManagerTest.java|  2 +-
 .../direct/WriteWithShardingFactoryTest.java|  2 +-
 .../flink/FlinkBatchTranslationContext.java |  2 +-
 .../flink/FlinkStreamingPip

[3/4] beam git commit: Run Optimize Imports on the Repository

2017-05-05 Thread tgroh
Run Optimize Imports on the Repository


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/c2f47515
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/c2f47515
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/c2f47515

Branch: refs/heads/master
Commit: c2f47515693ef8046b3a6bb8a892af7b5a8df59b
Parents: 625c996
Author: Thomas Groh 
Authored: Thu May 4 13:46:24 2017 -0700
Committer: Thomas Groh 
Committed: Fri May 5 10:27:58 2017 -0700

--
 .../src/test/java/org/apache/beam/examples/WindowedWordCountIT.java | 1 -
 .../beam/examples/complete/game/utils/WriteWindowedToBigQuery.java  | 1 -
 .../apache/beam/runners/apex/translation/TransformTranslator.java   | 1 -
 .../org/apache/beam/runners/dataflow/util/DataflowTemplateJob.java  | 1 -
 .../main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryIO.java   | 1 -
 5 files changed, 5 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/c2f47515/examples/java/src/test/java/org/apache/beam/examples/WindowedWordCountIT.java
--
diff --git 
a/examples/java/src/test/java/org/apache/beam/examples/WindowedWordCountIT.java 
b/examples/java/src/test/java/org/apache/beam/examples/WindowedWordCountIT.java
index b5eddb5..f7e35c0 100644
--- 
a/examples/java/src/test/java/org/apache/beam/examples/WindowedWordCountIT.java
+++ 
b/examples/java/src/test/java/org/apache/beam/examples/WindowedWordCountIT.java
@@ -30,7 +30,6 @@ import java.util.List;
 import java.util.SortedMap;
 import java.util.TreeMap;
 import java.util.concurrent.ThreadLocalRandom;
-
 import org.apache.beam.examples.common.ExampleUtils;
 import org.apache.beam.examples.common.WriteOneFilePerWindow.PerWindowFiles;
 import org.apache.beam.sdk.PipelineResult;

http://git-wip-us.apache.org/repos/asf/beam/blob/c2f47515/examples/java8/src/main/java/org/apache/beam/examples/complete/game/utils/WriteWindowedToBigQuery.java
--
diff --git 
a/examples/java8/src/main/java/org/apache/beam/examples/complete/game/utils/WriteWindowedToBigQuery.java
 
b/examples/java8/src/main/java/org/apache/beam/examples/complete/game/utils/WriteWindowedToBigQuery.java
index deb9db2..db3319c 100644
--- 
a/examples/java8/src/main/java/org/apache/beam/examples/complete/game/utils/WriteWindowedToBigQuery.java
+++ 
b/examples/java8/src/main/java/org/apache/beam/examples/complete/game/utils/WriteWindowedToBigQuery.java
@@ -19,7 +19,6 @@ package org.apache.beam.examples.complete.game.utils;
 
 import com.google.api.services.bigquery.model.TableRow;
 import java.util.Map;
-
 import org.apache.beam.sdk.io.gcp.bigquery.BigQueryIO;
 import org.apache.beam.sdk.io.gcp.bigquery.BigQueryIO.Write.CreateDisposition;
 import org.apache.beam.sdk.io.gcp.bigquery.BigQueryIO.Write.WriteDisposition;

http://git-wip-us.apache.org/repos/asf/beam/blob/c2f47515/runners/apex/src/main/java/org/apache/beam/runners/apex/translation/TransformTranslator.java
--
diff --git 
a/runners/apex/src/main/java/org/apache/beam/runners/apex/translation/TransformTranslator.java
 
b/runners/apex/src/main/java/org/apache/beam/runners/apex/translation/TransformTranslator.java
index 49ff49b..c924b2e 100644
--- 
a/runners/apex/src/main/java/org/apache/beam/runners/apex/translation/TransformTranslator.java
+++ 
b/runners/apex/src/main/java/org/apache/beam/runners/apex/translation/TransformTranslator.java
@@ -18,7 +18,6 @@
 
 package org.apache.beam.runners.apex.translation;
 
-
 import java.io.Serializable;
 import org.apache.beam.sdk.transforms.PTransform;
 

http://git-wip-us.apache.org/repos/asf/beam/blob/c2f47515/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/util/DataflowTemplateJob.java
--
diff --git 
a/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/util/DataflowTemplateJob.java
 
b/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/util/DataflowTemplateJob.java
index 2937184..b464462 100644
--- 
a/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/util/DataflowTemplateJob.java
+++ 
b/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/util/DataflowTemplateJob.java
@@ -19,7 +19,6 @@ import com.google.api.client.util.Sleeper;
 import com.google.common.annotations.VisibleForTesting;
 import javax.annotation.Nullable;
 import org.apache.beam.runners.dataflow.DataflowPipelineJob;
-import org.apache.beam.sdk.PipelineResult.State;
 import org.joda.time.Duration;
 
 /**

http://git-wip-us.apache.org/repos/asf/beam/blob/c2f47515/sdks/java/io/google-c

[1/4] beam git commit: Move AppliedPTransform into the Runners package

2017-05-05 Thread tgroh
Repository: beam
Updated Branches:
  refs/heads/master 625c996c4 -> 6d8590028


http://git-wip-us.apache.org/repos/asf/beam/blob/2ac2a34b/runners/direct-java/src/test/java/org/apache/beam/runners/direct/ViewOverrideFactoryTest.java
--
diff --git 
a/runners/direct-java/src/test/java/org/apache/beam/runners/direct/ViewOverrideFactoryTest.java
 
b/runners/direct-java/src/test/java/org/apache/beam/runners/direct/ViewOverrideFactoryTest.java
index eda00a7..024e15c 100644
--- 
a/runners/direct-java/src/test/java/org/apache/beam/runners/direct/ViewOverrideFactoryTest.java
+++ 
b/runners/direct-java/src/test/java/org/apache/beam/runners/direct/ViewOverrideFactoryTest.java
@@ -30,11 +30,11 @@ import java.util.Set;
 import java.util.concurrent.atomic.AtomicBoolean;
 import org.apache.beam.runners.direct.ViewOverrideFactory.WriteView;
 import org.apache.beam.sdk.Pipeline.PipelineVisitor;
+import org.apache.beam.sdk.runners.AppliedPTransform;
 import 
org.apache.beam.sdk.runners.PTransformOverrideFactory.PTransformReplacement;
 import org.apache.beam.sdk.runners.TransformHierarchy.Node;
 import org.apache.beam.sdk.testing.PAssert;
 import org.apache.beam.sdk.testing.TestPipeline;
-import org.apache.beam.sdk.transforms.AppliedPTransform;
 import org.apache.beam.sdk.transforms.Create;
 import org.apache.beam.sdk.transforms.DoFn;
 import org.apache.beam.sdk.transforms.ParDo;

http://git-wip-us.apache.org/repos/asf/beam/blob/2ac2a34b/runners/direct-java/src/test/java/org/apache/beam/runners/direct/WatermarkCallbackExecutorTest.java
--
diff --git 
a/runners/direct-java/src/test/java/org/apache/beam/runners/direct/WatermarkCallbackExecutorTest.java
 
b/runners/direct-java/src/test/java/org/apache/beam/runners/direct/WatermarkCallbackExecutorTest.java
index f4a53da..b667346 100644
--- 
a/runners/direct-java/src/test/java/org/apache/beam/runners/direct/WatermarkCallbackExecutorTest.java
+++ 
b/runners/direct-java/src/test/java/org/apache/beam/runners/direct/WatermarkCallbackExecutorTest.java
@@ -23,8 +23,8 @@ import static org.junit.Assert.assertThat;
 import java.util.concurrent.CountDownLatch;
 import java.util.concurrent.Executors;
 import java.util.concurrent.TimeUnit;
+import org.apache.beam.sdk.runners.AppliedPTransform;
 import org.apache.beam.sdk.testing.TestPipeline;
-import org.apache.beam.sdk.transforms.AppliedPTransform;
 import org.apache.beam.sdk.transforms.Create;
 import org.apache.beam.sdk.transforms.Sum;
 import org.apache.beam.sdk.transforms.windowing.BoundedWindow;

http://git-wip-us.apache.org/repos/asf/beam/blob/2ac2a34b/runners/direct-java/src/test/java/org/apache/beam/runners/direct/WatermarkManagerTest.java
--
diff --git 
a/runners/direct-java/src/test/java/org/apache/beam/runners/direct/WatermarkManagerTest.java
 
b/runners/direct-java/src/test/java/org/apache/beam/runners/direct/WatermarkManagerTest.java
index e1e6ab5..9528ac9 100644
--- 
a/runners/direct-java/src/test/java/org/apache/beam/runners/direct/WatermarkManagerTest.java
+++ 
b/runners/direct-java/src/test/java/org/apache/beam/runners/direct/WatermarkManagerTest.java
@@ -43,9 +43,9 @@ import 
org.apache.beam.runners.direct.WatermarkManager.TransformWatermarks;
 import org.apache.beam.sdk.coders.ByteArrayCoder;
 import org.apache.beam.sdk.coders.StringUtf8Coder;
 import org.apache.beam.sdk.coders.VarLongCoder;
+import org.apache.beam.sdk.runners.AppliedPTransform;
 import org.apache.beam.sdk.state.TimeDomain;
 import org.apache.beam.sdk.testing.TestPipeline;
-import org.apache.beam.sdk.transforms.AppliedPTransform;
 import org.apache.beam.sdk.transforms.Create;
 import org.apache.beam.sdk.transforms.DoFn;
 import org.apache.beam.sdk.transforms.Filter;

http://git-wip-us.apache.org/repos/asf/beam/blob/2ac2a34b/runners/direct-java/src/test/java/org/apache/beam/runners/direct/WriteWithShardingFactoryTest.java
--
diff --git 
a/runners/direct-java/src/test/java/org/apache/beam/runners/direct/WriteWithShardingFactoryTest.java
 
b/runners/direct-java/src/test/java/org/apache/beam/runners/direct/WriteWithShardingFactoryTest.java
index a2b0c5c..5a2a328 100644
--- 
a/runners/direct-java/src/test/java/org/apache/beam/runners/direct/WriteWithShardingFactoryTest.java
+++ 
b/runners/direct-java/src/test/java/org/apache/beam/runners/direct/WriteWithShardingFactoryTest.java
@@ -48,8 +48,8 @@ import org.apache.beam.sdk.io.WriteFiles;
 import org.apache.beam.sdk.io.fs.MatchResult.Metadata;
 import org.apache.beam.sdk.io.fs.ResourceId;
 import org.apache.beam.sdk.options.ValueProvider.StaticValueProvider;
+import org.apache.beam.sdk.runners.AppliedPTransform;
 import org.apache.beam.sdk.testing.TestPipeline;
-import org.apache.beam.sdk.transforms.AppliedPTransform;
 import org.apache.beam.sdk.

[GitHub] beam pull request #2764: Move AppliedPTransform into the Runners Module

2017-05-05 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/2764


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[1/2] beam git commit: Convert Coder into an Abstract Static Class

2017-05-05 Thread tgroh
Repository: beam
Updated Branches:
  refs/heads/master 6d8590028 -> 69846f500


Convert Coder into an Abstract Static Class


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/72be5c71
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/72be5c71
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/72be5c71

Branch: refs/heads/master
Commit: 72be5c71561bf552c25e2de2b0d21aa374b17ec0
Parents: 6d85900
Author: Thomas Groh 
Authored: Fri May 5 09:20:10 2017 -0700
Committer: Thomas Groh 
Committed: Fri May 5 10:34:56 2017 -0700

--
 runners/google-cloud-dataflow-java/pom.xml  |  2 +-
 .../java/org/apache/beam/sdk/coders/Coder.java  | 36 ++--
 .../apache/beam/sdk/coders/CoderFactories.java  | 35 ++-
 .../apache/beam/sdk/coders/StructuredCoder.java |  2 +-
 .../DoFnSignaturesSplittableDoFnTest.java   |  4 +--
 5 files changed, 40 insertions(+), 39 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/72be5c71/runners/google-cloud-dataflow-java/pom.xml
--
diff --git a/runners/google-cloud-dataflow-java/pom.xml 
b/runners/google-cloud-dataflow-java/pom.xml
index e21b6de..c7142fe 100644
--- a/runners/google-cloud-dataflow-java/pom.xml
+++ b/runners/google-cloud-dataflow-java/pom.xml
@@ -33,7 +33,7 @@
   jar
 
   
-
beam-master-20170504-2
+
beam-master-20170505-wd-2914
 
1
 
6
   

http://git-wip-us.apache.org/repos/asf/beam/blob/72be5c71/sdks/java/core/src/main/java/org/apache/beam/sdk/coders/Coder.java
--
diff --git a/sdks/java/core/src/main/java/org/apache/beam/sdk/coders/Coder.java 
b/sdks/java/core/src/main/java/org/apache/beam/sdk/coders/Coder.java
index c923719..169e448 100644
--- a/sdks/java/core/src/main/java/org/apache/beam/sdk/coders/Coder.java
+++ b/sdks/java/core/src/main/java/org/apache/beam/sdk/coders/Coder.java
@@ -57,10 +57,10 @@ import org.apache.beam.sdk.values.TypeDescriptor;
  *
  * @param  the type of the values being transcoded
  */
-public interface Coder extends Serializable {
+public abstract class Coder implements Serializable {
   /** The context in which encoding or decoding is being done. */
   @Deprecated
-  class Context {
+  public static class Context {
 /**
  * The outer context: the value being encoded or decoded takes
  * up the remainder of the record/stream contents.
@@ -118,7 +118,7 @@ public interface Coder extends Serializable {
* for some reason
* @throws CoderException if the value could not be encoded for some reason
*/
-  void encode(T value, OutputStream outStream)
+  public abstract void encode(T value, OutputStream outStream)
   throws CoderException, IOException;
 
   /**
@@ -130,7 +130,7 @@ public interface Coder extends Serializable {
* @throws CoderException if the value could not be encoded for some reason
*/
   @Deprecated
-  void encodeOuter(T value, OutputStream outStream)
+  public abstract void encodeOuter(T value, OutputStream outStream)
   throws CoderException, IOException;
 
   /**
@@ -142,7 +142,7 @@ public interface Coder extends Serializable {
* @throws CoderException if the value could not be encoded for some reason
*/
   @Deprecated
-  void encode(T value, OutputStream outStream, Context context)
+  public abstract void encode(T value, OutputStream outStream, Context context)
   throws CoderException, IOException;
 
   /**
@@ -153,7 +153,7 @@ public interface Coder extends Serializable {
* for some reason
* @throws CoderException if the value could not be decoded for some reason
*/
-  T decode(InputStream inStream) throws CoderException, IOException;
+  public abstract T decode(InputStream inStream) throws CoderException, 
IOException;
 
   /**
* Decodes a value of type {@code T} from the given input stream in
@@ -164,7 +164,7 @@ public interface Coder extends Serializable {
* @throws CoderException if the value could not be decoded for some reason
*/
   @Deprecated
-  T decodeOuter(InputStream inStream) throws CoderException, IOException;
+  public abstract T decodeOuter(InputStream inStream) throws CoderException, 
IOException;
 
   /**
* Decodes a value of type {@code T} from the given input stream in
@@ -175,7 +175,7 @@ public interface Coder extends Serializable {
* @throws CoderException if the value could not be decoded for some reason
*/
   @Deprecated
-  T decode(InputStream inStream, Context context)
+  public abstract T decode(InputStream inStream, Context context)
   throws CoderException, IOException;
 
   /**
@@ -184,7 +184,7 @@ public interface Coder extends Serializable {
* returns {@code null} if this cannot be done or this is no

[2/2] beam git commit: This closes #2914

2017-05-05 Thread tgroh
This closes #2914


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/69846f50
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/69846f50
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/69846f50

Branch: refs/heads/master
Commit: 69846f5006a509d91bb0c5b316a288711e17f716
Parents: 6d85900 72be5c7
Author: Thomas Groh 
Authored: Fri May 5 10:34:57 2017 -0700
Committer: Thomas Groh 
Committed: Fri May 5 10:34:57 2017 -0700

--
 runners/google-cloud-dataflow-java/pom.xml  |  2 +-
 .../java/org/apache/beam/sdk/coders/Coder.java  | 36 ++--
 .../apache/beam/sdk/coders/CoderFactories.java  | 35 ++-
 .../apache/beam/sdk/coders/StructuredCoder.java |  2 +-
 .../DoFnSignaturesSplittableDoFnTest.java   |  4 +--
 5 files changed, 40 insertions(+), 39 deletions(-)
--




[GitHub] beam pull request #2914: [BEAM-2021] Convert Coder into an Abstract Static C...

2017-05-05 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/2914


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (BEAM-2021) Fix Java's Coder class hierarchy

2017-05-05 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2021?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15998632#comment-15998632
 ] 

ASF GitHub Bot commented on BEAM-2021:
--

Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/2914


> Fix Java's Coder class hierarchy
> 
>
> Key: BEAM-2021
> URL: https://issues.apache.org/jira/browse/BEAM-2021
> Project: Beam
>  Issue Type: Improvement
>  Components: beam-model-runner-api, sdk-java-core
>Affects Versions: First stable release
>Reporter: Kenneth Knowles
>Assignee: Thomas Groh
>  Labels: backward-incompatible
> Fix For: First stable release
>
>
> This is thoroughly out of hand. In the runner API world, there are two paths:
> 1. URN plus component coders plus custom payload (in the form of component 
> coders alongside an SdkFunctionSpec)
> 2. Custom coder (a single URN) and payload is serialized Java. I think this 
> never has component coders.
> The other base classes have now been shown to be extraneous: they favor 
> saving ~3 lines of boilerplate for rarely written code at the cost of 
> readability. Instead they should just be dropped.
> The custom payload is an Any proto in the runner API. But tying the Coder 
> interface to proto would be unfortunate from a design perspective and cannot 
> be done anyhow due to dependency hell.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] beam pull request #2920: [BEAM-1340] Move PipelineRunner to user-facing pack...

2017-05-05 Thread kennknowles
GitHub user kennknowles opened a pull request:

https://github.com/apache/beam/pull/2920

[BEAM-1340] Move PipelineRunner to user-facing package; make sdk.runners 
package internal

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [ ] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [ ] Make sure tests pass via `mvn clean verify`.
 - [ ] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [ ] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.pdf).

---


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/kennknowles/beam PipelineRunner

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/2920.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2920


commit 85e856a0175565293723b32ea8fd4d519fe94778
Author: Kenneth Knowles 
Date:   2017-05-05T17:33:11Z

Exclude sdk.runners from javadoc

commit be00d3f253f291c7c85920b447bade71d657ca56
Author: Kenneth Knowles 
Date:   2017-05-05T17:36:40Z

Move PipelineRunner to toplevel sdk package (automated refactor)

This allows excluding the runner-author-only sdk.runners package from the
public javadoc.

commit f7e79d9ad1ae059a45ac993f694a204773763837
Author: Kenneth Knowles 
Date:   2017-05-05T17:39:26Z

Javadoc that the sdk.runners package is internal




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (BEAM-1340) Remove or make private public bits of the SDK that shouldn't be public

2017-05-05 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-1340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15998644#comment-15998644
 ] 

ASF GitHub Bot commented on BEAM-1340:
--

GitHub user kennknowles opened a pull request:

https://github.com/apache/beam/pull/2920

[BEAM-1340] Move PipelineRunner to user-facing package; make sdk.runners 
package internal

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [ ] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [ ] Make sure tests pass via `mvn clean verify`.
 - [ ] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [ ] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.pdf).

---


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/kennknowles/beam PipelineRunner

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/2920.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2920


commit 85e856a0175565293723b32ea8fd4d519fe94778
Author: Kenneth Knowles 
Date:   2017-05-05T17:33:11Z

Exclude sdk.runners from javadoc

commit be00d3f253f291c7c85920b447bade71d657ca56
Author: Kenneth Knowles 
Date:   2017-05-05T17:36:40Z

Move PipelineRunner to toplevel sdk package (automated refactor)

This allows excluding the runner-author-only sdk.runners package from the
public javadoc.

commit f7e79d9ad1ae059a45ac993f694a204773763837
Author: Kenneth Knowles 
Date:   2017-05-05T17:39:26Z

Javadoc that the sdk.runners package is internal




> Remove or make private public bits of the SDK that shouldn't be public
> --
>
> Key: BEAM-1340
> URL: https://issues.apache.org/jira/browse/BEAM-1340
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core, sdk-java-extensions
>Reporter: Kenneth Knowles
>Priority: Blocker
>  Labels: backward-incompatible
> Fix For: First stable release
>
>
> This JIRA is for the many small changes that do not merit their own JIRA 
> towards getting the SDK's API surface right. For example, removal of 
> `DoFn.InputProvider` and `DoFn.OutputReceiver`.
> While the above is not quite backwards incompatible, succeeding at this task 
> surely will be.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (BEAM-2184) Align OutputTimeFn API between Java, Python (and Runner API)

2017-05-05 Thread Kenneth Knowles (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2184?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15998649#comment-15998649
 ] 

Kenneth Knowles commented on BEAM-2184:
---

Let the bikeshed commence! I don't like {{OutputTimeFn}} _or_ {{OutputTime}} 
_or_ {{TimestampCombiner}} even though I am responsible for all three :-)

> Align OutputTimeFn API between Java, Python (and Runner API)
> 
>
> Key: BEAM-2184
> URL: https://issues.apache.org/jira/browse/BEAM-2184
> Project: Beam
>  Issue Type: Bug
>  Components: beam-model-runner-api, sdk-py
>Reporter: Robert Bradshaw
>Assignee: Kenneth Knowles
> Fix For: First stable release
>
>
> See also https://issues.apache.org/jira/browse/BEAM-1327



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (BEAM-1283) DoFn finishBundle should be required to specify the window for output

2017-05-05 Thread Robert Bradshaw (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-1283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15998653#comment-15998653
 ] 

Robert Bradshaw commented on BEAM-1283:
---

I think we should get in the habit of having a single owner across all SDKs to 
promote consistency. There may be cases were implementation is delegated across 
subtasks/tickets, but model changes should not be SDK-specific. 

> DoFn finishBundle should be required to specify the window for output
> -
>
> Key: BEAM-1283
> URL: https://issues.apache.org/jira/browse/BEAM-1283
> Project: Beam
>  Issue Type: Bug
>  Components: beam-model, sdk-java-core, sdk-py
>Reporter: Kenneth Knowles
>Assignee: Sourabh Bajaj
>  Labels: backward-incompatible
> Fix For: First stable release
>
>
> The spec is here in Javadoc: 
> https://github.com/apache/beam/blob/master/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/DoFn.java#L128
> "If invoked from {{@StartBundle}} or {{@FinishBundle}}, this will attempt to 
> use the {{WindowFn}} of the input {{PCollection}} to determine what windows 
> the element should be in, throwing an exception if the {{WindowFn}} attempts 
> to access any information about the input element. The output element will 
> have a timestamp of negative infinity."
> This is a collection of caveats that make this method not always technically 
> wrong, but quite a mess. Ideas that reasonable folks have suggested lately:
>  - The {{WindowFn}} cannot actually be applied because {{WindowFn}} is 
> allowed to see the element type. The spec just avoids this by limiting which 
> {{WindowFn}} can be used.
>  - There is no natural output timestamp, so it should always be provided. The 
> spec avoids this by specifying an arbitrary and fairly useless timestamp.
>  - If it is a merging {{WindowFn}} like sessions that has already been merged 
> then you'll just have a bogus proto window regardless of explicit timestamp 
> or not.
> The use cases for these methods are best addressed by state plus window 
> expiry callback, so we should revisit this spec and probably just wipe it.
> There are some rare case where you might need to output from {{FinishBundle}} 
> in a way that is not _actually_ sensitive to bundling (perhaps modulo some 
> downstream notion of equivalence) in which case you had better know what 
> window you are outputting to. Often it should be the global window.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (BEAM-1283) DoFn finishBundle should be required to specify the window for output

2017-05-05 Thread Kenneth Knowles (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-1283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15998671#comment-15998671
 ] 

Kenneth Knowles commented on BEAM-1283:
---

Both sounds good to me.

> DoFn finishBundle should be required to specify the window for output
> -
>
> Key: BEAM-1283
> URL: https://issues.apache.org/jira/browse/BEAM-1283
> Project: Beam
>  Issue Type: Bug
>  Components: beam-model, sdk-java-core, sdk-py
>Reporter: Kenneth Knowles
>Assignee: Sourabh Bajaj
>  Labels: backward-incompatible
> Fix For: First stable release
>
>
> The spec is here in Javadoc: 
> https://github.com/apache/beam/blob/master/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/DoFn.java#L128
> "If invoked from {{@StartBundle}} or {{@FinishBundle}}, this will attempt to 
> use the {{WindowFn}} of the input {{PCollection}} to determine what windows 
> the element should be in, throwing an exception if the {{WindowFn}} attempts 
> to access any information about the input element. The output element will 
> have a timestamp of negative infinity."
> This is a collection of caveats that make this method not always technically 
> wrong, but quite a mess. Ideas that reasonable folks have suggested lately:
>  - The {{WindowFn}} cannot actually be applied because {{WindowFn}} is 
> allowed to see the element type. The spec just avoids this by limiting which 
> {{WindowFn}} can be used.
>  - There is no natural output timestamp, so it should always be provided. The 
> spec avoids this by specifying an arbitrary and fairly useless timestamp.
>  - If it is a merging {{WindowFn}} like sessions that has already been merged 
> then you'll just have a bogus proto window regardless of explicit timestamp 
> or not.
> The use cases for these methods are best addressed by state plus window 
> expiry callback, so we should revisit this spec and probably just wipe it.
> There are some rare case where you might need to output from {{FinishBundle}} 
> in a way that is not _actually_ sensitive to bundling (perhaps modulo some 
> downstream notion of equivalence) in which case you had better know what 
> window you are outputting to. Often it should be the global window.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


Jenkins build is still unstable: beam_PostCommit_Java_ValidatesRunner_Flink #2687

2017-05-05 Thread Apache Jenkins Server
See 




[GitHub] beam pull request #2921: [BEAM-2166] Remove context from encoded byte size

2017-05-05 Thread tgroh
GitHub user tgroh opened a pull request:

https://github.com/apache/beam/pull/2921

[BEAM-2166] Remove context from encoded byte size

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [ ] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [ ] Make sure tests pass via `mvn clean verify`.
 - [ ] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [ ] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.pdf).

---
Based on top of #2829 

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/tgroh/beam 
remove_context_from_encoded_byte_size

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/2921.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2921


commit 202a92f5880d51dfa93db6cb90d63da1c070f808
Author: Thomas Groh 
Date:   2017-05-05T17:08:32Z

Re-add AtomicCoder

This is a moderately useful base class for coders which take no
configuration.

commit 815e5765bdad35f72d441d7483f28124243e0a0c
Author: Thomas Groh 
Date:   2017-05-05T17:10:40Z

Add default implementations of Coder methods to Coder

Remove from StructuredCoder. These are sensible defaults implemented in
terms of other Coder methods.

commit a94e403a1cdabf5f7c628dee6857d7333b2fc7c5
Author: Thomas Groh 
Date:   2017-05-05T17:13:05Z

Reparent many Coders to Atomic or StructuredCoder

These coders do not take configuration, or take configuration only in
terms of other Coders, and are appropriate to reparent.

commit ffd58820eaed9117549e63920552000abb9e672d
Author: Thomas Groh 
Date:   2017-05-05T17:13:41Z

Make CustomCoder extend Coder directly

CustomCoder in general consists of configuration beyond component
coders, and should not extend, for example, equality methods of
StrucutredCoder.

commit ae6b84d546a9b4b2bb7bcfd761634b5c55e7cf09
Author: Thomas Groh 
Date:   2017-05-05T18:06:45Z

fixup! Add default implementations of Coder methods to Coder

commit ed35b56902d4f5785d4adc1e8861ca8378a730cf
Author: Thomas Groh 
Date:   2017-05-05T18:12:46Z

Remove Context from getEncodedElementByteSize




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (BEAM-2166) Remove Coder.Context from the public API

2017-05-05 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2166?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15998719#comment-15998719
 ] 

ASF GitHub Bot commented on BEAM-2166:
--

GitHub user tgroh opened a pull request:

https://github.com/apache/beam/pull/2921

[BEAM-2166] Remove context from encoded byte size

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [ ] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [ ] Make sure tests pass via `mvn clean verify`.
 - [ ] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [ ] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.pdf).

---
Based on top of #2829 

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/tgroh/beam 
remove_context_from_encoded_byte_size

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/2921.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2921


commit 202a92f5880d51dfa93db6cb90d63da1c070f808
Author: Thomas Groh 
Date:   2017-05-05T17:08:32Z

Re-add AtomicCoder

This is a moderately useful base class for coders which take no
configuration.

commit 815e5765bdad35f72d441d7483f28124243e0a0c
Author: Thomas Groh 
Date:   2017-05-05T17:10:40Z

Add default implementations of Coder methods to Coder

Remove from StructuredCoder. These are sensible defaults implemented in
terms of other Coder methods.

commit a94e403a1cdabf5f7c628dee6857d7333b2fc7c5
Author: Thomas Groh 
Date:   2017-05-05T17:13:05Z

Reparent many Coders to Atomic or StructuredCoder

These coders do not take configuration, or take configuration only in
terms of other Coders, and are appropriate to reparent.

commit ffd58820eaed9117549e63920552000abb9e672d
Author: Thomas Groh 
Date:   2017-05-05T17:13:41Z

Make CustomCoder extend Coder directly

CustomCoder in general consists of configuration beyond component
coders, and should not extend, for example, equality methods of
StrucutredCoder.

commit ae6b84d546a9b4b2bb7bcfd761634b5c55e7cf09
Author: Thomas Groh 
Date:   2017-05-05T18:06:45Z

fixup! Add default implementations of Coder methods to Coder

commit ed35b56902d4f5785d4adc1e8861ca8378a730cf
Author: Thomas Groh 
Date:   2017-05-05T18:12:46Z

Remove Context from getEncodedElementByteSize




> Remove Coder.Context from the public API
> 
>
> Key: BEAM-2166
> URL: https://issues.apache.org/jira/browse/BEAM-2166
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core, sdk-py
>Affects Versions: First stable release
>Reporter: Robert Bradshaw
>Assignee: Robert Bradshaw
> Fix For: First stable release
>
>
> Justification: 
> * Contexts add confusion and complexity to the public API (e.g. 
> https://issues.apache.org/jira/browse/BEAM-1448)
> * Leaf (user-written) coders are nearly always nested.
> * Coders are being removed from sources, which was the initial need for 
> context.
> * It is unclear how much value, if any, this provides for internal 
> performance.
> * There may be performance (as well as simplification) gains in removing this 
> for the Fn API.
> Fully removing this distinction from the internals can be defered  until the 
> last bullet points are more completely investigated. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] beam pull request #2922: Use uuid for temp directory

2017-05-05 Thread aaltay
GitHub user aaltay opened a pull request:

https://github.com/apache/beam/pull/2922

Use uuid for temp directory

R: @sb2nov 

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/aaltay/beam temps

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/2922.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2922


commit c7d97c6877f46e9fc5080e7cd1d813ce624d877f
Author: Ahmet Altay 
Date:   2017-05-05T18:43:20Z

Use uuid for temp directory




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[1/2] beam git commit: [BEAM-59] Fully delete IOChannelUtils

2017-05-05 Thread dhalperi
Repository: beam
Updated Branches:
  refs/heads/master 69846f500 -> 0d2a0aeee


[BEAM-59] Fully delete IOChannelUtils


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/60f86db6
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/60f86db6
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/60f86db6

Branch: refs/heads/master
Commit: 60f86db6ef211aedd7c7343842b68390ccf52d93
Parents: 69846f5
Author: Dan Halperin 
Authored: Fri May 5 10:20:38 2017 -0700
Committer: Dan Halperin 
Committed: Fri May 5 11:52:59 2017 -0700

--
 .../apache/beam/sdk/util/IOChannelUtils.java| 27 
 1 file changed, 27 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/60f86db6/sdks/java/core/src/main/java/org/apache/beam/sdk/util/IOChannelUtils.java
--
diff --git 
a/sdks/java/core/src/main/java/org/apache/beam/sdk/util/IOChannelUtils.java 
b/sdks/java/core/src/main/java/org/apache/beam/sdk/util/IOChannelUtils.java
deleted file mode 100644
index b658983..000
--- a/sdks/java/core/src/main/java/org/apache/beam/sdk/util/IOChannelUtils.java
+++ /dev/null
@@ -1,27 +0,0 @@
-/*
- * Licensed to the Apache Software Foundation (ASF) under one
- * or more contributor license agreements.  See the NOTICE file
- * distributed with this work for additional information
- * regarding copyright ownership.  The ASF licenses this file
- * to you under the Apache License, Version 2.0 (the
- * "License"); you may not use this file except in compliance
- * with the License.  You may obtain a copy of the License at
- *
- * http://www.apache.org/licenses/LICENSE-2.0
- *
- * Unless required by applicable law or agreed to in writing, software
- * distributed under the License is distributed on an "AS IS" BASIS,
- * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- * See the License for the specific language governing permissions and
- * limitations under the License.
- */
-package org.apache.beam.sdk.util;
-
-import org.apache.beam.sdk.options.PipelineOptions;
-
-/** Do not use, being removed. */
-@Deprecated
-public class IOChannelUtils {
-  @Deprecated
-  public static void registerIOFactoriesAllowOverride(PipelineOptions options) 
{}
-}



[2/2] beam git commit: This closes #2918

2017-05-05 Thread dhalperi
This closes #2918


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/0d2a0aee
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/0d2a0aee
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/0d2a0aee

Branch: refs/heads/master
Commit: 0d2a0aeeef92bdbc23e4061ec066da47e3c7de38
Parents: 69846f5 60f86db
Author: Dan Halperin 
Authored: Fri May 5 11:53:11 2017 -0700
Committer: Dan Halperin 
Committed: Fri May 5 11:53:11 2017 -0700

--
 .../apache/beam/sdk/util/IOChannelUtils.java| 27 
 1 file changed, 27 deletions(-)
--




[GitHub] beam pull request #2918: [BEAM-59] Fully delete IOChannelUtils

2017-05-05 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/2918


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Closed] (BEAM-59) Switch from IOChannelFactory to FileSystems

2017-05-05 Thread Daniel Halperin (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-59?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Halperin closed BEAM-59.
---
Resolution: Fixed

> Switch from IOChannelFactory to FileSystems
> ---
>
> Key: BEAM-59
> URL: https://issues.apache.org/jira/browse/BEAM-59
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-java-core, sdk-java-gcp
>Reporter: Daniel Halperin
>Assignee: Daniel Halperin
> Fix For: First stable release
>
>
> Right now, FileBasedSource and FileBasedSink communication is mediated by 
> IOChannelFactory. There are a number of issues:
> * Global configuration -- e.g., all 'gs://' URIs use the same credentials. 
> This should be per-source/per-sink/etc.
> * Supported APIs -- currently IOChannelFactory is in the "non-public API" 
> util package and subject to change. We need users to be able to add new 
> backends ('s3://', 'hdfs://', etc.) directly, without fear that they will be 
> broken.
> * Per-backend features: e.g., creating buckets in GCS/s3, setting expiration 
> time, etc.
> Updates:
> Design docs posted on dev@ list:
> Part 1: IOChannelFactory Redesign: 
> https://docs.google.com/document/d/11TdPyZ9_zmjokhNWM3Id-XJsVG3qel2lhdKTknmZ_7M/edit#
> Part 2: Configurable BeamFileSystem:
> https://docs.google.com/document/d/1-7vo9nLRsEEzDGnb562PuL4q9mUiq_ZVpCAiyyJw8p8/edit#heading=h.p3gc3colc2cs



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (BEAM-59) Switch from IOChannelFactory to FileSystems

2017-05-05 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-59?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15998782#comment-15998782
 ] 

ASF GitHub Bot commented on BEAM-59:


Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/2918


> Switch from IOChannelFactory to FileSystems
> ---
>
> Key: BEAM-59
> URL: https://issues.apache.org/jira/browse/BEAM-59
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-java-core, sdk-java-gcp
>Reporter: Daniel Halperin
>Assignee: Daniel Halperin
> Fix For: First stable release
>
>
> Right now, FileBasedSource and FileBasedSink communication is mediated by 
> IOChannelFactory. There are a number of issues:
> * Global configuration -- e.g., all 'gs://' URIs use the same credentials. 
> This should be per-source/per-sink/etc.
> * Supported APIs -- currently IOChannelFactory is in the "non-public API" 
> util package and subject to change. We need users to be able to add new 
> backends ('s3://', 'hdfs://', etc.) directly, without fear that they will be 
> broken.
> * Per-backend features: e.g., creating buckets in GCS/s3, setting expiration 
> time, etc.
> Updates:
> Design docs posted on dev@ list:
> Part 1: IOChannelFactory Redesign: 
> https://docs.google.com/document/d/11TdPyZ9_zmjokhNWM3Id-XJsVG3qel2lhdKTknmZ_7M/edit#
> Part 2: Configurable BeamFileSystem:
> https://docs.google.com/document/d/1-7vo9nLRsEEzDGnb562PuL4q9mUiq_ZVpCAiyyJw8p8/edit#heading=h.p3gc3colc2cs



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


Build failed in Jenkins: beam_PerformanceTests_Dataflow #376

2017-05-05 Thread Apache Jenkins Server
See 


Changes:

[klk] Activate WindowedWordCountIT on Apex runner

[klk] No parallelism for Apex WindowedWordCountIT

[dhalperi] [BEAM-2016] Delete HdfsFileSource & Sink

[tgroh] Run Optimize Imports on the Repository

[tgroh] Move AppliedPTransform into the Runners package

[tgroh] Convert Coder into an Abstract Static Class

[dhalperi] [BEAM-59] Fully delete IOChannelUtils

--
Started by timer
[EnvInject] - Loading node environment variables.
Building remotely on beam8 (beam) in workspace 

 > git rev-parse --is-inside-work-tree # timeout=10
Fetching changes from the remote Git repository
 > git config remote.origin.url https://github.com/apache/beam.git # timeout=10
Fetching upstream changes from https://github.com/apache/beam.git
 > git --version # timeout=10
 > git fetch --tags --progress https://github.com/apache/beam.git 
 > +refs/heads/*:refs/remotes/origin/* +refs/pull/*:refs/remotes/origin/pr/*
 > git rev-parse origin/master^{commit} # timeout=10
Checking out Revision 0d2a0aeeef92bdbc23e4061ec066da47e3c7de38 (origin/master)
 > git config core.sparsecheckout # timeout=10
 > git checkout -f 0d2a0aeeef92bdbc23e4061ec066da47e3c7de38
 > git rev-list 3bffe0e0014bdd6ae73dc2e8ecfc2b61d066120c # timeout=10
Cleaning workspace
 > git rev-parse --verify HEAD # timeout=10
Resetting working tree
 > git reset --hard # timeout=10
 > git clean -fdx # timeout=10
[EnvInject] - Executing scripts and injecting environment variables after the 
SCM step.
[EnvInject] - Injecting as environment variables the properties content 
SPARK_LOCAL_IP=127.0.0.1

[EnvInject] - Variables injected successfully.
[beam_PerformanceTests_Dataflow] $ /bin/bash -xe 
/tmp/hudson6165280624966686801.sh
+ rm -rf PerfKitBenchmarker
[beam_PerformanceTests_Dataflow] $ /bin/bash -xe 
/tmp/hudson4580259411695657974.sh
+ git clone https://github.com/GoogleCloudPlatform/PerfKitBenchmarker.git
Cloning into 'PerfKitBenchmarker'...
[beam_PerformanceTests_Dataflow] $ /bin/bash -xe 
/tmp/hudson991341416332915944.sh
+ pip install --user -r PerfKitBenchmarker/requirements.txt
Requirement already satisfied (use --upgrade to upgrade): python-gflags==3.1.1 
in /home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 14))
Requirement already satisfied (use --upgrade to upgrade): jinja2>=2.7 in 
/usr/local/lib/python2.7/dist-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 15))
Requirement already satisfied (use --upgrade to upgrade): setuptools in 
/usr/lib/python2.7/dist-packages (from -r PerfKitBenchmarker/requirements.txt 
(line 16))
Requirement already satisfied (use --upgrade to upgrade): 
colorlog[windows]==2.6.0 in /home/jenkins/.local/lib/python2.7/site-packages 
(from -r PerfKitBenchmarker/requirements.txt (line 17))
  Installing extra requirements: 'windows'
Requirement already satisfied (use --upgrade to upgrade): blinker>=1.3 in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 18))
Requirement already satisfied (use --upgrade to upgrade): futures>=3.0.3 in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 19))
Requirement already satisfied (use --upgrade to upgrade): PyYAML==3.11 in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 20))
Requirement already satisfied (use --upgrade to upgrade): pint>=0.7 in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 21))
Requirement already satisfied (use --upgrade to upgrade): numpy in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 22))
Requirement already satisfied (use --upgrade to upgrade): functools32 in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 23))
Requirement already satisfied (use --upgrade to upgrade): contextlib2>=0.5.1 in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 24))
Cleaning up...
[beam_PerformanceTests_Dataflow] $ /bin/bash -xe 
/tmp/hudson4900402012477053724.sh
+ python PerfKitBenchmarker/pkb.py --project=apache-beam-testing 
--dpb_log_level=INFO --maven_binary=/home/jenkins/tools/maven/latest/bin/mvn 
--bigquery_table=beam_performance.pkb_results --official=true 
--benchmarks=dpb_wordcount_benchmark 
--dpb_dataflow_staging_location=gs://temp-storage-for-perf-tests/staging 
--dpb_wordcount_input=dataflow-samples/shakespeare/kinglear.txt 
--config_override=dpb_wordcount_benchmark.dpb_service.service_type=dataflow
WARNING:root:File resource loader root perfkitbenchmarker/data/ycsb is not a 
directory.
2017-05-05 18:54:23,115 424c35d3 MainThread INFO Verbose log

[GitHub] beam pull request #2923: [BEAM-2141] disable beam_PerformanceTests_Dataflow

2017-05-05 Thread dhalperi
GitHub user dhalperi opened a pull request:

https://github.com/apache/beam/pull/2923

[BEAM-2141] disable beam_PerformanceTests_Dataflow

Has not passed in Jenkins memory, at least multiple weeks.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/dhalperi/beam patch-1

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/2923.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2923


commit 9740c6ee2dca01c8de03383fbc62485ff6dd982d
Author: Daniel Halperin 
Date:   2017-05-05T19:10:16Z

[BEAM-2141] disable beam_PerformanceTests_Dataflow

Has not passed in Jenkins memory, at least multiple weeks.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (BEAM-2166) Remove Coder.Context from the public API

2017-05-05 Thread Robert Bradshaw (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2166?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15998802#comment-15998802
 ] 

Robert Bradshaw commented on BEAM-2166:
---

Not done. I have a follow-up CL that can only go in once dependencies (such as 
workers) migrate to this change.

> Remove Coder.Context from the public API
> 
>
> Key: BEAM-2166
> URL: https://issues.apache.org/jira/browse/BEAM-2166
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core, sdk-py
>Affects Versions: First stable release
>Reporter: Robert Bradshaw
>Assignee: Robert Bradshaw
> Fix For: First stable release
>
>
> Justification: 
> * Contexts add confusion and complexity to the public API (e.g. 
> https://issues.apache.org/jira/browse/BEAM-1448)
> * Leaf (user-written) coders are nearly always nested.
> * Coders are being removed from sources, which was the initial need for 
> context.
> * It is unclear how much value, if any, this provides for internal 
> performance.
> * There may be performance (as well as simplification) gains in removing this 
> for the Fn API.
> Fully removing this distinction from the internals can be defered  until the 
> last bullet points are more completely investigated. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[2/2] beam git commit: This closes #2923

2017-05-05 Thread dhalperi
This closes #2923


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/a99d43c7
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/a99d43c7
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/a99d43c7

Branch: refs/heads/master
Commit: a99d43c75794f38769496ee5581013d73cfad7e3
Parents: 0d2a0ae 9740c6e
Author: Dan Halperin 
Authored: Fri May 5 12:10:49 2017 -0700
Committer: Dan Halperin 
Committed: Fri May 5 12:10:49 2017 -0700

--
 .test-infra/jenkins/job_beam_PerformanceTests_Dataflow.groovy | 3 +++
 1 file changed, 3 insertions(+)
--




  1   2   3   >