Jenkins build is back to stable : beam_PostCommit_Java_MavenInstall #5203

2017-11-08 Thread Apache Jenkins Server
See 




Jenkins build became unstable: beam_PostCommit_Java_MavenInstall #5202

2017-11-08 Thread Apache Jenkins Server
See 




[GitHub] beam pull request #4101: [BEAM-3161] Cannot output with timestamp XXXX

2017-11-08 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/4101


---


[jira] [Commented] (BEAM-3161) Cannot output with timestamp XXXX

2017-11-08 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-3161?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16245289#comment-16245289
 ] 

ASF GitHub Bot commented on BEAM-3161:
--

Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/4101


> Cannot output with timestamp 
> -
>
> Key: BEAM-3161
> URL: https://issues.apache.org/jira/browse/BEAM-3161
> Project: Beam
>  Issue Type: Bug
>  Components: dsl-sql
>Reporter: Xu Mingmin
>Assignee: Xu Mingmin
> Fix For: 2.3.0
>
>
> BeamSQL throws exception when running query
> {code}
> select siteId, count(*), TUMBLE_START(unix_timestamp_to_date(eventTimestamp), 
> INTERVAL '1' HOUR) from RHEOS_SOJEVENT_TOTAL GROUP BY siteId, 
> TUMBLE(unix_timestamp_to_date(eventTimestamp), INTERVAL '1' HOUR)
> {code}
> Exception details:
> {code}
> Caused by: java.lang.IllegalArgumentException: Cannot output with timestamp 
> 2017-11-08T21:37:46.000Z. Output timestamps must be no earlier than the 
> timestamp of the current input (2017-11-08T21:37:49.322Z) minus the allowed 
> skew (0 milliseconds). See the DoFn#getAllowedTimestampSkew() Javadoc for 
> details on changing the allowed skew.
>   at 
> org.apache.beam.runners.core.SimpleDoFnRunner$DoFnProcessContext.checkTimestamp(SimpleDoFnRunner.java:463)
>   at 
> org.apache.beam.runners.core.SimpleDoFnRunner$DoFnProcessContext.outputWithTimestamp(SimpleDoFnRunner.java:429)
>   at org.apache.Caused by: java.lang.IllegalArgumentException: Cannot 
> output with timestamp 2017-11-08T21:37:46.000Z. Output timestamps must be no 
> earlier than the timestamp of the current input (2017-11-08T21:37:49.322Z) 
> minus the allowed skew (0 milliseconds). See the 
> DoFn#getAllowedTimestampSkew() Javadoc for details on changing the allowed 
> skew.
>   at 
> org.apache.beam.runners.core.SimpleDoFnRunner$DoFnProcessContext.checkTimestamp(SimpleDoFnRunner.java:463)
>   at 
> org.apache.beam.runners.core.SimpleDoFnRunner$DoFnProcessContext.outputWithTimestamp(SimpleDoFnRunner.java:429)
>   at 
> org.apache.beam.sdk.transforms.WithTimestamps$AddTimestampsDoFn.processElement(WithTimestamps.java:138)beam.sdk.transforms.WithTimestamps$AddTimestampsDoFn.processElement(WithTimestamps.java:138)
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[2/2] beam git commit: This closes #4101

2017-11-08 Thread xumingming
This closes #4101


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/4451d556
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/4451d556
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/4451d556

Branch: refs/heads/master
Commit: 4451d5566d6ffd73d2fd6f946a11ffd1eebd90c6
Parents: 867d816 4a5741f
Author: James Xu 
Authored: Thu Nov 9 13:43:09 2017 +0800
Committer: James Xu 
Committed: Thu Nov 9 13:43:09 2017 +0800

--
 .../beam/sdk/extensions/sql/impl/rel/BeamAggregationRel.java  | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)
--




[1/2] beam git commit: change `withAllowedTimestampSkew`

2017-11-08 Thread xumingming
Repository: beam
Updated Branches:
  refs/heads/master 867d81684 -> 4451d5566


change `withAllowedTimestampSkew`


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/4a5741f2
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/4a5741f2
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/4a5741f2

Branch: refs/heads/master
Commit: 4a5741f2f1aabac098e41cfe453b52876a1bcdb0
Parents: 867d816
Author: mingmxu 
Authored: Wed Nov 8 13:44:18 2017 -0800
Committer: James Xu 
Committed: Thu Nov 9 13:42:04 2017 +0800

--
 .../beam/sdk/extensions/sql/impl/rel/BeamAggregationRel.java  | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/4a5741f2/sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/impl/rel/BeamAggregationRel.java
--
diff --git 
a/sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/impl/rel/BeamAggregationRel.java
 
b/sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/impl/rel/BeamAggregationRel.java
index e49e79c..6ed30b6 100644
--- 
a/sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/impl/rel/BeamAggregationRel.java
+++ 
b/sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/impl/rel/BeamAggregationRel.java
@@ -80,7 +80,8 @@ public class BeamAggregationRel extends Aggregate implements 
BeamRelNode {
 
BeamSqlRelUtils.getBeamRelInput(input).buildBeamPipeline(inputPCollections, 
sqlEnv);
 if (windowFieldIdx != -1) {
   upstream = upstream.apply(stageName + "assignEventTimestamp", 
WithTimestamps
-  .of(new BeamAggregationTransforms.WindowTimestampFn(windowFieldIdx)))
+  .of(new BeamAggregationTransforms.WindowTimestampFn(windowFieldIdx))
+  .withAllowedTimestampSkew(new Duration(Long.MAX_VALUE)))
   .setCoder(upstream.getCoder());
 }
 



Build failed in Jenkins: beam_PostCommit_Python_Verify #3511

2017-11-08 Thread Apache Jenkins Server
See 


Changes:

[ekirpichov] Adds logging at INFO for all creation, deletion and copying of 
files in

--
[...truncated 933.13 KB...]
copying 
apache_beam/runners/dataflow/internal/clients/dataflow/dataflow_v1b3_messages.py
 -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/internal/clients/dataflow
copying 
apache_beam/runners/dataflow/internal/clients/dataflow/message_matchers.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/internal/clients/dataflow
copying 
apache_beam/runners/dataflow/internal/clients/dataflow/message_matchers_test.py 
-> apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/internal/clients/dataflow
copying apache_beam/runners/dataflow/native_io/__init__.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/native_io
copying apache_beam/runners/dataflow/native_io/iobase.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/native_io
copying apache_beam/runners/dataflow/native_io/iobase_test.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/native_io
copying apache_beam/runners/dataflow/native_io/streaming_create.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/native_io
copying apache_beam/runners/direct/__init__.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/bundle_factory.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/clock.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/consumer_tracking_pipeline_visitor.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/consumer_tracking_pipeline_visitor_test.py 
-> apache-beam-2.3.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/direct_metrics.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/direct_metrics_test.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/direct_runner.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/direct_runner_test.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/evaluation_context.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/executor.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/helper_transforms.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/transform_evaluator.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/util.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/watermark_manager.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/experimental/__init__.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/experimental
copying apache_beam/runners/experimental/python_rpc_direct/__init__.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/experimental/python_rpc_direct
copying 
apache_beam/runners/experimental/python_rpc_direct/python_rpc_direct_runner.py 
-> apache-beam-2.3.0.dev0/apache_beam/runners/experimental/python_rpc_direct
copying apache_beam/runners/experimental/python_rpc_direct/server.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/experimental/python_rpc_direct
copying apache_beam/runners/job/__init__.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/job
copying apache_beam/runners/job/manager.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/job
copying apache_beam/runners/job/utils.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/job
copying apache_beam/runners/portability/__init__.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/portability
copying apache_beam/runners/portability/fn_api_runner.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/portability
copying apache_beam/runners/portability/fn_api_runner_test.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/portability
copying apache_beam/runners/portability/maptask_executor_runner.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/portability
copying apache_beam/runners/portability/maptask_executor_runner_test.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/portability
copying apache_beam/runners/portability/universal_local_runner.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/portability
copying apache_beam/runners/portability/universal_local_runner_main.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/portability
copying apache_beam/runners/portability/universal_local_runner_test.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/portability
copying apache_beam/runners/test/__init__.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/test
copying apache_beam/runners/worker/__init__.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/worker
copying apache_beam/run

Build failed in Jenkins: beam_PerformanceTests_Python #539

2017-11-08 Thread Apache Jenkins Server
See 


Changes:

[tgroh] Fork Control Clients into java-fn-execution

[tgroh] Fork Data Service Libraries to java-fn-execution

[ekirpichov] Adds logging at INFO for all creation, deletion and copying of 
files in

--
Started by timer
[EnvInject] - Loading node environment variables.
Building remotely on beam7 (beam) in workspace 

 > git rev-parse --is-inside-work-tree # timeout=10
Fetching changes from the remote Git repository
 > git config remote.origin.url https://github.com/apache/beam.git # timeout=10
Fetching upstream changes from https://github.com/apache/beam.git
 > git --version # timeout=10
 > git fetch --tags --progress https://github.com/apache/beam.git 
 > +refs/heads/*:refs/remotes/origin/* 
 > +refs/pull/${ghprbPullId}/*:refs/remotes/origin/pr/${ghprbPullId}/*
 > git rev-parse origin/master^{commit} # timeout=10
Checking out Revision 867d816846e5b26aaac3d4f48221a43bbda7dcd1 (origin/master)
 > git config core.sparsecheckout # timeout=10
 > git checkout -f 867d816846e5b26aaac3d4f48221a43bbda7dcd1
Commit message: "This closes #4103: Adds logging at INFO for all creation, 
deletion and copying of files in WriteFiles"
 > git rev-list 29399fde81f6d12af018d1bab26c59441a899789 # timeout=10
Cleaning workspace
 > git rev-parse --verify HEAD # timeout=10
Resetting working tree
 > git reset --hard # timeout=10
 > git clean -fdx # timeout=10
[EnvInject] - Executing scripts and injecting environment variables after the 
SCM step.
[EnvInject] - Injecting as environment variables the properties content 
SPARK_LOCAL_IP=127.0.0.1

[EnvInject] - Variables injected successfully.
[beam_PerformanceTests_Python] $ /bin/bash -xe 
/tmp/jenkins4840265899090915459.sh
+ rm -rf PerfKitBenchmarker
[beam_PerformanceTests_Python] $ /bin/bash -xe 
/tmp/jenkins3852896900054872333.sh
+ git clone https://github.com/GoogleCloudPlatform/PerfKitBenchmarker.git
Cloning into 'PerfKitBenchmarker'...
[beam_PerformanceTests_Python] $ /bin/bash -xe 
/tmp/jenkins3529437948148642623.sh
+ pip install --user -r PerfKitBenchmarker/requirements.txt
Requirement already satisfied: absl-py in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 14))
Requirement already satisfied: jinja2>=2.7 in 
/usr/local/lib/python2.7/dist-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 15))
Requirement already satisfied: setuptools in /usr/lib/python2.7/dist-packages 
(from -r PerfKitBenchmarker/requirements.txt (line 16))
Requirement already satisfied: colorlog[windows]==2.6.0 in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 17))
Requirement already satisfied: blinker>=1.3 in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 18))
Requirement already satisfied: futures>=3.0.3 in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 19))
Requirement already satisfied: PyYAML==3.12 in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 20))
Requirement already satisfied: pint>=0.7 in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 21))
Requirement already satisfied: numpy in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 22))
Requirement already satisfied: functools32 in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 23))
Requirement already satisfied: contextlib2>=0.5.1 in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 24))
Requirement already satisfied: pywinrm in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 25))
Requirement already satisfied: six in 
/home/jenkins/.local/lib/python2.7/site-packages (from absl-py->-r 
PerfKitBenchmarker/requirements.txt (line 14))
Requirement already satisfied: MarkupSafe>=0.23 in 
/usr/local/lib/python2.7/dist-packages (from jinja2>=2.7->-r 
PerfKitBenchmarker/requirements.txt (line 15))
Requirement already satisfied: colorama; extra == "windows" in 
/usr/lib/python2.7/dist-packages (from colorlog[windows]==2.6.0->-r 
PerfKitBenchmarker/requirements.txt (line 17))
Requirement already satisfied: xmltodict in 
/home/jenkins/.local/lib/python2.7/site-packages (from pywinrm->-r 
PerfKitBenchmarker/requirements.txt (line 25))
Requirement already satisfied: requests-ntlm>=0.3.0 in 
/home/jenkins/.local/lib/python2.7/site-packages (from pywinrm->-r 
PerfKitBenchmarker/requirements.txt (line 25))
Requirement already satisfied: requests>=2.9.1 in 
/home/jenkins/.local/lib/python2.7/site-packages (from p

[jira] [Created] (BEAM-3164) Capture stderr logs during gen proto

2017-11-08 Thread holdenk (JIRA)
holdenk created BEAM-3164:
-

 Summary: Capture stderr logs during gen proto
 Key: BEAM-3164
 URL: https://issues.apache.org/jira/browse/BEAM-3164
 Project: Beam
  Issue Type: Bug
  Components: build-system, sdk-py-core
Reporter: holdenk
Assignee: Davor Bonaci


Currently python PRs are failing with gen-proto failures, but these are 
difficult to debug because we don't capture the information (see 
https://builds.apache.org/job/beam_PreCommit_Python_MavenInstall/727/console ).
cc [~altay] [~robertwb]



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (BEAM-3162) Add IBM Streams Runner to compatability matrix

2017-11-08 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-3162?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16245244#comment-16245244
 ] 

ASF GitHub Bot commented on BEAM-3162:
--

pgerv12 opened a new pull request #347: [BEAM-3162] Add IBM Streams Runner to 
compatability matrix
URL: https://github.com/apache/beam-site/pull/347
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Add IBM Streams Runner to compatability matrix
> --
>
> Key: BEAM-3162
> URL: https://issues.apache.org/jira/browse/BEAM-3162
> Project: Beam
>  Issue Type: Task
>  Components: website
>Reporter: Paul Gerver
>Assignee: Paul Gerver
>Priority: Minor
>  Labels: documentation
>   Original Estimate: 48h
>  Remaining Estimate: 48h
>
> The IBM Streams Runner was announced on 2017-11-02. This task involves adding 
> a Streams Runner column to the existing Beam capability matrix on the Beam 
> website.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] beam pull request #4103: Adds logging at INFO for all creation, deletion and...

2017-11-08 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/4103


---


[1/2] beam git commit: Adds logging at INFO for all creation, deletion and copying of files in WriteFiles

2017-11-08 Thread jkff
Repository: beam
Updated Branches:
  refs/heads/master 727253ee3 -> 867d81684


Adds logging at INFO for all creation, deletion and copying of files in 
WriteFiles


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/feab6043
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/feab6043
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/feab6043

Branch: refs/heads/master
Commit: feab6043f5ff3f78c974c4c3438e55b1b55a39f8
Parents: 727253e
Author: Eugene Kirpichov 
Authored: Wed Nov 8 16:13:25 2017 -0800
Committer: Eugene Kirpichov 
Committed: Wed Nov 8 20:54:43 2017 -0800

--
 .../org/apache/beam/sdk/io/FileBasedSink.java | 18 --
 1 file changed, 16 insertions(+), 2 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/feab6043/sdks/java/core/src/main/java/org/apache/beam/sdk/io/FileBasedSink.java
--
diff --git 
a/sdks/java/core/src/main/java/org/apache/beam/sdk/io/FileBasedSink.java 
b/sdks/java/core/src/main/java/org/apache/beam/sdk/io/FileBasedSink.java
index d577fea..78ba071 100644
--- a/sdks/java/core/src/main/java/org/apache/beam/sdk/io/FileBasedSink.java
+++ b/sdks/java/core/src/main/java/org/apache/beam/sdk/io/FileBasedSink.java
@@ -694,6 +694,10 @@ public abstract class FileBasedSink
 for (Map.Entry srcDestPair : 
filenames.entrySet()) {
   srcFiles.add(srcDestPair.getKey());
   dstFiles.add(srcDestPair.getValue());
+  LOG.info(
+  "Will copy temporary file {} to final location {}",
+  srcDestPair.getKey(),
+  srcDestPair.getValue());
 }
 // During a failure case, files may have been deleted in an earlier 
step. Thus
 // we ignore missing files here.
@@ -734,6 +738,7 @@ public abstract class FileBasedSink
   
FileSystems.match(Collections.singletonList(tempDir.toString() + "*")));
   for (Metadata matchResult : singleMatch.metadata()) {
 matches.add(matchResult.resourceId());
+LOG.info("Will remove temporary file {}", 
matchResult.resourceId());
   }
 } catch (Exception e) {
   LOG.warn("Failed to match temporary files under: [{}].", tempDir);
@@ -921,7 +926,15 @@ public abstract class FileBasedSink
   getWriteOperation().getSink().writableByteChannelFactory;
   // The factory may force a MIME type or it may return null, indicating 
to use the sink's MIME.
   String channelMimeType = firstNonNull(factory.getMimeType(), mimeType);
-  LOG.debug("Opening {} for write with MIME type {}.", outputFile, 
channelMimeType);
+  LOG.info(
+  "Opening temporary file {} with MIME type {} "
+  + "to write destination {} shard {} window {} pane {}",
+  outputFile,
+  channelMimeType,
+  destination,
+  shard,
+  window,
+  paneInfo);
   WritableByteChannel tempChannel = FileSystems.create(outputFile, 
channelMimeType);
   try {
 channel = factory.create(tempChannel);
@@ -950,6 +963,7 @@ public abstract class FileBasedSink
 
 public final void cleanup() throws Exception {
   if (outputFile != null) {
+LOG.info("Deleting temporary file {}", outputFile);
 // outputFile may be null if open() was not called or failed.
 FileSystems.delete(
 Collections.singletonList(outputFile), 
StandardMoveOptions.IGNORE_MISSING_FILES);
@@ -991,7 +1005,7 @@ public abstract class FileBasedSink
 
   FileResult result =
   new FileResult<>(outputFile, shard, window, paneInfo, destination);
-  LOG.debug("Result for bundle {}: {}", this.id, outputFile);
+  LOG.info("Successfully wrote temporary file {}", outputFile);
   return result;
 }
 



[2/2] beam git commit: This closes #4103: Adds logging at INFO for all creation, deletion and copying of files in WriteFiles

2017-11-08 Thread jkff
This closes #4103: Adds logging at INFO for all creation, deletion and copying 
of files in WriteFiles


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/867d8168
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/867d8168
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/867d8168

Branch: refs/heads/master
Commit: 867d816846e5b26aaac3d4f48221a43bbda7dcd1
Parents: 727253e feab604
Author: Eugene Kirpichov 
Authored: Wed Nov 8 20:54:59 2017 -0800
Committer: Eugene Kirpichov 
Committed: Wed Nov 8 20:54:59 2017 -0800

--
 .../org/apache/beam/sdk/io/FileBasedSink.java | 18 --
 1 file changed, 16 insertions(+), 2 deletions(-)
--




Build failed in Jenkins: beam_PostCommit_Python_Verify #3510

2017-11-08 Thread Apache Jenkins Server
See 


--
[...truncated 934.28 KB...]
copying 
apache_beam/runners/dataflow/internal/clients/dataflow/dataflow_v1b3_messages.py
 -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/internal/clients/dataflow
copying 
apache_beam/runners/dataflow/internal/clients/dataflow/message_matchers.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/internal/clients/dataflow
copying 
apache_beam/runners/dataflow/internal/clients/dataflow/message_matchers_test.py 
-> apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/internal/clients/dataflow
copying apache_beam/runners/dataflow/native_io/__init__.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/native_io
copying apache_beam/runners/dataflow/native_io/iobase.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/native_io
copying apache_beam/runners/dataflow/native_io/iobase_test.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/native_io
copying apache_beam/runners/dataflow/native_io/streaming_create.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/native_io
copying apache_beam/runners/direct/__init__.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/bundle_factory.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/clock.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/consumer_tracking_pipeline_visitor.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/consumer_tracking_pipeline_visitor_test.py 
-> apache-beam-2.3.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/direct_metrics.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/direct_metrics_test.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/direct_runner.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/direct_runner_test.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/evaluation_context.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/executor.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/helper_transforms.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/transform_evaluator.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/util.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/watermark_manager.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/experimental/__init__.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/experimental
copying apache_beam/runners/experimental/python_rpc_direct/__init__.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/experimental/python_rpc_direct
copying 
apache_beam/runners/experimental/python_rpc_direct/python_rpc_direct_runner.py 
-> apache-beam-2.3.0.dev0/apache_beam/runners/experimental/python_rpc_direct
copying apache_beam/runners/experimental/python_rpc_direct/server.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/experimental/python_rpc_direct
copying apache_beam/runners/job/__init__.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/job
copying apache_beam/runners/job/manager.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/job
copying apache_beam/runners/job/utils.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/job
copying apache_beam/runners/portability/__init__.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/portability
copying apache_beam/runners/portability/fn_api_runner.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/portability
copying apache_beam/runners/portability/fn_api_runner_test.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/portability
copying apache_beam/runners/portability/maptask_executor_runner.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/portability
copying apache_beam/runners/portability/maptask_executor_runner_test.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/portability
copying apache_beam/runners/portability/universal_local_runner.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/portability
copying apache_beam/runners/portability/universal_local_runner_main.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/portability
copying apache_beam/runners/portability/universal_local_runner_test.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/portability
copying apache_beam/runners/test/__init__.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/test
copying apache_beam/runners/worker/__init__.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/worker
copying apache_beam/runners/worker/bundle_processor.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/worker
copying apache_beam/runn

Jenkins build became unstable: beam_PostCommit_Java_ValidatesRunner_Dataflow #4313

2017-11-08 Thread Apache Jenkins Server
See 




[GitHub] beam pull request #4107: Merge branch 'master' into jstorm-runner at commit ...

2017-11-08 Thread peihe
GitHub user peihe opened a pull request:

https://github.com/apache/beam/pull/4107

Merge branch 'master' into jstorm-runner at commit 727253e

Follow this checklist to help us incorporate your contribution quickly and 
easily:

 - [ ] Make sure there is a [JIRA 
issue](https://issues.apache.org/jira/projects/BEAM/issues/) filed for the 
change (usually before you start working on it).  Trivial changes like typos do 
not require a JIRA issue.  Your pull request should address just this issue, 
without pulling in other changes.
 - [ ] Each commit in the pull request should have a meaningful subject 
line and body.
 - [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in 
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA 
issue.
 - [ ] Write a pull request description that is detailed enough to 
understand what the pull request does, how, and why.
 - [ ] Run `mvn clean verify` to make sure basic checks pass. A more 
thorough check will be performed on your pull request automatically.
 - [ ] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).

---


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/peihe/incubator-beam jstorm-runner-merge

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/4107.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #4107


commit 3a04dbe46debc2c4aa188b210870a24dc1408bf1
Author: Holden Karau 
Date:   2017-10-10T19:31:16Z

pr/cl-feedback from c-y-koo

commit 9de10e2d80ce15c5881b26f6436d28cccf60e18b
Author: Holden Karau 
Date:   2017-10-12T06:46:28Z

Fix snippets

commit 9c2a6e771d30e3e120b941bd2d450a00d8b7f091
Author: Robert Bradshaw 
Date:   2017-10-12T22:54:09Z

Closes #3804

commit b63a3eb082be24c74e1cb5f56b60c4c3dd38f728
Author: Scott Wegner 
Date:   2017-10-11T21:14:22Z

Increment dataflow client version

commit 6055d280a4ae1d9018e4644d671748341162af4b
Author: Thomas Groh 
Date:   2017-10-12T23:31:08Z

This closes #3980

commit 2d147eb21f98f5668ff5e2cc497b8d3e5fae9c20
Author: Luke Cwik 
Date:   2017-10-12T16:49:06Z

[BEAM-3048] Remove RAND_RANGE in WindowedWordCount

commit f398e5ad4abff386083c0af87cb29cd87891bbe4
Author: Luke Cwik 
Date:   2017-10-13T00:16:29Z

[BEAM-3048] Remove RAND_RANGE in WindowedWordCount

This closes #3987

commit 3feef91761c6f5a44f535e4daf9c39a88320e229
Author: Ahmet Altay 
Date:   2017-10-13T02:17:28Z

Add an option for dataflow job labels.

commit a3a7807fec68a7954ba136dc00499649d372f5ca
Author: Ahmet Altay 
Date:   2017-10-13T19:52:20Z

This closes #3990

commit ec192d15d3e83d6fe2127619c8bbd69e83277918
Author: Robert Bradshaw 
Date:   2017-10-04T20:57:01Z

Align names with those produced by the dataflow runner harness.

These will be unused once the runner harness produces the correct
transform payloads.

commit d91ebd9f5fa3cf5c250f02096c27c21354dce859
Author: Robert Bradshaw 
Date:   2017-10-05T00:33:07Z

Fix from any -> bytes transition.

commit 3dc75599437bc57a1d20a81f137f13b4f25b3719
Author: Robert Bradshaw 
Date:   2017-10-13T21:41:04Z

Closes #3941

commit d226c7679b9d94a40553609f31ecbfba72559e8a
Author: Robert Bradshaw 
Date:   2017-10-09T23:46:19Z

Add an element batching transform.

commit 7f5753f1f7f5321f19de51a4d320c1af2d2ad44f
Author: Robert Bradshaw 
Date:   2017-10-14T00:13:41Z

Closes #3971

commit 4b908c2e693fe9ed44fcb6c67a2d82c7da355259
Author: Eugene Kirpichov 
Date:   2017-09-25T20:57:04Z

Introduces Contextful

commit e2ad925dc4d8bb33a264a21c48b8ceef63ac6eb3
Author: Eugene Kirpichov 
Date:   2017-10-03T00:36:48Z

Supports side inputs in MapElements and FlatMapElements

commit 014614b695bac0b636aae662977dd3a3fa3b8a1e
Author: Eugene Kirpichov 
Date:   2017-10-14T01:44:28Z

This closes #3921: [BEAM-3009] Introduces Contextful machinery and uses it 
to add side input support to Watch

commit 3ad84791d4d85896f46b7956b5bd8045cdc4a0e9
Author: Robert Bradshaw 
Date:   2017-10-03T00:20:38Z

Add progress metrics to Python SDK.

commit 245d77338be8677df7ea9cbd75b1b8ab8c4da831
Author: Robert Bradshaw 
Date:   2017-10-16T20:09:31Z

Closes #3939

commit 6b25948b4f56e4d45e1f9e03eb19c6077413a80e
Author: chamik...@google.com 
Date:   2017-10-16T07:50:03Z

Sets user agent in BigTableIO.Read.getBigTableService().

commit ec052bb44ce40ae5fda2fbbd94fbc2b97ac363f5
Author: chamik...@google.com 
Date:   2017-10-16T20:14:25Z

This closes #3996

commit e940456bd95da3c8b79eb4666ad09280dccaedcf
Author: Kenneth Knowles 
Date:   2017-10-16T22:13:26Z

Return null when timer not found instead of crashing

commit ec58a80ca0f913c85d5f17cba3535243cd010876
Author: Yunqing Zhou

Build failed in Jenkins: beam_PostCommit_Python_Verify #3509

2017-11-08 Thread Apache Jenkins Server
See 


Changes:

[tgroh] Fork Control Clients into java-fn-execution

[tgroh] Fork Data Service Libraries to java-fn-execution

--
[...truncated 935.74 KB...]
copying 
apache_beam/runners/dataflow/internal/clients/dataflow/dataflow_v1b3_messages.py
 -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/internal/clients/dataflow
copying 
apache_beam/runners/dataflow/internal/clients/dataflow/message_matchers.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/internal/clients/dataflow
copying 
apache_beam/runners/dataflow/internal/clients/dataflow/message_matchers_test.py 
-> apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/internal/clients/dataflow
copying apache_beam/runners/dataflow/native_io/__init__.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/native_io
copying apache_beam/runners/dataflow/native_io/iobase.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/native_io
copying apache_beam/runners/dataflow/native_io/iobase_test.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/native_io
copying apache_beam/runners/dataflow/native_io/streaming_create.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/native_io
copying apache_beam/runners/direct/__init__.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/bundle_factory.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/clock.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/consumer_tracking_pipeline_visitor.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/consumer_tracking_pipeline_visitor_test.py 
-> apache-beam-2.3.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/direct_metrics.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/direct_metrics_test.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/direct_runner.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/direct_runner_test.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/evaluation_context.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/executor.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/helper_transforms.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/transform_evaluator.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/util.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/watermark_manager.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/experimental/__init__.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/experimental
copying apache_beam/runners/experimental/python_rpc_direct/__init__.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/experimental/python_rpc_direct
copying 
apache_beam/runners/experimental/python_rpc_direct/python_rpc_direct_runner.py 
-> apache-beam-2.3.0.dev0/apache_beam/runners/experimental/python_rpc_direct
copying apache_beam/runners/experimental/python_rpc_direct/server.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/experimental/python_rpc_direct
copying apache_beam/runners/job/__init__.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/job
copying apache_beam/runners/job/manager.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/job
copying apache_beam/runners/job/utils.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/job
copying apache_beam/runners/portability/__init__.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/portability
copying apache_beam/runners/portability/fn_api_runner.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/portability
copying apache_beam/runners/portability/fn_api_runner_test.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/portability
copying apache_beam/runners/portability/maptask_executor_runner.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/portability
copying apache_beam/runners/portability/maptask_executor_runner_test.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/portability
copying apache_beam/runners/portability/universal_local_runner.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/portability
copying apache_beam/runners/portability/universal_local_runner_main.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/portability
copying apache_beam/runners/portability/universal_local_runner_test.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/portability
copying apache_beam/runners/test/__init__.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/test
copying apache_beam/runners/worker/__init__.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/worker

[jira] [Commented] (BEAM-3163) We should run and monitor PostCommit test suites for release branches.

2017-11-08 Thread Valentyn Tymofieiev (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-3163?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16245111#comment-16245111
 ] 

Valentyn Tymofieiev commented on BEAM-3163:
---

Related: https://issues.apache.org/jira/browse/BEAM-3120

> We should run and monitor PostCommit test suites for release branches.
> --
>
> Key: BEAM-3163
> URL: https://issues.apache.org/jira/browse/BEAM-3163
> Project: Beam
>  Issue Type: New Feature
>  Components: testing
>Reporter: Valentyn Tymofieiev
>Assignee: Jason Kuster
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (BEAM-3163) We should run and monitor PostCommit test suites for release branches.

2017-11-08 Thread Valentyn Tymofieiev (JIRA)
Valentyn Tymofieiev created BEAM-3163:
-

 Summary: We should run and monitor PostCommit test suites for 
release branches.
 Key: BEAM-3163
 URL: https://issues.apache.org/jira/browse/BEAM-3163
 Project: Beam
  Issue Type: New Feature
  Components: testing
Reporter: Valentyn Tymofieiev
Assignee: Jason Kuster






--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


svn commit: r23019 - in /dev/beam/2.2.0: apache-beam-2.2.0-python.zip apache-beam-2.2.0-python.zip.asc apache-beam-2.2.0-python.zip.md5 apache-beam-2.2.0-python.zip.sha1

2017-11-08 Thread reuvenlax
Author: reuvenlax
Date: Thu Nov  9 01:52:23 2017
New Revision: 23019

Log:
Fix python distribution

Modified:
dev/beam/2.2.0/apache-beam-2.2.0-python.zip
dev/beam/2.2.0/apache-beam-2.2.0-python.zip.asc
dev/beam/2.2.0/apache-beam-2.2.0-python.zip.md5
dev/beam/2.2.0/apache-beam-2.2.0-python.zip.sha1

Modified: dev/beam/2.2.0/apache-beam-2.2.0-python.zip
==
Binary files - no diff available.

Modified: dev/beam/2.2.0/apache-beam-2.2.0-python.zip.asc
==
--- dev/beam/2.2.0/apache-beam-2.2.0-python.zip.asc (original)
+++ dev/beam/2.2.0/apache-beam-2.2.0-python.zip.asc Thu Nov  9 01:52:23 2017
@@ -1,16 +1,16 @@
 -BEGIN PGP SIGNATURE-
 
-iQIcBAABAgAGBQJaAtcuAAoJEDvHLhu5i3cIyREP/i8NMUFs+ILLGEuu+gx/AZxI
-J5uu+vZZKDyRuJB5UGef7XaGxXk1n0zn03y1EHjWHHrVKrat3xJLTd8B7YDiSReb
-KvsZKXRPS72K51WnX/Y7V30wmh73vAM3fNvNf7spMAL/r3iKXdB4V7t8HrhFUrrI
-oOkbpQOhh5b9bHwWv26aufpQ2xG3UR8gJn8zC5m3Y1lQgpW4MEoACVsxbDX9tXKm
-Xwu9+oTbvejM+YtMZVTrsJpo3tyf+BTsBHdPRdpIEVkvMx7UXD4fckcBcGM63gTh
-HQKhCHUBJG7swkW62ga+Vx2l0GB9dIjb89GnrFt2UYBAAu68ucofZgYTp7AyQabg
-W5ReeZmuJ/gPzhl9Olt6ci8MfY9cYzbzrEzAV1al+BdrtwZtovXW2yRuS5qRK6FI
-DU9vul1lJ2jH3lv2CDVbygovbOc7HyVMz8cLQMP8CgVvOXFp1OFPDPa/ddtq/pEj
-5LWS8g6Y2sjguzFE6/8b3hMX3XUA1CBCivwNvUWOAXFUXt5wnlPQY+QEDR3FpgVX
-ovdl3O+TJq1xlaQtRmXMLkBkKqHgDTCUGhDARQ65o0azcw6WbAwipSsDlAdqAoK7
-kwinwwn1NBIe1usm+KlPSjI1sROsAFC5tK7Spo3FauwDT1NvWaZs8S4CkbfFcNR0
-ufy6LtgbZC33GBvkTaEW
-=opZ9
+iQIcBAABAgAGBQJaA7SKAAoJEDvHLhu5i3cITlkQALjRPICVWhlG3CDo7nQN2xOM
+vht1MHOrqHHI4+xr8Xo4TGxJ+lN+8X0eZq+qB1W3tUg9yu0K8PW++3DQ8mfKhJPM
+QaKTNiMDZZBVuBrwl/yjDLc4Dm8BIa9LggencEwhNwscjPF2ueM18DRoCkZNIz/p
+Gdm4mnBTnkkpLlv1X1utQPODBcVkqLYt7otcoc2T9bPTVCZyW4xuzkksPFFBc5HM
+S+BwLVydrHSaKRljRMZI0Cs4ev2kT3ksy3W7gP++4Tw5JSAOYlA8kQUtBGKCCw6l
+9PdJ8aKNFzc7fEZlWI9tysMWEu30RDFDGcn8nC09K6QlE9VDrGV6PDbjNl5J29py
+m74UIWcpz3qAfiHPtZ3SwuwJNDP7E2dTftmw2+v6aU9VNjBjdCdD41K8M6mhHKQf
+P8f6E6LcG5M78oBJi+f+9xyH89D+VWCUcgYJShhfOCsYoKTNDNgne2Kve7UWTiSM
+yrVqKwHk4YbazC6MylXci79I5nm90qv17TUBA+2jKSI9e0WVlFvZAv0b4wBm7EEl
+/Z0tYCeeanKjCa3OhzaMBsIBpr2fqGCBBwuzZVHxm2isDpk0noSuOzT6laBd31iZ
+tashYspVwEJh3eOT9BcPNwDM6lKNG5UUhmEdZ0NvvrF/Ki7LBe7PwGeAUgr5qHtY
+b7Hb5sFEenrxT5ic4c6r
+=KUH6
 -END PGP SIGNATURE-

Modified: dev/beam/2.2.0/apache-beam-2.2.0-python.zip.md5
==
--- dev/beam/2.2.0/apache-beam-2.2.0-python.zip.md5 (original)
+++ dev/beam/2.2.0/apache-beam-2.2.0-python.zip.md5 Thu Nov  9 01:52:23 2017
@@ -1 +1 @@
-b801775a6bb905154e41eb401a7309e7  apache-beam-2.2.0-python.zip
+ae2630f56c6c41bdce9a6eafa349cf7e  apache-beam-2.2.0-python.zip

Modified: dev/beam/2.2.0/apache-beam-2.2.0-python.zip.sha1
==
--- dev/beam/2.2.0/apache-beam-2.2.0-python.zip.sha1 (original)
+++ dev/beam/2.2.0/apache-beam-2.2.0-python.zip.sha1 Thu Nov  9 01:52:23 2017
@@ -1 +1 @@
-dbe06eb3b308036af7e1f34041bc0ad9c11b60b3  apache-beam-2.2.0-python.zip
+a02ddb419ee515b9c93d23f541535c7f2fa1497c  apache-beam-2.2.0-python.zip




[GitHub] beam pull request #3845: mr-runner: use SerializablePipelineOptions to serde...

2017-11-08 Thread huafengw
Github user huafengw closed the pull request at:

https://github.com/apache/beam/pull/3845


---


[GitHub] beam pull request #4106: Update urns.py

2017-11-08 Thread tvalentyn
GitHub user tvalentyn opened a pull request:

https://github.com/apache/beam/pull/4106

Update urns.py

Fix typo in GLOBAL_WINDOW_CODER urn constant.

Follow this checklist to help us incorporate your contribution quickly and 
easily:

 - [ ] Make sure there is a [JIRA 
issue](https://issues.apache.org/jira/projects/BEAM/issues/) filed for the 
change (usually before you start working on it).  Trivial changes like typos do 
not require a JIRA issue.  Your pull request should address just this issue, 
without pulling in other changes.
 - [ ] Each commit in the pull request should have a meaningful subject 
line and body.
 - [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in 
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA 
issue.
 - [ ] Write a pull request description that is detailed enough to 
understand what the pull request does, how, and why.
 - [ ] Run `mvn clean verify` to make sure basic checks pass. A more 
thorough check will be performed on your pull request automatically.
 - [ ] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).

---


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/tvalentyn/beam patch-1

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/4106.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #4106


commit a6d824831000189a39604b41119c9dcc0f286d2a
Author: tvalentyn 
Date:   2017-11-09T01:31:43Z

Update urns.py

Fix typo in GLOBAL_WINDOW_CODER urn constant.




---


[jira] [Commented] (BEAM-2899) Universal Local Runner

2017-11-08 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16245031#comment-16245031
 ] 

ASF GitHub Bot commented on BEAM-2899:
--

GitHub user tgroh opened a pull request:

https://github.com/apache/beam/pull/4105

[BEAM-2899]  Fork FnDataService from runners-core

Follow this checklist to help us incorporate your contribution quickly and 
easily:

 - [ ] Make sure there is a [JIRA 
issue](https://issues.apache.org/jira/projects/BEAM/issues/) filed for the 
change (usually before you start working on it).  Trivial changes like typos do 
not require a JIRA issue.  Your pull request should address just this issue, 
without pulling in other changes.
 - [ ] Each commit in the pull request should have a meaningful subject 
line and body.
 - [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in 
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA 
issue.
 - [ ] Write a pull request description that is detailed enough to 
understand what the pull request does, how, and why.
 - [ ] Run `mvn clean verify` to make sure basic checks pass. A more 
thorough check will be performed on your pull request automatically.
 - [ ] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).

---
Also remove the runners-core fn package.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/tgroh/beam fork_data_service

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/4105.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #4105


commit 9898a0b930ba2beb91d6c7b2b9c6093b111d8ba9
Author: Thomas Groh 
Date:   2017-11-08T18:34:19Z

Fork FnDataService from runners-core

This necessitates a dependency on the Core SDK for Coder and WindowedValue.

commit 9bdcdb7ed2b57e8e430af7d8fb669edf71899df2
Author: Thomas Groh 
Date:   2017-11-09T00:58:45Z

Remove core-java/fn package

This is actually unused, and available within java-fn-execution, which
is the preferred module in which to store FnApi libraries.




> Universal Local Runner
> --
>
> Key: BEAM-2899
> URL: https://issues.apache.org/jira/browse/BEAM-2899
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-core
>Reporter: Henning Rohde
>Assignee: Thomas Groh
>  Labels: portability
>
> To make the portability effort tractable, we should implement a Universal 
> Local Runner (ULR) in Java that runs in a single server process plus docker 
> containers for the SDK harness containers. It would serve multiple purposes:
>   (1) A reference implementation for other runners. Ideally, any new feature 
> should be implemented in the ULR first.
>   (2) A fully-featured test runner for SDKs who participate in the 
> portability framework. It thus complements the direct runners.
>   (3) A test runner for user code that depends on or customizes the runtime 
> environment. For example, a DoFn that shells out has a dependency that may be 
> satisfied on the user's desktop (and thus works fine on the direct runner), 
> but perhaps not by the container harness image. The ULR allows for an easy 
> way to find out.
> The Java direct runner presumably has lots of pieces that can be reused.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (BEAM-3158) DoFnTester should call close() in catch bloc and then re-throw exception to allow using @Rule ExpectedException

2017-11-08 Thread Eugene Kirpichov (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-3158?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16245032#comment-16245032
 ] 

Eugene Kirpichov commented on BEAM-3158:


Sorry I'm not following: what exactly goes wrong currently, if you use a 
try-with on the DoFnTester combined with an ExpectedException?

> DoFnTester should call close() in catch bloc and then re-throw exception to 
> allow using @Rule ExpectedException
> ---
>
> Key: BEAM-3158
> URL: https://issues.apache.org/jira/browse/BEAM-3158
> Project: Beam
>  Issue Type: Improvement
>  Components: testing
>Reporter: Etienne Chauchot
>Assignee: Etienne Chauchot
>Priority: Minor
>
> In {{CLONE_ONCE}} and {{DO_NOT_CLONE}} cloning behaviors, it is required to 
> explicitly call {{DoFnTester.close()}}. If an exception is raised by the 
> DoFnTester, the user needs to use a try/catch bloc to call DoFnTester.close() 
> himself. If this DoFnTester were doing 
> {code}
> try{
> ...
> } catch(Exception e){
> close();
> throw e;
> }
> {code}
> then the user would no longer need to call {{DoFnTester.close()}} (to release 
> resources, stop threads ...) and thus no longer need the try/catch. This will 
> allow him to use the Junit {{@Rule ExpectedException}} in place of something 
> ugly like 
> {code}
> // need to avoid flow interruption to close the DoFnTester
> Exception raisedException = null;
> try {
>   fnTester.processBundle(input);
> } catch (IOException exception) {
>   raisedException = exception;
> }
> fnTester.close();
> assertTrue(raisedException != null);
> {code}
> But maybe {{DoFnTester}} cannot always be safely closed if a exception was 
> raised.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] beam pull request #4105: [BEAM-2899] Fork FnDataService from runners-core

2017-11-08 Thread tgroh
GitHub user tgroh opened a pull request:

https://github.com/apache/beam/pull/4105

[BEAM-2899]  Fork FnDataService from runners-core

Follow this checklist to help us incorporate your contribution quickly and 
easily:

 - [ ] Make sure there is a [JIRA 
issue](https://issues.apache.org/jira/projects/BEAM/issues/) filed for the 
change (usually before you start working on it).  Trivial changes like typos do 
not require a JIRA issue.  Your pull request should address just this issue, 
without pulling in other changes.
 - [ ] Each commit in the pull request should have a meaningful subject 
line and body.
 - [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in 
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA 
issue.
 - [ ] Write a pull request description that is detailed enough to 
understand what the pull request does, how, and why.
 - [ ] Run `mvn clean verify` to make sure basic checks pass. A more 
thorough check will be performed on your pull request automatically.
 - [ ] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).

---
Also remove the runners-core fn package.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/tgroh/beam fork_data_service

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/4105.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #4105


commit 9898a0b930ba2beb91d6c7b2b9c6093b111d8ba9
Author: Thomas Groh 
Date:   2017-11-08T18:34:19Z

Fork FnDataService from runners-core

This necessitates a dependency on the Core SDK for Coder and WindowedValue.

commit 9bdcdb7ed2b57e8e430af7d8fb669edf71899df2
Author: Thomas Groh 
Date:   2017-11-09T00:58:45Z

Remove core-java/fn package

This is actually unused, and available within java-fn-execution, which
is the preferred module in which to store FnApi libraries.




---


[jira] [Commented] (BEAM-2899) Universal Local Runner

2017-11-08 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16245014#comment-16245014
 ] 

ASF GitHub Bot commented on BEAM-2899:
--

Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/4086


> Universal Local Runner
> --
>
> Key: BEAM-2899
> URL: https://issues.apache.org/jira/browse/BEAM-2899
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-core
>Reporter: Henning Rohde
>Assignee: Thomas Groh
>  Labels: portability
>
> To make the portability effort tractable, we should implement a Universal 
> Local Runner (ULR) in Java that runs in a single server process plus docker 
> containers for the SDK harness containers. It would serve multiple purposes:
>   (1) A reference implementation for other runners. Ideally, any new feature 
> should be implemented in the ULR first.
>   (2) A fully-featured test runner for SDKs who participate in the 
> portability framework. It thus complements the direct runners.
>   (3) A test runner for user code that depends on or customizes the runtime 
> environment. For example, a DoFn that shells out has a dependency that may be 
> satisfied on the user's desktop (and thus works fine on the direct runner), 
> but perhaps not by the container harness image. The ULR allows for an easy 
> way to find out.
> The Java direct runner presumably has lots of pieces that can be reused.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] beam pull request #4086: [BEAM-2899] Copy runners.core.fn to java-fn-executi...

2017-11-08 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/4086


---


[1/3] beam git commit: Fork Data Service Libraries to java-fn-execution

2017-11-08 Thread tgroh
Repository: beam
Updated Branches:
  refs/heads/master 29399fde8 -> 727253ee3


Fork Data Service Libraries to java-fn-execution


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/70311595
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/70311595
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/70311595

Branch: refs/heads/master
Commit: 70311595b37fb3dfb8c2d2c994aa2074903db3db
Parents: 9ed655b
Author: Thomas Groh 
Authored: Mon Nov 6 15:04:33 2017 -0800
Committer: Thomas Groh 
Committed: Wed Nov 8 16:43:42 2017 -0800

--
 .../beam/runners/core/fn/FnDataReceiver.java|  4 +++
 .../beam/runners/core/fn/FnDataService.java |  4 +++
 .../runners/core/fn/SdkHarnessDoFnRunner.java   |  8 +-
 .../fnexecution/data/FnDataReceiver.java| 27 
 .../runners/fnexecution/data/package-info.java  | 23 +
 5 files changed, 65 insertions(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/70311595/runners/core-java/src/main/java/org/apache/beam/runners/core/fn/FnDataReceiver.java
--
diff --git 
a/runners/core-java/src/main/java/org/apache/beam/runners/core/fn/FnDataReceiver.java
 
b/runners/core-java/src/main/java/org/apache/beam/runners/core/fn/FnDataReceiver.java
index e9928a7..639d678 100644
--- 
a/runners/core-java/src/main/java/org/apache/beam/runners/core/fn/FnDataReceiver.java
+++ 
b/runners/core-java/src/main/java/org/apache/beam/runners/core/fn/FnDataReceiver.java
@@ -27,7 +27,11 @@ import java.io.Closeable;
  *
  * Register a target with a {@link FnDataService} to gain a {@link 
FnDataReceiver} to which you
  * may write outgoing data.
+ *
+ * @deprecated Runners should depend on the beam-runners-java-fn-execution 
module for this
+ * functionality.
  */
+@Deprecated
 public interface FnDataReceiver extends Closeable {
   void accept(T input) throws Exception;
 }

http://git-wip-us.apache.org/repos/asf/beam/blob/70311595/runners/core-java/src/main/java/org/apache/beam/runners/core/fn/FnDataService.java
--
diff --git 
a/runners/core-java/src/main/java/org/apache/beam/runners/core/fn/FnDataService.java
 
b/runners/core-java/src/main/java/org/apache/beam/runners/core/fn/FnDataService.java
index fdde79c..2a6777e 100644
--- 
a/runners/core-java/src/main/java/org/apache/beam/runners/core/fn/FnDataService.java
+++ 
b/runners/core-java/src/main/java/org/apache/beam/runners/core/fn/FnDataService.java
@@ -27,7 +27,11 @@ import org.apache.beam.sdk.util.WindowedValue;
  * The {@link FnDataService} is able to forward inbound elements to a consumer 
and is also a
  * consumer of outbound elements. Callers can register themselves as consumers 
for inbound elements
  * or can get a handle for a consumer for outbound elements.
+ *
+ * @deprecated Runners should depend on the beam-runners-java-fn-execution 
module for this
+ * functionality.
  */
+@Deprecated
 public interface FnDataService {
 
   /**

http://git-wip-us.apache.org/repos/asf/beam/blob/70311595/runners/core-java/src/main/java/org/apache/beam/runners/core/fn/SdkHarnessDoFnRunner.java
--
diff --git 
a/runners/core-java/src/main/java/org/apache/beam/runners/core/fn/SdkHarnessDoFnRunner.java
 
b/runners/core-java/src/main/java/org/apache/beam/runners/core/fn/SdkHarnessDoFnRunner.java
index ec4d344..d27077f 100644
--- 
a/runners/core-java/src/main/java/org/apache/beam/runners/core/fn/SdkHarnessDoFnRunner.java
+++ 
b/runners/core-java/src/main/java/org/apache/beam/runners/core/fn/SdkHarnessDoFnRunner.java
@@ -29,7 +29,13 @@ import org.apache.beam.sdk.util.UserCodeException;
 import org.apache.beam.sdk.util.WindowedValue;
 import org.joda.time.Instant;
 
-/** Processes a bundle by sending it to an SDK harness over the Fn API. */
+/**
+ * Processes a bundle by sending it to an SDK harness over the Fn API.
+ *
+ * @deprecated Runners should interact with the Control and Data plane 
directly, rather than through
+ * a {@link DoFnRunner}. Consider the beam-runners-java-fn-execution 
artifact instead.
+ */
+@Deprecated
 public class SdkHarnessDoFnRunner implements 
DoFnRunner {
 
   private final SdkHarnessClient sdkHarnessClient;

http://git-wip-us.apache.org/repos/asf/beam/blob/70311595/runners/java-fn-execution/src/main/java/org/apache/beam/runners/fnexecution/data/FnDataReceiver.java
--
diff --git 
a/runners/java-fn-execution/src/main/java/org/apache/beam/runners/fnexecution/data/FnDataReceiver.java
 
b/runners/java-fn-execution/src/main/java/org/apache/beam/runners/fnexecution/data/FnDataReceiver.java
new file mo

[3/3] beam git commit: This closes #4086

2017-11-08 Thread tgroh
This closes #4086


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/727253ee
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/727253ee
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/727253ee

Branch: refs/heads/master
Commit: 727253ee3774a3ac4f2841d4276d2bc8e14dcdec
Parents: 29399fd 7031159
Author: Thomas Groh 
Authored: Wed Nov 8 16:43:43 2017 -0800
Committer: Thomas Groh 
Committed: Wed Nov 8 16:43:43 2017 -0800

--
 .../runners/core/fn/FnApiControlClient.java |   4 +
 .../core/fn/FnApiControlClientPoolService.java  |   8 +-
 .../beam/runners/core/fn/FnDataReceiver.java|   4 +
 .../beam/runners/core/fn/FnDataService.java |   4 +
 .../beam/runners/core/fn/SdkHarnessClient.java  |   4 +
 .../runners/core/fn/SdkHarnessDoFnRunner.java   |   8 +-
 runners/java-fn-execution/pom.xml   |  15 ++
 .../fnexecution/control/FnApiControlClient.java | 148 
 .../control/FnApiControlClientPoolService.java  |  66 +++
 .../fnexecution/control/SdkHarnessClient.java   | 173 +++
 .../fnexecution/control/package-info.java   |  23 +++
 .../fnexecution/data/FnDataReceiver.java|  27 +++
 .../runners/fnexecution/data/package-info.java  |  23 +++
 .../FnApiControlClientPoolServiceTest.java  |  65 +++
 .../control/FnApiControlClientTest.java | 139 +++
 .../control/SdkHarnessClientTest.java   |  96 ++
 16 files changed, 805 insertions(+), 2 deletions(-)
--




[2/3] beam git commit: Fork Control Clients into java-fn-execution

2017-11-08 Thread tgroh
Fork Control Clients into java-fn-execution

Deprecate versions in runners-core. Runner-side portability APIs should
not have a dependency edge to an SDK, and use of the java-fn-execution
package ensures that.


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/9ed655be
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/9ed655be
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/9ed655be

Branch: refs/heads/master
Commit: 9ed655be780630e1218d185bd0d2ebfea099b988
Parents: 29399fd
Author: Thomas Groh 
Authored: Mon Nov 6 15:03:17 2017 -0800
Committer: Thomas Groh 
Committed: Wed Nov 8 16:43:42 2017 -0800

--
 .../runners/core/fn/FnApiControlClient.java |   4 +
 .../core/fn/FnApiControlClientPoolService.java  |   8 +-
 .../beam/runners/core/fn/SdkHarnessClient.java  |   4 +
 runners/java-fn-execution/pom.xml   |  15 ++
 .../fnexecution/control/FnApiControlClient.java | 148 
 .../control/FnApiControlClientPoolService.java  |  66 +++
 .../fnexecution/control/SdkHarnessClient.java   | 173 +++
 .../fnexecution/control/package-info.java   |  23 +++
 .../FnApiControlClientPoolServiceTest.java  |  65 +++
 .../control/FnApiControlClientTest.java | 139 +++
 .../control/SdkHarnessClientTest.java   |  96 ++
 11 files changed, 740 insertions(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/9ed655be/runners/core-java/src/main/java/org/apache/beam/runners/core/fn/FnApiControlClient.java
--
diff --git 
a/runners/core-java/src/main/java/org/apache/beam/runners/core/fn/FnApiControlClient.java
 
b/runners/core-java/src/main/java/org/apache/beam/runners/core/fn/FnApiControlClient.java
index 7546851..811444c 100644
--- 
a/runners/core-java/src/main/java/org/apache/beam/runners/core/fn/FnApiControlClient.java
+++ 
b/runners/core-java/src/main/java/org/apache/beam/runners/core/fn/FnApiControlClient.java
@@ -39,7 +39,11 @@ import org.slf4j.LoggerFactory;
  * connections).
  *
  * This low-level client is responsible only for correlating requests with 
responses.
+ *
+ * @deprecated Runners should depend on the beam-runners-java-fn-execution 
module for this
+ * functionality.
  */
+@Deprecated
 class FnApiControlClient implements Closeable {
   private static final Logger LOG = 
LoggerFactory.getLogger(FnApiControlClient.class);
 

http://git-wip-us.apache.org/repos/asf/beam/blob/9ed655be/runners/core-java/src/main/java/org/apache/beam/runners/core/fn/FnApiControlClientPoolService.java
--
diff --git 
a/runners/core-java/src/main/java/org/apache/beam/runners/core/fn/FnApiControlClientPoolService.java
 
b/runners/core-java/src/main/java/org/apache/beam/runners/core/fn/FnApiControlClientPoolService.java
index fd28040..21fc4f7 100644
--- 
a/runners/core-java/src/main/java/org/apache/beam/runners/core/fn/FnApiControlClientPoolService.java
+++ 
b/runners/core-java/src/main/java/org/apache/beam/runners/core/fn/FnApiControlClientPoolService.java
@@ -24,7 +24,13 @@ import 
org.apache.beam.model.fnexecution.v1.BeamFnControlGrpc;
 import org.slf4j.Logger;
 import org.slf4j.LoggerFactory;
 
-/** A Fn API control service which adds incoming SDK harness connections to a 
pool. */
+/**
+ * A Fn API control service which adds incoming SDK harness connections to a 
pool.
+ *
+ * @deprecated Runners should depend on the beam-runners-java-fn-execution 
module for this
+ * functionality.
+ */
+@Deprecated
 public class FnApiControlClientPoolService extends 
BeamFnControlGrpc.BeamFnControlImplBase {
   private static final Logger LOGGER = 
LoggerFactory.getLogger(FnApiControlClientPoolService.class);
 

http://git-wip-us.apache.org/repos/asf/beam/blob/9ed655be/runners/core-java/src/main/java/org/apache/beam/runners/core/fn/SdkHarnessClient.java
--
diff --git 
a/runners/core-java/src/main/java/org/apache/beam/runners/core/fn/SdkHarnessClient.java
 
b/runners/core-java/src/main/java/org/apache/beam/runners/core/fn/SdkHarnessClient.java
index bfd1837..091dea1 100644
--- 
a/runners/core-java/src/main/java/org/apache/beam/runners/core/fn/SdkHarnessClient.java
+++ 
b/runners/core-java/src/main/java/org/apache/beam/runners/core/fn/SdkHarnessClient.java
@@ -31,7 +31,11 @@ import org.apache.beam.model.fnexecution.v1.BeamFnApi;
  *
  * This provides a Java-friendly wrapper around {@link FnApiControlClient} 
and {@link
  * FnDataReceiver}, which handle lower-level gRPC message wrangling.
+ *
+ * @deprecated Runners should depend on the beam-runners-java-fn-execution 
module for this
+ * functionality.
  */
+@Deprecated
 pub

[GitHub] beam pull request #4104: Remove obsolete dependence of FnApiRunner on MapTas...

2017-11-08 Thread robertwb
GitHub user robertwb opened a pull request:

https://github.com/apache/beam/pull/4104

Remove obsolete dependence of FnApiRunner on MapTaskExecutorRunner.

Follow this checklist to help us incorporate your contribution quickly and 
easily:

 - [ ] Make sure there is a [JIRA 
issue](https://issues.apache.org/jira/projects/BEAM/issues/) filed for the 
change (usually before you start working on it).  Trivial changes like typos do 
not require a JIRA issue.  Your pull request should address just this issue, 
without pulling in other changes.
 - [ ] Each commit in the pull request should have a meaningful subject 
line and body.
 - [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in 
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA 
issue.
 - [ ] Write a pull request description that is detailed enough to 
understand what the pull request does, how, and why.
 - [ ] Run `mvn clean verify` to make sure basic checks pass. A more 
thorough check will be performed on your pull request automatically.
 - [ ] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).

---


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/robertwb/incubator-beam simplify

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/4104.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #4104


commit 874d7aca41981089fb82f9447c0aabc417d4f9ef
Author: Robert Bradshaw 
Date:   2017-11-09T00:24:38Z

Remove obsolete dependence of FnApiRunner on MapTaskExecutorRunner.




---


[GitHub] beam pull request #4103: Adds logging at INFO for all creation, deletion and...

2017-11-08 Thread jkff
GitHub user jkff opened a pull request:

https://github.com/apache/beam/pull/4103

Adds logging at INFO for all creation, deletion and copying of files in 
WriteFiles

This will help with debugging issues such as 
https://stackoverflow.com/questions/47113773/dataflow-2-1-0-streaming-application-is-not-cleaning-temp-folders/47142671

The amount of logging, I believe, should be reasonable: it's several 
messages per output file (not per element or anything like that): when the temp 
file is created, successfully closed, deleted on error, copied to final 
location, and deleted after copying. This should allow tracing everything that 
happens to suspicious files.

R: @chamikaramj 

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/jkff/incubator-beam write-files-logging

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/4103.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #4103


commit 2ee2417154321c78595480ec1973f7c9e74579a5
Author: Eugene Kirpichov 
Date:   2017-11-09T00:13:25Z

Adds logging at INFO for all creation, deletion and copying of files in 
WriteFiles




---


[GitHub] beam pull request #4102: Set save_main_session to True for csv module

2017-11-08 Thread davidcavazos
GitHub user davidcavazos opened a pull request:

https://github.com/apache/beam/pull/4102

Set save_main_session to True for csv module

Parsing the game events requires the `csv` module, so we also needed to set 
`save_main_session` to True for this to work in Dataflow.

R: @aaltay 

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/davidcavazos/beam user_score

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/4102.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #4102


commit 89e9c34e839180b990b6abe4e546d0acc83e04fe
Author: David Cavazos 
Date:   2017-11-08T23:32:03Z

Set save_main_session to True for csv module




---


Build failed in Jenkins: beam_PerformanceTests_Python #538

2017-11-08 Thread Apache Jenkins Server
See 


Changes:

[robertwb] [BEAM-3040] Disable flaky subprocess variant of ULR test.

--
Started by timer
[EnvInject] - Loading node environment variables.
Building remotely on beam7 (beam) in workspace 

 > git rev-parse --is-inside-work-tree # timeout=10
Fetching changes from the remote Git repository
 > git config remote.origin.url https://github.com/apache/beam.git # timeout=10
Fetching upstream changes from https://github.com/apache/beam.git
 > git --version # timeout=10
 > git fetch --tags --progress https://github.com/apache/beam.git 
 > +refs/heads/*:refs/remotes/origin/* 
 > +refs/pull/${ghprbPullId}/*:refs/remotes/origin/pr/${ghprbPullId}/*
 > git rev-parse origin/master^{commit} # timeout=10
Checking out Revision 29399fde81f6d12af018d1bab26c59441a899789 (origin/master)
 > git config core.sparsecheckout # timeout=10
 > git checkout -f 29399fde81f6d12af018d1bab26c59441a899789
Commit message: "Closes #4099"
 > git rev-list 598774738e7a1236cf30f70a584311cee52d1818 # timeout=10
Cleaning workspace
 > git rev-parse --verify HEAD # timeout=10
Resetting working tree
 > git reset --hard # timeout=10
 > git clean -fdx # timeout=10
[EnvInject] - Executing scripts and injecting environment variables after the 
SCM step.
[EnvInject] - Injecting as environment variables the properties content 
SPARK_LOCAL_IP=127.0.0.1

[EnvInject] - Variables injected successfully.
[beam_PerformanceTests_Python] $ /bin/bash -xe 
/tmp/jenkins1660793170453211580.sh
+ rm -rf PerfKitBenchmarker
[beam_PerformanceTests_Python] $ /bin/bash -xe 
/tmp/jenkins6956842955668873613.sh
+ git clone https://github.com/GoogleCloudPlatform/PerfKitBenchmarker.git
Cloning into 'PerfKitBenchmarker'...
[beam_PerformanceTests_Python] $ /bin/bash -xe 
/tmp/jenkins8884482866587835674.sh
+ pip install --user -r PerfKitBenchmarker/requirements.txt
Requirement already satisfied: absl-py in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 14))
Requirement already satisfied: jinja2>=2.7 in 
/usr/local/lib/python2.7/dist-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 15))
Requirement already satisfied: setuptools in /usr/lib/python2.7/dist-packages 
(from -r PerfKitBenchmarker/requirements.txt (line 16))
Requirement already satisfied: colorlog[windows]==2.6.0 in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 17))
Requirement already satisfied: blinker>=1.3 in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 18))
Requirement already satisfied: futures>=3.0.3 in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 19))
Requirement already satisfied: PyYAML==3.12 in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 20))
Requirement already satisfied: pint>=0.7 in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 21))
Requirement already satisfied: numpy in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 22))
Requirement already satisfied: functools32 in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 23))
Requirement already satisfied: contextlib2>=0.5.1 in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 24))
Requirement already satisfied: pywinrm in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 25))
Requirement already satisfied: six in 
/home/jenkins/.local/lib/python2.7/site-packages (from absl-py->-r 
PerfKitBenchmarker/requirements.txt (line 14))
Requirement already satisfied: MarkupSafe>=0.23 in 
/usr/local/lib/python2.7/dist-packages (from jinja2>=2.7->-r 
PerfKitBenchmarker/requirements.txt (line 15))
Requirement already satisfied: colorama; extra == "windows" in 
/usr/lib/python2.7/dist-packages (from colorlog[windows]==2.6.0->-r 
PerfKitBenchmarker/requirements.txt (line 17))
Requirement already satisfied: xmltodict in 
/home/jenkins/.local/lib/python2.7/site-packages (from pywinrm->-r 
PerfKitBenchmarker/requirements.txt (line 25))
Requirement already satisfied: requests-ntlm>=0.3.0 in 
/home/jenkins/.local/lib/python2.7/site-packages (from pywinrm->-r 
PerfKitBenchmarker/requirements.txt (line 25))
Requirement already satisfied: requests>=2.9.1 in 
/home/jenkins/.local/lib/python2.7/site-packages (from pywinrm->-r 
PerfKitBenchmarker/requirements.txt (line 25))
Requirement already satisfied: ntlm-auth>=1.0.2 in 
/home/jenkins/.local/lib/python2.7/site-packages (from 
requests-ntlm>=0.3.0->pywinrm->-r PerfKitBenchmarke

[jira] [Commented] (BEAM-3110) The transform Read(UnboundedKafkaSource) is currently not supported

2017-11-08 Thread Xu Mingmin (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-3110?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16244960#comment-16244960
 ] 

Xu Mingmin commented on BEAM-3110:
--

seems the bug is not there in latest master branch, will do more test to verify.

> The transform Read(UnboundedKafkaSource) is currently not supported
> ---
>
> Key: BEAM-3110
> URL: https://issues.apache.org/jira/browse/BEAM-3110
> Project: Beam
>  Issue Type: Bug
>  Components: runner-flink
>Affects Versions: 2.2.0, 2.3.0
>Reporter: Xu Mingmin
>Assignee: Aljoscha Krettek
>
> I see this issue when submitting a job to Flink cluster. It appears after 
> build {{2.2.0-20170912.083349-51}}.
> {code}
> org.apache.flink.client.program.ProgramInvocationException: The main method 
> caused an error.
>   at 
> org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:545)
>   at 
> org.apache.flink.client.program.PackagedProgram.invokeInteractiveModeForExecution(PackagedProgram.java:419)
>   at 
> org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:381)
>   at 
> org.apache.flink.client.CliFrontend.executeProgram(CliFrontend.java:838)
>   at org.apache.flink.client.CliFrontend.run(CliFrontend.java:259)
>   at 
> org.apache.flink.client.CliFrontend.parseParameters(CliFrontend.java:1086)
>   at org.apache.flink.client.CliFrontend$2.call(CliFrontend.java:1133)
>   at org.apache.flink.client.CliFrontend$2.call(CliFrontend.java:1130)
>   at 
> org.apache.flink.runtime.security.HadoopSecurityContext$1.run(HadoopSecurityContext.java:43)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:415)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
>   at 
> org.apache.flink.runtime.security.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:40)
>   at org.apache.flink.client.CliFrontend.main(CliFrontend.java:1129)
> Caused by: java.lang.UnsupportedOperationException: The transform 
> Read(UnboundedKafkaSource) is currently not supported.
>   at 
> org.apache.beam.runners.flink.FlinkStreamingPipelineTranslator.visitPrimitiveTransform(FlinkStreamingPipelineTranslator.java:113)
>   at 
> org.apache.beam.sdk.runners.TransformHierarchy$Node.visit(TransformHierarchy.java:666)
>   at 
> org.apache.beam.sdk.runners.TransformHierarchy$Node.visit(TransformHierarchy.java:658)
>   at 
> org.apache.beam.sdk.runners.TransformHierarchy$Node.visit(TransformHierarchy.java:658)
>   at 
> org.apache.beam.sdk.runners.TransformHierarchy$Node.visit(TransformHierarchy.java:658)
>   at 
> org.apache.beam.sdk.runners.TransformHierarchy$Node.access$600(TransformHierarchy.java:311)
>   at 
> org.apache.beam.sdk.runners.TransformHierarchy.visit(TransformHierarchy.java:245)
>   at org.apache.beam.sdk.Pipeline.traverseTopologically(Pipeline.java:451)
>   at 
> org.apache.beam.runners.flink.FlinkPipelineTranslator.translate(FlinkPipelineTranslator.java:38)
>   at 
> org.apache.beam.runners.flink.FlinkStreamingPipelineTranslator.translate(FlinkStreamingPipelineTranslator.java:69)
>   at 
> org.apache.beam.runners.flink.FlinkPipelineExecutionEnvironment.translate(FlinkPipelineExecutionEnvironment.java:104)
>   at org.apache.beam.runners.flink.FlinkRunner.run(FlinkRunner.java:113)
>   at org.apache.beam.sdk.Pipeline.run(Pipeline.java:304)
>   at org.apache.beam.sdk.Pipeline.run(Pipeline.java:290)
> {code} 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (BEAM-137) Add implicit conf/pipeline-default.conf options file

2017-11-08 Thread Nicholas Verbeck (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-137?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16244954#comment-16244954
 ] 

Nicholas Verbeck commented on BEAM-137:
---

As a follow on. Any idea on how we can expand fromArgs to be able to supply 
files from it?

> Add implicit conf/pipeline-default.conf options file
> 
>
> Key: BEAM-137
> URL: https://issues.apache.org/jira/browse/BEAM-137
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-java-core
>Reporter: Jean-Baptiste Onofré
>
> Right now, most of users provide the pipeline options via the main arguments.
> For instance, it's the classic way to provide pipeline input, etc.
> For convenience, it would be great that the pipeline looks for options in 
> conf/[pipeline_name]-default.conf by default, and override the options using 
> the main arguments.
> Thoughts ?



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (BEAM-137) Add implicit conf/pipeline-default.conf options file

2017-11-08 Thread Nicholas Verbeck (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-137?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16244950#comment-16244950
 ] 

Nicholas Verbeck commented on BEAM-137:
---

So I'm looking at adding this support to make my life a little easier. What 
I've currently got in my head is what [~jbonofre] suggested but with a little 
expansion of logic to support [~lcwik] hints at wanting. As well as add support 
for multiple files. 

Current idea looks like this:
1) Add addFiles(String... files) to both PipelineOptions and 
PipelineOptions.Builder. Allowing for any number of files to be read in order 
of passed in values. fromArgs() being the final be all override of all values
2) The value of each file string would be a URI. Where the scheme will govern 
both the fetch location (Local, HTTP, etc) and format of data. Allowing for 
files to be stored where ever the user chooses but also support both the 
serialized JSON and prop files. As well as allow for more formats in the 
future. 

Example URI: prop+local:///spark/config.prop

> Add implicit conf/pipeline-default.conf options file
> 
>
> Key: BEAM-137
> URL: https://issues.apache.org/jira/browse/BEAM-137
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-java-core
>Reporter: Jean-Baptiste Onofré
>
> Right now, most of users provide the pipeline options via the main arguments.
> For instance, it's the classic way to provide pipeline input, etc.
> For convenience, it would be great that the pipeline looks for options in 
> conf/[pipeline_name]-default.conf by default, and override the options using 
> the main arguments.
> Thoughts ?



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


Build failed in Jenkins: beam_PostCommit_Python_Verify #3508

2017-11-08 Thread Apache Jenkins Server
See 


--
[...truncated 930.53 KB...]
copying 
apache_beam/runners/dataflow/internal/clients/dataflow/dataflow_v1b3_messages.py
 -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/internal/clients/dataflow
copying 
apache_beam/runners/dataflow/internal/clients/dataflow/message_matchers.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/internal/clients/dataflow
copying 
apache_beam/runners/dataflow/internal/clients/dataflow/message_matchers_test.py 
-> apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/internal/clients/dataflow
copying apache_beam/runners/dataflow/native_io/__init__.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/native_io
copying apache_beam/runners/dataflow/native_io/iobase.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/native_io
copying apache_beam/runners/dataflow/native_io/iobase_test.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/native_io
copying apache_beam/runners/dataflow/native_io/streaming_create.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/native_io
copying apache_beam/runners/direct/__init__.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/bundle_factory.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/clock.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/consumer_tracking_pipeline_visitor.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/consumer_tracking_pipeline_visitor_test.py 
-> apache-beam-2.3.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/direct_metrics.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/direct_metrics_test.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/direct_runner.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/direct_runner_test.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/evaluation_context.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/executor.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/helper_transforms.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/transform_evaluator.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/util.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/watermark_manager.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/experimental/__init__.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/experimental
copying apache_beam/runners/experimental/python_rpc_direct/__init__.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/experimental/python_rpc_direct
copying 
apache_beam/runners/experimental/python_rpc_direct/python_rpc_direct_runner.py 
-> apache-beam-2.3.0.dev0/apache_beam/runners/experimental/python_rpc_direct
copying apache_beam/runners/experimental/python_rpc_direct/server.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/experimental/python_rpc_direct
copying apache_beam/runners/job/__init__.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/job
copying apache_beam/runners/job/manager.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/job
copying apache_beam/runners/job/utils.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/job
copying apache_beam/runners/portability/__init__.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/portability
copying apache_beam/runners/portability/fn_api_runner.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/portability
copying apache_beam/runners/portability/fn_api_runner_test.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/portability
copying apache_beam/runners/portability/maptask_executor_runner.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/portability
copying apache_beam/runners/portability/maptask_executor_runner_test.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/portability
copying apache_beam/runners/portability/universal_local_runner.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/portability
copying apache_beam/runners/portability/universal_local_runner_main.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/portability
copying apache_beam/runners/portability/universal_local_runner_test.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/portability
copying apache_beam/runners/test/__init__.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/test
copying apache_beam/runners/worker/__init__.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/worker
copying apache_beam/runners/worker/bundle_processor.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/worker
copying apache_beam/runn

[jira] [Assigned] (BEAM-3162) Add IBM Streams Runner to compatability matrix

2017-11-08 Thread Luke Cwik (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3162?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Luke Cwik reassigned BEAM-3162:
---

Assignee: Paul Gerver  (was: Paul Gerver)

> Add IBM Streams Runner to compatability matrix
> --
>
> Key: BEAM-3162
> URL: https://issues.apache.org/jira/browse/BEAM-3162
> Project: Beam
>  Issue Type: Task
>  Components: website
>Reporter: Paul Gerver
>Assignee: Paul Gerver
>Priority: Minor
>  Labels: documentation
>   Original Estimate: 48h
>  Remaining Estimate: 48h
>
> The IBM Streams Runner was announced on 2017-11-02. This task involves adding 
> a Streams Runner column to the existing Beam capability matrix on the Beam 
> website.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (BEAM-3162) Add IBM Streams Runner to compatability matrix

2017-11-08 Thread Luke Cwik (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3162?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Luke Cwik reassigned BEAM-3162:
---

Assignee: Paul Gerver  (was: Reuven Lax)

> Add IBM Streams Runner to compatability matrix
> --
>
> Key: BEAM-3162
> URL: https://issues.apache.org/jira/browse/BEAM-3162
> Project: Beam
>  Issue Type: Task
>  Components: website
>Reporter: Paul Gerver
>Assignee: Paul Gerver
>Priority: Minor
>  Labels: documentation
>   Original Estimate: 48h
>  Remaining Estimate: 48h
>
> The IBM Streams Runner was announced on 2017-11-02. This task involves adding 
> a Streams Runner column to the existing Beam capability matrix on the Beam 
> website.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


Build failed in Jenkins: beam_PostCommit_Python_Verify #3507

2017-11-08 Thread Apache Jenkins Server
See 


Changes:

[robertwb] [BEAM-3040] Disable flaky subprocess variant of ULR test.

--
[...truncated 933.06 KB...]
copying apache_beam/runners/runner.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners
copying apache_beam/runners/runner_test.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners
copying apache_beam/runners/dataflow/__init__.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow
copying apache_beam/runners/dataflow/dataflow_metrics.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow
copying apache_beam/runners/dataflow/dataflow_metrics_test.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow
copying apache_beam/runners/dataflow/dataflow_runner.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow
copying apache_beam/runners/dataflow/dataflow_runner_test.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow
copying apache_beam/runners/dataflow/ptransform_overrides.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow
copying apache_beam/runners/dataflow/template_runner_test.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow
copying apache_beam/runners/dataflow/test_dataflow_runner.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow
copying apache_beam/runners/dataflow/internal/__init__.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/internal
copying apache_beam/runners/dataflow/internal/apiclient.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/internal
copying apache_beam/runners/dataflow/internal/apiclient_test.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/internal
copying apache_beam/runners/dataflow/internal/dependency.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/internal
copying apache_beam/runners/dataflow/internal/dependency_test.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/internal
copying apache_beam/runners/dataflow/internal/names.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/internal
copying apache_beam/runners/dataflow/internal/clients/__init__.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/internal/clients
copying apache_beam/runners/dataflow/internal/clients/dataflow/__init__.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/internal/clients/dataflow
copying 
apache_beam/runners/dataflow/internal/clients/dataflow/dataflow_v1b3_client.py 
-> apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/internal/clients/dataflow
copying 
apache_beam/runners/dataflow/internal/clients/dataflow/dataflow_v1b3_messages.py
 -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/internal/clients/dataflow
copying 
apache_beam/runners/dataflow/internal/clients/dataflow/message_matchers.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/internal/clients/dataflow
copying 
apache_beam/runners/dataflow/internal/clients/dataflow/message_matchers_test.py 
-> apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/internal/clients/dataflow
copying apache_beam/runners/dataflow/native_io/__init__.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/native_io
copying apache_beam/runners/dataflow/native_io/iobase.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/native_io
copying apache_beam/runners/dataflow/native_io/iobase_test.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/native_io
copying apache_beam/runners/dataflow/native_io/streaming_create.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/native_io
copying apache_beam/runners/direct/__init__.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/bundle_factory.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/clock.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/consumer_tracking_pipeline_visitor.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/consumer_tracking_pipeline_visitor_test.py 
-> apache-beam-2.3.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/direct_metrics.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/direct_metrics_test.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/direct_runner.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/direct_runner_test.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/evaluation_context.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/executor.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/helper_transforms.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/transform_evaluator.py -> 
apache-beam-2.3.0.dev0/apache_bea

[jira] [Commented] (BEAM-3162) Add IBM Streams Runner to compatability matrix

2017-11-08 Thread Paul Gerver (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-3162?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16244796#comment-16244796
 ] 

Paul Gerver commented on BEAM-3162:
---

I will request Jira access to the Beam dev list to this can be assigned to me.

> Add IBM Streams Runner to compatability matrix
> --
>
> Key: BEAM-3162
> URL: https://issues.apache.org/jira/browse/BEAM-3162
> Project: Beam
>  Issue Type: Task
>  Components: website
>Reporter: Paul Gerver
>Assignee: Reuven Lax
>Priority: Minor
>  Labels: documentation
>   Original Estimate: 48h
>  Remaining Estimate: 48h
>
> The IBM Streams Runner was announced on 2017-11-02. This task involves adding 
> a Streams Runner column to the existing Beam capability matrix on the Beam 
> website.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (BEAM-3084) Support for window analytic functions

2017-11-08 Thread Xu Mingmin (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3084?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xu Mingmin reassigned BEAM-3084:


Assignee: (was: Xu Mingmin)

> Support for window analytic functions
> -
>
> Key: BEAM-3084
> URL: https://issues.apache.org/jira/browse/BEAM-3084
> Project: Beam
>  Issue Type: New Feature
>  Components: dsl-sql
>Reporter: Xu Mingmin
>
> Calcite streaming documentation includes examples for using SQL window
> analytic functions
> https://calcite.apache.org/docs/stream.html#sliding-windows
> From: Kobi Salant 
> d...@beam.apache.org



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (BEAM-3147) Nexmark in SQL

2017-11-08 Thread Anton Kedin (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-3147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16244791#comment-16244791
 ] 

Anton Kedin commented on BEAM-3147:
---

Ah, the other related jiras are under 'extensions' component. Will look for 
them there.

> Nexmark in SQL
> --
>
> Key: BEAM-3147
> URL: https://issues.apache.org/jira/browse/BEAM-3147
> Project: Beam
>  Issue Type: New Feature
>  Components: dsl-sql
>Reporter: Anton Kedin
>Assignee: Anton Kedin
>  Labels: newbie,, nexmark, starter
>
> Currently there is a (Nexmark 
> suite)[https://github.com/apache/beam/tree/master/sdks/java/nexmark] running 
> against Java SDK. It has Java object model and runs a set of PTransofrms 
> replicating the queries specified in Nexmark.
> The task is to have the same set of queries running on top of Beam SQL.
> References:
> * (Nexmark Paper)[http://datalab.cs.pdx.edu/niagara/pstream/nexmark.pdf]
> * (Nexmark Queries)[http://datalab.cs.pdx.edu/niagara/NEXMark/]
> * (Beam Java Nexmark 
> Suite)[https://github.com/apache/beam/tree/master/sdks/java/nexmark] 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (BEAM-3162) Add IBM Streams Runner to compatability matrix

2017-11-08 Thread Paul (JIRA)
Paul created BEAM-3162:
--

 Summary: Add IBM Streams Runner to compatability matrix
 Key: BEAM-3162
 URL: https://issues.apache.org/jira/browse/BEAM-3162
 Project: Beam
  Issue Type: Task
  Components: website
Reporter: Paul
Assignee: Reuven Lax
Priority: Minor


The IBM Streams Runner was announced on 2017-11-02. This task involves adding a 
Streams Runner column to the existing Beam capability matrix on the Beam 
website.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (BEAM-3161) Cannot output with timestamp XXXX

2017-11-08 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-3161?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16244776#comment-16244776
 ] 

ASF GitHub Bot commented on BEAM-3161:
--

GitHub user XuMingmin opened a pull request:

https://github.com/apache/beam/pull/4101

[BEAM-3161] Cannot output with timestamp 

R: @xumingming 
CC: @takidau 

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/XuMingmin/beam BEAM-3161

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/4101.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #4101


commit b4e8494708d79d0a2a64e86619bd5b99fcf5799d
Author: mingmxu 
Date:   2017-11-08T21:44:18Z

change `withAllowedTimestampSkew`




> Cannot output with timestamp 
> -
>
> Key: BEAM-3161
> URL: https://issues.apache.org/jira/browse/BEAM-3161
> Project: Beam
>  Issue Type: Bug
>  Components: dsl-sql
>Reporter: Xu Mingmin
>Assignee: Xu Mingmin
> Fix For: 2.3.0
>
>
> BeamSQL throws exception when running query
> {code}
> select siteId, count(*), TUMBLE_START(unix_timestamp_to_date(eventTimestamp), 
> INTERVAL '1' HOUR) from RHEOS_SOJEVENT_TOTAL GROUP BY siteId, 
> TUMBLE(unix_timestamp_to_date(eventTimestamp), INTERVAL '1' HOUR)
> {code}
> Exception details:
> {code}
> Caused by: java.lang.IllegalArgumentException: Cannot output with timestamp 
> 2017-11-08T21:37:46.000Z. Output timestamps must be no earlier than the 
> timestamp of the current input (2017-11-08T21:37:49.322Z) minus the allowed 
> skew (0 milliseconds). See the DoFn#getAllowedTimestampSkew() Javadoc for 
> details on changing the allowed skew.
>   at 
> org.apache.beam.runners.core.SimpleDoFnRunner$DoFnProcessContext.checkTimestamp(SimpleDoFnRunner.java:463)
>   at 
> org.apache.beam.runners.core.SimpleDoFnRunner$DoFnProcessContext.outputWithTimestamp(SimpleDoFnRunner.java:429)
>   at org.apache.Caused by: java.lang.IllegalArgumentException: Cannot 
> output with timestamp 2017-11-08T21:37:46.000Z. Output timestamps must be no 
> earlier than the timestamp of the current input (2017-11-08T21:37:49.322Z) 
> minus the allowed skew (0 milliseconds). See the 
> DoFn#getAllowedTimestampSkew() Javadoc for details on changing the allowed 
> skew.
>   at 
> org.apache.beam.runners.core.SimpleDoFnRunner$DoFnProcessContext.checkTimestamp(SimpleDoFnRunner.java:463)
>   at 
> org.apache.beam.runners.core.SimpleDoFnRunner$DoFnProcessContext.outputWithTimestamp(SimpleDoFnRunner.java:429)
>   at 
> org.apache.beam.sdk.transforms.WithTimestamps$AddTimestampsDoFn.processElement(WithTimestamps.java:138)beam.sdk.transforms.WithTimestamps$AddTimestampsDoFn.processElement(WithTimestamps.java:138)
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] beam pull request #4101: [BEAM-3161] Cannot output with timestamp XXXX

2017-11-08 Thread XuMingmin
GitHub user XuMingmin opened a pull request:

https://github.com/apache/beam/pull/4101

[BEAM-3161] Cannot output with timestamp 

R: @xumingming 
CC: @takidau 

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/XuMingmin/beam BEAM-3161

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/4101.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #4101


commit b4e8494708d79d0a2a64e86619bd5b99fcf5799d
Author: mingmxu 
Date:   2017-11-08T21:44:18Z

change `withAllowedTimestampSkew`




---


[jira] [Created] (BEAM-3161) Cannot output with timestamp XXXX

2017-11-08 Thread Xu Mingmin (JIRA)
Xu Mingmin created BEAM-3161:


 Summary: Cannot output with timestamp 
 Key: BEAM-3161
 URL: https://issues.apache.org/jira/browse/BEAM-3161
 Project: Beam
  Issue Type: Bug
  Components: dsl-sql
Reporter: Xu Mingmin
Assignee: Xu Mingmin
 Fix For: 2.3.0


BeamSQL throws exception when running query
{code}
select siteId, count(*), TUMBLE_START(unix_timestamp_to_date(eventTimestamp), 
INTERVAL '1' HOUR) from RHEOS_SOJEVENT_TOTAL GROUP BY siteId, 
TUMBLE(unix_timestamp_to_date(eventTimestamp), INTERVAL '1' HOUR)
{code}

Exception details:
{code}
Caused by: java.lang.IllegalArgumentException: Cannot output with timestamp 
2017-11-08T21:37:46.000Z. Output timestamps must be no earlier than the 
timestamp of the current input (2017-11-08T21:37:49.322Z) minus the allowed 
skew (0 milliseconds). See the DoFn#getAllowedTimestampSkew() Javadoc for 
details on changing the allowed skew.
at 
org.apache.beam.runners.core.SimpleDoFnRunner$DoFnProcessContext.checkTimestamp(SimpleDoFnRunner.java:463)
at 
org.apache.beam.runners.core.SimpleDoFnRunner$DoFnProcessContext.outputWithTimestamp(SimpleDoFnRunner.java:429)
at org.apache.Caused by: java.lang.IllegalArgumentException: Cannot 
output with timestamp 2017-11-08T21:37:46.000Z. Output timestamps must be no 
earlier than the timestamp of the current input (2017-11-08T21:37:49.322Z) 
minus the allowed skew (0 milliseconds). See the DoFn#getAllowedTimestampSkew() 
Javadoc for details on changing the allowed skew.
at 
org.apache.beam.runners.core.SimpleDoFnRunner$DoFnProcessContext.checkTimestamp(SimpleDoFnRunner.java:463)
at 
org.apache.beam.runners.core.SimpleDoFnRunner$DoFnProcessContext.outputWithTimestamp(SimpleDoFnRunner.java:429)
at 
org.apache.beam.sdk.transforms.WithTimestamps$AddTimestampsDoFn.processElement(WithTimestamps.java:138)beam.sdk.transforms.WithTimestamps$AddTimestampsDoFn.processElement(WithTimestamps.java:138)
{code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (BEAM-3040) Python precommit timed out after 150 minutes

2017-11-08 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-3040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16244716#comment-16244716
 ] 

ASF GitHub Bot commented on BEAM-3040:
--

Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/4099


> Python precommit timed out after 150 minutes
> 
>
> Key: BEAM-3040
> URL: https://issues.apache.org/jira/browse/BEAM-3040
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Kenneth Knowles
>Assignee: Ahmet Altay
>
> https://builds.apache.org/job/beam_PreCommit_Python_MavenInstall/143/consoleFull
> Within about 10 minutes it reaches this point:
> {code}
> ...
> 2017-10-10T03:33:33.591 [INFO] --- findbugs-maven-plugin:3.0.4:check 
> (default) @ beam-sdks-python ---
> 2017-10-10T03:33:33.702 [INFO] 
> 2017-10-10T03:33:33.702 [INFO] --- exec-maven-plugin:1.5.0:exec 
> (setuptools-test) @ beam-sdks-python ---
> {code}
> and the final output is like this:
> {code}
> ...
> 2017-10-10T03:33:33.702 [INFO] --- exec-maven-plugin:1.5.0:exec 
> (setuptools-test) @ beam-sdks-python ---
> docs create: 
> /home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_MavenInstall/sdks/python/target/.tox/docs
> GLOB sdist-make: 
> /home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_MavenInstall/sdks/python/setup.py
> lint create: 
> /home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_MavenInstall/sdks/python/target/.tox/lint
> py27 create: 
> /home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_MavenInstall/sdks/python/target/.tox/py27
> py27cython create: 
> /home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_MavenInstall/sdks/python/target/.tox/py27cython
> py27cython installdeps: nose==1.3.7, grpcio-tools==1.3.5, cython==0.25.2
> docs installdeps: nose==1.3.7, grpcio-tools==1.3.5, Sphinx==1.5.5, 
> sphinx_rtd_theme==0.2.4
> lint installdeps: nose==1.3.7, pycodestyle==2.3.1, pylint==1.7.1
> py27 installdeps: nose==1.3.7, grpcio-tools==1.3.5
> lint inst: 
> /home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_MavenInstall/sdks/python/target/.tox/dist/apache-beam-2.3.0.dev.zip
> py27 inst: 
> /home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_MavenInstall/sdks/python/target/.tox/dist/apache-beam-2.3.0.dev.zip
> py27cython inst: 
> /home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_MavenInstall/sdks/python/target/.tox/dist/apache-beam-2.3.0.dev.zip
> py27 runtests: PYTHONHASHSEED='2225684666'
> py27 runtests: commands[0] | python --version
> py27 runtests: commands[1] | - find apache_beam -type f -name *.pyc -delete
> py27 runtests: commands[2] | pip install -e .[test]
> lint runtests: PYTHONHASHSEED='2225684666'
> lint runtests: commands[0] | time pip install -e .[test]
> docs inst: 
> /home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_MavenInstall/sdks/python/target/.tox/dist/apache-beam-2.3.0.dev.zip
> py27 runtests: commands[3] | python 
> apache_beam/examples/complete/autocomplete_test.py
> lint runtests: commands[1] | time 
> /home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_MavenInstall/sdks/python/run_pylint.sh
> py27 runtests: commands[4] | python setup.py test
> docs runtests: PYTHONHASHSEED='2225684666'
> docs runtests: commands[0] | time pip install -e .[test,gcp,docs]
> docs runtests: commands[1] | time 
> /home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_MavenInstall/sdks/python/generate_pydoc.sh
> py27gcp create: 
> /home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_MavenInstall/sdks/python/target/.tox/py27gcp
> py27gcp installdeps: nose==1.3.7
> py27cython runtests: PYTHONHASHSEED='2225684666'
> py27cython runtests: commands[0] | python --version
> py27cython runtests: commands[1] | - find apache_beam -type f -name *.pyc 
> -delete
> py27cython runtests: commands[2] | - find apache_beam -type f -name *.c 
> -delete
> py27cython runtests: commands[3] | - find apache_beam -type f -name *.so 
> -delete
> py27cython runtests: commands[4] | - find target/build -type f -name *.c 
> -delete
> py27cython runtests: commands[5] | - find target/build -type f -name *.so 
> -delete
> py27cython runtests: commands[6] | time pip install -e .[test]
> py27gcp inst: 
> /home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_MavenInstall/sdks/python/target/.tox/dist/apache-beam-2.3.0.dev.zip
> py27gcp runtests: PYTHONHASHSEED='2225684666'
> py27gcp runtests: commands[0] | pip install -e .[test,gcp]
> py27gcp runtests: commands[1] | python --version
> py27gcp runtests: commands[2] | - find apache_beam -type f -name *.pyc -delete
> py27gcp runtests: commands[3] | python 
> apache_beam/examples/complete/autocomplete_test.py
> py27gcp runtests: commands[4] | python setup.py test
> py27cython runtests: commands[7] | python 
> apache_beam/examples/c

[jira] [Updated] (BEAM-3160) Type based coder inference incorrectly assumes that a coder for one type is equivalent to every other coder for that type.

2017-11-08 Thread Luke Cwik (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3160?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Luke Cwik updated BEAM-3160:

Description: 
We should prevent coder inference from assuming that two coders for the same 
type are interchangeable.

Two Avro GenericRecord coders with different schemas are considered identical 
and an arbitrary one is returned by the Coder/Type inference system if the 
GenericRecord type appears multiple times.
e.g.
*KvCoder.of(IterableCoder.of(AvroCoder.of(SchemaA)), 
IterableCoder.of(AvroCoder.of(SchemaB)))* after coder inference for the type 
*KV, Iterable>* will return 
*KvCoder.of(IterableCoder.of(AvroCoder.of(SchemaX)), 
IterableCoder.of(AvroCoder.of(SchemaX)))* where SchemaX is either SchemaA or 
SchemaB.

Code:
https://github.com/apache/beam/blob/v2.1.1/sdks/java/core/src/main/java/org/apache/beam/sdk/coders/CoderRegistry.java#L420
 and other Type -> Coder maps in the same file should prevent insertion if the 
type already exists and the coders aren't equal.

  was:
We should prevent coder inference from assuming that two coders for the same 
type are interchangeable.

Two Avro GenericRecord coders with different schemas are considered identical 
and an arbitrary one is returned by the Coder/Type inference system if the 
GenericRecord type appears multiple times.
e.g.
*KvCoder.of(AvroCoder.of(SchemaA), AvroCoder.of(SchemaB))* after coder 
inference for the type *KV* will return 
*KvCoder.of(AvroCoder.of(SchemaX), AvroCoder.of(SchemaX))* where SchemaX is 
either SchemaA or SchemaB.

Code:
https://github.com/apache/beam/blob/v2.1.1/sdks/java/core/src/main/java/org/apache/beam/sdk/coders/CoderRegistry.java#L420
 and other Type -> Coder maps in the same file should prevent insertion if the 
type already exists.


> Type based coder inference incorrectly assumes that a coder for one type is 
> equivalent to every other coder for that type.
> --
>
> Key: BEAM-3160
> URL: https://issues.apache.org/jira/browse/BEAM-3160
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core
>Reporter: Luke Cwik
> Fix For: 2.3.0
>
>
> We should prevent coder inference from assuming that two coders for the same 
> type are interchangeable.
> Two Avro GenericRecord coders with different schemas are considered identical 
> and an arbitrary one is returned by the Coder/Type inference system if the 
> GenericRecord type appears multiple times.
> e.g.
> *KvCoder.of(IterableCoder.of(AvroCoder.of(SchemaA)), 
> IterableCoder.of(AvroCoder.of(SchemaB)))* after coder inference for the type 
> *KV, Iterable>* will return 
> *KvCoder.of(IterableCoder.of(AvroCoder.of(SchemaX)), 
> IterableCoder.of(AvroCoder.of(SchemaX)))* where SchemaX is either SchemaA or 
> SchemaB.
> Code:
> https://github.com/apache/beam/blob/v2.1.1/sdks/java/core/src/main/java/org/apache/beam/sdk/coders/CoderRegistry.java#L420
>  and other Type -> Coder maps in the same file should prevent insertion if 
> the type already exists and the coders aren't equal.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[1/2] beam git commit: [BEAM-3040] Disable flaky subprocess variant of ULR test.

2017-11-08 Thread robertwb
Repository: beam
Updated Branches:
  refs/heads/master 598774738 -> 29399fde8


[BEAM-3040] Disable flaky subprocess variant of ULR test.


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/30740a53
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/30740a53
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/30740a53

Branch: refs/heads/master
Commit: 30740a53588194459e23e7b8ddc69a3ab506e2cb
Parents: 5987747
Author: Robert Bradshaw 
Authored: Wed Nov 8 11:00:33 2017 -0800
Committer: Robert Bradshaw 
Committed: Wed Nov 8 11:00:33 2017 -0800

--
 .../apache_beam/runners/portability/universal_local_runner_test.py  | 1 +
 1 file changed, 1 insertion(+)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/30740a53/sdks/python/apache_beam/runners/portability/universal_local_runner_test.py
--
diff --git 
a/sdks/python/apache_beam/runners/portability/universal_local_runner_test.py 
b/sdks/python/apache_beam/runners/portability/universal_local_runner_test.py
index 4c8cedc..e1104dc 100644
--- a/sdks/python/apache_beam/runners/portability/universal_local_runner_test.py
+++ b/sdks/python/apache_beam/runners/portability/universal_local_runner_test.py
@@ -75,6 +75,7 @@ class 
UniversalLocalRunnerTestWithGrpc(UniversalLocalRunnerTest):
   _use_grpc = True
 
 
+@unittest.skip("BEAM-3040")
 class UniversalLocalRunnerTestWithSubprocesses(UniversalLocalRunnerTest):
   _use_grpc = True
   _use_subprocesses = True



[GitHub] beam pull request #4099: [BEAM-3040] Disable flaky subprocess variant of ULR...

2017-11-08 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/4099


---


[2/2] beam git commit: Closes #4099

2017-11-08 Thread robertwb
Closes #4099


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/29399fde
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/29399fde
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/29399fde

Branch: refs/heads/master
Commit: 29399fde81f6d12af018d1bab26c59441a899789
Parents: 5987747 30740a5
Author: Robert Bradshaw 
Authored: Wed Nov 8 12:59:12 2017 -0800
Committer: Robert Bradshaw 
Committed: Wed Nov 8 12:59:12 2017 -0800

--
 .../apache_beam/runners/portability/universal_local_runner_test.py  | 1 +
 1 file changed, 1 insertion(+)
--




[GitHub] beam pull request #4100: Dataflow: Add option to upload heap dumps to GCS

2017-11-08 Thread bjchambers
GitHub user bjchambers opened a pull request:

https://github.com/apache/beam/pull/4100

Dataflow: Add option to upload heap dumps to GCS

This flag needs to go in before backend runner code that reads it. It
will have no effect until that code is deployed.

Follow this checklist to help us incorporate your contribution quickly and 
easily:

 - [*] Make sure there is a [JIRA 
issue](https://issues.apache.org/jira/projects/BEAM/issues/) filed for the 
change (usually before you start working on it).  Trivial changes like typos do 
not require a JIRA issue.  Your pull request should address just this issue, 
without pulling in other changes.
 - [*] Each commit in the pull request should have a meaningful subject 
line and body.
 - [*] Format the pull request title like `[BEAM-XXX] Fixes bug in 
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA 
issue.
 - [*] Write a pull request description that is detailed enough to 
understand what the pull request does, how, and why.
 - [*] Run `mvn clean verify` to make sure basic checks pass. A more 
thorough check will be performed on your pull request automatically.
 - [*] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).

---


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/bjchambers/beam heap-dump-flag

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/4100.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #4100


commit 2db71dd77d99e0117024658104b1040ec87f896b
Author: bchambers 
Date:   2017-11-08T20:50:44Z

Dataflow: Add option to upload heap dumps to GCS

This flag needs to go in before backend runner code that reads it. It
will have no effect until that code is deployed.




---


Build failed in Jenkins: beam_PostCommit_Python_Verify #3506

2017-11-08 Thread Apache Jenkins Server
See 


Changes:

[chamikara] Ensure Kafka sink serializers are set.

[mingmxu] [BEAM-2528] create table

--
[...truncated 931.71 KB...]
copying apache_beam/runners/runner.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners
copying apache_beam/runners/runner_test.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners
copying apache_beam/runners/dataflow/__init__.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow
copying apache_beam/runners/dataflow/dataflow_metrics.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow
copying apache_beam/runners/dataflow/dataflow_metrics_test.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow
copying apache_beam/runners/dataflow/dataflow_runner.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow
copying apache_beam/runners/dataflow/dataflow_runner_test.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow
copying apache_beam/runners/dataflow/ptransform_overrides.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow
copying apache_beam/runners/dataflow/template_runner_test.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow
copying apache_beam/runners/dataflow/test_dataflow_runner.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow
copying apache_beam/runners/dataflow/internal/__init__.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/internal
copying apache_beam/runners/dataflow/internal/apiclient.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/internal
copying apache_beam/runners/dataflow/internal/apiclient_test.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/internal
copying apache_beam/runners/dataflow/internal/dependency.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/internal
copying apache_beam/runners/dataflow/internal/dependency_test.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/internal
copying apache_beam/runners/dataflow/internal/names.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/internal
copying apache_beam/runners/dataflow/internal/clients/__init__.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/internal/clients
copying apache_beam/runners/dataflow/internal/clients/dataflow/__init__.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/internal/clients/dataflow
copying 
apache_beam/runners/dataflow/internal/clients/dataflow/dataflow_v1b3_client.py 
-> apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/internal/clients/dataflow
copying 
apache_beam/runners/dataflow/internal/clients/dataflow/dataflow_v1b3_messages.py
 -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/internal/clients/dataflow
copying 
apache_beam/runners/dataflow/internal/clients/dataflow/message_matchers.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/internal/clients/dataflow
copying 
apache_beam/runners/dataflow/internal/clients/dataflow/message_matchers_test.py 
-> apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/internal/clients/dataflow
copying apache_beam/runners/dataflow/native_io/__init__.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/native_io
copying apache_beam/runners/dataflow/native_io/iobase.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/native_io
copying apache_beam/runners/dataflow/native_io/iobase_test.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/native_io
copying apache_beam/runners/dataflow/native_io/streaming_create.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/native_io
copying apache_beam/runners/direct/__init__.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/bundle_factory.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/clock.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/consumer_tracking_pipeline_visitor.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/consumer_tracking_pipeline_visitor_test.py 
-> apache-beam-2.3.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/direct_metrics.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/direct_metrics_test.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/direct_runner.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/direct_runner_test.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/evaluation_context.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/executor.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/helper_transforms.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/transform_evaluator.py -> 
apache-beam-2.3

[jira] [Commented] (BEAM-3040) Python precommit timed out after 150 minutes

2017-11-08 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-3040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16244548#comment-16244548
 ] 

ASF GitHub Bot commented on BEAM-3040:
--

GitHub user robertwb opened a pull request:

https://github.com/apache/beam/pull/4099

[BEAM-3040] Disable flaky subprocess variant of ULR test.

Follow this checklist to help us incorporate your contribution quickly and 
easily:

 - [ ] Make sure there is a [JIRA 
issue](https://issues.apache.org/jira/projects/BEAM/issues/) filed for the 
change (usually before you start working on it).  Trivial changes like typos do 
not require a JIRA issue.  Your pull request should address just this issue, 
without pulling in other changes.
 - [ ] Each commit in the pull request should have a meaningful subject 
line and body.
 - [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in 
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA 
issue.
 - [ ] Write a pull request description that is detailed enough to 
understand what the pull request does, how, and why.
 - [ ] Run `mvn clean verify` to make sure basic checks pass. A more 
thorough check will be performed on your pull request automatically.
 - [ ] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).

---


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/robertwb/incubator-beam beam-3040

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/4099.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #4099


commit 30740a53588194459e23e7b8ddc69a3ab506e2cb
Author: Robert Bradshaw 
Date:   2017-11-08T19:00:33Z

[BEAM-3040] Disable flaky subprocess variant of ULR test.




> Python precommit timed out after 150 minutes
> 
>
> Key: BEAM-3040
> URL: https://issues.apache.org/jira/browse/BEAM-3040
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Kenneth Knowles
>Assignee: Ahmet Altay
>
> https://builds.apache.org/job/beam_PreCommit_Python_MavenInstall/143/consoleFull
> Within about 10 minutes it reaches this point:
> {code}
> ...
> 2017-10-10T03:33:33.591 [INFO] --- findbugs-maven-plugin:3.0.4:check 
> (default) @ beam-sdks-python ---
> 2017-10-10T03:33:33.702 [INFO] 
> 2017-10-10T03:33:33.702 [INFO] --- exec-maven-plugin:1.5.0:exec 
> (setuptools-test) @ beam-sdks-python ---
> {code}
> and the final output is like this:
> {code}
> ...
> 2017-10-10T03:33:33.702 [INFO] --- exec-maven-plugin:1.5.0:exec 
> (setuptools-test) @ beam-sdks-python ---
> docs create: 
> /home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_MavenInstall/sdks/python/target/.tox/docs
> GLOB sdist-make: 
> /home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_MavenInstall/sdks/python/setup.py
> lint create: 
> /home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_MavenInstall/sdks/python/target/.tox/lint
> py27 create: 
> /home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_MavenInstall/sdks/python/target/.tox/py27
> py27cython create: 
> /home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_MavenInstall/sdks/python/target/.tox/py27cython
> py27cython installdeps: nose==1.3.7, grpcio-tools==1.3.5, cython==0.25.2
> docs installdeps: nose==1.3.7, grpcio-tools==1.3.5, Sphinx==1.5.5, 
> sphinx_rtd_theme==0.2.4
> lint installdeps: nose==1.3.7, pycodestyle==2.3.1, pylint==1.7.1
> py27 installdeps: nose==1.3.7, grpcio-tools==1.3.5
> lint inst: 
> /home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_MavenInstall/sdks/python/target/.tox/dist/apache-beam-2.3.0.dev.zip
> py27 inst: 
> /home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_MavenInstall/sdks/python/target/.tox/dist/apache-beam-2.3.0.dev.zip
> py27cython inst: 
> /home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_MavenInstall/sdks/python/target/.tox/dist/apache-beam-2.3.0.dev.zip
> py27 runtests: PYTHONHASHSEED='2225684666'
> py27 runtests: commands[0] | python --version
> py27 runtests: commands[1] | - find apache_beam -type f -name *.pyc -delete
> py27 runtests: commands[2] | pip install -e .[test]
> lint runtests: PYTHONHASHSEED='2225684666'
> lint runtests: commands[0] | time pip install -e .[test]
> docs inst: 
> /home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_MavenInstall/sdks/python/target/.tox/dist/apache-beam-2.3.0.dev.zip
> py27 runtests: commands[3] | python 
> apache_beam/examples/complete/autocomplete_test.py
> lint runtests: commands[1] | time 
> /home/jenkins/jenkins-slave/workspace/beam_

[GitHub] beam pull request #4099: [BEAM-3040] Disable flaky subprocess variant of ULR...

2017-11-08 Thread robertwb
GitHub user robertwb opened a pull request:

https://github.com/apache/beam/pull/4099

[BEAM-3040] Disable flaky subprocess variant of ULR test.

Follow this checklist to help us incorporate your contribution quickly and 
easily:

 - [ ] Make sure there is a [JIRA 
issue](https://issues.apache.org/jira/projects/BEAM/issues/) filed for the 
change (usually before you start working on it).  Trivial changes like typos do 
not require a JIRA issue.  Your pull request should address just this issue, 
without pulling in other changes.
 - [ ] Each commit in the pull request should have a meaningful subject 
line and body.
 - [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in 
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA 
issue.
 - [ ] Write a pull request description that is detailed enough to 
understand what the pull request does, how, and why.
 - [ ] Run `mvn clean verify` to make sure basic checks pass. A more 
thorough check will be performed on your pull request automatically.
 - [ ] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).

---


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/robertwb/incubator-beam beam-3040

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/4099.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #4099


commit 30740a53588194459e23e7b8ddc69a3ab506e2cb
Author: Robert Bradshaw 
Date:   2017-11-08T19:00:33Z

[BEAM-3040] Disable flaky subprocess variant of ULR test.




---


[jira] [Created] (BEAM-3160) Type based coder inference incorrectly assumes that a coder for one type is equivalent to every other coder for that type.

2017-11-08 Thread Luke Cwik (JIRA)
Luke Cwik created BEAM-3160:
---

 Summary: Type based coder inference incorrectly assumes that a 
coder for one type is equivalent to every other coder for that type.
 Key: BEAM-3160
 URL: https://issues.apache.org/jira/browse/BEAM-3160
 Project: Beam
  Issue Type: Bug
  Components: sdk-java-core
Reporter: Luke Cwik
 Fix For: 2.3.0


We should prevent coder inference from assuming that two coders for the same 
type are interchangeable.

Two Avro GenericRecord coders with different schemas are considered identical 
and an arbitrary one is returned by the Coder/Type inference system if the 
GenericRecord type appears multiple times.
e.g.
*KvCoder.of(AvroCoder.of(SchemaA), AvroCoder.of(SchemaB))* after coder 
inference for the type *KV* will return 
*KvCoder.of(AvroCoder.of(SchemaX), AvroCoder.of(SchemaX))* where SchemaX is 
either SchemaA or SchemaB.

Code:
https://github.com/apache/beam/blob/v2.1.1/sdks/java/core/src/main/java/org/apache/beam/sdk/coders/CoderRegistry.java#L420
 and other Type -> Coder maps in the same file should prevent insertion if the 
type already exists.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


Build failed in Jenkins: beam_PerformanceTests_Python #537

2017-11-08 Thread Apache Jenkins Server
See 


Changes:

[echauchot] [BEAM-2853] remove Nexmark README.md now that the doc is in the 
website

[chamikara] Added vcf file io source and modified _TextSource to optionally 
handle

[chamikara] Ensure Kafka sink serializers are set.

[mingmxu] [BEAM-2528] create table

--
Started by timer
[EnvInject] - Loading node environment variables.
Building remotely on beam7 (beam) in workspace 

 > git rev-parse --is-inside-work-tree # timeout=10
Fetching changes from the remote Git repository
 > git config remote.origin.url https://github.com/apache/beam.git # timeout=10
Fetching upstream changes from https://github.com/apache/beam.git
 > git --version # timeout=10
 > git fetch --tags --progress https://github.com/apache/beam.git 
 > +refs/heads/*:refs/remotes/origin/* 
 > +refs/pull/${ghprbPullId}/*:refs/remotes/origin/pr/${ghprbPullId}/*
 > git rev-parse origin/master^{commit} # timeout=10
Checking out Revision 598774738e7a1236cf30f70a584311cee52d1818 (origin/master)
 > git config core.sparsecheckout # timeout=10
 > git checkout -f 598774738e7a1236cf30f70a584311cee52d1818
Commit message: "This closes #4041"
 > git rev-list e0166ceb657b393e4037ac7c178797a864379745 # timeout=10
Cleaning workspace
 > git rev-parse --verify HEAD # timeout=10
Resetting working tree
 > git reset --hard # timeout=10
 > git clean -fdx # timeout=10
[EnvInject] - Executing scripts and injecting environment variables after the 
SCM step.
[EnvInject] - Injecting as environment variables the properties content 
SPARK_LOCAL_IP=127.0.0.1

[EnvInject] - Variables injected successfully.
[beam_PerformanceTests_Python] $ /bin/bash -xe 
/tmp/jenkins2616947547150389152.sh
+ rm -rf PerfKitBenchmarker
[beam_PerformanceTests_Python] $ /bin/bash -xe 
/tmp/jenkins7375445932597836410.sh
+ git clone https://github.com/GoogleCloudPlatform/PerfKitBenchmarker.git
Cloning into 'PerfKitBenchmarker'...
[beam_PerformanceTests_Python] $ /bin/bash -xe 
/tmp/jenkins3948741543220817579.sh
+ pip install --user -r PerfKitBenchmarker/requirements.txt
Requirement already satisfied: absl-py in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 14))
Requirement already satisfied: jinja2>=2.7 in 
/usr/local/lib/python2.7/dist-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 15))
Requirement already satisfied: setuptools in /usr/lib/python2.7/dist-packages 
(from -r PerfKitBenchmarker/requirements.txt (line 16))
Requirement already satisfied: colorlog[windows]==2.6.0 in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 17))
Requirement already satisfied: blinker>=1.3 in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 18))
Requirement already satisfied: futures>=3.0.3 in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 19))
Requirement already satisfied: PyYAML==3.12 in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 20))
Requirement already satisfied: pint>=0.7 in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 21))
Requirement already satisfied: numpy in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 22))
Requirement already satisfied: functools32 in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 23))
Requirement already satisfied: contextlib2>=0.5.1 in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 24))
Requirement already satisfied: pywinrm in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 25))
Requirement already satisfied: six in 
/home/jenkins/.local/lib/python2.7/site-packages (from absl-py->-r 
PerfKitBenchmarker/requirements.txt (line 14))
Requirement already satisfied: MarkupSafe>=0.23 in 
/usr/local/lib/python2.7/dist-packages (from jinja2>=2.7->-r 
PerfKitBenchmarker/requirements.txt (line 15))
Requirement already satisfied: colorama; extra == "windows" in 
/usr/lib/python2.7/dist-packages (from colorlog[windows]==2.6.0->-r 
PerfKitBenchmarker/requirements.txt (line 17))
Requirement already satisfied: xmltodict in 
/home/jenkins/.local/lib/python2.7/site-packages (from pywinrm->-r 
PerfKitBenchmarker/requirements.txt (line 25))
Requirement already satisfied: requests-ntlm>=0.3.0 in 
/home/jenkins/.local/lib/python2.7/site-packages (from pywinrm->-r 
PerfKitBenchmarker/requirements.txt (line 25))
Requirement already satisfied: requests>=2.9.1 in 
/home/jenkins/.local/lib/python2.7/site-packages (from pywinrm->-r 
PerfKitBench

[jira] [Commented] (BEAM-3159) DoFnTester should be deprecated in favor of TestPipeline

2017-11-08 Thread Kenneth Knowles (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-3159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16244421#comment-16244421
 ] 

Kenneth Knowles commented on BEAM-3159:
---

Absolutely agree. It is pretty broken and harder to use than the DirectRunner.

> DoFnTester should be deprecated in favor of TestPipeline
> 
>
> Key: BEAM-3159
> URL: https://issues.apache.org/jira/browse/BEAM-3159
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core
>Reporter: Ben Chambers
>Assignee: Kenneth Knowles
>Priority: Minor
>
> Reasons:
> 1. The logical unit within a Beam pipeline is a transform. Either a small 
> transform like a ParDo or a larger composite transform. Unit tests should 
> focus on these units, rather than probing specific behaviors of the 
> user-defined functions.
> 2. The way that a runner interacts with a user-defined function is not 
> necessarily obvious. DoFnTester allows testing non-sensical cases that 
> wouldn't arise in practice, since it allows low-level interactions with the 
> actual UDFs.
> Instead, we should encourage the use of TestPipeline with the direct runner. 
> This allows testing a single transform (such as a ParDo running a UDF) in 
> context. It also makes it easier to test things like side-inputs and multiple 
> outputs, since you use the same techniques in the test as you would in a real 
> pipeline, rather than requiring a whole new API.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


Build failed in Jenkins: beam_PostCommit_Python_Verify #3505

2017-11-08 Thread Apache Jenkins Server
See 


Changes:

[chamikara] Added vcf file io source and modified _TextSource to optionally 
handle

--
[...truncated 928.31 KB...]
copying apache_beam/runners/runner.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners
copying apache_beam/runners/runner_test.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners
copying apache_beam/runners/dataflow/__init__.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow
copying apache_beam/runners/dataflow/dataflow_metrics.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow
copying apache_beam/runners/dataflow/dataflow_metrics_test.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow
copying apache_beam/runners/dataflow/dataflow_runner.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow
copying apache_beam/runners/dataflow/dataflow_runner_test.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow
copying apache_beam/runners/dataflow/ptransform_overrides.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow
copying apache_beam/runners/dataflow/template_runner_test.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow
copying apache_beam/runners/dataflow/test_dataflow_runner.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow
copying apache_beam/runners/dataflow/internal/__init__.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/internal
copying apache_beam/runners/dataflow/internal/apiclient.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/internal
copying apache_beam/runners/dataflow/internal/apiclient_test.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/internal
copying apache_beam/runners/dataflow/internal/dependency.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/internal
copying apache_beam/runners/dataflow/internal/dependency_test.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/internal
copying apache_beam/runners/dataflow/internal/names.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/internal
copying apache_beam/runners/dataflow/internal/clients/__init__.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/internal/clients
copying apache_beam/runners/dataflow/internal/clients/dataflow/__init__.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/internal/clients/dataflow
copying 
apache_beam/runners/dataflow/internal/clients/dataflow/dataflow_v1b3_client.py 
-> apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/internal/clients/dataflow
copying 
apache_beam/runners/dataflow/internal/clients/dataflow/dataflow_v1b3_messages.py
 -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/internal/clients/dataflow
copying 
apache_beam/runners/dataflow/internal/clients/dataflow/message_matchers.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/internal/clients/dataflow
copying 
apache_beam/runners/dataflow/internal/clients/dataflow/message_matchers_test.py 
-> apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/internal/clients/dataflow
copying apache_beam/runners/dataflow/native_io/__init__.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/native_io
copying apache_beam/runners/dataflow/native_io/iobase.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/native_io
copying apache_beam/runners/dataflow/native_io/iobase_test.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/native_io
copying apache_beam/runners/dataflow/native_io/streaming_create.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/native_io
copying apache_beam/runners/direct/__init__.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/bundle_factory.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/clock.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/consumer_tracking_pipeline_visitor.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/consumer_tracking_pipeline_visitor_test.py 
-> apache-beam-2.3.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/direct_metrics.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/direct_metrics_test.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/direct_runner.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/direct_runner_test.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/evaluation_context.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/executor.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/helper_transforms.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/transform_evaluator.py -> 
apache-beam-2.3.0.

[jira] [Created] (BEAM-3159) DoFnTester should be deprecated in favor of TestPipeline

2017-11-08 Thread Ben Chambers (JIRA)
Ben Chambers created BEAM-3159:
--

 Summary: DoFnTester should be deprecated in favor of TestPipeline
 Key: BEAM-3159
 URL: https://issues.apache.org/jira/browse/BEAM-3159
 Project: Beam
  Issue Type: Bug
  Components: sdk-java-core
Reporter: Ben Chambers
Assignee: Kenneth Knowles
Priority: Minor


Reasons:

1. The logical unit within a Beam pipeline is a transform. Either a small 
transform like a ParDo or a larger composite transform. Unit tests should focus 
on these units, rather than probing specific behaviors of the user-defined 
functions.

2. The way that a runner interacts with a user-defined function is not 
necessarily obvious. DoFnTester allows testing non-sensical cases that wouldn't 
arise in practice, since it allows low-level interactions with the actual UDFs.

Instead, we should encourage the use of TestPipeline with the direct runner. 
This allows testing a single transform (such as a ParDo running a UDF) in 
context. It also makes it easier to test things like side-inputs and multiple 
outputs, since you use the same techniques in the test as you would in a real 
pipeline, rather than requiring a whole new API.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (BEAM-2528) BeamSql: support create table

2017-11-08 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16244382#comment-16244382
 ] 

ASF GitHub Bot commented on BEAM-2528:
--

Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/4041


> BeamSql: support create table
> -
>
> Key: BEAM-2528
> URL: https://issues.apache.org/jira/browse/BEAM-2528
> Project: Beam
>  Issue Type: Sub-task
>  Components: dsl-sql
>Reporter: James Xu
>Assignee: James Xu
>
> support create table.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] beam pull request #4041: [BEAM-2528] BeamSQL DDL :: CreateTable

2017-11-08 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/4041


---


[4/4] beam git commit: This closes #4041

2017-11-08 Thread mingmxu
This closes #4041


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/59877473
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/59877473
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/59877473

Branch: refs/heads/master
Commit: 598774738e7a1236cf30f70a584311cee52d1818
Parents: ae45bbd 208fc38
Author: mingmxu 
Authored: Wed Nov 8 09:18:48 2017 -0800
Committer: mingmxu 
Committed: Wed Nov 8 09:18:48 2017 -0800

--
 pom.xml |   2 +
 .../src/main/resources/beam/findbugs-filter.xml |   7 +
 sdks/java/extensions/sql/pom.xml| 148 ++-
 .../extensions/sql/src/main/codegen/config.fmpp |  23 +++
 .../sql/src/main/codegen/data/Parser.tdd|  75 
 .../sql/src/main/codegen/includes/license.ftl   |  17 ++
 .../src/main/codegen/includes/parserImpls.ftl   |  89 +
 .../beam/sdk/extensions/sql/BeamSqlCli.java | 115 
 .../beam/sdk/extensions/sql/BeamSqlTable.java   |  54 ++
 .../sdk/extensions/sql/impl/BeamSqlCli.java |  65 ---
 .../sdk/extensions/sql/impl/BeamSqlEnv.java |   3 +-
 .../sql/impl/parser/BeamSqlParser.java  |  50 +
 .../sql/impl/parser/ColumnConstraint.java   |  42 +
 .../sql/impl/parser/ColumnDefinition.java   |  56 ++
 .../extensions/sql/impl/parser/ParserUtils.java |  64 +++
 .../sql/impl/parser/SqlCreateTable.java | 141 ++
 .../sql/impl/parser/SqlDDLKeywords.java |  27 +++
 .../extensions/sql/impl/parser/UnparseUtil.java |  59 ++
 .../sql/impl/parser/package-info.java   |  22 +++
 .../sql/impl/planner/BeamQueryPlanner.java  |   2 +-
 .../extensions/sql/impl/rel/BeamIOSinkRel.java  |   2 +-
 .../sql/impl/rel/BeamIOSourceRel.java   |   2 +-
 .../sql/impl/schema/BaseBeamTable.java  |   9 +-
 .../sql/impl/schema/BeamSqlTable.java   |  54 --
 .../sql/impl/schema/BeamTableUtils.java |   4 +-
 .../impl/schema/kafka/BeamKafkaCSVTable.java| 109 ---
 .../sql/impl/schema/kafka/BeamKafkaTable.java   | 109 ---
 .../sql/impl/schema/kafka/package-info.java |  22 ---
 .../sql/impl/schema/text/BeamTextCSVTable.java  |  70 ---
 .../schema/text/BeamTextCSVTableIOReader.java   |  58 --
 .../schema/text/BeamTextCSVTableIOWriter.java   |  58 --
 .../sql/impl/schema/text/BeamTextTable.java |  41 
 .../sql/impl/schema/text/package-info.java  |  22 ---
 .../beam/sdk/extensions/sql/meta/Column.java|  51 +
 .../beam/sdk/extensions/sql/meta/Table.java |  69 +++
 .../sdk/extensions/sql/meta/package-info.java   |  22 +++
 .../extensions/sql/meta/provider/MetaUtils.java |  40 
 .../sql/meta/provider/TableProvider.java|  61 ++
 .../meta/provider/kafka/BeamKafkaCSVTable.java  | 111 +++
 .../sql/meta/provider/kafka/BeamKafkaTable.java | 115 
 .../meta/provider/kafka/KafkaTableProvider.java |  82 
 .../sql/meta/provider/kafka/package-info.java   |  22 +++
 .../sql/meta/provider/package-info.java |  22 +++
 .../meta/provider/text/BeamTextCSVTable.java|  80 
 .../provider/text/BeamTextCSVTableIOReader.java |  59 ++
 .../provider/text/BeamTextCSVTableIOWriter.java |  59 ++
 .../sql/meta/provider/text/BeamTextTable.java   |  41 
 .../meta/provider/text/TextTableProvider.java   |  83 +
 .../sql/meta/provider/text/package-info.java|  22 +++
 .../sql/meta/store/InMemoryMetaStore.java   | 113 +++
 .../extensions/sql/meta/store/MetaStore.java|  56 ++
 .../extensions/sql/meta/store/package-info.java |  22 +++
 .../extensions/sql/BeamSqlApiSurfaceTest.java   |   7 +-
 .../beam/sdk/extensions/sql/BeamSqlCliTest.java |  75 
 .../beam/sdk/extensions/sql/TestUtils.java  |   2 +-
 .../sql/impl/parser/BeamSqlParserTest.java  | 167 +
 .../schema/kafka/BeamKafkaCSVTableTest.java | 107 ---
 .../impl/schema/text/BeamTextCSVTableTest.java  | 176 --
 .../provider/kafka/BeamKafkaCSVTableTest.java   | 107 +++
 .../provider/kafka/KafkaTableProviderTest.java  |  76 
 .../provider/text/BeamTextCSVTableTest.java | 176 ++
 .../provider/text/TextTableProviderTest.java|  87 +
 .../sql/meta/store/InMemoryMetaStoreTest.java   | 185 +++
 .../sql/mock/MockedUnboundedTable.java  |   2 +-
 64 files changed, 3011 insertions(+), 907 deletions(-)
--




[2/4] beam git commit: [BEAM-2528] create table

2017-11-08 Thread mingmxu
http://git-wip-us.apache.org/repos/asf/beam/blob/208fc38a/sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/meta/Column.java
--
diff --git 
a/sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/meta/Column.java
 
b/sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/meta/Column.java
new file mode 100644
index 000..9bcc16a
--- /dev/null
+++ 
b/sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/meta/Column.java
@@ -0,0 +1,51 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.beam.sdk.extensions.sql.meta;
+
+import com.google.auto.value.AutoValue;
+import java.io.Serializable;
+import javax.annotation.Nullable;
+
+/**
+ * Metadata class for a {@code BeamSqlTable} column.
+ */
+@AutoValue
+public abstract class Column implements Serializable {
+  public abstract String getName();
+  public abstract Integer getType();
+  @Nullable
+  public abstract String getComment();
+  public abstract boolean isPrimaryKey();
+
+  public static Builder builder() {
+return new 
org.apache.beam.sdk.extensions.sql.meta.AutoValue_Column.Builder();
+  }
+
+  /**
+   * Builder class for {@link Column}.
+   */
+  @AutoValue.Builder
+  public abstract static class Builder {
+public abstract Builder name(String name);
+public abstract Builder type(Integer type);
+public abstract Builder comment(String comment);
+public abstract Builder primaryKey(boolean isPrimaryKey);
+public abstract Column build();
+  }
+}

http://git-wip-us.apache.org/repos/asf/beam/blob/208fc38a/sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/meta/Table.java
--
diff --git 
a/sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/meta/Table.java
 
b/sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/meta/Table.java
new file mode 100644
index 000..4af82a0
--- /dev/null
+++ 
b/sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/meta/Table.java
@@ -0,0 +1,69 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.beam.sdk.extensions.sql.meta;
+
+import com.alibaba.fastjson.JSONObject;
+import com.google.auto.value.AutoValue;
+import java.io.Serializable;
+import java.net.URI;
+import java.util.List;
+import javax.annotation.Nullable;
+
+/**
+ * Represents the metadata of a {@code BeamSqlTable}.
+ */
+@AutoValue
+public abstract class Table implements Serializable {
+  /** type of the table. */
+  public abstract String getType();
+  public abstract String getName();
+  public abstract List getColumns();
+  @Nullable
+  public abstract String getComment();
+  @Nullable
+  public abstract URI getLocation();
+  @Nullable
+  public abstract JSONObject getProperties();
+
+  public static Builder builder() {
+return new 
org.apache.beam.sdk.extensions.sql.meta.AutoValue_Table.Builder();
+  }
+
+  public String getLocationAsString() {
+if (getLocation() == null) {
+  return null;
+}
+
+return "/" + getLocation().getHost() + getLocation().getPath();
+  }
+
+  /**
+   * Builder class for {@link Table}.
+   */
+  @AutoValue.Builder
+  public abstract static class Builder {
+public abstract Builder type(String type);
+public abstract Builder name(St

[1/4] beam git commit: [BEAM-2528] create table

2017-11-08 Thread mingmxu
Repository: beam
Updated Branches:
  refs/heads/master ae45bbd63 -> 598774738


http://git-wip-us.apache.org/repos/asf/beam/blob/208fc38a/sdks/java/extensions/sql/src/test/java/org/apache/beam/sdk/extensions/sql/meta/provider/text/BeamTextCSVTableTest.java
--
diff --git 
a/sdks/java/extensions/sql/src/test/java/org/apache/beam/sdk/extensions/sql/meta/provider/text/BeamTextCSVTableTest.java
 
b/sdks/java/extensions/sql/src/test/java/org/apache/beam/sdk/extensions/sql/meta/provider/text/BeamTextCSVTableTest.java
new file mode 100644
index 000..0f1085f
--- /dev/null
+++ 
b/sdks/java/extensions/sql/src/test/java/org/apache/beam/sdk/extensions/sql/meta/provider/text/BeamTextCSVTableTest.java
@@ -0,0 +1,176 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.beam.sdk.extensions.sql.meta.provider.text;
+
+import java.io.File;
+import java.io.FileOutputStream;
+import java.io.IOException;
+import java.io.OutputStream;
+import java.io.PrintStream;
+import java.nio.file.FileVisitResult;
+import java.nio.file.Files;
+import java.nio.file.Path;
+import java.nio.file.SimpleFileVisitor;
+import java.nio.file.attribute.BasicFileAttributes;
+import java.util.ArrayList;
+import java.util.Arrays;
+import java.util.List;
+import org.apache.beam.sdk.extensions.sql.BeamRecordSqlType;
+import org.apache.beam.sdk.extensions.sql.impl.planner.BeamQueryPlanner;
+import org.apache.beam.sdk.extensions.sql.impl.utils.CalciteUtils;
+import org.apache.beam.sdk.testing.PAssert;
+import org.apache.beam.sdk.testing.TestPipeline;
+import org.apache.beam.sdk.values.BeamRecord;
+import org.apache.beam.sdk.values.PCollection;
+import org.apache.calcite.rel.type.RelDataType;
+import org.apache.calcite.rel.type.RelDataTypeFactory;
+import org.apache.calcite.rel.type.RelProtoDataType;
+import org.apache.calcite.sql.type.SqlTypeName;
+import org.apache.commons.csv.CSVFormat;
+import org.apache.commons.csv.CSVPrinter;
+import org.junit.AfterClass;
+import org.junit.BeforeClass;
+import org.junit.Rule;
+import org.junit.Test;
+
+/**
+ * Tests for {@code BeamTextCSVTable}.
+ */
+public class BeamTextCSVTableTest {
+
+  @Rule public TestPipeline pipeline = TestPipeline.create();
+  @Rule public TestPipeline pipeline2 = TestPipeline.create();
+
+  /**
+   * testData.
+   *
+   * 
+   * The types of the csv fields are:
+   * integer,bigint,float,double,string
+   * 
+   */
+  private static Object[] data1 = new Object[] { 1, 1L, 1.1F, 1.1, "james" };
+  private static Object[] data2 = new Object[] { 2, 2L, 2.2F, 2.2, "bond" };
+
+  private static List testData = Arrays.asList(data1, data2);
+  private static List testDataRows = new ArrayList() {{
+for (Object[] data : testData) {
+  add(buildRow(data));
+}
+  }};
+
+  private static Path tempFolder;
+  private static File readerSourceFile;
+  private static File writerTargetFile;
+
+  @Test public void testBuildIOReader() {
+PCollection rows = new BeamTextCSVTable(buildBeamSqlRowType(),
+readerSourceFile.getAbsolutePath()).buildIOReader(pipeline);
+PAssert.that(rows).containsInAnyOrder(testDataRows);
+pipeline.run();
+  }
+
+  @Test public void testBuildIOWriter() {
+new BeamTextCSVTable(buildBeamSqlRowType(),
+readerSourceFile.getAbsolutePath()).buildIOReader(pipeline)
+.apply(new BeamTextCSVTable(buildBeamSqlRowType(), 
writerTargetFile.getAbsolutePath())
+.buildIOWriter());
+pipeline.run();
+
+PCollection rows = new BeamTextCSVTable(buildBeamSqlRowType(),
+writerTargetFile.getAbsolutePath()).buildIOReader(pipeline2);
+
+// confirm the two reads match
+PAssert.that(rows).containsInAnyOrder(testDataRows);
+pipeline2.run();
+  }
+
+  @BeforeClass public static void setUp() throws IOException {
+tempFolder = Files.createTempDirectory("BeamTextTableTest");
+readerSourceFile = writeToFile(testData, "readerSourceFile.txt");
+writerTargetFile = writeToFile(testData, "writerTargetFile.txt");
+  }
+
+  @AfterClass public static void teardownClass() throws IOException {
+Files.walkFileTree(tempFolder, new SimpleFileVisitor() {
+
+  @Override publ

[3/4] beam git commit: [BEAM-2528] create table

2017-11-08 Thread mingmxu
[BEAM-2528] create table


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/208fc38a
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/208fc38a
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/208fc38a

Branch: refs/heads/master
Commit: 208fc38ac5ea15caaa606a9d77a78a4448df1d73
Parents: ae45bbd
Author: James Xu 
Authored: Wed Sep 13 20:36:37 2017 +0800
Committer: mingmxu 
Committed: Wed Nov 8 09:18:11 2017 -0800

--
 pom.xml |   2 +
 .../src/main/resources/beam/findbugs-filter.xml |   7 +
 sdks/java/extensions/sql/pom.xml| 148 ++-
 .../extensions/sql/src/main/codegen/config.fmpp |  23 +++
 .../sql/src/main/codegen/data/Parser.tdd|  75 
 .../sql/src/main/codegen/includes/license.ftl   |  17 ++
 .../src/main/codegen/includes/parserImpls.ftl   |  89 +
 .../beam/sdk/extensions/sql/BeamSqlCli.java | 115 
 .../beam/sdk/extensions/sql/BeamSqlTable.java   |  54 ++
 .../sdk/extensions/sql/impl/BeamSqlCli.java |  65 ---
 .../sdk/extensions/sql/impl/BeamSqlEnv.java |   3 +-
 .../sql/impl/parser/BeamSqlParser.java  |  50 +
 .../sql/impl/parser/ColumnConstraint.java   |  42 +
 .../sql/impl/parser/ColumnDefinition.java   |  56 ++
 .../extensions/sql/impl/parser/ParserUtils.java |  64 +++
 .../sql/impl/parser/SqlCreateTable.java | 141 ++
 .../sql/impl/parser/SqlDDLKeywords.java |  27 +++
 .../extensions/sql/impl/parser/UnparseUtil.java |  59 ++
 .../sql/impl/parser/package-info.java   |  22 +++
 .../sql/impl/planner/BeamQueryPlanner.java  |   2 +-
 .../extensions/sql/impl/rel/BeamIOSinkRel.java  |   2 +-
 .../sql/impl/rel/BeamIOSourceRel.java   |   2 +-
 .../sql/impl/schema/BaseBeamTable.java  |   9 +-
 .../sql/impl/schema/BeamSqlTable.java   |  54 --
 .../sql/impl/schema/BeamTableUtils.java |   4 +-
 .../impl/schema/kafka/BeamKafkaCSVTable.java| 109 ---
 .../sql/impl/schema/kafka/BeamKafkaTable.java   | 109 ---
 .../sql/impl/schema/kafka/package-info.java |  22 ---
 .../sql/impl/schema/text/BeamTextCSVTable.java  |  70 ---
 .../schema/text/BeamTextCSVTableIOReader.java   |  58 --
 .../schema/text/BeamTextCSVTableIOWriter.java   |  58 --
 .../sql/impl/schema/text/BeamTextTable.java |  41 
 .../sql/impl/schema/text/package-info.java  |  22 ---
 .../beam/sdk/extensions/sql/meta/Column.java|  51 +
 .../beam/sdk/extensions/sql/meta/Table.java |  69 +++
 .../sdk/extensions/sql/meta/package-info.java   |  22 +++
 .../extensions/sql/meta/provider/MetaUtils.java |  40 
 .../sql/meta/provider/TableProvider.java|  61 ++
 .../meta/provider/kafka/BeamKafkaCSVTable.java  | 111 +++
 .../sql/meta/provider/kafka/BeamKafkaTable.java | 115 
 .../meta/provider/kafka/KafkaTableProvider.java |  82 
 .../sql/meta/provider/kafka/package-info.java   |  22 +++
 .../sql/meta/provider/package-info.java |  22 +++
 .../meta/provider/text/BeamTextCSVTable.java|  80 
 .../provider/text/BeamTextCSVTableIOReader.java |  59 ++
 .../provider/text/BeamTextCSVTableIOWriter.java |  59 ++
 .../sql/meta/provider/text/BeamTextTable.java   |  41 
 .../meta/provider/text/TextTableProvider.java   |  83 +
 .../sql/meta/provider/text/package-info.java|  22 +++
 .../sql/meta/store/InMemoryMetaStore.java   | 113 +++
 .../extensions/sql/meta/store/MetaStore.java|  56 ++
 .../extensions/sql/meta/store/package-info.java |  22 +++
 .../extensions/sql/BeamSqlApiSurfaceTest.java   |   7 +-
 .../beam/sdk/extensions/sql/BeamSqlCliTest.java |  75 
 .../beam/sdk/extensions/sql/TestUtils.java  |   2 +-
 .../sql/impl/parser/BeamSqlParserTest.java  | 167 +
 .../schema/kafka/BeamKafkaCSVTableTest.java | 107 ---
 .../impl/schema/text/BeamTextCSVTableTest.java  | 176 --
 .../provider/kafka/BeamKafkaCSVTableTest.java   | 107 +++
 .../provider/kafka/KafkaTableProviderTest.java  |  76 
 .../provider/text/BeamTextCSVTableTest.java | 176 ++
 .../provider/text/TextTableProviderTest.java|  87 +
 .../sql/meta/store/InMemoryMetaStoreTest.java   | 185 +++
 .../sql/mock/MockedUnboundedTable.java  |   2 +-
 64 files changed, 3011 insertions(+), 907 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/208fc38a/pom.xml
--
diff --git a/pom.xml b/pom.xml
index 7efb23d..adfef71 100644
--- a/pom.xml
+++ b/pom.xml
@@ -1593,6 +1593,8 @@
   maven-javadoc-plugin
   ${ma

[jira] [Commented] (BEAM-2257) KafkaIO write without key requires a producer fn

2017-11-08 Thread Raghu Angadi (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2257?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16244375#comment-16244375
 ] 

Raghu Angadi commented on BEAM-2257:


Thanks [~nerdynick] for the fix. [~jbonofre] please resolve this when you get a 
chance.

> KafkaIO write without key requires a producer fn
> 
>
> Key: BEAM-2257
> URL: https://issues.apache.org/jira/browse/BEAM-2257
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-extensions
>Reporter: Jean-Baptiste Onofré
>Assignee: Jean-Baptiste Onofré
> Fix For: 2.3.0
>
>
> The {{KafkaIO}} javadoc says that it's possible to write directly {{String}} 
> to the topic without key:
> {code}
>PCollection strings = ...;
>strings.apply(KafkaIO.write()
>.withBootstrapServers("broker_1:9092,broker_2:9092")
>.withTopic("results")
>.withValueSerializer(new StringSerializer()) // just need serializer 
> for value
>.values()
>  );
> {code}
> This is not fully correct:
> 1. {{withValueSerializer()}} requires a class of serializer, not an instance. 
> So, it should be {{withValueSerializer(StringSerializer.class)}}.
> 2. As the key serializer is not provider, a kafka producer fn is required, 
> else, the user will get:
> {code}
> Caused by: org.apache.kafka.common.KafkaException: Failed to construct kafka 
> producer
>   at 
> org.apache.kafka.clients.producer.KafkaProducer.(KafkaProducer.java:321)
>   at 
> org.apache.kafka.clients.producer.KafkaProducer.(KafkaProducer.java:156)
>   at 
> org.apache.beam.sdk.io.kafka.KafkaIO$KafkaWriter.setup(KafkaIO.java:1494)
> Caused by: java.lang.NullPointerException
>   at 
> org.apache.kafka.clients.producer.KafkaProducer.(KafkaProducer.java:300)
>   at 
> org.apache.kafka.clients.producer.KafkaProducer.(KafkaProducer.java:156)
>   at 
> org.apache.beam.sdk.io.kafka.KafkaIO$KafkaWriter.setup(KafkaIO.java:1494)
> {code}
> A possible workaround is to create a {{VoidSerializer}} and pass it via 
> {{withKeySerializer()}} or provide the producer fn.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (BEAM-2257) KafkaIO write without key requires a producer fn

2017-11-08 Thread Raghu Angadi (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-2257?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raghu Angadi updated BEAM-2257:
---
Fix Version/s: 2.3.0

> KafkaIO write without key requires a producer fn
> 
>
> Key: BEAM-2257
> URL: https://issues.apache.org/jira/browse/BEAM-2257
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-extensions
>Reporter: Jean-Baptiste Onofré
>Assignee: Jean-Baptiste Onofré
> Fix For: 2.3.0
>
>
> The {{KafkaIO}} javadoc says that it's possible to write directly {{String}} 
> to the topic without key:
> {code}
>PCollection strings = ...;
>strings.apply(KafkaIO.write()
>.withBootstrapServers("broker_1:9092,broker_2:9092")
>.withTopic("results")
>.withValueSerializer(new StringSerializer()) // just need serializer 
> for value
>.values()
>  );
> {code}
> This is not fully correct:
> 1. {{withValueSerializer()}} requires a class of serializer, not an instance. 
> So, it should be {{withValueSerializer(StringSerializer.class)}}.
> 2. As the key serializer is not provider, a kafka producer fn is required, 
> else, the user will get:
> {code}
> Caused by: org.apache.kafka.common.KafkaException: Failed to construct kafka 
> producer
>   at 
> org.apache.kafka.clients.producer.KafkaProducer.(KafkaProducer.java:321)
>   at 
> org.apache.kafka.clients.producer.KafkaProducer.(KafkaProducer.java:156)
>   at 
> org.apache.beam.sdk.io.kafka.KafkaIO$KafkaWriter.setup(KafkaIO.java:1494)
> Caused by: java.lang.NullPointerException
>   at 
> org.apache.kafka.clients.producer.KafkaProducer.(KafkaProducer.java:300)
>   at 
> org.apache.kafka.clients.producer.KafkaProducer.(KafkaProducer.java:156)
>   at 
> org.apache.beam.sdk.io.kafka.KafkaIO$KafkaWriter.setup(KafkaIO.java:1494)
> {code}
> A possible workaround is to create a {{VoidSerializer}} and pass it via 
> {{withKeySerializer()}} or provide the producer fn.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (BEAM-2203) Arithmetic operators: support DATETIME & DATETIME_INTERVAL

2017-11-08 Thread Anton Kedin (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2203?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16244320#comment-16244320
 ] 

Anton Kedin commented on BEAM-2203:
---

To summarize, implemented operations so far:
* TIMESTAMPADD(timeUnit, numer, interval)
* timestamp + interval
* TIMESTAMPDIFF(timeUnit, timestampStart, timestampEnd)
* timestamp - interval

> Arithmetic operators: support DATETIME & DATETIME_INTERVAL
> --
>
> Key: BEAM-2203
> URL: https://issues.apache.org/jira/browse/BEAM-2203
> Project: Beam
>  Issue Type: New Feature
>  Components: dsl-sql
>Reporter: James Xu
>Assignee: Anton Kedin
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (BEAM-2774) Add I/O source for VCF files (python)

2017-11-08 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2774?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16244290#comment-16244290
 ] 

ASF GitHub Bot commented on BEAM-2774:
--

Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/3979


> Add I/O source for VCF files (python)
> -
>
> Key: BEAM-2774
> URL: https://issues.apache.org/jira/browse/BEAM-2774
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-py-core
>Reporter: Asha Rostamianfar
>Assignee: Miles Saul
>   Original Estimate: 336h
>  Remaining Estimate: 336h
>
> A new I/O source for reading (and eventually writing) VCF files [1] for 
> Python. The design doc is available at 
> https://docs.google.com/document/d/1jsdxOPALYYlhnww2NLURS8NKXaFyRSJrcGbEDpY9Lkw/edit
> [1] http://samtools.github.io/hts-specs/VCFv4.3.pdf



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (BEAM-2257) KafkaIO write without key requires a producer fn

2017-11-08 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2257?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16244303#comment-16244303
 ] 

ASF GitHub Bot commented on BEAM-2257:
--

Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/4034


> KafkaIO write without key requires a producer fn
> 
>
> Key: BEAM-2257
> URL: https://issues.apache.org/jira/browse/BEAM-2257
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-extensions
>Reporter: Jean-Baptiste Onofré
>Assignee: Jean-Baptiste Onofré
>
> The {{KafkaIO}} javadoc says that it's possible to write directly {{String}} 
> to the topic without key:
> {code}
>PCollection strings = ...;
>strings.apply(KafkaIO.write()
>.withBootstrapServers("broker_1:9092,broker_2:9092")
>.withTopic("results")
>.withValueSerializer(new StringSerializer()) // just need serializer 
> for value
>.values()
>  );
> {code}
> This is not fully correct:
> 1. {{withValueSerializer()}} requires a class of serializer, not an instance. 
> So, it should be {{withValueSerializer(StringSerializer.class)}}.
> 2. As the key serializer is not provider, a kafka producer fn is required, 
> else, the user will get:
> {code}
> Caused by: org.apache.kafka.common.KafkaException: Failed to construct kafka 
> producer
>   at 
> org.apache.kafka.clients.producer.KafkaProducer.(KafkaProducer.java:321)
>   at 
> org.apache.kafka.clients.producer.KafkaProducer.(KafkaProducer.java:156)
>   at 
> org.apache.beam.sdk.io.kafka.KafkaIO$KafkaWriter.setup(KafkaIO.java:1494)
> Caused by: java.lang.NullPointerException
>   at 
> org.apache.kafka.clients.producer.KafkaProducer.(KafkaProducer.java:300)
>   at 
> org.apache.kafka.clients.producer.KafkaProducer.(KafkaProducer.java:156)
>   at 
> org.apache.beam.sdk.io.kafka.KafkaIO$KafkaWriter.setup(KafkaIO.java:1494)
> {code}
> A possible workaround is to create a {{VoidSerializer}} and pass it via 
> {{withKeySerializer()}} or provide the producer fn.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] beam pull request #4034: [BEAM-2257] Ensure Kafka sink serializers are set.

2017-11-08 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/4034


---


[2/2] beam git commit: This closes #4034

2017-11-08 Thread chamikara
This closes #4034


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/ae45bbd6
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/ae45bbd6
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/ae45bbd6

Branch: refs/heads/master
Commit: ae45bbd635c898b879efc3251f834428f01dfb57
Parents: 0af9720 b413a96
Author: chamik...@google.com 
Authored: Wed Nov 8 08:52:53 2017 -0800
Committer: chamik...@google.com 
Committed: Wed Nov 8 08:52:53 2017 -0800

--
 .../src/main/java/org/apache/beam/sdk/io/kafka/KafkaIO.java  | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)
--




[1/2] beam git commit: Ensure Kafka sink serializers are set.

2017-11-08 Thread chamikara
Repository: beam
Updated Branches:
  refs/heads/master 0af972095 -> ae45bbd63


Ensure Kafka sink serializers are set.


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/b413a966
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/b413a966
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/b413a966

Branch: refs/heads/master
Commit: b413a9665f99599bfd929f850fa67d227ea190d5
Parents: 0af9720
Author: Raghu Angadi 
Authored: Fri Oct 20 15:29:20 2017 -0700
Committer: chamik...@google.com 
Committed: Wed Nov 8 08:52:42 2017 -0800

--
 .../src/main/java/org/apache/beam/sdk/io/kafka/KafkaIO.java  | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/b413a966/sdks/java/io/kafka/src/main/java/org/apache/beam/sdk/io/kafka/KafkaIO.java
--
diff --git 
a/sdks/java/io/kafka/src/main/java/org/apache/beam/sdk/io/kafka/KafkaIO.java 
b/sdks/java/io/kafka/src/main/java/org/apache/beam/sdk/io/kafka/KafkaIO.java
index f6158ca..33fc289 100644
--- a/sdks/java/io/kafka/src/main/java/org/apache/beam/sdk/io/kafka/KafkaIO.java
+++ b/sdks/java/io/kafka/src/main/java/org/apache/beam/sdk/io/kafka/KafkaIO.java
@@ -928,10 +928,8 @@ public class KafkaIO {
 // Backlog support :
 // Kafka consumer does not have an API to fetch latest offset for topic. 
We need to seekToEnd()
 // then look at position(). Use another consumer to do this so that the 
primary consumer does
-// not need to be interrupted. The latest offsets are fetched periodically 
on another thread.
-// This is still a hack. There could be unintended side effects, e.g. if 
user enabled offset
-// auto commit in consumer config, this could interfere with the primary 
consumer (we will
-// handle this particular problem). We might have to make this optional.
+// not need to be interrupted. The latest offsets are fetched periodically 
on a thread. This is
+// still a bit of a hack, but so far there haven't been any issues 
reported by the users.
 private Consumer offsetConsumer;
 private final ScheduledExecutorService offsetFetcherThread =
 Executors.newSingleThreadScheduledExecutor();
@@ -1614,6 +1612,8 @@ public class KafkaIO {
 getProducerConfig().get(ProducerConfig.BOOTSTRAP_SERVERS_CONFIG) != 
null,
 "withBootstrapServers() is required");
   checkArgument(getTopic() != null, "withTopic() is required");
+  checkArgument(getKeySerializer() != null, "withKeySerializer() is 
required");
+  checkArgument(getValueSerializer() != null, "withValueSerializer() is 
required");
 
   if (isEOS()) {
 EOSWrite.ensureEOSSupport();



[GitHub] beam pull request #3979: [BEAM-2774] Add I/O source to read VCF files

2017-11-08 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/3979


---


[1/4] beam git commit: Added vcf file io source and modified _TextSource to optionally handle headers

2017-11-08 Thread chamikara
Repository: beam
Updated Branches:
  refs/heads/master 9c5454287 -> 0af972095


http://git-wip-us.apache.org/repos/asf/beam/blob/f22da33c/sdks/python/apache_beam/testing/data/vcf/valid-4.2.vcf
--
diff --git a/sdks/python/apache_beam/testing/data/vcf/valid-4.2.vcf 
b/sdks/python/apache_beam/testing/data/vcf/valid-4.2.vcf
new file mode 100644
index 000..c42d71c
--- /dev/null
+++ b/sdks/python/apache_beam/testing/data/vcf/valid-4.2.vcf
@@ -0,0 +1,42 @@
+##fileformat=VCFv4.2
+##fileDate=20090805
+##source=myImputationProgramV3.1
+##phasing=partial
+##INFO=
+##INFO=
+##INFO=
+##INFO=
+##INFO=
+##INFO=
+##INFO=
+##INFO=
+##FILTER=
+##FILTER=
+##FORMAT=
+##FORMAT=
+##FORMAT=
+##FORMAT=
+##FORMAT=
+##reference=file:/lustre/scratch105/projects/g1k/ref/main_project/human_g1k_v37.fasta
+##contig=
+##contig=
+##contig=
+##SAMPLE=
+##SAMPLE=
+##PEDIGREE=
+##PEDIGREE=
+##pedigreeDB=url
+#CHROM POS ID  REF ALT QUALFILTER  INFOFORMAT  NA1 
NA2 NA3
+19 14370   rs6054257   G   A   29  PASS
NS=3;DP=14;AF=0.5;DB;H2 GT:GQ:DP:HQ 0|0:48:1:51,51  1|0:48:8:51,51  
1/1:43:5:.,.
+20 17330   .   T   A   3   q10 NS=3;DP=11;AF=0.017 
GT:GQ:DP:HQ 0|0:49:3:58,50  0|1:3:5:65,30/0:41:3
+20 1110696 rs6040355   A   G,T 67  PASS
NS=2;DP=10;AF=0.333,0.667;AA=T;DB   GT:GQ:DP:HQ 1|2:21:6:23,27  
2|1:2:0:18,22/2:35:4
+20 1230237 .   T   .   47  PASSNS=3;DP=13;AA=T 
GT:GQ:DP:HQ 0|0:54:7:56,60  0|0:48:4:51,51  0/0:61:2
+20 1234567 microsat1   GTC G,GTCTC 50  PASSNS=3;DP=9;AA=G  
GT:GQ:DP0/1:35:40/2:17:21/1:40:3
+20 2234567 .   C   [13:123457[ACGC 50  PASS
SVTYPE=BÑD;NS=3;DP=9;AA=G  GT:GQ:DP0/1:35:40/1:17:2
1/1:40:3
+20 2234568 .   C   .TC 50  PASS
SVTYPE=BND;NS=3;DP=9;AA=G   GT:GQ:DP0/1:35:40/1:17:2
1/1:40:3
+20 2234569 .   C   CT. 50  PASS
SVTYPE=BND;NS=3;DP=9;AA=G   GT:GQ:DP0/1:35:40/1:17:2
1/1:40:3
+20 3234569 .   C 50  PASS
END=3235677;NS=3;DP=9;AA=G  GT:GQ:DP0/1:35:40/1:17:2
1/1:40:3
+20 4234569 .   N   .[13:123457[50  PASS
SVTYPE=BND;NS=3;DP=9;AA=G   GT:GQ:DP0/1:35:40/1:17:2
./.:40:3
+20 5234569 .   N   [13:123457[.50  PASS
SVTYPE=BND;NS=3;DP=9;AA=G   GT:GQ:DP0/1:35:40/1:17:2
1/1:40:3
+Y  17330   .   T   A   3   q10 NS=3;DP=11;AF=0.017 
GT:GL   0:0,49  0:0,3   1:41,0
+HLA-A*01:01:01:01  1   .   N   T   50  PASS
END=1;NS=3;DP=9;AA=GGT:GQ:DP:HQ 0|0:48:1:51,51  1|0:48:8:51,51  
1/1:43:5:.,.
\ No newline at end of file

http://git-wip-us.apache.org/repos/asf/beam/blob/f22da33c/sdks/python/apache_beam/testing/data/vcf/valid-4.2.vcf.gz
--
diff --git a/sdks/python/apache_beam/testing/data/vcf/valid-4.2.vcf.gz 
b/sdks/python/apache_beam/testing/data/vcf/valid-4.2.vcf.gz
new file mode 100644
index 000..4208e3e
Binary files /dev/null and 
b/sdks/python/apache_beam/testing/data/vcf/valid-4.2.vcf.gz differ

http://git-wip-us.apache.org/repos/asf/beam/blob/f22da33c/sdks/python/apache_beam/testing/test_utils.py
--
diff --git a/sdks/python/apache_beam/testing/test_utils.py 
b/sdks/python/apache_beam/testing/test_utils.py
index 41a02cf..c28b692 100644
--- a/sdks/python/apache_beam/testing/test_utils.py
+++ b/sdks/python/apache_beam/testing/test_utils.py
@@ -22,6 +22,9 @@ For internal use only; no backwards-compatibility guarantees.
 
 import hashlib
 import imp
+import os
+import shutil
+import tempfile
 
 from mock import Mock
 from mock import patch
@@ -32,6 +35,43 @@ from apache_beam.utils import retry
 DEFAULT_HASHING_ALG = 'sha1'
 
 
+class TempDir(object):
+  """Context Manager to create and clean-up a temporary directory."""
+
+  def __init__(self):
+self._tempdir = tempfile.mkdtemp()
+
+  def __enter__(self):
+return self
+
+  def __exit__(self, *args):
+if os.path.exists(self._tempdir):
+  shutil.rmtree(self._tempdir)
+
+  def get_path(self):
+"""Returns the path to the temporary directory."""
+return self._tempdir
+
+  def create_temp_file(self, suffix='', lines=None):
+"""Creates a temporary file in the temporary directory.
+
+Args:
+  suffix (str): The filename suffix of the temporary file (e.g. '.txt')
+  lines (List[str]): A list of lines that will be written to the temporary
+file.
+Returns:
+  The name of the temporary file created.
+"""
+f = tempfile.NamedTemporaryFile(delete=False,
+ 

[3/4] beam git commit: Added vcf file io source and modified _TextSource to optionally handle headers

2017-11-08 Thread chamikara
Added vcf file io source and modified _TextSource to optionally handle headers


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/f22da33c
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/f22da33c
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/f22da33c

Branch: refs/heads/master
Commit: f22da33cc46b48c046740cbcfda78ea59adbd924
Parents: 9c54542
Author: Miles Saul 
Authored: Wed Oct 11 15:00:03 2017 -0400
Committer: chamik...@google.com 
Committed: Wed Nov 8 08:46:18 2017 -0800

--
 pom.xml | 3 +
 sdks/python/apache_beam/io/source_test_utils.py | 3 +-
 sdks/python/apache_beam/io/textio.py|67 +-
 sdks/python/apache_beam/io/textio_test.py   |   503 +-
 sdks/python/apache_beam/io/vcfio.py |   436 +
 sdks/python/apache_beam/io/vcfio_test.py|   519 +
 .../apache_beam/testing/data/vcf/valid-4.0.vcf  |23 +
 .../testing/data/vcf/valid-4.0.vcf.bz2  |   Bin 0 -> 781 bytes
 .../testing/data/vcf/valid-4.0.vcf.gz   |   Bin 0 -> 727 bytes
 .../testing/data/vcf/valid-4.1-large.vcf| 1 +
 .../testing/data/vcf/valid-4.1-large.vcf.gz |   Bin 0 -> 156715 bytes
 .../apache_beam/testing/data/vcf/valid-4.2.vcf  |42 +
 .../testing/data/vcf/valid-4.2.vcf.gz   |   Bin 0 -> 1240 bytes
 sdks/python/apache_beam/testing/test_utils.py   |40 +
 .../apache_beam/testing/test_utils_test.py  |24 +
 sdks/python/generate_pydoc.sh   | 6 +
 sdks/python/setup.py| 3 +-
 17 files changed, 11419 insertions(+), 250 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/f22da33c/pom.xml
--
diff --git a/pom.xml b/pom.xml
index 0b7b323..7efb23d 100644
--- a/pom.xml
+++ b/pom.xml
@@ -1695,6 +1695,9 @@
   
   **/apache_beam/portability/api/*_pb2*.py
   **/go/pkg/beam/model/**/*.pb.go
+
+  
+  **/apache_beam/testing/data/vcf/*
 
   
 

http://git-wip-us.apache.org/repos/asf/beam/blob/f22da33c/sdks/python/apache_beam/io/source_test_utils.py
--
diff --git a/sdks/python/apache_beam/io/source_test_utils.py 
b/sdks/python/apache_beam/io/source_test_utils.py
index 712049b..e4d2f6f 100644
--- a/sdks/python/apache_beam/io/source_test_utils.py
+++ b/sdks/python/apache_beam/io/source_test_utils.py
@@ -52,7 +52,8 @@ from multiprocessing.pool import ThreadPool
 
 from apache_beam.io import iobase
 
-__all__ = ['read_from_source', 'assert_sources_equal_reference_source',
+__all__ = ['read_from_source',
+   'assert_sources_equal_reference_source',
'assert_reentrant_reads_succeed',
'assert_split_at_fraction_behavior',
'assert_split_at_fraction_binary',

http://git-wip-us.apache.org/repos/asf/beam/blob/f22da33c/sdks/python/apache_beam/io/textio.py
--
diff --git a/sdks/python/apache_beam/io/textio.py 
b/sdks/python/apache_beam/io/textio.py
index c25181d..4a4bd3a 100644
--- a/sdks/python/apache_beam/io/textio.py
+++ b/sdks/python/apache_beam/io/textio.py
@@ -78,6 +78,10 @@ class _TextSource(filebasedsource.FileBasedSource):
  'size of data %d.', value, len(self._data))
   self._position = value
 
+def reset(self):
+  self.data = ''
+  self.position = 0
+
   def __init__(self,
file_pattern,
min_bundle_size,
@@ -86,7 +90,26 @@ class _TextSource(filebasedsource.FileBasedSource):
coder,
buffer_size=DEFAULT_READ_BUFFER_SIZE,
validate=True,
-   skip_header_lines=0):
+   skip_header_lines=0,
+   header_processor_fns=(None, None)):
+"""Initialize a _TextSource
+
+Args:
+  header_processor_fns (tuple): a tuple of a `header_matcher` function
+and a `header_processor` function. The `header_matcher` should
+return `True` for all lines at the start of the file that are part
+of the file header and `False` otherwise. These header lines will
+not be yielded when reading records and instead passed into
+`header_processor` to be handled. If `skip_header_lines` and a
+`header_matcher` are both provided, the value of `skip_header_lines`
+lines will be skipped and the header will be processed from
+there.
+Raises:
+  ValueError: if skip_lines is negative.
+
+Please refer to documentation in class `ReadFromText` for the rest
+of the arguments.
+"""
 super(_TextSource, self).__in

[2/4] beam git commit: Added vcf file io source and modified _TextSource to optionally handle headers

2017-11-08 Thread chamikara
http://git-wip-us.apache.org/repos/asf/beam/blob/f22da33c/sdks/python/apache_beam/testing/data/vcf/valid-4.1-large.vcf
--
diff --git a/sdks/python/apache_beam/testing/data/vcf/valid-4.1-large.vcf 
b/sdks/python/apache_beam/testing/data/vcf/valid-4.1-large.vcf
new file mode 100644
index 000..c470685
--- /dev/null
+++ b/sdks/python/apache_beam/testing/data/vcf/valid-4.1-large.vcf
@@ -0,0 +1,1 @@
+##fileformat=VCFv4.1
+##fileDate=20121204
+##center=Complete Genomics
+##source=CGAPipeline_2.2.0.26
+##source_GENOME_REFERENCE=NCBI build 37
+##source_MAX_PLOIDY=10
+##source_NUMBER_LEVELS=GS01868-DNA_H02:7
+##source_NONDIPLOID_WINDOW_WIDTH=10
+##source_MEAN_GC_CORRECTED_CVG=GS01868-DNA_H02:41.51
+##source_GENE_ANNOTATIONS=NCBI build 37.2
+##source_DBSNP_BUILD=dbSNP build 135
+##source_MEI_1000G_ANNOTATIONS=INITIAL-DATA-RELEASE
+##source_COSMIC=COSMIC v59
+##source_DGV_VERSION=9
+##source_MIRBASE_VERSION=mirBase version 18
+##source_PFAM_DATE=April 21, 2011
+##source_REPMASK_GENERATED_AT=2011-Feb-15 10:08
+##source_SEGDUP_GENERATED_AT=2010-Dec-01 13:40
+##phasing=partial
+##reference=ftp://ftp.completegenomics.com/ReferenceFiles/build37.fa.bz2
+##contig=
+##contig=
+##contig=
+##contig=
+##contig=
+##contig=
+##contig=
+##contig=
+##contig=
+##contig=
+##contig=
+##contig=
+##contig=
+##contig=
+##contig=
+##contig=
+##contig=
+##contig=
+##contig=
+##contig=
+##contig=
+##contig=
+##contig=
+##contig=
+##contig=
+##ALT=
+##ALT=
+##ALT=
+##ALT=
+##ALT=
+##ALT=
+##ALT=
+##ALT=
+##ALT=
+##FILTER=
+##FILTER=
+##FILTER=
+##FILTER=
+##FILTER=
+##FILTER=
+##FILTER=
+##FILTER=
+##FILTER=
+##INFO=
+##INFO=
+##INFO=
+##INFO=
+##INFO=
+##INFO=
+##INFO=
+##INFO=
+##INFO=
+##INFO=
+##INFO=
+##INFO=
+##INFO=
+##INFO=
+##INFO=
+##INFO=
+##INFO=
+##INFO=
+##INFO=
+##INFO=
+##INFO=
+##FORMAT=
+##FORMAT=
+##FORMAT=
+##FORMAT=
+##FORMAT=
+##FORMAT=
+##FORMAT=
+##FORMAT=
+##FORMAT=
+##FORMAT=
+##FORMAT=
+##FORMAT=
+##FORMAT=
+##FORMAT=
+##FORMAT=
+##FORMAT=
+##FORMAT=
+##FORMAT=
+##FORMAT=
+##FORMAT=
+##FORMAT=
+##FORMAT=
+##FORMAT=
+##FORMAT=
+##FORMAT=
+##FORMAT=
+##FORMAT=
+##FORMAT=
+##FORMAT=
+##FORMAT=
+##FORMAT=
+##FORMAT=
+##FORMAT=
+#CHROM POS ID  REF ALT QUALFILTER  INFOFORMAT  
GS16676-ASM
+1  1   .   N   .   .   
END=1;NS=1;AN=0 GT:PS   ./.:.
+1  10001   .   T   .   .   
NS=1;CGA_WINEND=12000   
GT:CGA_GP:CGA_NP:CGA_CP:CGA_PS:CGA_CT:CGA_TS:CGA_CL:CGA_LS  
.:0.67:1.44:.:0:.:0:0.999:152
+1  10001   .   T   .   .   
END=11038;NS=1;AN=0 GT:PS   ./.:.
+1  11048   .   CGCACGGCGCCGGGCTCGGAGGGTGGCGCC  .   .   
.   NS=1;AN=0   GT:PS   ./.:.
+1  11270   .   AGAGTGG .   .   .   NS=1;AN=0   GT:PS   
./.:.
+1  11302   .   GGGCACTGCAGGGCCCTCTTGCTTACTGTATAGTGGTGGCACGCCG  .   
.   .   NS=1;AN=0   GT:PS   ./.:.
+1  11388   .   AGGTGTAGTGGCAGCACGCCCACCTGCTGGCAGCT .   .   
.   NS=1;AN=0   GT:PS   ./.:.
+1  11475   .   ACACCCGGAGCATATGCTGTTTGGTCTCAGTAGACTC   .   .   
.   NS=1;AN=0   GT:PS   ./.:.
+1  11528   .   TGGGTTTGTAATAAATATGTTTAATTTGTGAAC   .   
.   .   NS=1;AN=0   GT:PS   ./.:.
+1  11650   .   TGGATGCCAGTCTAACAGGTGAAG.   .   .   
NS=1;AN=0   GT:PS   ./.:.
+1  11707   .   TCCTGGCCATGTGTATTTAAATTTC   .   .   
.   NS=1;AN=0   GT:PS   ./.:.
+1  11769   .   TGAGAATGACTGCGCAAATTTGCCGGATTTCCTTTGCT  .   .   
.   NS=1;AN=0   GT:PS   ./.:.
+1  11841   .   CCGGGTATCATTCACCATCCGTT .   .   .   
NS=1;AN=0   GT:PS   ./.:.
+1  11891   .   CTTTGACCTCTTCTTTCTGTTCATGTGTATTTGC  .   .   
.   NS=1;AN=0   GT:PS   ./.:.
+1  11958   .   ACCGGGCCTTTGAGAGGTCACAGGGTCTTGATGCTGTGGTCTTCATC .   
.   .   NS=1;AN=0   GT:PS   ./.:.
+1  12001   .   C   .   .   
NS=1;CGA_WINEND=14000   
GT:CGA_GP:CGA_NP:CGA_CP:CGA_PS:CGA_CT:CGA_TS:CGA_CL:CGA_LS  
.:2.26:1.68:.:0:.:0:0.999:152
+1  12027   .   ACTGCTGGCCTGTGCCAGGGTGCAAG  .   .   .   
NS=1;AN=0   GT:PS   ./.:.
+1  12099   .   AGTGGGATGGGCCATTGTTCATCTTC  .   .   .   
NS=1;AN=0   GT:PS   ./.:.
+1  12135   .   TGTCTGCATGTAACTTAATACCACAACCAGGCATAGGG  .   .   
.   NS=1;AN=0   GT:PS   ./.:.
+1  12187   .   AAGATGAGTGAGAGCATC  .   .   .   
NS=1;AN=0   GT:PS   ./.:.
+1  12238   .   CTTGTGC .   .   .   NS=1;AN=0   GT:PS   
./.:.
+1  12264   .   ACGTGGCCGGCCCTCGCTCCAGCAGCTGGATACCTGCCGTCTGCTGCC
.   .   .   NS=1;AN=0   GT:PS   ./.:.
+1  12329   .   GCCGGGCTGTGA.   .   .   NS=1;AN=0   

[4/4] beam git commit: This closes #3979

2017-11-08 Thread chamikara
This closes #3979


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/0af97209
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/0af97209
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/0af97209

Branch: refs/heads/master
Commit: 0af972095a9a233831dd98cb8b4057f8b2f25dfe
Parents: 9c54542 f22da33
Author: chamik...@google.com 
Authored: Wed Nov 8 08:47:44 2017 -0800
Committer: chamik...@google.com 
Committed: Wed Nov 8 08:47:44 2017 -0800

--
 pom.xml | 3 +
 sdks/python/apache_beam/io/source_test_utils.py | 3 +-
 sdks/python/apache_beam/io/textio.py|67 +-
 sdks/python/apache_beam/io/textio_test.py   |   503 +-
 sdks/python/apache_beam/io/vcfio.py |   436 +
 sdks/python/apache_beam/io/vcfio_test.py|   519 +
 .../apache_beam/testing/data/vcf/valid-4.0.vcf  |23 +
 .../testing/data/vcf/valid-4.0.vcf.bz2  |   Bin 0 -> 781 bytes
 .../testing/data/vcf/valid-4.0.vcf.gz   |   Bin 0 -> 727 bytes
 .../testing/data/vcf/valid-4.1-large.vcf| 1 +
 .../testing/data/vcf/valid-4.1-large.vcf.gz |   Bin 0 -> 156715 bytes
 .../apache_beam/testing/data/vcf/valid-4.2.vcf  |42 +
 .../testing/data/vcf/valid-4.2.vcf.gz   |   Bin 0 -> 1240 bytes
 sdks/python/apache_beam/testing/test_utils.py   |40 +
 .../apache_beam/testing/test_utils_test.py  |24 +
 sdks/python/generate_pydoc.sh   | 6 +
 sdks/python/setup.py| 3 +-
 17 files changed, 11419 insertions(+), 250 deletions(-)
--




Build failed in Jenkins: beam_PostCommit_Python_Verify #3504

2017-11-08 Thread Apache Jenkins Server
See 


--
[...truncated 931.67 KB...]
copying 
apache_beam/runners/dataflow/internal/clients/dataflow/dataflow_v1b3_messages.py
 -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/internal/clients/dataflow
copying 
apache_beam/runners/dataflow/internal/clients/dataflow/message_matchers.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/internal/clients/dataflow
copying 
apache_beam/runners/dataflow/internal/clients/dataflow/message_matchers_test.py 
-> apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/internal/clients/dataflow
copying apache_beam/runners/dataflow/native_io/__init__.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/native_io
copying apache_beam/runners/dataflow/native_io/iobase.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/native_io
copying apache_beam/runners/dataflow/native_io/iobase_test.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/native_io
copying apache_beam/runners/dataflow/native_io/streaming_create.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/native_io
copying apache_beam/runners/direct/__init__.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/bundle_factory.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/clock.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/consumer_tracking_pipeline_visitor.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/consumer_tracking_pipeline_visitor_test.py 
-> apache-beam-2.3.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/direct_metrics.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/direct_metrics_test.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/direct_runner.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/direct_runner_test.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/evaluation_context.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/executor.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/helper_transforms.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/transform_evaluator.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/util.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/watermark_manager.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/experimental/__init__.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/experimental
copying apache_beam/runners/experimental/python_rpc_direct/__init__.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/experimental/python_rpc_direct
copying 
apache_beam/runners/experimental/python_rpc_direct/python_rpc_direct_runner.py 
-> apache-beam-2.3.0.dev0/apache_beam/runners/experimental/python_rpc_direct
copying apache_beam/runners/experimental/python_rpc_direct/server.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/experimental/python_rpc_direct
copying apache_beam/runners/job/__init__.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/job
copying apache_beam/runners/job/manager.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/job
copying apache_beam/runners/job/utils.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/job
copying apache_beam/runners/portability/__init__.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/portability
copying apache_beam/runners/portability/fn_api_runner.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/portability
copying apache_beam/runners/portability/fn_api_runner_test.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/portability
copying apache_beam/runners/portability/maptask_executor_runner.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/portability
copying apache_beam/runners/portability/maptask_executor_runner_test.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/portability
copying apache_beam/runners/portability/universal_local_runner.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/portability
copying apache_beam/runners/portability/universal_local_runner_main.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/portability
copying apache_beam/runners/portability/universal_local_runner_test.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/portability
copying apache_beam/runners/test/__init__.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/test
copying apache_beam/runners/worker/__init__.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/worker
copying apache_beam/runners/worker/bundle_processor.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/worker
copying apache_beam/runn

[jira] [Updated] (BEAM-3158) DoFnTester should call close() in catch bloc and then re-throw exception to allow using @Rule ExpectedException

2017-11-08 Thread Etienne Chauchot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3158?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Etienne Chauchot updated BEAM-3158:
---
Description: 
In {{CLONE_ONCE}} and {{DO_NOT_CLONE}} cloning behaviors, it is required to 
explicitly call {{DoFnTester.close()}}. If an exception is raised by the 
DoFnTester, the user needs to use a try/catch bloc to call DoFnTester.close() 
himself. If this DoFnTester were doing 
{code}
try{
...
} catch(Exception e){
close();
throw e;
}
{code}
then the user would no longer need to call {{DoFnTester.close()}} (to release 
resources, stop threads ...) and thus no longer need the try/catch. This will 
allow him to use the Junit {{@Rule ExpectedException}} in place of something 
ugly like 
{code}
// need to avoid flow interruption to close the DoFnTester
Exception raisedException = null;
try {
  fnTester.processBundle(input);
} catch (IOException exception) {
  raisedException = exception;
}
fnTester.close();
assertTrue(raisedException != null);
{code}
But maybe {{DoFnTester}} cannot always be safely closed if a exception was 
raised.

  was:
In {{CLONE_ONCE}} and {{DO_NOT_CLONE}} cloning behaviors, it is required to 
explicitly call {{DoFnTester.close()}}. If an exception is raised by the 
DoFnTester, the user needs to use a try/catch bloc to call DoFnTester.close() 
himself. If this DoFnTester were doing 
{code}
try{
...
} catch(Exception e){
close();
throw e;
}
{code}
then the user would no longer need to call {{DoFnTester.close()}} (to release 
resources, stop threads ...) and thus no longer need the try/catch. This will 
allow him to use the Junit {{@Rule ExpectedException}} in place of something 
ugly like 
{code}
// need to avoid flow interruption to close the DoFnTester
Exception raisedException = null;
try {
  fnTester.processBundle(input);
} catch (IOException exception) {
  raisedException = exception;
}
fnTester.close();
assertTrue(raisedException != null);
{code}


> DoFnTester should call close() in catch bloc and then re-throw exception to 
> allow using @Rule ExpectedException
> ---
>
> Key: BEAM-3158
> URL: https://issues.apache.org/jira/browse/BEAM-3158
> Project: Beam
>  Issue Type: Improvement
>  Components: testing
>Reporter: Etienne Chauchot
>Assignee: Etienne Chauchot
>Priority: Minor
>
> In {{CLONE_ONCE}} and {{DO_NOT_CLONE}} cloning behaviors, it is required to 
> explicitly call {{DoFnTester.close()}}. If an exception is raised by the 
> DoFnTester, the user needs to use a try/catch bloc to call DoFnTester.close() 
> himself. If this DoFnTester were doing 
> {code}
> try{
> ...
> } catch(Exception e){
> close();
> throw e;
> }
> {code}
> then the user would no longer need to call {{DoFnTester.close()}} (to release 
> resources, stop threads ...) and thus no longer need the try/catch. This will 
> allow him to use the Junit {{@Rule ExpectedException}} in place of something 
> ugly like 
> {code}
> // need to avoid flow interruption to close the DoFnTester
> Exception raisedException = null;
> try {
>   fnTester.processBundle(input);
> } catch (IOException exception) {
>   raisedException = exception;
> }
> fnTester.close();
> assertTrue(raisedException != null);
> {code}
> But maybe {{DoFnTester}} cannot always be safely closed if a exception was 
> raised.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (BEAM-3158) DoFnTester should call close() in catch bloc and then re-throw exception

2017-11-08 Thread Etienne Chauchot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3158?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Etienne Chauchot updated BEAM-3158:
---
Description: 
In {{CLONE_ONCE}} and {{DO_NOT_CLONE}} cloning behaviors, it is required to 
explicitly call {{DoFnTester.close()}}. If an exception is raised by the 
DoFnTester, the user needs to use a try/catch bloc to call DoFnTester.close() 
himself. If this DoFnTester were doing 
{code}
try{
...
} catch(Exception e){
close();
throw e;
}
{code}
then the user would no longer need to call {{DoFnTester.close()}} (to release 
resources, stop threads ...) and thus no longer need the try/catch. This will 
allow him to use the Junit {{@Rule ExpectedException}} in place of something 
ugly like 
{code}
// need to avoid flow interruption to close the DoFnTester
Exception raisedException = null;
try {
  fnTester.processBundle(input);
} catch (IOException exception) {
  raisedException = exception;
}
fnTester.close();
assertTrue(raisedException != null);
{code}

  was:
In {{CLONE_ONCE}} and {{DO_NOT_CLONE}} cloning behaviors, it is required to 
explicitly call {{DoFnTester.close()}}. If an exception is raised by the 
DoFnTester, the user needs to use a try/catch bloc to call DoFnTester.close() 
himself. If this DoFnTester were doing 
{code}
try{
...
} catch(Exception e){
close();
throw e;
}
{code}
then the user would no longer need to call {{DoFnTester.close()}} (to release 
resources, stop threads ...) and thus no longer need the try/catch. This will 
allow him to use the Junit @Rule ExpectedException in place of something ugly 
like 
{code}
// need to avoid flow interruption to close the DoFnTester
Exception raisedException = null;
try {
  fnTester.processBundle(input);
} catch (IOException exception) {
  raisedException = exception;
}
fnTester.close();
assertTrue(raisedException != null);
{code}


> DoFnTester should call close() in catch bloc and then re-throw exception
> 
>
> Key: BEAM-3158
> URL: https://issues.apache.org/jira/browse/BEAM-3158
> Project: Beam
>  Issue Type: Improvement
>  Components: testing
>Reporter: Etienne Chauchot
>Assignee: Etienne Chauchot
>Priority: Minor
>
> In {{CLONE_ONCE}} and {{DO_NOT_CLONE}} cloning behaviors, it is required to 
> explicitly call {{DoFnTester.close()}}. If an exception is raised by the 
> DoFnTester, the user needs to use a try/catch bloc to call DoFnTester.close() 
> himself. If this DoFnTester were doing 
> {code}
> try{
> ...
> } catch(Exception e){
> close();
> throw e;
> }
> {code}
> then the user would no longer need to call {{DoFnTester.close()}} (to release 
> resources, stop threads ...) and thus no longer need the try/catch. This will 
> allow him to use the Junit {{@Rule ExpectedException}} in place of something 
> ugly like 
> {code}
> // need to avoid flow interruption to close the DoFnTester
> Exception raisedException = null;
> try {
>   fnTester.processBundle(input);
> } catch (IOException exception) {
>   raisedException = exception;
> }
> fnTester.close();
> assertTrue(raisedException != null);
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


Jenkins build is back to stable : beam_PostCommit_Java_ValidatesRunner_Apex #2756

2017-11-08 Thread Apache Jenkins Server
See 




[jira] [Updated] (BEAM-3158) DoFnTester should call close() in catch bloc and then re-throw exception to allow using @Rule ExpectedException

2017-11-08 Thread Etienne Chauchot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3158?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Etienne Chauchot updated BEAM-3158:
---
Summary: DoFnTester should call close() in catch bloc and then re-throw 
exception to allow using @Rule ExpectedException  (was: DoFnTester should call 
close() in catch bloc and then re-throw exception)

> DoFnTester should call close() in catch bloc and then re-throw exception to 
> allow using @Rule ExpectedException
> ---
>
> Key: BEAM-3158
> URL: https://issues.apache.org/jira/browse/BEAM-3158
> Project: Beam
>  Issue Type: Improvement
>  Components: testing
>Reporter: Etienne Chauchot
>Assignee: Etienne Chauchot
>Priority: Minor
>
> In {{CLONE_ONCE}} and {{DO_NOT_CLONE}} cloning behaviors, it is required to 
> explicitly call {{DoFnTester.close()}}. If an exception is raised by the 
> DoFnTester, the user needs to use a try/catch bloc to call DoFnTester.close() 
> himself. If this DoFnTester were doing 
> {code}
> try{
> ...
> } catch(Exception e){
> close();
> throw e;
> }
> {code}
> then the user would no longer need to call {{DoFnTester.close()}} (to release 
> resources, stop threads ...) and thus no longer need the try/catch. This will 
> allow him to use the Junit {{@Rule ExpectedException}} in place of something 
> ugly like 
> {code}
> // need to avoid flow interruption to close the DoFnTester
> Exception raisedException = null;
> try {
>   fnTester.processBundle(input);
> } catch (IOException exception) {
>   raisedException = exception;
> }
> fnTester.close();
> assertTrue(raisedException != null);
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (BEAM-3158) DoFnTester should call close() in catch bloc and then re-throw exception

2017-11-08 Thread Etienne Chauchot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-3158?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16244173#comment-16244173
 ] 

Etienne Chauchot commented on BEAM-3158:


@jkff WDYT?

> DoFnTester should call close() in catch bloc and then re-throw exception
> 
>
> Key: BEAM-3158
> URL: https://issues.apache.org/jira/browse/BEAM-3158
> Project: Beam
>  Issue Type: Improvement
>  Components: testing
>Reporter: Etienne Chauchot
>Assignee: Etienne Chauchot
>Priority: Minor
>
> In {{CLONE_ONCE}} and {{DO_NOT_CLONE}} cloning behaviors, it is required to 
> explicitly call {{DoFnTester.close()}}. If an exception is raised by the 
> DoFnTester, the user needs to use a try/catch bloc to call DoFnTester.close() 
> himself. If this DoFnTester were doing 
> {code}
> try{
> ...
> } catch(Exception e){
> close();
> throw e;
> }
> {code}
> then the user would no longer need to call {{DoFnTester.close()}} (to release 
> resources, stop threads ...) and thus no longer need the try/catch. This will 
> allow him to use the Junit @Rule ExpectedException in place of something ugly 
> like 
> {code}
> // need to avoid flow interruption to close the DoFnTester
> Exception raisedException = null;
> try {
>   fnTester.processBundle(input);
> } catch (IOException exception) {
>   raisedException = exception;
> }
> fnTester.close();
> assertTrue(raisedException != null);
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Comment Edited] (BEAM-3158) DoFnTester should call close() in catch bloc and then re-throw exception

2017-11-08 Thread Etienne Chauchot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-3158?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16244173#comment-16244173
 ] 

Etienne Chauchot edited comment on BEAM-3158 at 11/8/17 3:39 PM:
-

[~jkff] WDYT?


was (Author: echauchot):
@jkff WDYT?

> DoFnTester should call close() in catch bloc and then re-throw exception
> 
>
> Key: BEAM-3158
> URL: https://issues.apache.org/jira/browse/BEAM-3158
> Project: Beam
>  Issue Type: Improvement
>  Components: testing
>Reporter: Etienne Chauchot
>Assignee: Etienne Chauchot
>Priority: Minor
>
> In {{CLONE_ONCE}} and {{DO_NOT_CLONE}} cloning behaviors, it is required to 
> explicitly call {{DoFnTester.close()}}. If an exception is raised by the 
> DoFnTester, the user needs to use a try/catch bloc to call DoFnTester.close() 
> himself. If this DoFnTester were doing 
> {code}
> try{
> ...
> } catch(Exception e){
> close();
> throw e;
> }
> {code}
> then the user would no longer need to call {{DoFnTester.close()}} (to release 
> resources, stop threads ...) and thus no longer need the try/catch. This will 
> allow him to use the Junit @Rule ExpectedException in place of something ugly 
> like 
> {code}
> // need to avoid flow interruption to close the DoFnTester
> Exception raisedException = null;
> try {
>   fnTester.processBundle(input);
> } catch (IOException exception) {
>   raisedException = exception;
> }
> fnTester.close();
> assertTrue(raisedException != null);
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (BEAM-3158) DoFnTester should call close() in catch bloc and then re-throw exception

2017-11-08 Thread Etienne Chauchot (JIRA)
Etienne Chauchot created BEAM-3158:
--

 Summary: DoFnTester should call close() in catch bloc and then 
re-throw exception
 Key: BEAM-3158
 URL: https://issues.apache.org/jira/browse/BEAM-3158
 Project: Beam
  Issue Type: Improvement
  Components: testing
Reporter: Etienne Chauchot
Assignee: Etienne Chauchot
Priority: Minor


In {{CLONE_ONCE}} and {{DO_NOT_CLONE}} cloning behaviors, it is required to 
explicitly call {{DoFnTester.close()}}. If an exception is raised by the 
DoFnTester, the user needs to use a try/catch bloc to call DoFnTester.close() 
himself. If this DoFnTester were doing 
{code}
try{
...
} catch(Exception e){
close();
throw e;
}
{code}
then the user would no longer need to call {{DoFnTester.close()}} (to release 
resources, stop threads ...) and thus no longer need the try/catch. This will 
allow him to use the Junit @Rule ExpectedException in place of something ugly 
like 
{code}
// need to avoid flow interruption to close the DoFnTester
Exception raisedException = null;
try {
  fnTester.processBundle(input);
} catch (IOException exception) {
  raisedException = exception;
}
fnTester.close();
assertTrue(raisedException != null);
{code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


Build failed in Jenkins: beam_PostCommit_Python_Verify #3503

2017-11-08 Thread Apache Jenkins Server
See 


Changes:

[echauchot] [BEAM-2853] remove Nexmark README.md now that the doc is in the 
website

--
[...truncated 934.62 KB...]
copying 
apache_beam/runners/dataflow/internal/clients/dataflow/dataflow_v1b3_messages.py
 -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/internal/clients/dataflow
copying 
apache_beam/runners/dataflow/internal/clients/dataflow/message_matchers.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/internal/clients/dataflow
copying 
apache_beam/runners/dataflow/internal/clients/dataflow/message_matchers_test.py 
-> apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/internal/clients/dataflow
copying apache_beam/runners/dataflow/native_io/__init__.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/native_io
copying apache_beam/runners/dataflow/native_io/iobase.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/native_io
copying apache_beam/runners/dataflow/native_io/iobase_test.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/native_io
copying apache_beam/runners/dataflow/native_io/streaming_create.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/native_io
copying apache_beam/runners/direct/__init__.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/bundle_factory.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/clock.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/consumer_tracking_pipeline_visitor.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/consumer_tracking_pipeline_visitor_test.py 
-> apache-beam-2.3.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/direct_metrics.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/direct_metrics_test.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/direct_runner.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/direct_runner_test.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/evaluation_context.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/executor.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/helper_transforms.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/transform_evaluator.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/util.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/watermark_manager.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/experimental/__init__.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/experimental
copying apache_beam/runners/experimental/python_rpc_direct/__init__.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/experimental/python_rpc_direct
copying 
apache_beam/runners/experimental/python_rpc_direct/python_rpc_direct_runner.py 
-> apache-beam-2.3.0.dev0/apache_beam/runners/experimental/python_rpc_direct
copying apache_beam/runners/experimental/python_rpc_direct/server.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/experimental/python_rpc_direct
copying apache_beam/runners/job/__init__.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/job
copying apache_beam/runners/job/manager.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/job
copying apache_beam/runners/job/utils.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/job
copying apache_beam/runners/portability/__init__.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/portability
copying apache_beam/runners/portability/fn_api_runner.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/portability
copying apache_beam/runners/portability/fn_api_runner_test.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/portability
copying apache_beam/runners/portability/maptask_executor_runner.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/portability
copying apache_beam/runners/portability/maptask_executor_runner_test.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/portability
copying apache_beam/runners/portability/universal_local_runner.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/portability
copying apache_beam/runners/portability/universal_local_runner_main.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/portability
copying apache_beam/runners/portability/universal_local_runner_test.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/portability
copying apache_beam/runners/test/__init__.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/test
copying apache_beam/runners/worker/__init__.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/worker
copying apache_beam/runn

[jira] [Commented] (BEAM-3118) Fix thread leaks in ElasticsearchIO 5 write tests

2017-11-08 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-3118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16244106#comment-16244106
 ] 

ASF GitHub Bot commented on BEAM-3118:
--

GitHub user echauchot opened a pull request:

https://github.com/apache/beam/pull/4098

[BEAM-3118] Fix thread leaks in ElasticsearchIOTest.testWriteWith* when 
using DoFnTester

Follow this checklist to help us incorporate your contribution quickly and 
easily:

 - [X] Make sure there is a [JIRA 
issue](https://issues.apache.org/jira/projects/BEAM/issues/) filed for the 
change (usually before you start working on it).  Trivial changes like typos do 
not require a JIRA issue.  Your pull request should address just this issue, 
without pulling in other changes.
 - [X] Each commit in the pull request should have a meaningful subject 
line and body.
 - [X] Format the pull request title like `[BEAM-XXX] Fixes bug in 
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA 
issue.
 - [X] Write a pull request description that is detailed enough to 
understand what the pull request does, how, and why.
 - [X] Run `mvn clean verify` to make sure basic checks pass. A more 
thorough check will be performed on your pull request automatically.
 - [X] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).

---
DoFnTester is configured in CLONE_ONCE cloning behavior. So we need to 
explicitly call DoFnTester.close() so that doFn.tearDown() could be called and 
the ES Resclient properly closed causing the http leaked threads to be properly 
terminated.
R: @jkff 

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/echauchot/beam 
BEAM-3118-write-tests-leak-threads

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/4098.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #4098


commit fdf66f9007571c6807b345e506e8529f2e2dcec7
Author: Etienne Chauchot 
Date:   2017-11-06T16:17:32Z

[BEAM-3118] Fix thread leaks in ElasticsearchIOTest.testWriteWith* when 
using DoFnTester.

RestClient was not properly closed because of WriteFn cloning.

commit c4472a5ad87fa25b5b435060c821d8cfdfd52850
Author: Etienne Chauchot 
Date:   2017-11-08T14:32:16Z

[BEAM-3118] Replace expectedException in ESIOTest.testWriteWithErrors to 
avoid interrupting the execution flow and properly close the DoFnTester




> Fix thread leaks in ElasticsearchIO 5 write tests
> -
>
> Key: BEAM-3118
> URL: https://issues.apache.org/jira/browse/BEAM-3118
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-extensions
>Reporter: Etienne Chauchot
>Assignee: Etienne Chauchot
>
> Http threads are leaking in write tests that use {{DoFnTester}}.
> {code}
> AVERTISSEMENT: Will linger awaiting termination of 11 leaked thread(s).
> oct. 27, 2017 4:30:07 PM com.carrotsearch.randomizedtesting.ThreadLeakControl 
> checkThreadLeaks
> GRAVE: 9 threads leaked from SUITE scope at 
> org.apache.beam.sdk.io.elasticsearch.ElasticsearchIOTest: 
>1) Thread[id=92, name=I/O dispatcher 22, state=RUNNABLE, 
> group=TGRP-ElasticsearchIOTest]
> at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)
> at sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:269)
> at sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:93)
> at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:86)
> at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:97)
> at 
> org.apache.http.impl.nio.reactor.AbstractIOReactor.execute(AbstractIOReactor.java:255)
> at 
> org.apache.http.impl.nio.reactor.BaseIOReactor.execute(BaseIOReactor.java:104)
> at 
> org.apache.http.impl.nio.reactor.AbstractMultiworkerIOReactor$Worker.run(AbstractMultiworkerIOReactor.java:588)
> at java.lang.Thread.run(Thread.java:745)
>2) Thread[id=86, name=pool-4-thread-1, state=RUNNABLE, 
> group=TGRP-ElasticsearchIOTest]
> at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)
> at sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:269)
> at sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:93)
> at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:86)
> at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:97)
> at 
> org.apache.http.impl.nio.reactor.AbstractMultiworkerIOReactor.execute(AbstractMultiworkerIOReactor.java:340)
> at 
> org.apache.http.impl.nio.conn.PoolingNHttpClientConnectionManager.

[GitHub] beam pull request #4098: [BEAM-3118] Fix thread leaks in ElasticsearchIOTest...

2017-11-08 Thread echauchot
GitHub user echauchot opened a pull request:

https://github.com/apache/beam/pull/4098

[BEAM-3118] Fix thread leaks in ElasticsearchIOTest.testWriteWith* when 
using DoFnTester

Follow this checklist to help us incorporate your contribution quickly and 
easily:

 - [X] Make sure there is a [JIRA 
issue](https://issues.apache.org/jira/projects/BEAM/issues/) filed for the 
change (usually before you start working on it).  Trivial changes like typos do 
not require a JIRA issue.  Your pull request should address just this issue, 
without pulling in other changes.
 - [X] Each commit in the pull request should have a meaningful subject 
line and body.
 - [X] Format the pull request title like `[BEAM-XXX] Fixes bug in 
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA 
issue.
 - [X] Write a pull request description that is detailed enough to 
understand what the pull request does, how, and why.
 - [X] Run `mvn clean verify` to make sure basic checks pass. A more 
thorough check will be performed on your pull request automatically.
 - [X] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).

---
DoFnTester is configured in CLONE_ONCE cloning behavior. So we need to 
explicitly call DoFnTester.close() so that doFn.tearDown() could be called and 
the ES Resclient properly closed causing the http leaked threads to be properly 
terminated.
R: @jkff 

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/echauchot/beam 
BEAM-3118-write-tests-leak-threads

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/4098.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #4098


commit fdf66f9007571c6807b345e506e8529f2e2dcec7
Author: Etienne Chauchot 
Date:   2017-11-06T16:17:32Z

[BEAM-3118] Fix thread leaks in ElasticsearchIOTest.testWriteWith* when 
using DoFnTester.

RestClient was not properly closed because of WriteFn cloning.

commit c4472a5ad87fa25b5b435060c821d8cfdfd52850
Author: Etienne Chauchot 
Date:   2017-11-08T14:32:16Z

[BEAM-3118] Replace expectedException in ESIOTest.testWriteWithErrors to 
avoid interrupting the execution flow and properly close the DoFnTester




---


[jira] [Commented] (BEAM-2853) Move the nexmark documentation to the website

2017-11-08 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2853?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16244025#comment-16244025
 ] 

ASF GitHub Bot commented on BEAM-2853:
--

Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/4097


> Move the nexmark documentation to the website
> -
>
> Key: BEAM-2853
> URL: https://issues.apache.org/jira/browse/BEAM-2853
> Project: Beam
>  Issue Type: Improvement
>  Components: website
>Reporter: Ismaël Mejía
>Assignee: Etienne Chauchot
>Priority: Minor
>  Labels: nexmark
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] beam pull request #4097: [BEAM-2853] remove Nexmark README.md now that the d...

2017-11-08 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/4097


---


[1/2] beam git commit: [BEAM-2853] remove Nexmark README.md now that the doc is in the website

2017-11-08 Thread iemejia
Repository: beam
Updated Branches:
  refs/heads/master e0166ceb6 -> 9c5454287


[BEAM-2853] remove Nexmark README.md now that the doc is in the website


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/f78f4559
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/f78f4559
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/f78f4559

Branch: refs/heads/master
Commit: f78f45596bf5f3a5c3fd0b12b74688e1beeb3c7a
Parents: e0166ce
Author: Etienne Chauchot 
Authored: Wed Nov 8 11:23:13 2017 +0100
Committer: Etienne Chauchot 
Committed: Wed Nov 8 11:23:13 2017 +0100

--
 sdks/java/nexmark/README.md | 340 ---
 1 file changed, 340 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/f78f4559/sdks/java/nexmark/README.md
--
diff --git a/sdks/java/nexmark/README.md b/sdks/java/nexmark/README.md
deleted file mode 100644
index f252943..000
--- a/sdks/java/nexmark/README.md
+++ /dev/null
@@ -1,340 +0,0 @@
-
-
-# NEXMark test suite
-
-This is a suite of pipelines inspired by the 'continuous data stream'
-queries in [http://datalab.cs.pdx.edu/niagaraST/NEXMark/]
-(http://datalab.cs.pdx.edu/niagaraST/NEXMark/).
-
-These are multiple queries over a three entities model representing on online 
auction system:
-
- - **Person** represents a person submitting an item for auction and/or making 
a bid
-on an auction.
- - **Auction** represents an item under auction.
- - **Bid** represents a bid for an item under auction.
-
-The queries exercise many aspects of Beam model:
-
-* **Query1**: What are the bid values in Euro's?
-  Illustrates a simple map.
-* **Query2**: What are the auctions with particular auction numbers?
-  Illustrates a simple filter.
-* **Query3**: Who is selling in particular US states?
-  Illustrates an incremental join (using per-key state and timer) and filter.
-* **Query4**: What is the average selling price for each auction
-  category?
-  Illustrates complex join (using custom window functions) and
-  aggregation.
-* **Query5**: Which auctions have seen the most bids in the last period?
-  Illustrates sliding windows and combiners.
-* **Query6**: What is the average selling price per seller for their
-  last 10 closed auctions.
-  Shares the same 'winning bids' core as for **Query4**, and
-  illustrates a specialized combiner.
-* **Query7**: What are the highest bids per period?
-  Deliberately implemented using a side input to illustrate fanout.
-* **Query8**: Who has entered the system and created an auction in
-  the last period?
-  Illustrates a simple join.
-
-We have augmented the original queries with five more:
-
-* **Query0**: Pass-through.
-  Allows us to measure the monitoring overhead.
-* **Query9**: Winning-bids.
-  A common sub-query shared by **Query4** and **Query6**.
-* **Query10**: Log all events to GCS files.
-  Illustrates windows with large side effects on firing.
-* **Query11**: How many bids did a user make in each session they
-  were active?
-  Illustrates session windows.
-* **Query12**: How many bids does a user make within a fixed
-  processing time limit?
-  Illustrates working in processing time in the Global window, as
-  compared with event time in non-Global windows for all the other
-  queries.
-
-We can specify the Beam runner to use with maven profiles, available profiles 
are:
-
-* direct-runner
-* spark-runner
-* flink-runner
-* apex-runner
-
-The runner must also be specified like in any other Beam pipeline using
-
---runner
-
-
-Test data is deterministically synthesized on demand. The test
-data may be synthesized in the same pipeline as the query itself,
-or may be published to Pubsub.
-
-The query results may be:
-
-* Published to Pubsub.
-* Written to text files as plain text.
-* Written to text files using an Avro encoding.
-* Send to BigQuery.
-* Discarded.
-
-# Configuration
-
-## Common configuration parameters
-
-Decide if batch or streaming:
-
---streaming=true
-
-Number of events generators
-
---numEventGenerators=4
-
-Run query N
-
---query=N
-
-## Available Suites
-The suite to run can be chosen using this configuration parameter:
-
---suite=SUITE
-
-Available suites are:
-* DEFAULT: Test default configuration with query 0.
-* SMOKE: Run the 12 default configurations.
-* STRESS: Like smoke but for 1m events.
-* FULL_THROTTLE: Like SMOKE but 100m events.
-
-   
-
-## Apex specific configuration
-
---manageResources=false --monitorJobs=false
-
-## Dataflow specific configuration
-
---manageResources=false --monitorJobs=true \
---enforceEncodability=false --enforceImmutability=false
---project= \
---zone= \
---workerMachineType=n1-highmem-8 \
---stagingLocation= \
---runner=Dataf

[2/2] beam git commit: This closes #4097

2017-11-08 Thread iemejia
This closes #4097


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/9c545428
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/9c545428
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/9c545428

Branch: refs/heads/master
Commit: 9c54542875cd7180e225cbe6837074539e505934
Parents: e0166ce f78f455
Author: Ismaël Mejía 
Authored: Wed Nov 8 15:23:18 2017 +0100
Committer: Ismaël Mejía 
Committed: Wed Nov 8 15:23:18 2017 +0100

--
 sdks/java/nexmark/README.md | 340 ---
 1 file changed, 340 deletions(-)
--




Jenkins build became unstable: beam_PostCommit_Java_ValidatesRunner_Apex #2755

2017-11-08 Thread Apache Jenkins Server
See 




Build failed in Jenkins: beam_PerformanceTests_Python #536

2017-11-08 Thread Apache Jenkins Server
See 


Changes:

[xumingmingv] [BEAM-2203] Implement 'timestamp - interval'

--
Started by timer
[EnvInject] - Loading node environment variables.
Building remotely on beam7 (beam) in workspace 

 > git rev-parse --is-inside-work-tree # timeout=10
Fetching changes from the remote Git repository
 > git config remote.origin.url https://github.com/apache/beam.git # timeout=10
Fetching upstream changes from https://github.com/apache/beam.git
 > git --version # timeout=10
 > git fetch --tags --progress https://github.com/apache/beam.git 
 > +refs/heads/*:refs/remotes/origin/* 
 > +refs/pull/${ghprbPullId}/*:refs/remotes/origin/pr/${ghprbPullId}/*
 > git rev-parse origin/master^{commit} # timeout=10
Checking out Revision e0166ceb657b393e4037ac7c178797a864379745 (origin/master)
 > git config core.sparsecheckout # timeout=10
 > git checkout -f e0166ceb657b393e4037ac7c178797a864379745
Commit message: "This closes #4082"
 > git rev-list 29942939444632d3f7fac50decffe5952c7ef165 # timeout=10
Cleaning workspace
 > git rev-parse --verify HEAD # timeout=10
Resetting working tree
 > git reset --hard # timeout=10
 > git clean -fdx # timeout=10
[EnvInject] - Executing scripts and injecting environment variables after the 
SCM step.
[EnvInject] - Injecting as environment variables the properties content 
SPARK_LOCAL_IP=127.0.0.1

[EnvInject] - Variables injected successfully.
[beam_PerformanceTests_Python] $ /bin/bash -xe 
/tmp/jenkins4646494036297450391.sh
+ rm -rf PerfKitBenchmarker
[beam_PerformanceTests_Python] $ /bin/bash -xe 
/tmp/jenkins2366331168470880743.sh
+ git clone https://github.com/GoogleCloudPlatform/PerfKitBenchmarker.git
Cloning into 'PerfKitBenchmarker'...
[beam_PerformanceTests_Python] $ /bin/bash -xe 
/tmp/jenkins1018239280449230674.sh
+ pip install --user -r PerfKitBenchmarker/requirements.txt
Requirement already satisfied: absl-py in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 14))
Requirement already satisfied: jinja2>=2.7 in 
/usr/local/lib/python2.7/dist-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 15))
Requirement already satisfied: setuptools in /usr/lib/python2.7/dist-packages 
(from -r PerfKitBenchmarker/requirements.txt (line 16))
Requirement already satisfied: colorlog[windows]==2.6.0 in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 17))
Requirement already satisfied: blinker>=1.3 in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 18))
Requirement already satisfied: futures>=3.0.3 in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 19))
Requirement already satisfied: PyYAML==3.12 in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 20))
Requirement already satisfied: pint>=0.7 in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 21))
Requirement already satisfied: numpy in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 22))
Requirement already satisfied: functools32 in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 23))
Requirement already satisfied: contextlib2>=0.5.1 in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 24))
Requirement already satisfied: pywinrm in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 25))
Requirement already satisfied: six in 
/home/jenkins/.local/lib/python2.7/site-packages (from absl-py->-r 
PerfKitBenchmarker/requirements.txt (line 14))
Requirement already satisfied: MarkupSafe>=0.23 in 
/usr/local/lib/python2.7/dist-packages (from jinja2>=2.7->-r 
PerfKitBenchmarker/requirements.txt (line 15))
Requirement already satisfied: colorama; extra == "windows" in 
/usr/lib/python2.7/dist-packages (from colorlog[windows]==2.6.0->-r 
PerfKitBenchmarker/requirements.txt (line 17))
Requirement already satisfied: xmltodict in 
/home/jenkins/.local/lib/python2.7/site-packages (from pywinrm->-r 
PerfKitBenchmarker/requirements.txt (line 25))
Requirement already satisfied: requests-ntlm>=0.3.0 in 
/home/jenkins/.local/lib/python2.7/site-packages (from pywinrm->-r 
PerfKitBenchmarker/requirements.txt (line 25))
Requirement already satisfied: requests>=2.9.1 in 
/home/jenkins/.local/lib/python2.7/site-packages (from pywinrm->-r 
PerfKitBenchmarker/requirements.txt (line 25))
Requirement already satisfied: ntlm-auth>=1.0.2 in 
/home/jenkins/.local/lib/python2.7/site-packages (from 
requests-ntlm>=0.3.0->pywinrm->-r PerfKitBenchmarker/req

[jira] [Created] (BEAM-3157) BeamSql transform should support other PCollection types

2017-11-08 Thread JIRA
Ismaël Mejía created BEAM-3157:
--

 Summary: BeamSql transform should support other PCollection types
 Key: BEAM-3157
 URL: https://issues.apache.org/jira/browse/BEAM-3157
 Project: Beam
  Issue Type: Improvement
  Components: dsl-sql
Reporter: Ismaël Mejía


Currently the Beam SQL transform only supports input and output data 
represented as a BeamRecord. This seems to me like an usability limitation 
(even if we can do a ParDo to prepare objects before and after the transform).

I suppose this constraint comes from the fact that we need to map 
name/type/value from an object field into Calcite so it is convenient to have a 
specific data type (BeamRecord) for this. However we can accomplish the same by 
using a PCollection of JavaBean (where we know the same information via the 
field names/types/values) or by using Avro records where we also have the 
Schema information. For the output PCollection we can map the object via a 
Reference (e.g. a JavaBean to be filled with the names of an Avro object).

Note: I am assuming for the moment simple mappings since the SQL does not 
support composite types for the moment.

A simple API idea would be something like this:

PCollection col = ...
PCollection<>BeamSql.query("SELECT ...", MyNewPojo.class);

A first approach could be to just add the extra ParDos + transform DoFns 
however I suppose that for memory use reasons maybe mapping directly into 
Calcite would make sense.




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (BEAM-2853) Move the nexmark documentation to the website

2017-11-08 Thread Etienne Chauchot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2853?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16243668#comment-16243668
 ] 

Etienne Chauchot commented on BEAM-2853:


done

> Move the nexmark documentation to the website
> -
>
> Key: BEAM-2853
> URL: https://issues.apache.org/jira/browse/BEAM-2853
> Project: Beam
>  Issue Type: Improvement
>  Components: website
>Reporter: Ismaël Mejía
>Assignee: Etienne Chauchot
>Priority: Minor
>  Labels: nexmark
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


  1   2   >