[jira] [Commented] (BEAM-2817) Bigquery queries should allow options to run in batch mode or not

2017-10-30 Thread Justin Tumale (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2817?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16226259#comment-16226259
 ] 

Justin Tumale commented on BEAM-2817:
-

Hi [~chamikara], I was wondering about what you think would be the best way to 
pass in this configuration. In the BigQueryIO class, there is a createSource 
method (line 573) which calls BigQueryQuerySource.create(...) which I can pass 
in the priority to. When executeQuery (line 133) is called from this 
BigQueryQuerySource object, the configuration will be based on the priority 
passed upon create. Another way I was thinking was to have the configuration 
set from the BigQueryOptions. Please advise when you get a chance.

Thanks,
-Justin

cc [~laraschmidt]

> Bigquery queries should allow options to run in batch mode or not
> -
>
> Key: BEAM-2817
> URL: https://issues.apache.org/jira/browse/BEAM-2817
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-gcp
>Affects Versions: 2.0.0
>Reporter: Lara Schmidt
>Assignee: Justin Tumale
>  Labels: newbie, starter
>
> When bigquery read does a query it sets the mode to batch. A batch query can 
> be very slow to schedule as it batches it with other queries. However it 
> doesn't use batch quota which is better for some cases. However, in some 
> cases a fast query is better (especially in timed tests). It would be a good 
> idea to have a configuration to the bigquery source to set this per-read.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


Build failed in Jenkins: beam_PostCommit_Python_Verify #3455

2017-10-30 Thread Apache Jenkins Server
See 


--
[...truncated 930.86 KB...]
copying apache_beam/runners/common_test.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners
copying apache_beam/runners/pipeline_context.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners
copying apache_beam/runners/pipeline_context_test.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners
copying apache_beam/runners/runner.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners
copying apache_beam/runners/runner_test.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners
copying apache_beam/runners/dataflow/__init__.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow
copying apache_beam/runners/dataflow/dataflow_metrics.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow
copying apache_beam/runners/dataflow/dataflow_metrics_test.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow
copying apache_beam/runners/dataflow/dataflow_runner.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow
copying apache_beam/runners/dataflow/dataflow_runner_test.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow
copying apache_beam/runners/dataflow/ptransform_overrides.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow
copying apache_beam/runners/dataflow/template_runner_test.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow
copying apache_beam/runners/dataflow/test_dataflow_runner.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow
copying apache_beam/runners/dataflow/internal/__init__.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/internal
copying apache_beam/runners/dataflow/internal/apiclient.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/internal
copying apache_beam/runners/dataflow/internal/apiclient_test.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/internal
copying apache_beam/runners/dataflow/internal/dependency.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/internal
copying apache_beam/runners/dataflow/internal/dependency_test.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/internal
copying apache_beam/runners/dataflow/internal/names.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/internal
copying apache_beam/runners/dataflow/internal/clients/__init__.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/internal/clients
copying apache_beam/runners/dataflow/internal/clients/dataflow/__init__.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/internal/clients/dataflow
copying 
apache_beam/runners/dataflow/internal/clients/dataflow/dataflow_v1b3_client.py 
-> apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/internal/clients/dataflow
copying 
apache_beam/runners/dataflow/internal/clients/dataflow/dataflow_v1b3_messages.py
 -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/internal/clients/dataflow
copying 
apache_beam/runners/dataflow/internal/clients/dataflow/message_matchers.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/internal/clients/dataflow
copying 
apache_beam/runners/dataflow/internal/clients/dataflow/message_matchers_test.py 
-> apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/internal/clients/dataflow
copying apache_beam/runners/dataflow/native_io/__init__.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/native_io
copying apache_beam/runners/dataflow/native_io/iobase.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/native_io
copying apache_beam/runners/dataflow/native_io/iobase_test.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/native_io
copying apache_beam/runners/dataflow/native_io/streaming_create.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/native_io
copying apache_beam/runners/direct/__init__.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/bundle_factory.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/clock.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/consumer_tracking_pipeline_visitor.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/consumer_tracking_pipeline_visitor_test.py 
-> apache-beam-2.3.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/direct_metrics.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/direct_metrics_test.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/direct_runner.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/direct_runner_test.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/evaluation_context.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/executor.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/direct
copying 

[jira] [Commented] (BEAM-3064) Update dataflow runner containers for the release branch

2017-10-30 Thread Ahmet Altay (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-3064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16226202#comment-16226202
 ] 

Ahmet Altay commented on BEAM-3064:
---

We can also do things in the following order:
1. Merge https://github.com/apache/beam/pull/4059
2. Cut RC2
3. In parallel to 2, release dataflow containers. RC2 python tests may fail 
initially but will work after the release.

This could allow us to cut the RC2 now.

> Update dataflow runner containers for the release branch
> 
>
> Key: BEAM-3064
> URL: https://issues.apache.org/jira/browse/BEAM-3064
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Affects Versions: 2.2.0
>Reporter: Ahmet Altay
>Assignee: Valentyn Tymofieiev
>Priority: Blocker
> Fix For: 2.2.0
>
>
> Blocked by:
> https://github.com/apache/beam/pull/3970
> https://github.com/apache/beam/pull/3941 - cp into release branch.
> cc: [~reuvenlax] [~robertwb] [~tvalentyn]



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


Jenkins build became unstable: beam_PostCommit_Java_ValidatesRunner_Apex #2707

2017-10-30 Thread Apache Jenkins Server
See 




Jenkins build is back to normal : beam_PostCommit_Java_ValidatesRunner_Apex #2706

2017-10-30 Thread Apache Jenkins Server
See 




Build failed in Jenkins: beam_PostCommit_Python_Verify #3454

2017-10-30 Thread Apache Jenkins Server
See 


Changes:

[klk] Temporarily disable Dataflow pipeline_url metadata

--
[...truncated 929.04 KB...]
copying apache_beam/runners/common_test.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners
copying apache_beam/runners/pipeline_context.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners
copying apache_beam/runners/pipeline_context_test.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners
copying apache_beam/runners/runner.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners
copying apache_beam/runners/runner_test.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners
copying apache_beam/runners/dataflow/__init__.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow
copying apache_beam/runners/dataflow/dataflow_metrics.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow
copying apache_beam/runners/dataflow/dataflow_metrics_test.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow
copying apache_beam/runners/dataflow/dataflow_runner.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow
copying apache_beam/runners/dataflow/dataflow_runner_test.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow
copying apache_beam/runners/dataflow/ptransform_overrides.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow
copying apache_beam/runners/dataflow/template_runner_test.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow
copying apache_beam/runners/dataflow/test_dataflow_runner.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow
copying apache_beam/runners/dataflow/internal/__init__.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/internal
copying apache_beam/runners/dataflow/internal/apiclient.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/internal
copying apache_beam/runners/dataflow/internal/apiclient_test.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/internal
copying apache_beam/runners/dataflow/internal/dependency.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/internal
copying apache_beam/runners/dataflow/internal/dependency_test.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/internal
copying apache_beam/runners/dataflow/internal/names.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/internal
copying apache_beam/runners/dataflow/internal/clients/__init__.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/internal/clients
copying apache_beam/runners/dataflow/internal/clients/dataflow/__init__.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/internal/clients/dataflow
copying 
apache_beam/runners/dataflow/internal/clients/dataflow/dataflow_v1b3_client.py 
-> apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/internal/clients/dataflow
copying 
apache_beam/runners/dataflow/internal/clients/dataflow/dataflow_v1b3_messages.py
 -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/internal/clients/dataflow
copying 
apache_beam/runners/dataflow/internal/clients/dataflow/message_matchers.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/internal/clients/dataflow
copying 
apache_beam/runners/dataflow/internal/clients/dataflow/message_matchers_test.py 
-> apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/internal/clients/dataflow
copying apache_beam/runners/dataflow/native_io/__init__.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/native_io
copying apache_beam/runners/dataflow/native_io/iobase.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/native_io
copying apache_beam/runners/dataflow/native_io/iobase_test.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/native_io
copying apache_beam/runners/dataflow/native_io/streaming_create.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/native_io
copying apache_beam/runners/direct/__init__.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/bundle_factory.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/clock.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/consumer_tracking_pipeline_visitor.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/consumer_tracking_pipeline_visitor_test.py 
-> apache-beam-2.3.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/direct_metrics.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/direct_metrics_test.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/direct_runner.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/direct_runner_test.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/evaluation_context.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/direct
copying 

[jira] [Commented] (BEAM-3064) Update dataflow runner containers for the release branch

2017-10-30 Thread Reuven Lax (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-3064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16226128#comment-16226128
 ] 

Reuven Lax commented on BEAM-3064:
--

Since I'm about to cut RC2, should we wait for that?

> Update dataflow runner containers for the release branch
> 
>
> Key: BEAM-3064
> URL: https://issues.apache.org/jira/browse/BEAM-3064
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Affects Versions: 2.2.0
>Reporter: Ahmet Altay
>Assignee: Valentyn Tymofieiev
>Priority: Blocker
> Fix For: 2.2.0
>
>
> Blocked by:
> https://github.com/apache/beam/pull/3970
> https://github.com/apache/beam/pull/3941 - cp into release branch.
> cc: [~reuvenlax] [~robertwb] [~tvalentyn]



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] beam pull request #4062: [Beam-2482] - CodedValueMutationDetector should use...

2017-10-30 Thread evindj
GitHub user evindj opened a pull request:

https://github.com/apache/beam/pull/4062

[Beam-2482] - CodedValueMutationDetector should use the coders structural 
value

Follow this checklist to help us incorporate your contribution quickly and 
easily:

 - [ ] Make sure there is a [JIRA 
issue](https://issues.apache.org/jira/projects/BEAM/issues/) filed for the 
change (usually before you start working on it).  Trivial changes like typos do 
not require a JIRA issue.  Your pull request should address just this issue, 
without pulling in other changes.
 - [ ] Each commit in the pull request should have a meaningful subject 
line and body.
 - [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in 
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA 
issue.
 - [ ] Write a pull request description that is detailed enough to 
understand what the pull request does, how, and why.
 - [ ] Run `mvn clean verify` to make sure basic checks pass. A more 
thorough check will be performed on your pull request automatically.
 - [ ] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).

---


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/evindj/beam BEAM-2482

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/4062.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #4062


commit 3aa0fc37d96b6e76dbc540fc0fb538a438cbf6e9
Author: Innocent Djiofack 
Date:   2017-07-26T03:41:02Z

Changed the mutation detector to be based on structural value only

commit 36c7a14c69ba628f477798f63d4e721dc86e2be7
Author: Innocent Djiofack 
Date:   2017-07-26T04:51:21Z

fixed compilation errors and a bug in ByteArraycoder

commit 46cfa4021e5edb2bbb15665656647606d44cdce4
Author: Innocent Djiofack 
Date:   2017-07-26T10:48:11Z

Merge branch 'master' of https://github.com/apache/beam into BEAM-2482

commit 1bfc0aecbef5a6adb56e69db9168db0b652ed654
Author: Innocent Djiofack 
Date:   2017-07-26T11:36:59Z

Added override of consistentWithEquals() to SerializableCoder so that the 
correct strutural value could be returned.

commit 2b7ff0a0e0fe9116575ff3416cce217895ae68c4
Author: Innocent Djiofack 
Date:   2017-08-05T02:51:23Z

made the comparison of structural values based on cloned values in 
MutationDetectors.java. Added addition al equals comparison of the cloned 
values without them,some BigTableIoTest fail

commit 4931cd430602c2b41a7e2450950a6c822880e927
Author: Innocent Djiofack 
Date:   2017-08-16T00:57:51Z

Modified mutability detection to be based solely on structural value. Added 
@Ignore to 2 failing tests in BigTableIOTest

commit 65ec9836d229fe2f4016eba0599a555adecf5cb8
Author: Innocent Djiofack 
Date:   2017-09-01T00:07:19Z

fixed spacing

commit 3490fb292fde436400fccd04f705b695a4500c9f
Author: Innocent Djiofack 
Date:   2017-07-26T03:41:02Z

Changed the mutation detector to be based on structural value only

commit a1ed0b9c77de86e8fa9d43c021f9f16aeca55189
Author: Innocent Djiofack 
Date:   2017-07-26T04:51:21Z

fixed compilation errors and a bug in ByteArraycoder

commit f073165ca48ad1f2e4912b0e84a3114fa8df3f57
Author: Innocent Djiofack 
Date:   2017-07-26T11:36:59Z

Added override of consistentWithEquals() to SerializableCoder so that the 
correct strutural value could be returned.

commit d297a2605019d6d0eb682b444f62a18f45d04941
Author: Innocent Djiofack 
Date:   2017-08-05T02:51:23Z

made the comparison of structural values based on cloned values in 
MutationDetectors.java. Added addition al equals comparison of the cloned 
values without them,some BigTableIoTest fail

commit faf8439c2c4b51f839b50b166b98bfe06ecbcfe6
Author: Innocent Djiofack 
Date:   2017-08-16T00:57:51Z

Modified mutability detection to be based solely on structural value. Added 
@Ignore to 2 failing tests in BigTableIOTest

commit b68160031421e8f866d53e9ea69596e3d70ee620
Author: Innocent Djiofack 
Date:   2017-09-01T00:07:19Z

fixed spacing

commit f1a7e50844da65ca40d95e02bf2f4fbe5276abf6
Author: Innocent Djiofack 
Date:   2017-10-27T22:09:57Z

First test changes

commit 7c309dba760528e859a7bdbbc106a51141d00aaf
Author: Innocent Djiofack 
Date:   2017-10-27T22:29:31Z

removed @ignore staments

commit 4abaabeb3bb9b8673c243ccdd4cedb340fbbe135
Author: Innocent Djiofack 

[GitHub] beam pull request #3643: [Beam-2482] - CodedValueMutationDetector should use...

2017-10-30 Thread evindj
Github user evindj closed the pull request at:

https://github.com/apache/beam/pull/3643


---


[GitHub] beam pull request #4061: Remove obsolete extra parameter

2017-10-30 Thread ravwojdyla
GitHub user ravwojdyla opened a pull request:

https://github.com/apache/beam/pull/4061

Remove obsolete extra parameter

Follow this checklist to help us incorporate your contribution quickly and 
easily:

 - [ ] Make sure there is a [JIRA 
issue](https://issues.apache.org/jira/projects/BEAM/issues/) filed for the 
change (usually before you start working on it).  Trivial changes like typos do 
not require a JIRA issue.  Your pull request should address just this issue, 
without pulling in other changes.
 - [ ] Each commit in the pull request should have a meaningful subject 
line and body.
 - [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in 
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA 
issue.
 - [ ] Write a pull request description that is detailed enough to 
understand what the pull request does, how, and why.
 - [ ] Run `mvn clean verify` to make sure basic checks pass. A more 
thorough check will be performed on your pull request automatically.
 - [ ] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).

---


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/ravwojdyla/incubator-beam 
rav/remove_obsolete_param

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/4061.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #4061


commit a93955b5dec11a846d520e1d8ac94e8b53aec581
Author: Rafal Wojdyla 
Date:   2017-10-31T01:58:24Z

Remove obsolete extra parameter




---


[jira] [Assigned] (BEAM-3054) org.apache.beam.sdk.io.elasticsearch.ElasticsearchIOTest is flaky

2017-10-30 Thread Eugene Kirpichov (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3054?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Kirpichov reassigned BEAM-3054:
--

Assignee: Eugene Kirpichov  (was: Etienne Chauchot)

> org.apache.beam.sdk.io.elasticsearch.ElasticsearchIOTest is flaky
> -
>
> Key: BEAM-3054
> URL: https://issues.apache.org/jira/browse/BEAM-3054
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-extensions
>Reporter: Valentyn Tymofieiev
>Assignee: Eugene Kirpichov
>
> From a precommit test: 
> https://builds.apache.org/job/beam_PreCommit_Java_MavenInstall/15049/consoleFull
> 2017-10-13T00:49:17.586 [ERROR] 
> testRead(org.apache.beam.sdk.io.elasticsearch.ElasticsearchIOTest)  Time 
> elapsed: 12.054 s  <<< ERROR!
> java.io.IOException: Failed to insert test documents in index beam
>   at 
> __randomizedtesting.SeedInfo.seed([4A7FFDE90B849587:80A19E9BC55DC1A2]:0)
>   at 
> org.apache.beam.sdk.io.elasticsearch.ElasticSearchIOTestUtils.insertTestDocuments(ElasticSearchIOTestUtils.java:79)
>   at 
> org.apache.beam.sdk.io.elasticsearch.ElasticsearchIOTestCommon.testRead(ElasticsearchIOTestCommon.java:113)
>   at 
> org.apache.beam.sdk.io.elasticsearch.ElasticsearchIOTest.testRead(ElasticsearchIOTest.java:119)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1713)
>   at 
> com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:907)
>   at 
> com.carrotsearch.randomizedtesting.RandomizedRunner$9.evaluate(RandomizedRunner.java:943)
>   at 
> com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:957)
>   at 
> org.apache.beam.sdk.testing.TestPipeline$1.evaluate(TestPipeline.java:327)
>   at 
> org.junit.rules.ExpectedException$ExpectedExceptionStatement.evaluate(ExpectedException.java:239)
>   at 
> com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
>   at 
> org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:49)
>   at 
> org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
>   at 
> org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:48)
>   at 
> org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64)
>   at 
> org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47)
>   at 
> com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
>   at 
> com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:368)
>   at 
> com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:817)
>   at 
> com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:468)
>   at 
> com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:916)
>   at 
> com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:802)
>   at 
> com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:852)
>   at 
> com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:863)
>   at 
> org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
>   at 
> com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
>   at 
> org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:41)
>   at 
> com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
>   at 
> com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
>   at 
> com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
>   at 
> com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
>   at 
> com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
>   at 
> org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:53)
>   at 
> 

[jira] [Commented] (BEAM-3054) org.apache.beam.sdk.io.elasticsearch.ElasticsearchIOTest is flaky

2017-10-30 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-3054?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16226109#comment-16226109
 ] 

ASF GitHub Bot commented on BEAM-3054:
--

GitHub user jkff opened a pull request:

https://github.com/apache/beam/pull/4060

[BEAM-3054] Uses locale-insensitive number formatting in ESIO and tests

The ESIO5 test framework will randomly switch the locale of the current 
test, and hence it discovered this bug: this is an actual bug.

This commit switches %d to %s where appropriate, i.e. where a 
machine-readable decimal number in US locale is required.

R: @echauchot 

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/jkff/incubator-beam esio-locale

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/4060.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #4060


commit e647a961669c0afca3ece4194f99d8d1ea5fd52d
Author: Eugene Kirpichov 
Date:   2017-10-31T01:48:48Z

[BEAM-3054] Uses locale-insensitive number formatting in ESIO and tests

The ESIO5 test framework will randomly switch the locale of the current
test, and hence it discovered this bug: this is an actual bug.

This commit switches %d to %s where appropriate, i.e. where a
machine-readable decimal number in US locale is required.




> org.apache.beam.sdk.io.elasticsearch.ElasticsearchIOTest is flaky
> -
>
> Key: BEAM-3054
> URL: https://issues.apache.org/jira/browse/BEAM-3054
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-extensions
>Reporter: Valentyn Tymofieiev
>Assignee: Etienne Chauchot
>
> From a precommit test: 
> https://builds.apache.org/job/beam_PreCommit_Java_MavenInstall/15049/consoleFull
> 2017-10-13T00:49:17.586 [ERROR] 
> testRead(org.apache.beam.sdk.io.elasticsearch.ElasticsearchIOTest)  Time 
> elapsed: 12.054 s  <<< ERROR!
> java.io.IOException: Failed to insert test documents in index beam
>   at 
> __randomizedtesting.SeedInfo.seed([4A7FFDE90B849587:80A19E9BC55DC1A2]:0)
>   at 
> org.apache.beam.sdk.io.elasticsearch.ElasticSearchIOTestUtils.insertTestDocuments(ElasticSearchIOTestUtils.java:79)
>   at 
> org.apache.beam.sdk.io.elasticsearch.ElasticsearchIOTestCommon.testRead(ElasticsearchIOTestCommon.java:113)
>   at 
> org.apache.beam.sdk.io.elasticsearch.ElasticsearchIOTest.testRead(ElasticsearchIOTest.java:119)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1713)
>   at 
> com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:907)
>   at 
> com.carrotsearch.randomizedtesting.RandomizedRunner$9.evaluate(RandomizedRunner.java:943)
>   at 
> com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:957)
>   at 
> org.apache.beam.sdk.testing.TestPipeline$1.evaluate(TestPipeline.java:327)
>   at 
> org.junit.rules.ExpectedException$ExpectedExceptionStatement.evaluate(ExpectedException.java:239)
>   at 
> com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
>   at 
> org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:49)
>   at 
> org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
>   at 
> org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:48)
>   at 
> org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64)
>   at 
> org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47)
>   at 
> com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
>   at 
> com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:368)
>   at 
> com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:817)
>   at 
> com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:468)
>   at 
> com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:916)
>   at 
> 

[GitHub] beam pull request #4060: [BEAM-3054] Uses locale-insensitive number formatti...

2017-10-30 Thread jkff
GitHub user jkff opened a pull request:

https://github.com/apache/beam/pull/4060

[BEAM-3054] Uses locale-insensitive number formatting in ESIO and tests

The ESIO5 test framework will randomly switch the locale of the current 
test, and hence it discovered this bug: this is an actual bug.

This commit switches %d to %s where appropriate, i.e. where a 
machine-readable decimal number in US locale is required.

R: @echauchot 

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/jkff/incubator-beam esio-locale

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/4060.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #4060


commit e647a961669c0afca3ece4194f99d8d1ea5fd52d
Author: Eugene Kirpichov 
Date:   2017-10-31T01:48:48Z

[BEAM-3054] Uses locale-insensitive number formatting in ESIO and tests

The ESIO5 test framework will randomly switch the locale of the current
test, and hence it discovered this bug: this is an actual bug.

This commit switches %d to %s where appropriate, i.e. where a
machine-readable decimal number in US locale is required.




---


[jira] [Commented] (BEAM-3116) Dataflow templates broken whenever any worker pool metadata property is set

2017-10-30 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-3116?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16226108#comment-16226108
 ] 

ASF GitHub Bot commented on BEAM-3116:
--

Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/4056


> Dataflow templates broken whenever any worker pool metadata property is set
> ---
>
> Key: BEAM-3116
> URL: https://issues.apache.org/jira/browse/BEAM-3116
> Project: Beam
>  Issue Type: Bug
>  Components: runner-dataflow
>Reporter: Kenneth Knowles
>Assignee: Kenneth Knowles
>Priority: Blocker
>  Labels: portability
> Fix For: 2.3.0
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] beam pull request #4056: [BEAM-3116] Temporarily disable Dataflow metadata

2017-10-30 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/4056


---


[2/2] beam git commit: This closes #4056: [BEAM-3116] Temporarily disable Dataflow metadata

2017-10-30 Thread kenn
This closes #4056: [BEAM-3116] Temporarily disable Dataflow metadata

  Temporarily disable Dataflow pipeline_url metadata


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/710941e9
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/710941e9
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/710941e9

Branch: refs/heads/master
Commit: 710941e905438c1a121c6de17c588c76916a7c3c
Parents: 09f6815 324dae7
Author: Kenneth Knowles 
Authored: Mon Oct 30 18:07:14 2017 -0700
Committer: Kenneth Knowles 
Committed: Mon Oct 30 18:07:14 2017 -0700

--
 .../org/apache/beam/runners/dataflow/DataflowRunner.java |  6 --
 .../apache/beam/runners/dataflow/DataflowRunnerTest.java | 11 +--
 .../apache_beam/runners/dataflow/internal/apiclient.py   | 10 ++
 3 files changed, 15 insertions(+), 12 deletions(-)
--




[1/2] beam git commit: Temporarily disable Dataflow pipeline_url metadata

2017-10-30 Thread kenn
Repository: beam
Updated Branches:
  refs/heads/master 09f68159d -> 710941e90


Temporarily disable Dataflow pipeline_url metadata


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/324dae73
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/324dae73
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/324dae73

Branch: refs/heads/master
Commit: 324dae7345de220cad9f8df7b7952d076bb36185
Parents: 5fb30ec
Author: Kenneth Knowles 
Authored: Sat Oct 28 16:04:04 2017 -0700
Committer: Kenneth Knowles 
Committed: Sun Oct 29 20:49:27 2017 -0700

--
 .../org/apache/beam/runners/dataflow/DataflowRunner.java |  6 --
 .../apache/beam/runners/dataflow/DataflowRunnerTest.java | 11 +--
 .../apache_beam/runners/dataflow/internal/apiclient.py   | 10 ++
 3 files changed, 15 insertions(+), 12 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/324dae73/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/DataflowRunner.java
--
diff --git 
a/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/DataflowRunner.java
 
b/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/DataflowRunner.java
index 545321d..334c8e5 100644
--- 
a/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/DataflowRunner.java
+++ 
b/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/DataflowRunner.java
@@ -571,8 +571,10 @@ public class DataflowRunner extends 
PipelineRunner {
 String workerHarnessContainerImage = getContainerImageForJob(options);
 for (WorkerPool workerPool : newJob.getEnvironment().getWorkerPools()) {
   workerPool.setWorkerHarnessContainerImage(workerHarnessContainerImage);
-  workerPool.setMetadata(
-  ImmutableMap.of(STAGED_PIPELINE_METADATA_PROPERTY, 
stagedPipeline.getLocation()));
+
+  // https://issues.apache.org/jira/browse/BEAM-3116
+  // workerPool.setMetadata(
+  //ImmutableMap.of(STAGED_PIPELINE_METADATA_PROPERTY, 
stagedPipeline.getLocation()));
 }
 
 newJob.getEnvironment().setVersion(getEnvironmentVersion(options));

http://git-wip-us.apache.org/repos/asf/beam/blob/324dae73/runners/google-cloud-dataflow-java/src/test/java/org/apache/beam/runners/dataflow/DataflowRunnerTest.java
--
diff --git 
a/runners/google-cloud-dataflow-java/src/test/java/org/apache/beam/runners/dataflow/DataflowRunnerTest.java
 
b/runners/google-cloud-dataflow-java/src/test/java/org/apache/beam/runners/dataflow/DataflowRunnerTest.java
index 1568eda..66cf11d 100644
--- 
a/runners/google-cloud-dataflow-java/src/test/java/org/apache/beam/runners/dataflow/DataflowRunnerTest.java
+++ 
b/runners/google-cloud-dataflow-java/src/test/java/org/apache/beam/runners/dataflow/DataflowRunnerTest.java
@@ -22,7 +22,6 @@ import static org.hamcrest.Matchers.both;
 import static org.hamcrest.Matchers.containsString;
 import static org.hamcrest.Matchers.equalTo;
 import static org.hamcrest.Matchers.hasItem;
-import static org.hamcrest.Matchers.hasKey;
 import static org.hamcrest.Matchers.is;
 import static org.hamcrest.Matchers.not;
 import static org.hamcrest.Matchers.startsWith;
@@ -46,7 +45,6 @@ import com.google.api.services.dataflow.Dataflow;
 import com.google.api.services.dataflow.model.DataflowPackage;
 import com.google.api.services.dataflow.model.Job;
 import com.google.api.services.dataflow.model.ListJobsResponse;
-import com.google.api.services.dataflow.model.WorkerPool;
 import com.google.api.services.storage.model.StorageObject;
 import com.google.common.base.Throwables;
 import com.google.common.collect.ImmutableList;
@@ -166,10 +164,11 @@ public class DataflowRunnerTest implements Serializable {
 assertNull(job.getCurrentState());
 assertTrue(Pattern.matches("[a-z]([-a-z0-9]*[a-z0-9])?", job.getName()));
 
-for (WorkerPool workerPool : job.getEnvironment().getWorkerPools()) {
-  assertThat(workerPool.getMetadata(),
-  hasKey(DataflowRunner.STAGED_PIPELINE_METADATA_PROPERTY));
-}
+// https://issues.apache.org/jira/browse/BEAM-3116
+// for (WorkerPool workerPool : job.getEnvironment().getWorkerPools()) {
+//   assertThat(workerPool.getMetadata(),
+//   hasKey(DataflowRunner.STAGED_PIPELINE_METADATA_PROPERTY));
+// }
   }
 
   @Before

http://git-wip-us.apache.org/repos/asf/beam/blob/324dae73/sdks/python/apache_beam/runners/dataflow/internal/apiclient.py
--
diff --git a/sdks/python/apache_beam/runners/dataflow/internal/apiclient.py 

Jenkins build is back to stable : beam_PostCommit_Java_MavenInstall #5137

2017-10-30 Thread Apache Jenkins Server
See 




Jenkins build became unstable: beam_PostCommit_Java_ValidatesRunner_Spark #3389

2017-10-30 Thread Apache Jenkins Server
See 




Jenkins build is back to stable : beam_PostCommit_Java_ValidatesRunner_Dataflow #4257

2017-10-30 Thread Apache Jenkins Server
See 




[jira] [Commented] (BEAM-3054) org.apache.beam.sdk.io.elasticsearch.ElasticsearchIOTest is flaky

2017-10-30 Thread Eugene Kirpichov (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-3054?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16226096#comment-16226096
 ] 

Eugene Kirpichov commented on BEAM-3054:


This appears to be because Elasticsearch uses the Lucene test framework, which 
sets a random default locale, and sometimes it sets a locale in which numbers 
in the test documents, formatted as %d get formatted in a way that 
Elasticsearch can't import. (don't ask how I figured this out...)

The right way to fix this is to, I suppose, change the way the test inserts 
test documents.

> org.apache.beam.sdk.io.elasticsearch.ElasticsearchIOTest is flaky
> -
>
> Key: BEAM-3054
> URL: https://issues.apache.org/jira/browse/BEAM-3054
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-extensions
>Reporter: Valentyn Tymofieiev
>Assignee: Etienne Chauchot
>
> From a precommit test: 
> https://builds.apache.org/job/beam_PreCommit_Java_MavenInstall/15049/consoleFull
> 2017-10-13T00:49:17.586 [ERROR] 
> testRead(org.apache.beam.sdk.io.elasticsearch.ElasticsearchIOTest)  Time 
> elapsed: 12.054 s  <<< ERROR!
> java.io.IOException: Failed to insert test documents in index beam
>   at 
> __randomizedtesting.SeedInfo.seed([4A7FFDE90B849587:80A19E9BC55DC1A2]:0)
>   at 
> org.apache.beam.sdk.io.elasticsearch.ElasticSearchIOTestUtils.insertTestDocuments(ElasticSearchIOTestUtils.java:79)
>   at 
> org.apache.beam.sdk.io.elasticsearch.ElasticsearchIOTestCommon.testRead(ElasticsearchIOTestCommon.java:113)
>   at 
> org.apache.beam.sdk.io.elasticsearch.ElasticsearchIOTest.testRead(ElasticsearchIOTest.java:119)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1713)
>   at 
> com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:907)
>   at 
> com.carrotsearch.randomizedtesting.RandomizedRunner$9.evaluate(RandomizedRunner.java:943)
>   at 
> com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:957)
>   at 
> org.apache.beam.sdk.testing.TestPipeline$1.evaluate(TestPipeline.java:327)
>   at 
> org.junit.rules.ExpectedException$ExpectedExceptionStatement.evaluate(ExpectedException.java:239)
>   at 
> com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
>   at 
> org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:49)
>   at 
> org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
>   at 
> org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:48)
>   at 
> org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64)
>   at 
> org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47)
>   at 
> com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
>   at 
> com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:368)
>   at 
> com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:817)
>   at 
> com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:468)
>   at 
> com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:916)
>   at 
> com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:802)
>   at 
> com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:852)
>   at 
> com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:863)
>   at 
> org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
>   at 
> com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
>   at 
> org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:41)
>   at 
> com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
>   at 
> com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
>   at 
> com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
>   

Build failed in Jenkins: beam_PostCommit_Python_Verify #3453

2017-10-30 Thread Apache Jenkins Server
See 


Changes:

[chamikara] [BEAM-2979] Fix a race condition in getWatermark() in KafkaIO.

[kirpichov] [BEAM-1542] SpannerIO: mutation encoding and size estimation

--
[...truncated 932.37 KB...]
copying apache_beam/runners/common_test.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners
copying apache_beam/runners/pipeline_context.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners
copying apache_beam/runners/pipeline_context_test.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners
copying apache_beam/runners/runner.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners
copying apache_beam/runners/runner_test.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners
copying apache_beam/runners/dataflow/__init__.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow
copying apache_beam/runners/dataflow/dataflow_metrics.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow
copying apache_beam/runners/dataflow/dataflow_metrics_test.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow
copying apache_beam/runners/dataflow/dataflow_runner.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow
copying apache_beam/runners/dataflow/dataflow_runner_test.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow
copying apache_beam/runners/dataflow/ptransform_overrides.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow
copying apache_beam/runners/dataflow/template_runner_test.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow
copying apache_beam/runners/dataflow/test_dataflow_runner.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow
copying apache_beam/runners/dataflow/internal/__init__.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/internal
copying apache_beam/runners/dataflow/internal/apiclient.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/internal
copying apache_beam/runners/dataflow/internal/apiclient_test.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/internal
copying apache_beam/runners/dataflow/internal/dependency.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/internal
copying apache_beam/runners/dataflow/internal/dependency_test.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/internal
copying apache_beam/runners/dataflow/internal/names.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/internal
copying apache_beam/runners/dataflow/internal/clients/__init__.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/internal/clients
copying apache_beam/runners/dataflow/internal/clients/dataflow/__init__.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/internal/clients/dataflow
copying 
apache_beam/runners/dataflow/internal/clients/dataflow/dataflow_v1b3_client.py 
-> apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/internal/clients/dataflow
copying 
apache_beam/runners/dataflow/internal/clients/dataflow/dataflow_v1b3_messages.py
 -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/internal/clients/dataflow
copying 
apache_beam/runners/dataflow/internal/clients/dataflow/message_matchers.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/internal/clients/dataflow
copying 
apache_beam/runners/dataflow/internal/clients/dataflow/message_matchers_test.py 
-> apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/internal/clients/dataflow
copying apache_beam/runners/dataflow/native_io/__init__.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/native_io
copying apache_beam/runners/dataflow/native_io/iobase.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/native_io
copying apache_beam/runners/dataflow/native_io/iobase_test.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/native_io
copying apache_beam/runners/dataflow/native_io/streaming_create.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/native_io
copying apache_beam/runners/direct/__init__.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/bundle_factory.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/clock.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/consumer_tracking_pipeline_visitor.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/consumer_tracking_pipeline_visitor_test.py 
-> apache-beam-2.3.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/direct_metrics.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/direct_metrics_test.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/direct_runner.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/direct_runner_test.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/direct
copying 

[jira] [Commented] (BEAM-3107) Python Fnapi based workloads failing

2017-10-30 Thread Ahmet Altay (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-3107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16226064#comment-16226064
 ] 

Ahmet Altay commented on BEAM-3107:
---

Removed it from the release blocking list. Let's keep it open since it is still 
failing in the master branch.

> Python Fnapi based workloads failing
> 
>
> Key: BEAM-3107
> URL: https://issues.apache.org/jira/browse/BEAM-3107
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-harness
>Affects Versions: 2.2.0
>Reporter: Ahmet Altay
>Assignee: Valentyn Tymofieiev
>  Labels: portability
>
> Python post commits are failing because the runner harness is not compatible 
> with the sdk harness.
> We need a new runner harness compatible with: 
> https://github.com/apache/beam/commit/80c6f4ec0c2a3cc3a441289a9cc8ff53cb70f863



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (BEAM-3107) Python Fnapi based workloads failing

2017-10-30 Thread Ahmet Altay (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3107?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmet Altay updated BEAM-3107:
--
Fix Version/s: (was: 2.2.0)

> Python Fnapi based workloads failing
> 
>
> Key: BEAM-3107
> URL: https://issues.apache.org/jira/browse/BEAM-3107
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-harness
>Affects Versions: 2.2.0
>Reporter: Ahmet Altay
>Assignee: Valentyn Tymofieiev
>  Labels: portability
>
> Python post commits are failing because the runner harness is not compatible 
> with the sdk harness.
> We need a new runner harness compatible with: 
> https://github.com/apache/beam/commit/80c6f4ec0c2a3cc3a441289a9cc8ff53cb70f863



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (BEAM-3107) Python Fnapi based workloads failing

2017-10-30 Thread Valentyn Tymofieiev (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-3107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16226058#comment-16226058
 ] 

Valentyn Tymofieiev commented on BEAM-3107:
---

I ran the postcommit test suite locally and it passed. I believe this is no 
longer a release blocker, also we still have test infra problem tracked in 
BEAM-3120.

> Python Fnapi based workloads failing
> 
>
> Key: BEAM-3107
> URL: https://issues.apache.org/jira/browse/BEAM-3107
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-harness
>Affects Versions: 2.2.0
>Reporter: Ahmet Altay
>Assignee: Valentyn Tymofieiev
>  Labels: portability
> Fix For: 2.2.0
>
>
> Python post commits are failing because the runner harness is not compatible 
> with the sdk harness.
> We need a new runner harness compatible with: 
> https://github.com/apache/beam/commit/80c6f4ec0c2a3cc3a441289a9cc8ff53cb70f863



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


Jenkins build is back to normal : beam_PostCommit_Java_ValidatesRunner_Spark #3388

2017-10-30 Thread Apache Jenkins Server
See 




[jira] [Commented] (BEAM-3121) Dockerized jekyll server fails

2017-10-30 Thread Henning Rohde (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-3121?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16226043#comment-16226043
 ] 

Henning Rohde commented on BEAM-3121:
-

It seems this capability was added by Saurabh, who is no longer active, and not 
used otherwise. I'll remove it.

> Dockerized jekyll server fails
> --
>
> Key: BEAM-3121
> URL: https://issues.apache.org/jira/browse/BEAM-3121
> Project: Beam
>  Issue Type: Bug
>  Components: website
>Reporter: Henning Rohde
>Assignee: Henning Rohde
>
> The dockerized jekyll script for building the website fails with the 
> following error:
> $ ./run_with_docker.sh server
> Unable to find image 'jekyll/jekyll:latest' locally
> latest: Pulling from jekyll/jekyll
> b56ae66c2937: Pull complete 
> 337580f4eddd: Pull complete 
> f8edd684e995: Pull complete 
> 536ad31f0662: Pull complete 
> 120aeb540421: Pull complete 
> Digest: 
> sha256:61956fbaf76be2fc23cd4908abbe1bfce6541dcdd9d9b3658b53d80b25a363a4
> Status: Downloaded newer image for jekyll/jekyll:latest
> The following gems are missing
>  * rake (11.3.0)
>  * i18n (0.7.0)
>  * json (1.8.3)
>  * minitest (5.9.1)
>  * thread_safe (0.3.5)
>  * tzinfo (1.2.2)
>  * activesupport (4.2.7.1)
>  * addressable (2.4.0)
>  * colored (1.2)
>  * ffi (1.9.14)
>  * ethon (0.9.1)
>  * mini_portile2 (2.1.0)
>  * nokogiri (1.6.8.1)
>  * parallel (1.9.0)
>  * yell (2.0.6)
>  * html-proofer (3.3.1)
>  * sass (3.4.22)
>  * jekyll-sass-converter (1.4.0)
>  * rb-fsevent (0.9.7)
>  * rb-inotify (0.9.7)
>  * kramdown (1.12.0)
>  * liquid (3.0.6)
>  * pathutil (0.14.0)
>  * rouge (1.11.1)
>  * jekyll (3.2.0)
>  * jekyll-redirect-from (0.11.0)
>  * jekyll_github_sample (0.3.0)
> Install missing gems with `bundle install`
> Fetching gem metadata from https://rubygems.org/.
> Fetching version metadata from https://rubygems.org/..
> Fetching dependency metadata from https://rubygems.org/.
> Fetching rake 11.3.0
> Installing rake 11.3.0
> Fetching i18n 0.7.0
> Installing i18n 0.7.0
> Fetching json 1.8.3
> Installing json 1.8.3 with native extensions
> Fetching minitest 5.9.1
> Installing minitest 5.9.1
> Fetching thread_safe 0.3.5
> Installing thread_safe 0.3.5
> Fetching addressable 2.4.0
> Installing addressable 2.4.0
> Using bundler 1.15.4
> Fetching colorator 1.1.0
> Installing colorator 1.1.0
> Fetching colored 1.2
> Installing colored 1.2
> Fetching ffi 1.9.14
> Installing ffi 1.9.14 with native extensions
> Fetching forwardable-extended 2.6.0
> Installing forwardable-extended 2.6.0
> Fetching mercenary 0.3.6
> Installing mercenary 0.3.6
> Fetching mini_portile2 2.1.0
> Installing mini_portile2 2.1.0
> Fetching parallel 1.9.0
> Installing parallel 1.9.0
> Fetching yell 2.0.6
> Installing yell 2.0.6
> Fetching sass 3.4.22
> Installing sass 3.4.22
> Fetching rb-fsevent 0.9.7
> Installing rb-fsevent 0.9.7
> Fetching kramdown 1.12.0
> Installing kramdown 1.12.0
> Fetching liquid 3.0.6
> Installing liquid 3.0.6
> Fetching rouge 1.11.1
> Installing rouge 1.11.1
> Fetching safe_yaml 1.0.4
> Installing safe_yaml 1.0.4
> Gem::Ext::BuildError: ERROR: Failed to build gem native extension.
> current directory: 
> /srv/jekyll/vendor/bundle/gems/json-1.8.3/ext/json/ext/generator
> /usr/local/bin/ruby -r ./siteconf20171030-31-1kumuua.rb extconf.rb
> creating Makefile
> current directory: 
> /srv/jekyll/vendor/bundle/gems/json-1.8.3/ext/json/ext/generator
> make "DESTDIR=" clean
> current directory: 
> /srv/jekyll/vendor/bundle/gems/json-1.8.3/ext/json/ext/generator
> make "DESTDIR="
> compiling generator.c
> generator.c: In function 'generate_json':
> generator.c:861:25: error: 'rb_cFixnum' undeclared (first use in this 
> function)
>  } else if (klass == rb_cFixnum) {
>  ^~
> generator.c:861:25: note: each undeclared identifier is reported only once 
> for each function it appears in
> generator.c:863:25: error: 'rb_cBignum' undeclared (first use in this 
> function)
>  } else if (klass == rb_cBignum) {
>  ^~
> generator.c: At top level:
> cc1: warning: unrecognized command line option '-Wno-self-assign'
> cc1: warning: unrecognized command line option '-Wno-constant-logical-operand'
> cc1: warning: unrecognized command line option '-Wno-parentheses-equality'
> make: *** [Makefile:242: generator.o] Error 1
> make failed, exit code 2
> Gem files will remain installed in /srv/jekyll/vendor/bundle/gems/json-1.8.3 
> for inspection.
> Results logged to 
> /srv/jekyll/vendor/bundle/extensions/x86_64-linux/2.4.0/json-1.8.3/gem_make.out
> An error occurred while installing json (1.8.3), and Bundler cannot continue.
> Make sure that `gem install json -v '1.8.3'` succeeds before bundling.
> In Gemfile:
>   html-proofer was resolved to 3.3.1, which depends on
> activesupport was resolved to 

[jira] [Updated] (BEAM-3121) Dockerized jekyll server fails

2017-10-30 Thread Henning Rohde (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3121?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Henning Rohde updated BEAM-3121:

Priority: Minor  (was: Major)

> Dockerized jekyll server fails
> --
>
> Key: BEAM-3121
> URL: https://issues.apache.org/jira/browse/BEAM-3121
> Project: Beam
>  Issue Type: Bug
>  Components: website
>Reporter: Henning Rohde
>Assignee: Henning Rohde
>Priority: Minor
>
> The dockerized jekyll script for building the website fails with the 
> following error:
> $ ./run_with_docker.sh server
> Unable to find image 'jekyll/jekyll:latest' locally
> latest: Pulling from jekyll/jekyll
> b56ae66c2937: Pull complete 
> 337580f4eddd: Pull complete 
> f8edd684e995: Pull complete 
> 536ad31f0662: Pull complete 
> 120aeb540421: Pull complete 
> Digest: 
> sha256:61956fbaf76be2fc23cd4908abbe1bfce6541dcdd9d9b3658b53d80b25a363a4
> Status: Downloaded newer image for jekyll/jekyll:latest
> The following gems are missing
>  * rake (11.3.0)
>  * i18n (0.7.0)
>  * json (1.8.3)
>  * minitest (5.9.1)
>  * thread_safe (0.3.5)
>  * tzinfo (1.2.2)
>  * activesupport (4.2.7.1)
>  * addressable (2.4.0)
>  * colored (1.2)
>  * ffi (1.9.14)
>  * ethon (0.9.1)
>  * mini_portile2 (2.1.0)
>  * nokogiri (1.6.8.1)
>  * parallel (1.9.0)
>  * yell (2.0.6)
>  * html-proofer (3.3.1)
>  * sass (3.4.22)
>  * jekyll-sass-converter (1.4.0)
>  * rb-fsevent (0.9.7)
>  * rb-inotify (0.9.7)
>  * kramdown (1.12.0)
>  * liquid (3.0.6)
>  * pathutil (0.14.0)
>  * rouge (1.11.1)
>  * jekyll (3.2.0)
>  * jekyll-redirect-from (0.11.0)
>  * jekyll_github_sample (0.3.0)
> Install missing gems with `bundle install`
> Fetching gem metadata from https://rubygems.org/.
> Fetching version metadata from https://rubygems.org/..
> Fetching dependency metadata from https://rubygems.org/.
> Fetching rake 11.3.0
> Installing rake 11.3.0
> Fetching i18n 0.7.0
> Installing i18n 0.7.0
> Fetching json 1.8.3
> Installing json 1.8.3 with native extensions
> Fetching minitest 5.9.1
> Installing minitest 5.9.1
> Fetching thread_safe 0.3.5
> Installing thread_safe 0.3.5
> Fetching addressable 2.4.0
> Installing addressable 2.4.0
> Using bundler 1.15.4
> Fetching colorator 1.1.0
> Installing colorator 1.1.0
> Fetching colored 1.2
> Installing colored 1.2
> Fetching ffi 1.9.14
> Installing ffi 1.9.14 with native extensions
> Fetching forwardable-extended 2.6.0
> Installing forwardable-extended 2.6.0
> Fetching mercenary 0.3.6
> Installing mercenary 0.3.6
> Fetching mini_portile2 2.1.0
> Installing mini_portile2 2.1.0
> Fetching parallel 1.9.0
> Installing parallel 1.9.0
> Fetching yell 2.0.6
> Installing yell 2.0.6
> Fetching sass 3.4.22
> Installing sass 3.4.22
> Fetching rb-fsevent 0.9.7
> Installing rb-fsevent 0.9.7
> Fetching kramdown 1.12.0
> Installing kramdown 1.12.0
> Fetching liquid 3.0.6
> Installing liquid 3.0.6
> Fetching rouge 1.11.1
> Installing rouge 1.11.1
> Fetching safe_yaml 1.0.4
> Installing safe_yaml 1.0.4
> Gem::Ext::BuildError: ERROR: Failed to build gem native extension.
> current directory: 
> /srv/jekyll/vendor/bundle/gems/json-1.8.3/ext/json/ext/generator
> /usr/local/bin/ruby -r ./siteconf20171030-31-1kumuua.rb extconf.rb
> creating Makefile
> current directory: 
> /srv/jekyll/vendor/bundle/gems/json-1.8.3/ext/json/ext/generator
> make "DESTDIR=" clean
> current directory: 
> /srv/jekyll/vendor/bundle/gems/json-1.8.3/ext/json/ext/generator
> make "DESTDIR="
> compiling generator.c
> generator.c: In function 'generate_json':
> generator.c:861:25: error: 'rb_cFixnum' undeclared (first use in this 
> function)
>  } else if (klass == rb_cFixnum) {
>  ^~
> generator.c:861:25: note: each undeclared identifier is reported only once 
> for each function it appears in
> generator.c:863:25: error: 'rb_cBignum' undeclared (first use in this 
> function)
>  } else if (klass == rb_cBignum) {
>  ^~
> generator.c: At top level:
> cc1: warning: unrecognized command line option '-Wno-self-assign'
> cc1: warning: unrecognized command line option '-Wno-constant-logical-operand'
> cc1: warning: unrecognized command line option '-Wno-parentheses-equality'
> make: *** [Makefile:242: generator.o] Error 1
> make failed, exit code 2
> Gem files will remain installed in /srv/jekyll/vendor/bundle/gems/json-1.8.3 
> for inspection.
> Results logged to 
> /srv/jekyll/vendor/bundle/extensions/x86_64-linux/2.4.0/json-1.8.3/gem_make.out
> An error occurred while installing json (1.8.3), and Bundler cannot continue.
> Make sure that `gem install json -v '1.8.3'` succeeds before bundling.
> In Gemfile:
>   html-proofer was resolved to 3.3.1, which depends on
> activesupport was resolved to 4.2.7.1, which depends on
>   json



--
This message was sent by Atlassian JIRA

Build failed in Jenkins: beam_PerformanceTests_Python #506

2017-10-30 Thread Apache Jenkins Server
See 


Changes:

[chamikara] [BEAM-2979] Fix a race condition in getWatermark() in KafkaIO.

--
Started by timer
[EnvInject] - Loading node environment variables.
Building remotely on beam7 (beam) in workspace 

 > git rev-parse --is-inside-work-tree # timeout=10
Fetching changes from the remote Git repository
 > git config remote.origin.url https://github.com/apache/beam.git # timeout=10
Fetching upstream changes from https://github.com/apache/beam.git
 > git --version # timeout=10
 > git fetch --tags --progress https://github.com/apache/beam.git 
 > +refs/heads/*:refs/remotes/origin/* 
 > +refs/pull/${ghprbPullId}/*:refs/remotes/origin/pr/${ghprbPullId}/*
 > git rev-parse origin/master^{commit} # timeout=10
Checking out Revision aa26f4bf71d72989da82927bd1565e5722a1f8d8 (origin/master)
Commit message: "This closes #3985"
 > git config core.sparsecheckout # timeout=10
 > git checkout -f aa26f4bf71d72989da82927bd1565e5722a1f8d8
 > git rev-list 16b9d584ccb19b7ffaf10677e7c246f0f9647e7c # timeout=10
Cleaning workspace
 > git rev-parse --verify HEAD # timeout=10
Resetting working tree
 > git reset --hard # timeout=10
 > git clean -fdx # timeout=10
[EnvInject] - Executing scripts and injecting environment variables after the 
SCM step.
[EnvInject] - Injecting as environment variables the properties content 
SPARK_LOCAL_IP=127.0.0.1

[EnvInject] - Variables injected successfully.
[beam_PerformanceTests_Python] $ /bin/bash -xe 
/tmp/jenkins9158094400472618988.sh
+ rm -rf PerfKitBenchmarker
[beam_PerformanceTests_Python] $ /bin/bash -xe 
/tmp/jenkins2530280055680546395.sh
+ git clone https://github.com/GoogleCloudPlatform/PerfKitBenchmarker.git
Cloning into 'PerfKitBenchmarker'...
[beam_PerformanceTests_Python] $ /bin/bash -xe /tmp/jenkins723728149131385994.sh
+ pip install --user -r PerfKitBenchmarker/requirements.txt
Requirement already satisfied: absl-py in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 14))
Requirement already satisfied: jinja2>=2.7 in 
/usr/local/lib/python2.7/dist-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 15))
Requirement already satisfied: setuptools in /usr/lib/python2.7/dist-packages 
(from -r PerfKitBenchmarker/requirements.txt (line 16))
Requirement already satisfied: colorlog[windows]==2.6.0 in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 17))
Requirement already satisfied: blinker>=1.3 in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 18))
Requirement already satisfied: futures>=3.0.3 in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 19))
Requirement already satisfied: PyYAML==3.12 in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 20))
Requirement already satisfied: pint>=0.7 in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 21))
Requirement already satisfied: numpy in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 22))
Requirement already satisfied: functools32 in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 23))
Requirement already satisfied: contextlib2>=0.5.1 in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 24))
Collecting pywinrm (from -r PerfKitBenchmarker/requirements.txt (line 25))
/usr/local/lib/python2.7/dist-packages/pip/_vendor/requests/packages/urllib3/util/ssl_.py:318:
 SNIMissingWarning: An HTTPS request has been made, but the SNI (Subject Name 
Indication) extension to TLS is not available on this platform. This may cause 
the server to present an incorrect TLS certificate, which can cause validation 
failures. You can upgrade to a newer version of Python to solve this. For more 
information, see 
https://urllib3.readthedocs.io/en/latest/security.html#snimissingwarning.
  SNIMissingWarning
/usr/local/lib/python2.7/dist-packages/pip/_vendor/requests/packages/urllib3/util/ssl_.py:122:
 InsecurePlatformWarning: A true SSLContext object is not available. This 
prevents urllib3 from configuring SSL appropriately and may cause certain SSL 
connections to fail. You can upgrade to a newer version of Python to solve 
this. For more information, see 
https://urllib3.readthedocs.io/en/latest/security.html#insecureplatformwarning.
  InsecurePlatformWarning
  Downloading pywinrm-0.2.2-py2.py3-none-any.whl
Requirement already satisfied: six in 
/home/jenkins/.local/lib/python2.7/site-packages (from absl-py->-r 
PerfKitBenchmarker/requirements.txt (line 14))
Requirement already 

[jira] [Assigned] (BEAM-3121) Dockerized jekyll server fails

2017-10-30 Thread Henning Rohde (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3121?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Henning Rohde reassigned BEAM-3121:
---

Assignee: Henning Rohde  (was: Reuven Lax)

> Dockerized jekyll server fails
> --
>
> Key: BEAM-3121
> URL: https://issues.apache.org/jira/browse/BEAM-3121
> Project: Beam
>  Issue Type: Bug
>  Components: website
>Reporter: Henning Rohde
>Assignee: Henning Rohde
>
> The dockerized jekyll script for building the website fails with the 
> following error:
> $ ./run_with_docker.sh server
> Unable to find image 'jekyll/jekyll:latest' locally
> latest: Pulling from jekyll/jekyll
> b56ae66c2937: Pull complete 
> 337580f4eddd: Pull complete 
> f8edd684e995: Pull complete 
> 536ad31f0662: Pull complete 
> 120aeb540421: Pull complete 
> Digest: 
> sha256:61956fbaf76be2fc23cd4908abbe1bfce6541dcdd9d9b3658b53d80b25a363a4
> Status: Downloaded newer image for jekyll/jekyll:latest
> The following gems are missing
>  * rake (11.3.0)
>  * i18n (0.7.0)
>  * json (1.8.3)
>  * minitest (5.9.1)
>  * thread_safe (0.3.5)
>  * tzinfo (1.2.2)
>  * activesupport (4.2.7.1)
>  * addressable (2.4.0)
>  * colored (1.2)
>  * ffi (1.9.14)
>  * ethon (0.9.1)
>  * mini_portile2 (2.1.0)
>  * nokogiri (1.6.8.1)
>  * parallel (1.9.0)
>  * yell (2.0.6)
>  * html-proofer (3.3.1)
>  * sass (3.4.22)
>  * jekyll-sass-converter (1.4.0)
>  * rb-fsevent (0.9.7)
>  * rb-inotify (0.9.7)
>  * kramdown (1.12.0)
>  * liquid (3.0.6)
>  * pathutil (0.14.0)
>  * rouge (1.11.1)
>  * jekyll (3.2.0)
>  * jekyll-redirect-from (0.11.0)
>  * jekyll_github_sample (0.3.0)
> Install missing gems with `bundle install`
> Fetching gem metadata from https://rubygems.org/.
> Fetching version metadata from https://rubygems.org/..
> Fetching dependency metadata from https://rubygems.org/.
> Fetching rake 11.3.0
> Installing rake 11.3.0
> Fetching i18n 0.7.0
> Installing i18n 0.7.0
> Fetching json 1.8.3
> Installing json 1.8.3 with native extensions
> Fetching minitest 5.9.1
> Installing minitest 5.9.1
> Fetching thread_safe 0.3.5
> Installing thread_safe 0.3.5
> Fetching addressable 2.4.0
> Installing addressable 2.4.0
> Using bundler 1.15.4
> Fetching colorator 1.1.0
> Installing colorator 1.1.0
> Fetching colored 1.2
> Installing colored 1.2
> Fetching ffi 1.9.14
> Installing ffi 1.9.14 with native extensions
> Fetching forwardable-extended 2.6.0
> Installing forwardable-extended 2.6.0
> Fetching mercenary 0.3.6
> Installing mercenary 0.3.6
> Fetching mini_portile2 2.1.0
> Installing mini_portile2 2.1.0
> Fetching parallel 1.9.0
> Installing parallel 1.9.0
> Fetching yell 2.0.6
> Installing yell 2.0.6
> Fetching sass 3.4.22
> Installing sass 3.4.22
> Fetching rb-fsevent 0.9.7
> Installing rb-fsevent 0.9.7
> Fetching kramdown 1.12.0
> Installing kramdown 1.12.0
> Fetching liquid 3.0.6
> Installing liquid 3.0.6
> Fetching rouge 1.11.1
> Installing rouge 1.11.1
> Fetching safe_yaml 1.0.4
> Installing safe_yaml 1.0.4
> Gem::Ext::BuildError: ERROR: Failed to build gem native extension.
> current directory: 
> /srv/jekyll/vendor/bundle/gems/json-1.8.3/ext/json/ext/generator
> /usr/local/bin/ruby -r ./siteconf20171030-31-1kumuua.rb extconf.rb
> creating Makefile
> current directory: 
> /srv/jekyll/vendor/bundle/gems/json-1.8.3/ext/json/ext/generator
> make "DESTDIR=" clean
> current directory: 
> /srv/jekyll/vendor/bundle/gems/json-1.8.3/ext/json/ext/generator
> make "DESTDIR="
> compiling generator.c
> generator.c: In function 'generate_json':
> generator.c:861:25: error: 'rb_cFixnum' undeclared (first use in this 
> function)
>  } else if (klass == rb_cFixnum) {
>  ^~
> generator.c:861:25: note: each undeclared identifier is reported only once 
> for each function it appears in
> generator.c:863:25: error: 'rb_cBignum' undeclared (first use in this 
> function)
>  } else if (klass == rb_cBignum) {
>  ^~
> generator.c: At top level:
> cc1: warning: unrecognized command line option '-Wno-self-assign'
> cc1: warning: unrecognized command line option '-Wno-constant-logical-operand'
> cc1: warning: unrecognized command line option '-Wno-parentheses-equality'
> make: *** [Makefile:242: generator.o] Error 1
> make failed, exit code 2
> Gem files will remain installed in /srv/jekyll/vendor/bundle/gems/json-1.8.3 
> for inspection.
> Results logged to 
> /srv/jekyll/vendor/bundle/extensions/x86_64-linux/2.4.0/json-1.8.3/gem_make.out
> An error occurred while installing json (1.8.3), and Bundler cannot continue.
> Make sure that `gem install json -v '1.8.3'` succeeds before bundling.
> In Gemfile:
>   html-proofer was resolved to 3.3.1, which depends on
> activesupport was resolved to 4.2.7.1, which depends on
>   json



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] beam pull request #4014: [BEAM-1542] SpannerIO: mutation encoding and size e...

2017-10-30 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/4014


---


[jira] [Commented] (BEAM-1542) Need Source/Sink for Spanner

2017-10-30 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-1542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16226018#comment-16226018
 ] 

ASF GitHub Bot commented on BEAM-1542:
--

Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/4014


> Need Source/Sink for Spanner
> 
>
> Key: BEAM-1542
> URL: https://issues.apache.org/jira/browse/BEAM-1542
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-java-gcp
>Reporter: Guy Molinari
>Assignee: Mairbek Khadikov
>
> Is there a source/sink for Spanner in the works?   If not I would gladly give 
> this a shot.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[1/2] beam git commit: [BEAM-1542] SpannerIO: mutation encoding and size estimation improvements

2017-10-30 Thread jkff
Repository: beam
Updated Branches:
  refs/heads/master aa26f4bf7 -> 09f68159d


[BEAM-1542] SpannerIO: mutation encoding and size estimation improvements


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/3ba96003
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/3ba96003
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/3ba96003

Branch: refs/heads/master
Commit: 3ba96003d31ce98a54c0c51c1c0a9cf7c06e2fa2
Parents: aa26f4b
Author: Mairbek Khadikov 
Authored: Wed Oct 18 15:26:11 2017 -0700
Committer: Eugene Kirpichov 
Committed: Mon Oct 30 17:02:36 2017 -0700

--
 .../io/gcp/spanner/MutationGroupEncoder.java| 660 +++
 .../io/gcp/spanner/MutationSizeEstimator.java   |  48 ++
 .../gcp/spanner/MutationGroupEncoderTest.java   | 636 ++
 3 files changed, 1344 insertions(+)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/3ba96003/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/spanner/MutationGroupEncoder.java
--
diff --git 
a/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/spanner/MutationGroupEncoder.java
 
b/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/spanner/MutationGroupEncoder.java
new file mode 100644
index 000..ba0b4eb
--- /dev/null
+++ 
b/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/spanner/MutationGroupEncoder.java
@@ -0,0 +1,660 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.sdk.io.gcp.spanner;
+
+import static com.google.common.base.Preconditions.checkArgument;
+
+import com.google.cloud.ByteArray;
+import com.google.cloud.Date;
+import com.google.cloud.Timestamp;
+import com.google.cloud.spanner.Key;
+import com.google.cloud.spanner.KeySet;
+import com.google.cloud.spanner.Mutation;
+import com.google.cloud.spanner.Type;
+import com.google.cloud.spanner.Value;
+import java.io.ByteArrayInputStream;
+import java.io.ByteArrayOutputStream;
+import java.io.DataInputStream;
+import java.io.DataOutputStream;
+import java.io.IOException;
+import java.io.ObjectInputStream;
+import java.io.ObjectOutput;
+import java.io.ObjectOutputStream;
+import java.nio.charset.StandardCharsets;
+import java.util.ArrayList;
+import java.util.HashMap;
+import java.util.Iterator;
+import java.util.List;
+import java.util.Map;
+import org.apache.beam.sdk.util.VarInt;
+import org.joda.time.DateTime;
+import org.joda.time.Days;
+import org.joda.time.MutableDateTime;
+
+/**
+ * Given the Spanner Schema, efficiently encodes the mutation group.
+ */
+class MutationGroupEncoder {
+  private static final DateTime MIN_DATE = new DateTime(1, 1, 1, 0, 0);
+
+  private final SpannerSchema schema;
+  private final List tables;
+  private final Map tablesIndexes = new HashMap<>();
+
+  public MutationGroupEncoder(SpannerSchema schema) {
+this.schema = schema;
+tables = schema.getTables();
+
+for (int i = 0; i < tables.size(); i++) {
+  tablesIndexes.put(tables.get(i), i);
+}
+  }
+
+  public byte[] encode(MutationGroup g) {
+ByteArrayOutputStream bos = new ByteArrayOutputStream();
+
+try {
+  VarInt.encode(g.attached().size(), bos);
+  for (Mutation m : g) {
+encodeMutation(bos, m);
+  }
+} catch (IOException e) {
+  throw new RuntimeException(e);
+}
+return bos.toByteArray();
+  }
+
+  private static void setBit(byte[] bytes, int i) {
+int word = i / 8;
+int bit = 7 - i % 8;
+bytes[word] |= 1 << bit;
+  }
+
+  private static boolean getBit(byte[] bytes, int i) {
+int word = i / 8;
+int bit = 7 - i % 8;
+return (bytes[word] & 1 << (bit)) != 0;
+  }
+
+  private void encodeMutation(ByteArrayOutputStream bos, Mutation m) throws 
IOException {
+Mutation.Op op = m.getOperation();
+bos.write(op.ordinal());
+if (op == Mutation.Op.DELETE) {
+  encodeDelete(bos, 

[2/2] beam git commit: This closes #4014: [BEAM-1542] SpannerIO: mutation encoding and size estimation improvements

2017-10-30 Thread jkff
This closes #4014: [BEAM-1542] SpannerIO: mutation encoding and size estimation 
improvements


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/09f68159
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/09f68159
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/09f68159

Branch: refs/heads/master
Commit: 09f68159df28b2c83a6be1643984c0e28507989b
Parents: aa26f4b 3ba9600
Author: Eugene Kirpichov 
Authored: Mon Oct 30 17:02:42 2017 -0700
Committer: Eugene Kirpichov 
Committed: Mon Oct 30 17:02:42 2017 -0700

--
 .../io/gcp/spanner/MutationGroupEncoder.java| 660 +++
 .../io/gcp/spanner/MutationSizeEstimator.java   |  48 ++
 .../gcp/spanner/MutationGroupEncoderTest.java   | 636 ++
 3 files changed, 1344 insertions(+)
--




[jira] [Closed] (BEAM-2979) Race condition between KafkaIO.UnboundedKafkaReader.getWatermark() and KafkaIO.UnboundedKafkaReader.advance()

2017-10-30 Thread Eugene Kirpichov (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-2979?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Kirpichov closed BEAM-2979.
--
   Resolution: Fixed
Fix Version/s: (was: 2.3.0)
   2.2.0

> Race condition between KafkaIO.UnboundedKafkaReader.getWatermark() and 
> KafkaIO.UnboundedKafkaReader.advance()
> -
>
> Key: BEAM-2979
> URL: https://issues.apache.org/jira/browse/BEAM-2979
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core
>Affects Versions: 2.0.0, 2.1.0
>Reporter: Wesley Tanaka
>Assignee: Raghu Angadi
> Fix For: 2.2.0
>
>
> getWatermark() looks like this:
> {noformat}
> @Override
> public Instant getWatermark() {
>   if (curRecord == null) {
> LOG.debug("{}: getWatermark() : no records have been read yet.", 
> name);
> return initialWatermark;
>   }
>   return source.spec.getWatermarkFn() != null
>   ? source.spec.getWatermarkFn().apply(curRecord) : curTimestamp;
> }
> {noformat}
> advance() has code in it that looks like this:
> {noformat}
>   curRecord = null; // user coders below might throw.
>   // apply user deserializers.
>   // TODO: write records that can't be deserialized to a 
> "dead-letter" additional output.
>   KafkaRecord record = new KafkaRecord(
>   rawRecord.topic(),
>   rawRecord.partition(),
>   rawRecord.offset(),
>   consumerSpEL.getRecordTimestamp(rawRecord),
>   keyDeserializerInstance.deserialize(rawRecord.topic(), 
> rawRecord.key()),
>   valueDeserializerInstance.deserialize(rawRecord.topic(), 
> rawRecord.value()));
>   curTimestamp = (source.spec.getTimestampFn() == null)
>   ? Instant.now() : source.spec.getTimestampFn().apply(record);
>   curRecord = record;
> {noformat}
> There's a race condition between these two blocks of code which is exposed at 
> the very least in the FlinkRunner, which calls getWatermark() periodically 
> from a timer.
> The symptom of the race condition is a stack trace that looks like this (SDK 
> 2.0.0):
> {noformat}
> Caused by: org.apache.flink.runtime.client.JobExecutionException: Job 
> execution failed.
>   at 
> org.apache.flink.runtime.jobmanager.JobManager$$anonfun$handleMessage$1$$anonfun$applyOrElse$7.apply$mcV$sp(JobManager.scala:910)
>   at 
> org.apache.flink.runtime.jobmanager.JobManager$$anonfun$handleMessage$1$$anonfun$applyOrElse$7.apply(JobManager.scala:853)
>   at 
> org.apache.flink.runtime.jobmanager.JobManager$$anonfun$handleMessage$1$$anonfun$applyOrElse$7.apply(JobManager.scala:853)
>   at 
> scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24)
>   at 
> scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24)
>   at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:40)
>   at 
> akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:397)
>   at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
>   at 
> scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
>   at 
> scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
>   at 
> scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
> Caused by: TimerException{java.lang.NullPointerException}
>   at 
> org.apache.flink.streaming.runtime.tasks.SystemProcessingTimeService$TriggerTask.run(SystemProcessingTimeService.java:220)
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
>   at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.NullPointerException
>   at org.apache.beam.sdk.io.kafka.KafkaIO$Read$1.apply(KafkaIO.java:568)
>   at org.apache.beam.sdk.io.kafka.KafkaIO$Read$1.apply(KafkaIO.java:565)
>   at 
> org.apache.beam.sdk.io.kafka.KafkaIO$UnboundedKafkaReader.getWatermark(KafkaIO.java:1210)
>   at 
> org.apache.beam.runners.flink.translation.wrappers.streaming.io.UnboundedSourceWrapper.onProcessingTime(UnboundedSourceWrapper.java:431)
>   at 
> 

[jira] [Commented] (BEAM-2979) Race condition between KafkaIO.UnboundedKafkaReader.getWatermark() and KafkaIO.UnboundedKafkaReader.advance()

2017-10-30 Thread Raghu Angadi (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2979?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16226010#comment-16226010
 ] 

Raghu Angadi commented on BEAM-2979:


It was marked a blocker by mistake. 

> Race condition between KafkaIO.UnboundedKafkaReader.getWatermark() and 
> KafkaIO.UnboundedKafkaReader.advance()
> -
>
> Key: BEAM-2979
> URL: https://issues.apache.org/jira/browse/BEAM-2979
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core
>Affects Versions: 2.0.0, 2.1.0
>Reporter: Wesley Tanaka
>Assignee: Raghu Angadi
> Fix For: 2.3.0
>
>
> getWatermark() looks like this:
> {noformat}
> @Override
> public Instant getWatermark() {
>   if (curRecord == null) {
> LOG.debug("{}: getWatermark() : no records have been read yet.", 
> name);
> return initialWatermark;
>   }
>   return source.spec.getWatermarkFn() != null
>   ? source.spec.getWatermarkFn().apply(curRecord) : curTimestamp;
> }
> {noformat}
> advance() has code in it that looks like this:
> {noformat}
>   curRecord = null; // user coders below might throw.
>   // apply user deserializers.
>   // TODO: write records that can't be deserialized to a 
> "dead-letter" additional output.
>   KafkaRecord record = new KafkaRecord(
>   rawRecord.topic(),
>   rawRecord.partition(),
>   rawRecord.offset(),
>   consumerSpEL.getRecordTimestamp(rawRecord),
>   keyDeserializerInstance.deserialize(rawRecord.topic(), 
> rawRecord.key()),
>   valueDeserializerInstance.deserialize(rawRecord.topic(), 
> rawRecord.value()));
>   curTimestamp = (source.spec.getTimestampFn() == null)
>   ? Instant.now() : source.spec.getTimestampFn().apply(record);
>   curRecord = record;
> {noformat}
> There's a race condition between these two blocks of code which is exposed at 
> the very least in the FlinkRunner, which calls getWatermark() periodically 
> from a timer.
> The symptom of the race condition is a stack trace that looks like this (SDK 
> 2.0.0):
> {noformat}
> Caused by: org.apache.flink.runtime.client.JobExecutionException: Job 
> execution failed.
>   at 
> org.apache.flink.runtime.jobmanager.JobManager$$anonfun$handleMessage$1$$anonfun$applyOrElse$7.apply$mcV$sp(JobManager.scala:910)
>   at 
> org.apache.flink.runtime.jobmanager.JobManager$$anonfun$handleMessage$1$$anonfun$applyOrElse$7.apply(JobManager.scala:853)
>   at 
> org.apache.flink.runtime.jobmanager.JobManager$$anonfun$handleMessage$1$$anonfun$applyOrElse$7.apply(JobManager.scala:853)
>   at 
> scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24)
>   at 
> scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24)
>   at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:40)
>   at 
> akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:397)
>   at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
>   at 
> scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
>   at 
> scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
>   at 
> scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
> Caused by: TimerException{java.lang.NullPointerException}
>   at 
> org.apache.flink.streaming.runtime.tasks.SystemProcessingTimeService$TriggerTask.run(SystemProcessingTimeService.java:220)
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
>   at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.NullPointerException
>   at org.apache.beam.sdk.io.kafka.KafkaIO$Read$1.apply(KafkaIO.java:568)
>   at org.apache.beam.sdk.io.kafka.KafkaIO$Read$1.apply(KafkaIO.java:565)
>   at 
> org.apache.beam.sdk.io.kafka.KafkaIO$UnboundedKafkaReader.getWatermark(KafkaIO.java:1210)
>   at 
> org.apache.beam.runners.flink.translation.wrappers.streaming.io.UnboundedSourceWrapper.onProcessingTime(UnboundedSourceWrapper.java:431)
>   at 
> 

[jira] [Updated] (BEAM-2979) Race condition between KafkaIO.UnboundedKafkaReader.getWatermark() and KafkaIO.UnboundedKafkaReader.advance()

2017-10-30 Thread Raghu Angadi (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-2979?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raghu Angadi updated BEAM-2979:
---
Fix Version/s: (was: 2.2.0)
   2.3.0

> Race condition between KafkaIO.UnboundedKafkaReader.getWatermark() and 
> KafkaIO.UnboundedKafkaReader.advance()
> -
>
> Key: BEAM-2979
> URL: https://issues.apache.org/jira/browse/BEAM-2979
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core
>Affects Versions: 2.0.0, 2.1.0
>Reporter: Wesley Tanaka
>Assignee: Raghu Angadi
> Fix For: 2.3.0
>
>
> getWatermark() looks like this:
> {noformat}
> @Override
> public Instant getWatermark() {
>   if (curRecord == null) {
> LOG.debug("{}: getWatermark() : no records have been read yet.", 
> name);
> return initialWatermark;
>   }
>   return source.spec.getWatermarkFn() != null
>   ? source.spec.getWatermarkFn().apply(curRecord) : curTimestamp;
> }
> {noformat}
> advance() has code in it that looks like this:
> {noformat}
>   curRecord = null; // user coders below might throw.
>   // apply user deserializers.
>   // TODO: write records that can't be deserialized to a 
> "dead-letter" additional output.
>   KafkaRecord record = new KafkaRecord(
>   rawRecord.topic(),
>   rawRecord.partition(),
>   rawRecord.offset(),
>   consumerSpEL.getRecordTimestamp(rawRecord),
>   keyDeserializerInstance.deserialize(rawRecord.topic(), 
> rawRecord.key()),
>   valueDeserializerInstance.deserialize(rawRecord.topic(), 
> rawRecord.value()));
>   curTimestamp = (source.spec.getTimestampFn() == null)
>   ? Instant.now() : source.spec.getTimestampFn().apply(record);
>   curRecord = record;
> {noformat}
> There's a race condition between these two blocks of code which is exposed at 
> the very least in the FlinkRunner, which calls getWatermark() periodically 
> from a timer.
> The symptom of the race condition is a stack trace that looks like this (SDK 
> 2.0.0):
> {noformat}
> Caused by: org.apache.flink.runtime.client.JobExecutionException: Job 
> execution failed.
>   at 
> org.apache.flink.runtime.jobmanager.JobManager$$anonfun$handleMessage$1$$anonfun$applyOrElse$7.apply$mcV$sp(JobManager.scala:910)
>   at 
> org.apache.flink.runtime.jobmanager.JobManager$$anonfun$handleMessage$1$$anonfun$applyOrElse$7.apply(JobManager.scala:853)
>   at 
> org.apache.flink.runtime.jobmanager.JobManager$$anonfun$handleMessage$1$$anonfun$applyOrElse$7.apply(JobManager.scala:853)
>   at 
> scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24)
>   at 
> scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24)
>   at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:40)
>   at 
> akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:397)
>   at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
>   at 
> scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
>   at 
> scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
>   at 
> scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
> Caused by: TimerException{java.lang.NullPointerException}
>   at 
> org.apache.flink.streaming.runtime.tasks.SystemProcessingTimeService$TriggerTask.run(SystemProcessingTimeService.java:220)
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
>   at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.NullPointerException
>   at org.apache.beam.sdk.io.kafka.KafkaIO$Read$1.apply(KafkaIO.java:568)
>   at org.apache.beam.sdk.io.kafka.KafkaIO$Read$1.apply(KafkaIO.java:565)
>   at 
> org.apache.beam.sdk.io.kafka.KafkaIO$UnboundedKafkaReader.getWatermark(KafkaIO.java:1210)
>   at 
> org.apache.beam.runners.flink.translation.wrappers.streaming.io.UnboundedSourceWrapper.onProcessingTime(UnboundedSourceWrapper.java:431)
>   at 
> 

[jira] [Updated] (BEAM-2979) Race condition between KafkaIO.UnboundedKafkaReader.getWatermark() and KafkaIO.UnboundedKafkaReader.advance()

2017-10-30 Thread Raghu Angadi (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-2979?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raghu Angadi updated BEAM-2979:
---
Priority: Major  (was: Blocker)

> Race condition between KafkaIO.UnboundedKafkaReader.getWatermark() and 
> KafkaIO.UnboundedKafkaReader.advance()
> -
>
> Key: BEAM-2979
> URL: https://issues.apache.org/jira/browse/BEAM-2979
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core
>Affects Versions: 2.0.0, 2.1.0
>Reporter: Wesley Tanaka
>Assignee: Raghu Angadi
> Fix For: 2.3.0
>
>
> getWatermark() looks like this:
> {noformat}
> @Override
> public Instant getWatermark() {
>   if (curRecord == null) {
> LOG.debug("{}: getWatermark() : no records have been read yet.", 
> name);
> return initialWatermark;
>   }
>   return source.spec.getWatermarkFn() != null
>   ? source.spec.getWatermarkFn().apply(curRecord) : curTimestamp;
> }
> {noformat}
> advance() has code in it that looks like this:
> {noformat}
>   curRecord = null; // user coders below might throw.
>   // apply user deserializers.
>   // TODO: write records that can't be deserialized to a 
> "dead-letter" additional output.
>   KafkaRecord record = new KafkaRecord(
>   rawRecord.topic(),
>   rawRecord.partition(),
>   rawRecord.offset(),
>   consumerSpEL.getRecordTimestamp(rawRecord),
>   keyDeserializerInstance.deserialize(rawRecord.topic(), 
> rawRecord.key()),
>   valueDeserializerInstance.deserialize(rawRecord.topic(), 
> rawRecord.value()));
>   curTimestamp = (source.spec.getTimestampFn() == null)
>   ? Instant.now() : source.spec.getTimestampFn().apply(record);
>   curRecord = record;
> {noformat}
> There's a race condition between these two blocks of code which is exposed at 
> the very least in the FlinkRunner, which calls getWatermark() periodically 
> from a timer.
> The symptom of the race condition is a stack trace that looks like this (SDK 
> 2.0.0):
> {noformat}
> Caused by: org.apache.flink.runtime.client.JobExecutionException: Job 
> execution failed.
>   at 
> org.apache.flink.runtime.jobmanager.JobManager$$anonfun$handleMessage$1$$anonfun$applyOrElse$7.apply$mcV$sp(JobManager.scala:910)
>   at 
> org.apache.flink.runtime.jobmanager.JobManager$$anonfun$handleMessage$1$$anonfun$applyOrElse$7.apply(JobManager.scala:853)
>   at 
> org.apache.flink.runtime.jobmanager.JobManager$$anonfun$handleMessage$1$$anonfun$applyOrElse$7.apply(JobManager.scala:853)
>   at 
> scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24)
>   at 
> scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24)
>   at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:40)
>   at 
> akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:397)
>   at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
>   at 
> scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
>   at 
> scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
>   at 
> scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
> Caused by: TimerException{java.lang.NullPointerException}
>   at 
> org.apache.flink.streaming.runtime.tasks.SystemProcessingTimeService$TriggerTask.run(SystemProcessingTimeService.java:220)
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
>   at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.NullPointerException
>   at org.apache.beam.sdk.io.kafka.KafkaIO$Read$1.apply(KafkaIO.java:568)
>   at org.apache.beam.sdk.io.kafka.KafkaIO$Read$1.apply(KafkaIO.java:565)
>   at 
> org.apache.beam.sdk.io.kafka.KafkaIO$UnboundedKafkaReader.getWatermark(KafkaIO.java:1210)
>   at 
> org.apache.beam.runners.flink.translation.wrappers.streaming.io.UnboundedSourceWrapper.onProcessingTime(UnboundedSourceWrapper.java:431)
>   at 
> 

[jira] [Commented] (BEAM-2979) Race condition between KafkaIO.UnboundedKafkaReader.getWatermark() and KafkaIO.UnboundedKafkaReader.advance()

2017-10-30 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2979?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16225994#comment-16225994
 ] 

ASF GitHub Bot commented on BEAM-2979:
--

Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/3985


> Race condition between KafkaIO.UnboundedKafkaReader.getWatermark() and 
> KafkaIO.UnboundedKafkaReader.advance()
> -
>
> Key: BEAM-2979
> URL: https://issues.apache.org/jira/browse/BEAM-2979
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core
>Affects Versions: 2.0.0, 2.1.0
>Reporter: Wesley Tanaka
>Assignee: Raghu Angadi
>Priority: Blocker
> Fix For: 2.2.0
>
>
> getWatermark() looks like this:
> {noformat}
> @Override
> public Instant getWatermark() {
>   if (curRecord == null) {
> LOG.debug("{}: getWatermark() : no records have been read yet.", 
> name);
> return initialWatermark;
>   }
>   return source.spec.getWatermarkFn() != null
>   ? source.spec.getWatermarkFn().apply(curRecord) : curTimestamp;
> }
> {noformat}
> advance() has code in it that looks like this:
> {noformat}
>   curRecord = null; // user coders below might throw.
>   // apply user deserializers.
>   // TODO: write records that can't be deserialized to a 
> "dead-letter" additional output.
>   KafkaRecord record = new KafkaRecord(
>   rawRecord.topic(),
>   rawRecord.partition(),
>   rawRecord.offset(),
>   consumerSpEL.getRecordTimestamp(rawRecord),
>   keyDeserializerInstance.deserialize(rawRecord.topic(), 
> rawRecord.key()),
>   valueDeserializerInstance.deserialize(rawRecord.topic(), 
> rawRecord.value()));
>   curTimestamp = (source.spec.getTimestampFn() == null)
>   ? Instant.now() : source.spec.getTimestampFn().apply(record);
>   curRecord = record;
> {noformat}
> There's a race condition between these two blocks of code which is exposed at 
> the very least in the FlinkRunner, which calls getWatermark() periodically 
> from a timer.
> The symptom of the race condition is a stack trace that looks like this (SDK 
> 2.0.0):
> {noformat}
> Caused by: org.apache.flink.runtime.client.JobExecutionException: Job 
> execution failed.
>   at 
> org.apache.flink.runtime.jobmanager.JobManager$$anonfun$handleMessage$1$$anonfun$applyOrElse$7.apply$mcV$sp(JobManager.scala:910)
>   at 
> org.apache.flink.runtime.jobmanager.JobManager$$anonfun$handleMessage$1$$anonfun$applyOrElse$7.apply(JobManager.scala:853)
>   at 
> org.apache.flink.runtime.jobmanager.JobManager$$anonfun$handleMessage$1$$anonfun$applyOrElse$7.apply(JobManager.scala:853)
>   at 
> scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24)
>   at 
> scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24)
>   at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:40)
>   at 
> akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:397)
>   at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
>   at 
> scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
>   at 
> scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
>   at 
> scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
> Caused by: TimerException{java.lang.NullPointerException}
>   at 
> org.apache.flink.streaming.runtime.tasks.SystemProcessingTimeService$TriggerTask.run(SystemProcessingTimeService.java:220)
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
>   at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.NullPointerException
>   at org.apache.beam.sdk.io.kafka.KafkaIO$Read$1.apply(KafkaIO.java:568)
>   at org.apache.beam.sdk.io.kafka.KafkaIO$Read$1.apply(KafkaIO.java:565)
>   at 
> org.apache.beam.sdk.io.kafka.KafkaIO$UnboundedKafkaReader.getWatermark(KafkaIO.java:1210)
>   at 
> 

[GitHub] beam pull request #3985: [BEAM-2979] Fix a race condition in getWatermark() ...

2017-10-30 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/3985


---


[2/2] beam git commit: This closes #3985

2017-10-30 Thread chamikara
This closes #3985


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/aa26f4bf
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/aa26f4bf
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/aa26f4bf

Branch: refs/heads/master
Commit: aa26f4bf71d72989da82927bd1565e5722a1f8d8
Parents: 16b9d58 56b512f
Author: chamik...@google.com 
Authored: Mon Oct 30 16:40:54 2017 -0700
Committer: chamik...@google.com 
Committed: Mon Oct 30 16:40:54 2017 -0700

--
 .../src/main/java/org/apache/beam/sdk/io/kafka/KafkaIO.java  | 8 +++-
 1 file changed, 3 insertions(+), 5 deletions(-)
--




[1/2] beam git commit: [BEAM-2979] Fix a race condition in getWatermark() in KafkaIO.

2017-10-30 Thread chamikara
Repository: beam
Updated Branches:
  refs/heads/master 16b9d584c -> aa26f4bf7


[BEAM-2979] Fix a race condition in getWatermark() in KafkaIO.

Don't set curRecord to null before updating. If user deserializers throw, ok to 
keep curRecord pointing to old one.


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/56b512f9
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/56b512f9
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/56b512f9

Branch: refs/heads/master
Commit: 56b512f9242f17a804f2e8d9adca49c771863e53
Parents: 16b9d58
Author: Raghu Angadi 
Authored: Wed Oct 11 15:04:28 2017 -0700
Committer: chamik...@google.com 
Committed: Mon Oct 30 15:49:12 2017 -0700

--
 .../src/main/java/org/apache/beam/sdk/io/kafka/KafkaIO.java  | 8 +++-
 1 file changed, 3 insertions(+), 5 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/56b512f9/sdks/java/io/kafka/src/main/java/org/apache/beam/sdk/io/kafka/KafkaIO.java
--
diff --git 
a/sdks/java/io/kafka/src/main/java/org/apache/beam/sdk/io/kafka/KafkaIO.java 
b/sdks/java/io/kafka/src/main/java/org/apache/beam/sdk/io/kafka/KafkaIO.java
index af73a8d..17e0e34 100644
--- a/sdks/java/io/kafka/src/main/java/org/apache/beam/sdk/io/kafka/KafkaIO.java
+++ b/sdks/java/io/kafka/src/main/java/org/apache/beam/sdk/io/kafka/KafkaIO.java
@@ -1255,11 +1255,10 @@ public class KafkaIO {
 offsetGap = 0;
   }
 
-  curRecord = null; // user coders below might throw.
-
-  // apply user deserializers.
+  // Apply user deserializers. User deserializers might throw, which 
will be propagated up
+  // and 'curRecord' remains unchanged. The runner should close this 
reader.
   // TODO: write records that can't be deserialized to a "dead-letter" 
additional output.
-  KafkaRecord record = new KafkaRecord(
+  KafkaRecord record = new KafkaRecord<>(
   rawRecord.topic(),
   rawRecord.partition(),
   rawRecord.offset(),
@@ -1372,7 +1371,6 @@ public class KafkaIO {
   return curTimestamp;
 }
 
-
 @Override
 public long getSplitBacklogBytes() {
   long backlogBytes = 0;



[jira] [Created] (BEAM-3122) WriteFiles windowed writes don't work with session windows

2017-10-30 Thread Eugene Kirpichov (JIRA)
Eugene Kirpichov created BEAM-3122:
--

 Summary: WriteFiles windowed writes don't work with session windows
 Key: BEAM-3122
 URL: https://issues.apache.org/jira/browse/BEAM-3122
 Project: Beam
  Issue Type: Bug
  Components: sdk-java-core
Reporter: Eugene Kirpichov
Assignee: Reuven Lax


See 
https://stackoverflow.com/questions/46983318/writing-via-textio-write-with-sessions-windowing-raises-groupbykey-consumption-e




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (BEAM-3121) Dockerized jekyll server fails

2017-10-30 Thread Henning Rohde (JIRA)
Henning Rohde created BEAM-3121:
---

 Summary: Dockerized jekyll server fails
 Key: BEAM-3121
 URL: https://issues.apache.org/jira/browse/BEAM-3121
 Project: Beam
  Issue Type: Bug
  Components: website
Reporter: Henning Rohde
Assignee: Reuven Lax


The dockerized jekyll script for building the website fails with the following 
error:

$ ./run_with_docker.sh server
Unable to find image 'jekyll/jekyll:latest' locally
latest: Pulling from jekyll/jekyll
b56ae66c2937: Pull complete 
337580f4eddd: Pull complete 
f8edd684e995: Pull complete 
536ad31f0662: Pull complete 
120aeb540421: Pull complete 
Digest: sha256:61956fbaf76be2fc23cd4908abbe1bfce6541dcdd9d9b3658b53d80b25a363a4
Status: Downloaded newer image for jekyll/jekyll:latest
The following gems are missing
 * rake (11.3.0)
 * i18n (0.7.0)
 * json (1.8.3)
 * minitest (5.9.1)
 * thread_safe (0.3.5)
 * tzinfo (1.2.2)
 * activesupport (4.2.7.1)
 * addressable (2.4.0)
 * colored (1.2)
 * ffi (1.9.14)
 * ethon (0.9.1)
 * mini_portile2 (2.1.0)
 * nokogiri (1.6.8.1)
 * parallel (1.9.0)
 * yell (2.0.6)
 * html-proofer (3.3.1)
 * sass (3.4.22)
 * jekyll-sass-converter (1.4.0)
 * rb-fsevent (0.9.7)
 * rb-inotify (0.9.7)
 * kramdown (1.12.0)
 * liquid (3.0.6)
 * pathutil (0.14.0)
 * rouge (1.11.1)
 * jekyll (3.2.0)
 * jekyll-redirect-from (0.11.0)
 * jekyll_github_sample (0.3.0)
Install missing gems with `bundle install`
Fetching gem metadata from https://rubygems.org/.
Fetching version metadata from https://rubygems.org/..
Fetching dependency metadata from https://rubygems.org/.
Fetching rake 11.3.0
Installing rake 11.3.0
Fetching i18n 0.7.0
Installing i18n 0.7.0
Fetching json 1.8.3
Installing json 1.8.3 with native extensions
Fetching minitest 5.9.1
Installing minitest 5.9.1
Fetching thread_safe 0.3.5
Installing thread_safe 0.3.5
Fetching addressable 2.4.0
Installing addressable 2.4.0
Using bundler 1.15.4
Fetching colorator 1.1.0
Installing colorator 1.1.0
Fetching colored 1.2
Installing colored 1.2
Fetching ffi 1.9.14
Installing ffi 1.9.14 with native extensions
Fetching forwardable-extended 2.6.0
Installing forwardable-extended 2.6.0
Fetching mercenary 0.3.6
Installing mercenary 0.3.6
Fetching mini_portile2 2.1.0
Installing mini_portile2 2.1.0
Fetching parallel 1.9.0
Installing parallel 1.9.0
Fetching yell 2.0.6
Installing yell 2.0.6
Fetching sass 3.4.22
Installing sass 3.4.22
Fetching rb-fsevent 0.9.7
Installing rb-fsevent 0.9.7
Fetching kramdown 1.12.0
Installing kramdown 1.12.0
Fetching liquid 3.0.6
Installing liquid 3.0.6
Fetching rouge 1.11.1
Installing rouge 1.11.1
Fetching safe_yaml 1.0.4
Installing safe_yaml 1.0.4
Gem::Ext::BuildError: ERROR: Failed to build gem native extension.

current directory: 
/srv/jekyll/vendor/bundle/gems/json-1.8.3/ext/json/ext/generator
/usr/local/bin/ruby -r ./siteconf20171030-31-1kumuua.rb extconf.rb
creating Makefile

current directory: 
/srv/jekyll/vendor/bundle/gems/json-1.8.3/ext/json/ext/generator
make "DESTDIR=" clean

current directory: 
/srv/jekyll/vendor/bundle/gems/json-1.8.3/ext/json/ext/generator
make "DESTDIR="
compiling generator.c
generator.c: In function 'generate_json':
generator.c:861:25: error: 'rb_cFixnum' undeclared (first use in this function)
 } else if (klass == rb_cFixnum) {
 ^~
generator.c:861:25: note: each undeclared identifier is reported only once for 
each function it appears in
generator.c:863:25: error: 'rb_cBignum' undeclared (first use in this function)
 } else if (klass == rb_cBignum) {
 ^~
generator.c: At top level:
cc1: warning: unrecognized command line option '-Wno-self-assign'
cc1: warning: unrecognized command line option '-Wno-constant-logical-operand'
cc1: warning: unrecognized command line option '-Wno-parentheses-equality'
make: *** [Makefile:242: generator.o] Error 1

make failed, exit code 2

Gem files will remain installed in /srv/jekyll/vendor/bundle/gems/json-1.8.3 
for inspection.
Results logged to 
/srv/jekyll/vendor/bundle/extensions/x86_64-linux/2.4.0/json-1.8.3/gem_make.out

An error occurred while installing json (1.8.3), and Bundler cannot continue.
Make sure that `gem install json -v '1.8.3'` succeeds before bundling.

In Gemfile:
  html-proofer was resolved to 3.3.1, which depends on
activesupport was resolved to 4.2.7.1, which depends on
  json




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (BEAM-2979) Race condition between KafkaIO.UnboundedKafkaReader.getWatermark() and KafkaIO.UnboundedKafkaReader.advance()

2017-10-30 Thread Raghu Angadi (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2979?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16225831#comment-16225831
 ] 

Raghu Angadi commented on BEAM-2979:


I think it is good to have. There is no work around for Flink users that set 
watermark function.

> Race condition between KafkaIO.UnboundedKafkaReader.getWatermark() and 
> KafkaIO.UnboundedKafkaReader.advance()
> -
>
> Key: BEAM-2979
> URL: https://issues.apache.org/jira/browse/BEAM-2979
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core
>Affects Versions: 2.0.0, 2.1.0
>Reporter: Wesley Tanaka
>Assignee: Raghu Angadi
>Priority: Blocker
> Fix For: 2.2.0
>
>
> getWatermark() looks like this:
> {noformat}
> @Override
> public Instant getWatermark() {
>   if (curRecord == null) {
> LOG.debug("{}: getWatermark() : no records have been read yet.", 
> name);
> return initialWatermark;
>   }
>   return source.spec.getWatermarkFn() != null
>   ? source.spec.getWatermarkFn().apply(curRecord) : curTimestamp;
> }
> {noformat}
> advance() has code in it that looks like this:
> {noformat}
>   curRecord = null; // user coders below might throw.
>   // apply user deserializers.
>   // TODO: write records that can't be deserialized to a 
> "dead-letter" additional output.
>   KafkaRecord record = new KafkaRecord(
>   rawRecord.topic(),
>   rawRecord.partition(),
>   rawRecord.offset(),
>   consumerSpEL.getRecordTimestamp(rawRecord),
>   keyDeserializerInstance.deserialize(rawRecord.topic(), 
> rawRecord.key()),
>   valueDeserializerInstance.deserialize(rawRecord.topic(), 
> rawRecord.value()));
>   curTimestamp = (source.spec.getTimestampFn() == null)
>   ? Instant.now() : source.spec.getTimestampFn().apply(record);
>   curRecord = record;
> {noformat}
> There's a race condition between these two blocks of code which is exposed at 
> the very least in the FlinkRunner, which calls getWatermark() periodically 
> from a timer.
> The symptom of the race condition is a stack trace that looks like this (SDK 
> 2.0.0):
> {noformat}
> Caused by: org.apache.flink.runtime.client.JobExecutionException: Job 
> execution failed.
>   at 
> org.apache.flink.runtime.jobmanager.JobManager$$anonfun$handleMessage$1$$anonfun$applyOrElse$7.apply$mcV$sp(JobManager.scala:910)
>   at 
> org.apache.flink.runtime.jobmanager.JobManager$$anonfun$handleMessage$1$$anonfun$applyOrElse$7.apply(JobManager.scala:853)
>   at 
> org.apache.flink.runtime.jobmanager.JobManager$$anonfun$handleMessage$1$$anonfun$applyOrElse$7.apply(JobManager.scala:853)
>   at 
> scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24)
>   at 
> scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24)
>   at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:40)
>   at 
> akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:397)
>   at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
>   at 
> scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
>   at 
> scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
>   at 
> scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
> Caused by: TimerException{java.lang.NullPointerException}
>   at 
> org.apache.flink.streaming.runtime.tasks.SystemProcessingTimeService$TriggerTask.run(SystemProcessingTimeService.java:220)
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
>   at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.NullPointerException
>   at org.apache.beam.sdk.io.kafka.KafkaIO$Read$1.apply(KafkaIO.java:568)
>   at org.apache.beam.sdk.io.kafka.KafkaIO$Read$1.apply(KafkaIO.java:565)
>   at 
> org.apache.beam.sdk.io.kafka.KafkaIO$UnboundedKafkaReader.getWatermark(KafkaIO.java:1210)
>   at 
> 

[jira] [Updated] (BEAM-2979) Race condition between KafkaIO.UnboundedKafkaReader.getWatermark() and KafkaIO.UnboundedKafkaReader.advance()

2017-10-30 Thread Raghu Angadi (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-2979?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raghu Angadi updated BEAM-2979:
---
Fix Version/s: 2.2.0

> Race condition between KafkaIO.UnboundedKafkaReader.getWatermark() and 
> KafkaIO.UnboundedKafkaReader.advance()
> -
>
> Key: BEAM-2979
> URL: https://issues.apache.org/jira/browse/BEAM-2979
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core
>Affects Versions: 2.0.0, 2.1.0
>Reporter: Wesley Tanaka
>Assignee: Raghu Angadi
>Priority: Blocker
> Fix For: 2.2.0
>
>
> getWatermark() looks like this:
> {noformat}
> @Override
> public Instant getWatermark() {
>   if (curRecord == null) {
> LOG.debug("{}: getWatermark() : no records have been read yet.", 
> name);
> return initialWatermark;
>   }
>   return source.spec.getWatermarkFn() != null
>   ? source.spec.getWatermarkFn().apply(curRecord) : curTimestamp;
> }
> {noformat}
> advance() has code in it that looks like this:
> {noformat}
>   curRecord = null; // user coders below might throw.
>   // apply user deserializers.
>   // TODO: write records that can't be deserialized to a 
> "dead-letter" additional output.
>   KafkaRecord record = new KafkaRecord(
>   rawRecord.topic(),
>   rawRecord.partition(),
>   rawRecord.offset(),
>   consumerSpEL.getRecordTimestamp(rawRecord),
>   keyDeserializerInstance.deserialize(rawRecord.topic(), 
> rawRecord.key()),
>   valueDeserializerInstance.deserialize(rawRecord.topic(), 
> rawRecord.value()));
>   curTimestamp = (source.spec.getTimestampFn() == null)
>   ? Instant.now() : source.spec.getTimestampFn().apply(record);
>   curRecord = record;
> {noformat}
> There's a race condition between these two blocks of code which is exposed at 
> the very least in the FlinkRunner, which calls getWatermark() periodically 
> from a timer.
> The symptom of the race condition is a stack trace that looks like this (SDK 
> 2.0.0):
> {noformat}
> Caused by: org.apache.flink.runtime.client.JobExecutionException: Job 
> execution failed.
>   at 
> org.apache.flink.runtime.jobmanager.JobManager$$anonfun$handleMessage$1$$anonfun$applyOrElse$7.apply$mcV$sp(JobManager.scala:910)
>   at 
> org.apache.flink.runtime.jobmanager.JobManager$$anonfun$handleMessage$1$$anonfun$applyOrElse$7.apply(JobManager.scala:853)
>   at 
> org.apache.flink.runtime.jobmanager.JobManager$$anonfun$handleMessage$1$$anonfun$applyOrElse$7.apply(JobManager.scala:853)
>   at 
> scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24)
>   at 
> scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24)
>   at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:40)
>   at 
> akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:397)
>   at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
>   at 
> scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
>   at 
> scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
>   at 
> scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
> Caused by: TimerException{java.lang.NullPointerException}
>   at 
> org.apache.flink.streaming.runtime.tasks.SystemProcessingTimeService$TriggerTask.run(SystemProcessingTimeService.java:220)
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
>   at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.NullPointerException
>   at org.apache.beam.sdk.io.kafka.KafkaIO$Read$1.apply(KafkaIO.java:568)
>   at org.apache.beam.sdk.io.kafka.KafkaIO$Read$1.apply(KafkaIO.java:565)
>   at 
> org.apache.beam.sdk.io.kafka.KafkaIO$UnboundedKafkaReader.getWatermark(KafkaIO.java:1210)
>   at 
> org.apache.beam.runners.flink.translation.wrappers.streaming.io.UnboundedSourceWrapper.onProcessingTime(UnboundedSourceWrapper.java:431)
>   at 
> 

[jira] [Commented] (BEAM-3064) Update dataflow runner containers for the release branch

2017-10-30 Thread Valentyn Tymofieiev (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-3064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16225818#comment-16225818
 ] 

Valentyn Tymofieiev commented on BEAM-3064:
---

All PRs are merged, taking this back to build containers.

> Update dataflow runner containers for the release branch
> 
>
> Key: BEAM-3064
> URL: https://issues.apache.org/jira/browse/BEAM-3064
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Affects Versions: 2.2.0
>Reporter: Ahmet Altay
>Assignee: Valentyn Tymofieiev
>Priority: Blocker
> Fix For: 2.2.0
>
>
> Blocked by:
> https://github.com/apache/beam/pull/3970
> https://github.com/apache/beam/pull/3941 - cp into release branch.
> cc: [~reuvenlax] [~robertwb] [~tvalentyn]



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (BEAM-2979) Race condition between KafkaIO.UnboundedKafkaReader.getWatermark() and KafkaIO.UnboundedKafkaReader.advance()

2017-10-30 Thread Raghu Angadi (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2979?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16225817#comment-16225817
 ] 

Raghu Angadi commented on BEAM-2979:


No, Cham is waiting on Jenkings before merging. I cleared `Fix Version`. 

> Race condition between KafkaIO.UnboundedKafkaReader.getWatermark() and 
> KafkaIO.UnboundedKafkaReader.advance()
> -
>
> Key: BEAM-2979
> URL: https://issues.apache.org/jira/browse/BEAM-2979
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core
>Affects Versions: 2.0.0, 2.1.0
>Reporter: Wesley Tanaka
>Assignee: Raghu Angadi
>Priority: Blocker
>
> getWatermark() looks like this:
> {noformat}
> @Override
> public Instant getWatermark() {
>   if (curRecord == null) {
> LOG.debug("{}: getWatermark() : no records have been read yet.", 
> name);
> return initialWatermark;
>   }
>   return source.spec.getWatermarkFn() != null
>   ? source.spec.getWatermarkFn().apply(curRecord) : curTimestamp;
> }
> {noformat}
> advance() has code in it that looks like this:
> {noformat}
>   curRecord = null; // user coders below might throw.
>   // apply user deserializers.
>   // TODO: write records that can't be deserialized to a 
> "dead-letter" additional output.
>   KafkaRecord record = new KafkaRecord(
>   rawRecord.topic(),
>   rawRecord.partition(),
>   rawRecord.offset(),
>   consumerSpEL.getRecordTimestamp(rawRecord),
>   keyDeserializerInstance.deserialize(rawRecord.topic(), 
> rawRecord.key()),
>   valueDeserializerInstance.deserialize(rawRecord.topic(), 
> rawRecord.value()));
>   curTimestamp = (source.spec.getTimestampFn() == null)
>   ? Instant.now() : source.spec.getTimestampFn().apply(record);
>   curRecord = record;
> {noformat}
> There's a race condition between these two blocks of code which is exposed at 
> the very least in the FlinkRunner, which calls getWatermark() periodically 
> from a timer.
> The symptom of the race condition is a stack trace that looks like this (SDK 
> 2.0.0):
> {noformat}
> Caused by: org.apache.flink.runtime.client.JobExecutionException: Job 
> execution failed.
>   at 
> org.apache.flink.runtime.jobmanager.JobManager$$anonfun$handleMessage$1$$anonfun$applyOrElse$7.apply$mcV$sp(JobManager.scala:910)
>   at 
> org.apache.flink.runtime.jobmanager.JobManager$$anonfun$handleMessage$1$$anonfun$applyOrElse$7.apply(JobManager.scala:853)
>   at 
> org.apache.flink.runtime.jobmanager.JobManager$$anonfun$handleMessage$1$$anonfun$applyOrElse$7.apply(JobManager.scala:853)
>   at 
> scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24)
>   at 
> scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24)
>   at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:40)
>   at 
> akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:397)
>   at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
>   at 
> scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
>   at 
> scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
>   at 
> scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
> Caused by: TimerException{java.lang.NullPointerException}
>   at 
> org.apache.flink.streaming.runtime.tasks.SystemProcessingTimeService$TriggerTask.run(SystemProcessingTimeService.java:220)
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
>   at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.NullPointerException
>   at org.apache.beam.sdk.io.kafka.KafkaIO$Read$1.apply(KafkaIO.java:568)
>   at org.apache.beam.sdk.io.kafka.KafkaIO$Read$1.apply(KafkaIO.java:565)
>   at 
> org.apache.beam.sdk.io.kafka.KafkaIO$UnboundedKafkaReader.getWatermark(KafkaIO.java:1210)
>   at 
> org.apache.beam.runners.flink.translation.wrappers.streaming.io.UnboundedSourceWrapper.onProcessingTime(UnboundedSourceWrapper.java:431)
>   at 

[jira] [Assigned] (BEAM-3064) Update dataflow runner containers for the release branch

2017-10-30 Thread Valentyn Tymofieiev (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3064?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Valentyn Tymofieiev reassigned BEAM-3064:
-

Assignee: Valentyn Tymofieiev  (was: Reuven Lax)

> Update dataflow runner containers for the release branch
> 
>
> Key: BEAM-3064
> URL: https://issues.apache.org/jira/browse/BEAM-3064
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Affects Versions: 2.2.0
>Reporter: Ahmet Altay
>Assignee: Valentyn Tymofieiev
>Priority: Blocker
> Fix For: 2.2.0
>
>
> Blocked by:
> https://github.com/apache/beam/pull/3970
> https://github.com/apache/beam/pull/3941 - cp into release branch.
> cc: [~reuvenlax] [~robertwb] [~tvalentyn]



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (BEAM-2979) Race condition between KafkaIO.UnboundedKafkaReader.getWatermark() and KafkaIO.UnboundedKafkaReader.advance()

2017-10-30 Thread Raghu Angadi (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-2979?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raghu Angadi updated BEAM-2979:
---
Fix Version/s: (was: 2.2.0)

> Race condition between KafkaIO.UnboundedKafkaReader.getWatermark() and 
> KafkaIO.UnboundedKafkaReader.advance()
> -
>
> Key: BEAM-2979
> URL: https://issues.apache.org/jira/browse/BEAM-2979
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core
>Affects Versions: 2.0.0, 2.1.0
>Reporter: Wesley Tanaka
>Assignee: Raghu Angadi
>Priority: Blocker
>
> getWatermark() looks like this:
> {noformat}
> @Override
> public Instant getWatermark() {
>   if (curRecord == null) {
> LOG.debug("{}: getWatermark() : no records have been read yet.", 
> name);
> return initialWatermark;
>   }
>   return source.spec.getWatermarkFn() != null
>   ? source.spec.getWatermarkFn().apply(curRecord) : curTimestamp;
> }
> {noformat}
> advance() has code in it that looks like this:
> {noformat}
>   curRecord = null; // user coders below might throw.
>   // apply user deserializers.
>   // TODO: write records that can't be deserialized to a 
> "dead-letter" additional output.
>   KafkaRecord record = new KafkaRecord(
>   rawRecord.topic(),
>   rawRecord.partition(),
>   rawRecord.offset(),
>   consumerSpEL.getRecordTimestamp(rawRecord),
>   keyDeserializerInstance.deserialize(rawRecord.topic(), 
> rawRecord.key()),
>   valueDeserializerInstance.deserialize(rawRecord.topic(), 
> rawRecord.value()));
>   curTimestamp = (source.spec.getTimestampFn() == null)
>   ? Instant.now() : source.spec.getTimestampFn().apply(record);
>   curRecord = record;
> {noformat}
> There's a race condition between these two blocks of code which is exposed at 
> the very least in the FlinkRunner, which calls getWatermark() periodically 
> from a timer.
> The symptom of the race condition is a stack trace that looks like this (SDK 
> 2.0.0):
> {noformat}
> Caused by: org.apache.flink.runtime.client.JobExecutionException: Job 
> execution failed.
>   at 
> org.apache.flink.runtime.jobmanager.JobManager$$anonfun$handleMessage$1$$anonfun$applyOrElse$7.apply$mcV$sp(JobManager.scala:910)
>   at 
> org.apache.flink.runtime.jobmanager.JobManager$$anonfun$handleMessage$1$$anonfun$applyOrElse$7.apply(JobManager.scala:853)
>   at 
> org.apache.flink.runtime.jobmanager.JobManager$$anonfun$handleMessage$1$$anonfun$applyOrElse$7.apply(JobManager.scala:853)
>   at 
> scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24)
>   at 
> scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24)
>   at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:40)
>   at 
> akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:397)
>   at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
>   at 
> scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
>   at 
> scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
>   at 
> scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
> Caused by: TimerException{java.lang.NullPointerException}
>   at 
> org.apache.flink.streaming.runtime.tasks.SystemProcessingTimeService$TriggerTask.run(SystemProcessingTimeService.java:220)
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
>   at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.NullPointerException
>   at org.apache.beam.sdk.io.kafka.KafkaIO$Read$1.apply(KafkaIO.java:568)
>   at org.apache.beam.sdk.io.kafka.KafkaIO$Read$1.apply(KafkaIO.java:565)
>   at 
> org.apache.beam.sdk.io.kafka.KafkaIO$UnboundedKafkaReader.getWatermark(KafkaIO.java:1210)
>   at 
> org.apache.beam.runners.flink.translation.wrappers.streaming.io.UnboundedSourceWrapper.onProcessingTime(UnboundedSourceWrapper.java:431)
>   at 
> 

[GitHub] beam pull request #4059: [BEAM-3064] Update container version for python Dat...

2017-10-30 Thread aaltay
GitHub user aaltay opened a pull request:

https://github.com/apache/beam/pull/4059

[BEAM-3064] Update container version for python DataflowRunner for the 
2.2.0 release

R: @tvalentyn @reuvenlax 

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/aaltay/beam depcon

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/4059.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #4059


commit 371868018ba21c19a2e5ed3658b370b2e3f3a23f
Author: Ahmet Altay 
Date:   2017-10-30T21:48:42Z

Update container version for python DataflowRunner for the 2.2.0 release




---


[jira] [Commented] (BEAM-3064) Update dataflow runner containers for the release branch

2017-10-30 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-3064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16225802#comment-16225802
 ] 

ASF GitHub Bot commented on BEAM-3064:
--

GitHub user aaltay opened a pull request:

https://github.com/apache/beam/pull/4059

[BEAM-3064] Update container version for python DataflowRunner for the 
2.2.0 release

R: @tvalentyn @reuvenlax 

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/aaltay/beam depcon

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/4059.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #4059


commit 371868018ba21c19a2e5ed3658b370b2e3f3a23f
Author: Ahmet Altay 
Date:   2017-10-30T21:48:42Z

Update container version for python DataflowRunner for the 2.2.0 release




> Update dataflow runner containers for the release branch
> 
>
> Key: BEAM-3064
> URL: https://issues.apache.org/jira/browse/BEAM-3064
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Affects Versions: 2.2.0
>Reporter: Ahmet Altay
>Assignee: Reuven Lax
>Priority: Blocker
> Fix For: 2.2.0
>
>
> Blocked by:
> https://github.com/apache/beam/pull/3970
> https://github.com/apache/beam/pull/3941 - cp into release branch.
> cc: [~reuvenlax] [~robertwb] [~tvalentyn]



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] beam pull request #4055: [BEAM-3108] Align names with those produced by the ...

2017-10-30 Thread aaltay
Github user aaltay closed the pull request at:

https://github.com/apache/beam/pull/4055


---


[jira] [Commented] (BEAM-3108) Align names with those produced by the dataflow runner harness

2017-10-30 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-3108?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16225785#comment-16225785
 ] 

ASF GitHub Bot commented on BEAM-3108:
--

Github user aaltay closed the pull request at:

https://github.com/apache/beam/pull/4055


> Align names with those produced by the dataflow runner harness
> --
>
> Key: BEAM-3108
> URL: https://issues.apache.org/jira/browse/BEAM-3108
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Ahmet Altay
>Assignee: Robert Bradshaw
> Fix For: 2.2.0
>
>
> Merge https://github.com/apache/beam/pull/3941 to the release branch



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Resolved] (BEAM-2829) Add ability to set job labels in DataflowPipelineOptions

2017-10-30 Thread Ahmet Altay (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-2829?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmet Altay resolved BEAM-2829.
---
Resolution: Fixed

> Add ability to set job labels in DataflowPipelineOptions
> 
>
> Key: BEAM-2829
> URL: https://issues.apache.org/jira/browse/BEAM-2829
> Project: Beam
>  Issue Type: New Feature
>  Components: runner-dataflow
>Affects Versions: 2.0.0, 2.1.0
>Reporter: Zongwei Zhou
>Assignee: Ahmet Altay
>Priority: Minor
> Fix For: 2.2.0
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> Enable setting job labels --labels in DataflowPipelineOptions (earlier 
> Dataflow SDK 1.x supports this)
> https://github.com/apache/beam/blob/master/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/options/DataflowPipelineOptions.java



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] beam pull request #3993: [BEAM-2829] Add an option for dataflow job labels.

2017-10-30 Thread aaltay
Github user aaltay closed the pull request at:

https://github.com/apache/beam/pull/3993


---


[jira] [Commented] (BEAM-2829) Add ability to set job labels in DataflowPipelineOptions

2017-10-30 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2829?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16225784#comment-16225784
 ] 

ASF GitHub Bot commented on BEAM-2829:
--

Github user aaltay closed the pull request at:

https://github.com/apache/beam/pull/3993


> Add ability to set job labels in DataflowPipelineOptions
> 
>
> Key: BEAM-2829
> URL: https://issues.apache.org/jira/browse/BEAM-2829
> Project: Beam
>  Issue Type: New Feature
>  Components: runner-dataflow
>Affects Versions: 2.0.0, 2.1.0
>Reporter: Zongwei Zhou
>Assignee: Ahmet Altay
>Priority: Minor
> Fix For: 2.2.0
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> Enable setting job labels --labels in DataflowPipelineOptions (earlier 
> Dataflow SDK 1.x supports this)
> https://github.com/apache/beam/blob/master/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/options/DataflowPipelineOptions.java



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[2/3] beam git commit: Add an option for dataflow job labels.

2017-10-30 Thread altay
Add an option for dataflow job labels.


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/48ae7d1d
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/48ae7d1d
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/48ae7d1d

Branch: refs/heads/release-2.2.0
Commit: 48ae7d1d8243c6a3a037ad454162376f98dec3ab
Parents: 6db1db7
Author: Ahmet Altay 
Authored: Thu Oct 12 19:17:28 2017 -0700
Committer: Ahmet Altay 
Committed: Mon Oct 30 14:42:12 2017 -0700

--
 sdks/python/apache_beam/options/pipeline_options.py  |  7 +++
 .../apache_beam/runners/dataflow/internal/apiclient.py   | 11 +++
 2 files changed, 18 insertions(+)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/48ae7d1d/sdks/python/apache_beam/options/pipeline_options.py
--
diff --git a/sdks/python/apache_beam/options/pipeline_options.py 
b/sdks/python/apache_beam/options/pipeline_options.py
index 3abcbf2..37703fe 100644
--- a/sdks/python/apache_beam/options/pipeline_options.py
+++ b/sdks/python/apache_beam/options/pipeline_options.py
@@ -374,6 +374,13 @@ class GoogleCloudOptions(PipelineOptions):
 parser.add_argument('--template_location',
 default=None,
 help='Save job to specified local or GCS location.')
+parser.add_argument(
+'--label', '--labels',
+dest='labels',
+action='append',
+default=None,
+help='Labels that will be applied to this Dataflow job. Labels are key 
'
+'value pairs separated by = (e.g. --label key=value).')
 
   def validate(self, validator):
 errors = []

http://git-wip-us.apache.org/repos/asf/beam/blob/48ae7d1d/sdks/python/apache_beam/runners/dataflow/internal/apiclient.py
--
diff --git a/sdks/python/apache_beam/runners/dataflow/internal/apiclient.py 
b/sdks/python/apache_beam/runners/dataflow/internal/apiclient.py
index e48b58c..eec598a 100644
--- a/sdks/python/apache_beam/runners/dataflow/internal/apiclient.py
+++ b/sdks/python/apache_beam/runners/dataflow/internal/apiclient.py
@@ -363,6 +363,17 @@ class Job(object):
   self.proto.type = dataflow.Job.TypeValueValuesEnum.JOB_TYPE_STREAMING
 else:
   self.proto.type = dataflow.Job.TypeValueValuesEnum.JOB_TYPE_BATCH
+
+# Labels.
+if self.google_cloud_options.labels:
+  self.proto.labels = dataflow.Job.LabelsValue()
+  for label in self.google_cloud_options.labels:
+parts = label.split('=', 1)
+key = parts[0]
+value = parts[1] if len(parts) > 1 else ''
+self.proto.labels.additionalProperties.append(
+dataflow.Job.LabelsValue.AdditionalProperty(key=key, value=value))
+
 self.base64_str_re = re.compile(r'^[A-Za-z0-9+/]*=*$')
 self.coder_str_re = re.compile(r'^([A-Za-z]+\$)([A-Za-z0-9+/]*=*)$')
 



[1/3] beam git commit: Unit test for label pipeline option

2017-10-30 Thread altay
Repository: beam
Updated Branches:
  refs/heads/release-2.2.0 6db1db7b4 -> c6b6668a5


Unit test for label pipeline option


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/6f07e3f9
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/6f07e3f9
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/6f07e3f9

Branch: refs/heads/release-2.2.0
Commit: 6f07e3f9e735768e57ba7b902d8a80b3285e9a93
Parents: 48ae7d1
Author: Ahmet Altay 
Authored: Fri Oct 13 15:53:15 2017 -0700
Committer: Ahmet Altay 
Committed: Mon Oct 30 14:42:12 2017 -0700

--
 .../runners/dataflow/internal/apiclient_test.py | 28 
 1 file changed, 28 insertions(+)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/6f07e3f9/sdks/python/apache_beam/runners/dataflow/internal/apiclient_test.py
--
diff --git 
a/sdks/python/apache_beam/runners/dataflow/internal/apiclient_test.py 
b/sdks/python/apache_beam/runners/dataflow/internal/apiclient_test.py
index ecd6003..79cbd1c 100644
--- a/sdks/python/apache_beam/runners/dataflow/internal/apiclient_test.py
+++ b/sdks/python/apache_beam/runners/dataflow/internal/apiclient_test.py
@@ -195,6 +195,34 @@ class UtilTest(unittest.TestCase):
 for experiment in env.proto.experiments:
   self.assertNotIn('runner_harness_container_image=', experiment)
 
+  def test_labels(self):
+pipeline_options = PipelineOptions(
+['--project', 'test_project', '--job_name', 'test_job_name',
+ '--temp_location', 'gs://test-location/temp'])
+job = apiclient.Job(pipeline_options)
+self.assertIsNone(job.proto.labels)
+
+pipeline_options = PipelineOptions(
+['--project', 'test_project', '--job_name', 'test_job_name',
+ '--temp_location', 'gs://test-location/temp',
+ '--label', 'key1=value1',
+ '--label', 'key2',
+ '--label', 'key3=value3',
+ '--labels', 'key4=value4',
+ '--labels', 'key5'])
+job = apiclient.Job(pipeline_options)
+self.assertEqual(5, len(job.proto.labels.additionalProperties))
+self.assertEqual('key1', job.proto.labels.additionalProperties[0].key)
+self.assertEqual('value1', job.proto.labels.additionalProperties[0].value)
+self.assertEqual('key2', job.proto.labels.additionalProperties[1].key)
+self.assertEqual('', job.proto.labels.additionalProperties[1].value)
+self.assertEqual('key3', job.proto.labels.additionalProperties[2].key)
+self.assertEqual('value3', job.proto.labels.additionalProperties[2].value)
+self.assertEqual('key4', job.proto.labels.additionalProperties[3].key)
+self.assertEqual('value4', job.proto.labels.additionalProperties[3].value)
+self.assertEqual('key5', job.proto.labels.additionalProperties[4].key)
+self.assertEqual('', job.proto.labels.additionalProperties[4].value)
+
 
 if __name__ == '__main__':
   unittest.main()



[3/3] beam git commit: This closes #3993

2017-10-30 Thread altay
This closes #3993


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/c6b6668a
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/c6b6668a
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/c6b6668a

Branch: refs/heads/release-2.2.0
Commit: c6b6668a5239a791bf96c0828c305e1482a081fd
Parents: 6db1db7 6f07e3f
Author: Ahmet Altay 
Authored: Mon Oct 30 14:42:29 2017 -0700
Committer: Ahmet Altay 
Committed: Mon Oct 30 14:42:29 2017 -0700

--
 .../apache_beam/options/pipeline_options.py |  7 +
 .../runners/dataflow/internal/apiclient.py  | 11 
 .../runners/dataflow/internal/apiclient_test.py | 28 
 3 files changed, 46 insertions(+)
--




[jira] [Resolved] (BEAM-3108) Align names with those produced by the dataflow runner harness

2017-10-30 Thread Ahmet Altay (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3108?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmet Altay resolved BEAM-3108.
---
Resolution: Fixed

> Align names with those produced by the dataflow runner harness
> --
>
> Key: BEAM-3108
> URL: https://issues.apache.org/jira/browse/BEAM-3108
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Ahmet Altay
>Assignee: Robert Bradshaw
> Fix For: 2.2.0
>
>
> Merge https://github.com/apache/beam/pull/3941 to the release branch



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[3/3] beam git commit: This closes #4055

2017-10-30 Thread altay
This closes #4055


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/6db1db7b
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/6db1db7b
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/6db1db7b

Branch: refs/heads/release-2.2.0
Commit: 6db1db7b467525a53e9b205e3d477c43f480253e
Parents: c2e0306 9b063eb
Author: Ahmet Altay 
Authored: Mon Oct 30 14:40:05 2017 -0700
Committer: Ahmet Altay 
Committed: Mon Oct 30 14:40:05 2017 -0700

--
 .../python/apache_beam/runners/worker/bundle_processor.py | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)
--




[2/3] beam git commit: Align names with those produced by the dataflow runner harness.

2017-10-30 Thread altay
Align names with those produced by the dataflow runner harness.

These will be unused once the runner harness produces the correct
transform payloads.


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/a375b2e1
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/a375b2e1
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/a375b2e1

Branch: refs/heads/release-2.2.0
Commit: a375b2e1a498dbae827ae2398523283aacb51827
Parents: c2e0306
Author: Robert Bradshaw 
Authored: Wed Oct 4 13:57:01 2017 -0700
Committer: Ahmet Altay 
Committed: Mon Oct 30 14:39:58 2017 -0700

--
 sdks/python/apache_beam/runners/worker/bundle_processor.py | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/a375b2e1/sdks/python/apache_beam/runners/worker/bundle_processor.py
--
diff --git a/sdks/python/apache_beam/runners/worker/bundle_processor.py 
b/sdks/python/apache_beam/runners/worker/bundle_processor.py
index b69d002..69e4ade 100644
--- a/sdks/python/apache_beam/runners/worker/bundle_processor.py
+++ b/sdks/python/apache_beam/runners/worker/bundle_processor.py
@@ -58,8 +58,8 @@ IDENTITY_DOFN_URN = 'urn:org.apache.beam:dofn:identity:0.1'
 PYTHON_ITERABLE_VIEWFN_URN = 'urn:org.apache.beam:viewfn:iterable:python:0.1'
 PYTHON_CODER_URN = 'urn:org.apache.beam:coder:python:0.1'
 # TODO(vikasrk): Fix this once runner sends appropriate python urns.
-PYTHON_DOFN_URN = 'urn:org.apache.beam:dofn:java:0.1'
-PYTHON_SOURCE_URN = 'urn:org.apache.beam:source:java:0.1'
+OLD_DATAFLOW_RUNNER_HARNESS_PARDO_URN = 'urn:beam:dofn:javasdk:0.1'
+OLD_DATAFLOW_RUNNER_HARNESS_READ_URN = 'urn:org.apache.beam:source:java:0.1'
 
 
 def side_input_tag(transform_id, tag):
@@ -358,7 +358,7 @@ def create(factory, transform_id, transform_proto, 
grpc_port, consumers):
   data_channel=factory.data_channel_factory.create_data_channel(grpc_port))
 
 
-@BeamTransformFactory.register_urn(PYTHON_SOURCE_URN, None)
+@BeamTransformFactory.register_urn(OLD_DATAFLOW_RUNNER_HARNESS_READ_URN, None)
 def create(factory, transform_id, transform_proto, parameter, consumers):
   # The Dataflow runner harness strips the base64 encoding.
   source = pickler.loads(base64.b64encode(parameter))
@@ -393,7 +393,7 @@ def create(factory, transform_id, transform_proto, 
parameter, consumers):
   consumers)
 
 
-@BeamTransformFactory.register_urn(PYTHON_DOFN_URN, None)
+@BeamTransformFactory.register_urn(OLD_DATAFLOW_RUNNER_HARNESS_PARDO_URN, None)
 def create(factory, transform_id, transform_proto, parameter, consumers):
   dofn_data = pickler.loads(parameter)
   if len(dofn_data) == 2:



[1/3] beam git commit: Fix from any -> bytes transition.

2017-10-30 Thread altay
Repository: beam
Updated Branches:
  refs/heads/release-2.2.0 c2e030639 -> 6db1db7b4


Fix from any -> bytes transition.


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/9b063eb3
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/9b063eb3
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/9b063eb3

Branch: refs/heads/release-2.2.0
Commit: 9b063eb3b056c4f3d199c0c24c5f2857d84247fa
Parents: a375b2e
Author: Robert Bradshaw 
Authored: Wed Oct 4 17:33:07 2017 -0700
Committer: Ahmet Altay 
Committed: Mon Oct 30 14:39:58 2017 -0700

--
 sdks/python/apache_beam/runners/worker/bundle_processor.py | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/9b063eb3/sdks/python/apache_beam/runners/worker/bundle_processor.py
--
diff --git a/sdks/python/apache_beam/runners/worker/bundle_processor.py 
b/sdks/python/apache_beam/runners/worker/bundle_processor.py
index 69e4ade..fb8befb 100644
--- a/sdks/python/apache_beam/runners/worker/bundle_processor.py
+++ b/sdks/python/apache_beam/runners/worker/bundle_processor.py
@@ -401,7 +401,7 @@ def create(factory, transform_id, transform_proto, 
parameter, consumers):
 serialized_fn, side_input_data = dofn_data
   else:
 # No side input data.
-serialized_fn, side_input_data = parameter.value, []
+serialized_fn, side_input_data = parameter, []
   return _create_pardo_operation(
   factory, transform_id, transform_proto, consumers,
   serialized_fn, side_input_data)



[jira] [Assigned] (BEAM-2929) Dataflow support for portable side input

2017-10-30 Thread Luke Cwik (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-2929?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Luke Cwik reassigned BEAM-2929:
---

Assignee: Luke Cwik  (was: Thomas Groh)

> Dataflow support for portable side input
> 
>
> Key: BEAM-2929
> URL: https://issues.apache.org/jira/browse/BEAM-2929
> Project: Beam
>  Issue Type: Sub-task
>  Components: runner-dataflow
>Reporter: Henning Rohde
>Assignee: Luke Cwik
>  Labels: portability
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Reopened] (BEAM-2863) Add support for Side Inputs over the Fn API

2017-10-30 Thread Luke Cwik (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-2863?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Luke Cwik reopened BEAM-2863:
-

Wrong issue resolved.

> Add support for Side Inputs over the Fn API
> ---
>
> Key: BEAM-2863
> URL: https://issues.apache.org/jira/browse/BEAM-2863
> Project: Beam
>  Issue Type: Improvement
>  Components: beam-model
>Reporter: Luke Cwik
>Assignee: Luke Cwik
>  Labels: portability
> Fix For: 2.3.0
>
>
> See:
> * https://s.apache.org/beam-side-inputs-1-pager
> * http://s.apache.org/beam-fn-api-state-api



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Resolved] (BEAM-2566) Java SDK harness should not depend on any runner

2017-10-30 Thread Luke Cwik (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-2566?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Luke Cwik resolved BEAM-2566.
-
   Resolution: Fixed
Fix Version/s: 2.3.0

> Java SDK harness should not depend on any runner
> 
>
> Key: BEAM-2566
> URL: https://issues.apache.org/jira/browse/BEAM-2566
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core
>Reporter: Kenneth Knowles
>Assignee: Luke Cwik
>  Labels: portability
> Fix For: 2.3.0
>
>
> Right now there is a dependency on the Dataflow runner. I believe this is 
> legacy due to using {{CloudObject}} temporarily but I do not claim to 
> understand the full nature of the dependency.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Resolved] (BEAM-2863) Add support for Side Inputs over the Fn API

2017-10-30 Thread Luke Cwik (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-2863?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Luke Cwik resolved BEAM-2863.
-
   Resolution: Fixed
Fix Version/s: 2.3.0

> Add support for Side Inputs over the Fn API
> ---
>
> Key: BEAM-2863
> URL: https://issues.apache.org/jira/browse/BEAM-2863
> Project: Beam
>  Issue Type: Improvement
>  Components: beam-model
>Reporter: Luke Cwik
>Assignee: Luke Cwik
>  Labels: portability
> Fix For: 2.3.0
>
>
> See:
> * https://s.apache.org/beam-side-inputs-1-pager
> * http://s.apache.org/beam-fn-api-state-api



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (BEAM-2979) Race condition between KafkaIO.UnboundedKafkaReader.getWatermark() and KafkaIO.UnboundedKafkaReader.advance()

2017-10-30 Thread Reuven Lax (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2979?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16225730#comment-16225730
 ] 

Reuven Lax commented on BEAM-2979:
--

Do we need this for 2.2.0?

> Race condition between KafkaIO.UnboundedKafkaReader.getWatermark() and 
> KafkaIO.UnboundedKafkaReader.advance()
> -
>
> Key: BEAM-2979
> URL: https://issues.apache.org/jira/browse/BEAM-2979
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core
>Affects Versions: 2.0.0, 2.1.0
>Reporter: Wesley Tanaka
>Assignee: Raghu Angadi
>Priority: Blocker
> Fix For: 2.2.0
>
>
> getWatermark() looks like this:
> {noformat}
> @Override
> public Instant getWatermark() {
>   if (curRecord == null) {
> LOG.debug("{}: getWatermark() : no records have been read yet.", 
> name);
> return initialWatermark;
>   }
>   return source.spec.getWatermarkFn() != null
>   ? source.spec.getWatermarkFn().apply(curRecord) : curTimestamp;
> }
> {noformat}
> advance() has code in it that looks like this:
> {noformat}
>   curRecord = null; // user coders below might throw.
>   // apply user deserializers.
>   // TODO: write records that can't be deserialized to a 
> "dead-letter" additional output.
>   KafkaRecord record = new KafkaRecord(
>   rawRecord.topic(),
>   rawRecord.partition(),
>   rawRecord.offset(),
>   consumerSpEL.getRecordTimestamp(rawRecord),
>   keyDeserializerInstance.deserialize(rawRecord.topic(), 
> rawRecord.key()),
>   valueDeserializerInstance.deserialize(rawRecord.topic(), 
> rawRecord.value()));
>   curTimestamp = (source.spec.getTimestampFn() == null)
>   ? Instant.now() : source.spec.getTimestampFn().apply(record);
>   curRecord = record;
> {noformat}
> There's a race condition between these two blocks of code which is exposed at 
> the very least in the FlinkRunner, which calls getWatermark() periodically 
> from a timer.
> The symptom of the race condition is a stack trace that looks like this (SDK 
> 2.0.0):
> {noformat}
> Caused by: org.apache.flink.runtime.client.JobExecutionException: Job 
> execution failed.
>   at 
> org.apache.flink.runtime.jobmanager.JobManager$$anonfun$handleMessage$1$$anonfun$applyOrElse$7.apply$mcV$sp(JobManager.scala:910)
>   at 
> org.apache.flink.runtime.jobmanager.JobManager$$anonfun$handleMessage$1$$anonfun$applyOrElse$7.apply(JobManager.scala:853)
>   at 
> org.apache.flink.runtime.jobmanager.JobManager$$anonfun$handleMessage$1$$anonfun$applyOrElse$7.apply(JobManager.scala:853)
>   at 
> scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24)
>   at 
> scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24)
>   at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:40)
>   at 
> akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:397)
>   at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
>   at 
> scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
>   at 
> scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
>   at 
> scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
> Caused by: TimerException{java.lang.NullPointerException}
>   at 
> org.apache.flink.streaming.runtime.tasks.SystemProcessingTimeService$TriggerTask.run(SystemProcessingTimeService.java:220)
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
>   at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.NullPointerException
>   at org.apache.beam.sdk.io.kafka.KafkaIO$Read$1.apply(KafkaIO.java:568)
>   at org.apache.beam.sdk.io.kafka.KafkaIO$Read$1.apply(KafkaIO.java:565)
>   at 
> org.apache.beam.sdk.io.kafka.KafkaIO$UnboundedKafkaReader.getWatermark(KafkaIO.java:1210)
>   at 
> org.apache.beam.runners.flink.translation.wrappers.streaming.io.UnboundedSourceWrapper.onProcessingTime(UnboundedSourceWrapper.java:431)
>   at 
> 

[jira] [Updated] (BEAM-3116) Dataflow templates broken whenever any worker pool metadata property is set

2017-10-30 Thread Kenneth Knowles (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3116?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kenneth Knowles updated BEAM-3116:
--
Labels: portability  (was: )

> Dataflow templates broken whenever any worker pool metadata property is set
> ---
>
> Key: BEAM-3116
> URL: https://issues.apache.org/jira/browse/BEAM-3116
> Project: Beam
>  Issue Type: Bug
>  Components: runner-dataflow
>Reporter: Kenneth Knowles
>Assignee: Kenneth Knowles
>Priority: Blocker
>  Labels: portability
> Fix For: 2.3.0
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (BEAM-2994) Refactor TikaIO

2017-10-30 Thread Sergey Beryozkin (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16225703#comment-16225703
 ] 

Sergey Beryozkin commented on BEAM-2994:


Thanks for merging this PR

> Refactor TikaIO
> ---
>
> Key: BEAM-2994
> URL: https://issues.apache.org/jira/browse/BEAM-2994
> Project: Beam
>  Issue Type: Task
>  Components: sdk-java-extensions
>Affects Versions: 2.2.0
>Reporter: Sergey Beryozkin
>Assignee: Sergey Beryozkin
> Fix For: 2.3.0
>
>
> TikaIO is currently implemented as a BoundedSource and asynchronous 
> BoundedReader returning individual document's text chunks as Strings, 
> eventually passed unordered (and not linked to the original documents) to the 
> pipeline functions.
> It was decided in the recent beam-dev thread that initially TikaIO should 
> support the cases where only a single composite bean per file, capturing the 
> file content, location (or name) and metadata, should flow to the pipeline, 
> and thus avoiding the need to implement TikaIO as a BoundedSource/Reader.
> Enhancing  TikaIO to support the streaming of the content into the pipelines 
> may be considered in the next phase, based on the specific use-cases... 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (BEAM-3107) Python Fnapi based workloads failing

2017-10-30 Thread Valentyn Tymofieiev (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-3107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16225672#comment-16225672
 ] 

Valentyn Tymofieiev commented on BEAM-3107:
---

Filed https://issues.apache.org/jira/browse/BEAM-3120 to investigate

> Python Fnapi based workloads failing
> 
>
> Key: BEAM-3107
> URL: https://issues.apache.org/jira/browse/BEAM-3107
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-harness
>Affects Versions: 2.2.0
>Reporter: Ahmet Altay
>Assignee: Valentyn Tymofieiev
>  Labels: portability
> Fix For: 2.2.0
>
>
> Python post commits are failing because the runner harness is not compatible 
> with the sdk harness.
> We need a new runner harness compatible with: 
> https://github.com/apache/beam/commit/80c6f4ec0c2a3cc3a441289a9cc8ff53cb70f863



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (BEAM-3120) Jenkins postcommit test suite triggered for a release branch uses master branch.

2017-10-30 Thread Valentyn Tymofieiev (JIRA)
Valentyn Tymofieiev created BEAM-3120:
-

 Summary: Jenkins postcommit test suite triggered for a release 
branch uses master branch.
 Key: BEAM-3120
 URL: https://issues.apache.org/jira/browse/BEAM-3120
 Project: Beam
  Issue Type: Bug
  Components: testing
Reporter: Valentyn Tymofieiev
Assignee: Jason Kuster


>From 
>https://builds.apache.org/view/A-D/view/Beam/job/beam_PostCommit_Python_Verify/3452/
> we can see the build was triggered on Revision: 
>576d22a67ffeb29adf26f8adc09aaa6078099cf6
origin/release-2.2.0. However, looking at test logs we can see that 2.3.0.dev 
version of the Python SDK is being installed, and tested:

# Tox runs unit tests in a virtual environment
${LOCAL_PATH}/tox -e ALL -c sdks/python/tox.ini
GLOB sdist-make: 
/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python_Verify/sdks/python/setup.py
docs create: 
/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python_Verify/sdks/python/target/.tox/docs
docs installdeps: nose==1.3.7, grpcio-tools==1.3.5, Sphinx==1.5.5, 
sphinx_rtd_theme==0.2.4
docs inst: 
/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python_Verify/sdks/python/target/.tox/dist/apache-beam-2.3.0.dev.zip
docs installed: 
alabaster==0.7.10,apache-beam==2.3.0.dev0,avro==1.8.2,Babel==2.5.1,certifi==2017.7.27.1,chardet==3.0.4,crcmod==1.7,dill==0.2.6,docutils==0.14,enum34==1.1.6,funcsigs==1.0.2,futures==3.1.1,grpcio==1.7.0,grpcio-tools==1.3.5,httplib2==0.9.2,idna==2.6,imagesize==0.7.1,Jinja2==2.9.6,MarkupSafe==1.0,mock==2.0.0,nose==1.3.7,oauth2client==3.0.0,pbr==3.1.1,protobuf==3.3.0,pyasn1==0.3.7,pyasn1-modules==0.1.5,Pygments==2.2.0,pytz==2017.3,PyYAML==3.12,requests==2.18.4,rsa==3.4.2,six==1.10.0,snowballstemmer==1.2.1,Sphinx==1.5.5,sphinx-rtd-theme==0.2.4,typing==3.6.2,urllib3==1.22

Looking further.
cc [~altay]



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


Jenkins build became unstable: beam_PostCommit_Java_MavenInstall #5136

2017-10-30 Thread Apache Jenkins Server
See 




Build failed in Jenkins: beam_PerformanceTests_Python #505

2017-10-30 Thread Apache Jenkins Server
See 


Changes:

[iemejia] [BEAM-3111] Upgrade Elasticsearch to 5.6.3 and clean pom

[iemejia] [BEAM-3112] Improve error messages in ElasticsearchIO test utils

--
Started by timer
[EnvInject] - Loading node environment variables.
Building remotely on beam6 (beam) in workspace 

 > git rev-parse --is-inside-work-tree # timeout=10
Fetching changes from the remote Git repository
 > git config remote.origin.url https://github.com/apache/beam.git # timeout=10
Fetching upstream changes from https://github.com/apache/beam.git
 > git --version # timeout=10
 > git fetch --tags --progress https://github.com/apache/beam.git 
 > +refs/heads/*:refs/remotes/origin/* 
 > +refs/pull/${ghprbPullId}/*:refs/remotes/origin/pr/${ghprbPullId}/*
 > git rev-parse origin/master^{commit} # timeout=10
Checking out Revision 16b9d584ccb19b7ffaf10677e7c246f0f9647e7c (origin/master)
Commit message: "This closes #4051"
 > git config core.sparsecheckout # timeout=10
 > git checkout -f 16b9d584ccb19b7ffaf10677e7c246f0f9647e7c
 > git rev-list 5fb30ec8265c841cd8c4e6ae16b43be1f171eabb # timeout=10
Cleaning workspace
 > git rev-parse --verify HEAD # timeout=10
Resetting working tree
 > git reset --hard # timeout=10
 > git clean -fdx # timeout=10
[EnvInject] - Executing scripts and injecting environment variables after the 
SCM step.
[EnvInject] - Injecting as environment variables the properties content 
SPARK_LOCAL_IP=127.0.0.1

[EnvInject] - Variables injected successfully.
[beam_PerformanceTests_Python] $ /bin/bash -xe 
/tmp/jenkins1748115182508259856.sh
+ rm -rf PerfKitBenchmarker
[beam_PerformanceTests_Python] $ /bin/bash -xe /tmp/jenkins82666732461114425.sh
+ git clone https://github.com/GoogleCloudPlatform/PerfKitBenchmarker.git
Cloning into 'PerfKitBenchmarker'...
[beam_PerformanceTests_Python] $ /bin/bash -xe 
/tmp/jenkins7783035849946361920.sh
+ pip install --user -r PerfKitBenchmarker/requirements.txt
Requirement already satisfied: absl-py in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 14))
Requirement already satisfied: jinja2>=2.7 in 
/usr/local/lib/python2.7/dist-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 15))
Requirement already satisfied: setuptools in /usr/lib/python2.7/dist-packages 
(from -r PerfKitBenchmarker/requirements.txt (line 16))
Requirement already satisfied: colorlog[windows]==2.6.0 in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 17))
Requirement already satisfied: blinker>=1.3 in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 18))
Requirement already satisfied: futures>=3.0.3 in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 19))
Requirement already satisfied: PyYAML==3.12 in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 20))
Requirement already satisfied: pint>=0.7 in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 21))
Requirement already satisfied: numpy in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 22))
Requirement already satisfied: functools32 in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 23))
Requirement already satisfied: contextlib2>=0.5.1 in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 24))
Requirement already satisfied: six in 
/home/jenkins/.local/lib/python2.7/site-packages (from absl-py->-r 
PerfKitBenchmarker/requirements.txt (line 14))
Requirement already satisfied: MarkupSafe>=0.23 in 
/usr/local/lib/python2.7/dist-packages (from jinja2>=2.7->-r 
PerfKitBenchmarker/requirements.txt (line 15))
Requirement already satisfied: colorama; extra == "windows" in 
/usr/lib/python2.7/dist-packages (from colorlog[windows]==2.6.0->-r 
PerfKitBenchmarker/requirements.txt (line 17))
[beam_PerformanceTests_Python] $ /bin/bash -xe 
/tmp/jenkins7465928593733378747.sh
+ pip install --user -e 'sdks/python/[gcp,test]'
Obtaining 
file://
Requirement already satisfied: avro<2.0.0,>=1.8.1 in 
/home/jenkins/.local/lib/python2.7/site-packages (from apache-beam==2.3.0.dev)
Requirement already satisfied: crcmod<2.0,>=1.7 in 
/usr/lib/python2.7/dist-packages (from apache-beam==2.3.0.dev)
Requirement already satisfied: dill==0.2.6 in 
/home/jenkins/.local/lib/python2.7/site-packages (from apache-beam==2.3.0.dev)
Requirement already satisfied: grpcio<2.0,>=1.0 in 
/home/jenkins/.local/lib/python2.7/site-packages (from 

Build failed in Jenkins: beam_PostCommit_Java_ValidatesRunner_Spark #3387

2017-10-30 Thread Apache Jenkins Server
See 


--
Started by timer
[EnvInject] - Loading node environment variables.
Building remotely on beam8 (beam) in workspace 

 > git rev-parse --is-inside-work-tree # timeout=10
Fetching changes from the remote Git repository
 > git config remote.origin.url https://github.com/apache/beam.git # timeout=10
Fetching upstream changes from https://github.com/apache/beam.git
 > git --version # timeout=10
 > git fetch --tags --progress https://github.com/apache/beam.git 
 > +refs/heads/*:refs/remotes/origin/* 
 > +refs/pull/${ghprbPullId}/*:refs/remotes/origin/pr/${ghprbPullId}/*
 > git rev-parse origin/master^{commit} # timeout=10
Checking out Revision 16b9d584ccb19b7ffaf10677e7c246f0f9647e7c (origin/master)
java.lang.NoClassDefFoundError: Could not initialize class 
jenkins.model.Jenkins$MasterComputer
at 
org.jenkinsci.plugins.gitclient.AbstractGitAPIImpl.withRepository(AbstractGitAPIImpl.java:29)
at 
org.jenkinsci.plugins.gitclient.CliGitAPIImpl.withRepository(CliGitAPIImpl.java:71)
at sun.reflect.GeneratedMethodAccessor73.invoke(Unknown Source)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
hudson.remoting.RemoteInvocationHandler$RPCRequest.perform(RemoteInvocationHandler.java:896)
at 
hudson.remoting.RemoteInvocationHandler$RPCRequest.call(RemoteInvocationHandler.java:870)
at 
hudson.remoting.RemoteInvocationHandler$RPCRequest.call(RemoteInvocationHandler.java:829)
at hudson.remoting.UserRequest.perform(UserRequest.java:181)
at hudson.remoting.UserRequest.perform(UserRequest.java:52)
at hudson.remoting.Request$2.run(Request.java:336)
at 
hudson.remoting.InterceptingExecutorService$1.call(InterceptingExecutorService.java:68)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:748)
at ..remote call to beam8(Native Method)
at hudson.remoting.Channel.attachCallSiteStackTrace(Channel.java:1554)
at hudson.remoting.UserResponse.retrieve(UserRequest.java:281)
at hudson.remoting.Channel.call(Channel.java:839)
Caused: java.io.IOException: Remote call on beam8 failed
at hudson.remoting.Channel.call(Channel.java:847)
at 
hudson.remoting.RemoteInvocationHandler.invoke(RemoteInvocationHandler.java:257)
at com.sun.proxy.$Proxy108.withRepository(Unknown Source)
at 
org.jenkinsci.plugins.gitclient.RemoteGitImpl.withRepository(RemoteGitImpl.java:235)
at hudson.plugins.git.GitSCM.printCommitMessageToLog(GitSCM.java:1195)
at hudson.plugins.git.GitSCM.checkout(GitSCM.java:1159)
at hudson.scm.SCM.checkout(SCM.java:495)
at hudson.model.AbstractProject.checkout(AbstractProject.java:1212)
at 
hudson.model.AbstractBuild$AbstractBuildExecution.defaultCheckout(AbstractBuild.java:566)
at jenkins.scm.SCMCheckoutStrategy.checkout(SCMCheckoutStrategy.java:86)
at 
hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:491)
at hudson.model.Run.execute(Run.java:1737)
at hudson.maven.MavenModuleSetBuild.run(MavenModuleSetBuild.java:543)
at hudson.model.ResourceController.execute(ResourceController.java:97)
at hudson.model.Executor.run(Executor.java:419)
Retrying after 10 seconds
 > git rev-parse --is-inside-work-tree # timeout=10
Fetching changes from the remote Git repository
 > git config remote.origin.url https://github.com/apache/beam.git # timeout=10
Fetching upstream changes from https://github.com/apache/beam.git
 > git --version # timeout=10
 > git fetch --tags --progress https://github.com/apache/beam.git 
 > +refs/heads/*:refs/remotes/origin/* 
 > +refs/pull/${ghprbPullId}/*:refs/remotes/origin/pr/${ghprbPullId}/*
 > git rev-parse origin/master^{commit} # timeout=10
Checking out Revision 16b9d584ccb19b7ffaf10677e7c246f0f9647e7c (origin/master)
java.lang.NoClassDefFoundError: Could not initialize class 
jenkins.model.Jenkins$MasterComputer
at 
org.jenkinsci.plugins.gitclient.AbstractGitAPIImpl.withRepository(AbstractGitAPIImpl.java:29)
at 
org.jenkinsci.plugins.gitclient.CliGitAPIImpl.withRepository(CliGitAPIImpl.java:71)
at sun.reflect.GeneratedMethodAccessor73.invoke(Unknown Source)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 

Build failed in Jenkins: beam_PostCommit_Java_ValidatesRunner_Apex #2704

2017-10-30 Thread Apache Jenkins Server
See 


--
Started by timer
[EnvInject] - Loading node environment variables.
Building remotely on beam8 (beam) in workspace 

 > git rev-parse --is-inside-work-tree # timeout=10
Fetching changes from the remote Git repository
 > git config remote.origin.url https://github.com/apache/beam.git # timeout=10
Fetching upstream changes from https://github.com/apache/beam.git
 > git --version # timeout=10
 > git fetch --tags --progress https://github.com/apache/beam.git 
 > +refs/heads/*:refs/remotes/origin/* 
 > +refs/pull/${ghprbPullId}/*:refs/remotes/origin/pr/${ghprbPullId}/*
 > git rev-parse origin/master^{commit} # timeout=10
Checking out Revision 16b9d584ccb19b7ffaf10677e7c246f0f9647e7c (origin/master)
java.lang.NoClassDefFoundError: Could not initialize class 
jenkins.model.Jenkins$MasterComputer
at 
org.jenkinsci.plugins.gitclient.AbstractGitAPIImpl.withRepository(AbstractGitAPIImpl.java:29)
at 
org.jenkinsci.plugins.gitclient.CliGitAPIImpl.withRepository(CliGitAPIImpl.java:71)
at sun.reflect.GeneratedMethodAccessor73.invoke(Unknown Source)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
hudson.remoting.RemoteInvocationHandler$RPCRequest.perform(RemoteInvocationHandler.java:896)
at 
hudson.remoting.RemoteInvocationHandler$RPCRequest.call(RemoteInvocationHandler.java:870)
at 
hudson.remoting.RemoteInvocationHandler$RPCRequest.call(RemoteInvocationHandler.java:829)
at hudson.remoting.UserRequest.perform(UserRequest.java:181)
at hudson.remoting.UserRequest.perform(UserRequest.java:52)
at hudson.remoting.Request$2.run(Request.java:336)
at 
hudson.remoting.InterceptingExecutorService$1.call(InterceptingExecutorService.java:68)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:748)
at ..remote call to beam8(Native Method)
at hudson.remoting.Channel.attachCallSiteStackTrace(Channel.java:1554)
at hudson.remoting.UserResponse.retrieve(UserRequest.java:281)
at hudson.remoting.Channel.call(Channel.java:839)
Caused: java.io.IOException: Remote call on beam8 failed
at hudson.remoting.Channel.call(Channel.java:847)
at 
hudson.remoting.RemoteInvocationHandler.invoke(RemoteInvocationHandler.java:257)
at com.sun.proxy.$Proxy108.withRepository(Unknown Source)
at 
org.jenkinsci.plugins.gitclient.RemoteGitImpl.withRepository(RemoteGitImpl.java:235)
at hudson.plugins.git.GitSCM.printCommitMessageToLog(GitSCM.java:1195)
at hudson.plugins.git.GitSCM.checkout(GitSCM.java:1159)
at hudson.scm.SCM.checkout(SCM.java:495)
at hudson.model.AbstractProject.checkout(AbstractProject.java:1212)
at 
hudson.model.AbstractBuild$AbstractBuildExecution.defaultCheckout(AbstractBuild.java:566)
at jenkins.scm.SCMCheckoutStrategy.checkout(SCMCheckoutStrategy.java:86)
at 
hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:491)
at hudson.model.Run.execute(Run.java:1737)
at hudson.maven.MavenModuleSetBuild.run(MavenModuleSetBuild.java:543)
at hudson.model.ResourceController.execute(ResourceController.java:97)
at hudson.model.Executor.run(Executor.java:419)
Retrying after 10 seconds
 > git rev-parse --is-inside-work-tree # timeout=10
Fetching changes from the remote Git repository
 > git config remote.origin.url https://github.com/apache/beam.git # timeout=10
Fetching upstream changes from https://github.com/apache/beam.git
 > git --version # timeout=10
 > git fetch --tags --progress https://github.com/apache/beam.git 
 > +refs/heads/*:refs/remotes/origin/* 
 > +refs/pull/${ghprbPullId}/*:refs/remotes/origin/pr/${ghprbPullId}/*
 > git rev-parse origin/master^{commit} # timeout=10
Checking out Revision 16b9d584ccb19b7ffaf10677e7c246f0f9647e7c (origin/master)
java.lang.NoClassDefFoundError: Could not initialize class 
jenkins.model.Jenkins$MasterComputer
at 
org.jenkinsci.plugins.gitclient.AbstractGitAPIImpl.withRepository(AbstractGitAPIImpl.java:29)
at 
org.jenkinsci.plugins.gitclient.CliGitAPIImpl.withRepository(CliGitAPIImpl.java:71)
at sun.reflect.GeneratedMethodAccessor73.invoke(Unknown Source)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 

[jira] [Closed] (BEAM-307) Upgrade/Test to Kafka 0.10

2017-10-30 Thread Xu Mingmin (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-307?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xu Mingmin closed BEAM-307.
---
   Resolution: Fixed
Fix Version/s: Not applicable

> Upgrade/Test to Kafka 0.10
> --
>
> Key: BEAM-307
> URL: https://issues.apache.org/jira/browse/BEAM-307
> Project: Beam
>  Issue Type: Task
>  Components: sdk-java-extensions
>Reporter: Jean-Baptiste Onofré
>Assignee: Xu Mingmin
>  Labels: backward-incompatible
> Fix For: Not applicable
>
>
> I gonna test at least that the KafkaIO works fine with Kafka 0.10 (I'm 
> preparing new complete samples around that).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


Build failed in Jenkins: beam_PostCommit_Python_Verify #3452

2017-10-30 Thread Apache Jenkins Server
See 


--
[...truncated 932.54 KB...]
copying apache_beam/runners/common_test.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners
copying apache_beam/runners/pipeline_context.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners
copying apache_beam/runners/pipeline_context_test.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners
copying apache_beam/runners/runner.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners
copying apache_beam/runners/runner_test.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners
copying apache_beam/runners/dataflow/__init__.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow
copying apache_beam/runners/dataflow/dataflow_metrics.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow
copying apache_beam/runners/dataflow/dataflow_metrics_test.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow
copying apache_beam/runners/dataflow/dataflow_runner.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow
copying apache_beam/runners/dataflow/dataflow_runner_test.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow
copying apache_beam/runners/dataflow/ptransform_overrides.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow
copying apache_beam/runners/dataflow/template_runner_test.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow
copying apache_beam/runners/dataflow/test_dataflow_runner.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow
copying apache_beam/runners/dataflow/internal/__init__.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/internal
copying apache_beam/runners/dataflow/internal/apiclient.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/internal
copying apache_beam/runners/dataflow/internal/apiclient_test.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/internal
copying apache_beam/runners/dataflow/internal/dependency.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/internal
copying apache_beam/runners/dataflow/internal/dependency_test.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/internal
copying apache_beam/runners/dataflow/internal/names.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/internal
copying apache_beam/runners/dataflow/internal/clients/__init__.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/internal/clients
copying apache_beam/runners/dataflow/internal/clients/dataflow/__init__.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/internal/clients/dataflow
copying 
apache_beam/runners/dataflow/internal/clients/dataflow/dataflow_v1b3_client.py 
-> apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/internal/clients/dataflow
copying 
apache_beam/runners/dataflow/internal/clients/dataflow/dataflow_v1b3_messages.py
 -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/internal/clients/dataflow
copying 
apache_beam/runners/dataflow/internal/clients/dataflow/message_matchers.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/internal/clients/dataflow
copying 
apache_beam/runners/dataflow/internal/clients/dataflow/message_matchers_test.py 
-> apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/internal/clients/dataflow
copying apache_beam/runners/dataflow/native_io/__init__.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/native_io
copying apache_beam/runners/dataflow/native_io/iobase.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/native_io
copying apache_beam/runners/dataflow/native_io/iobase_test.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/native_io
copying apache_beam/runners/dataflow/native_io/streaming_create.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/native_io
copying apache_beam/runners/direct/__init__.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/bundle_factory.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/clock.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/consumer_tracking_pipeline_visitor.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/consumer_tracking_pipeline_visitor_test.py 
-> apache-beam-2.3.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/direct_metrics.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/direct_metrics_test.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/direct_runner.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/direct_runner_test.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/evaluation_context.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/executor.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/direct
copying 

[jira] [Resolved] (BEAM-3109) Add an element batching transform

2017-10-30 Thread Ahmet Altay (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3109?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmet Altay resolved BEAM-3109.
---
Resolution: Fixed

> Add an element batching transform
> -
>
> Key: BEAM-3109
> URL: https://issues.apache.org/jira/browse/BEAM-3109
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Ahmet Altay
>Assignee: Robert Bradshaw
> Fix For: 2.2.0
>
>
> Merge https://github.com/apache/beam/pull/3971 to the release branch



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (BEAM-3109) Add an element batching transform

2017-10-30 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-3109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16225369#comment-16225369
 ] 

ASF GitHub Bot commented on BEAM-3109:
--

Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/4054


> Add an element batching transform
> -
>
> Key: BEAM-3109
> URL: https://issues.apache.org/jira/browse/BEAM-3109
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Ahmet Altay
>Assignee: Robert Bradshaw
> Fix For: 2.2.0
>
>
> Merge https://github.com/apache/beam/pull/3971 to the release branch



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] beam pull request #4054: [BEAM-3109] Add an element batching transform.

2017-10-30 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/4054


---


[1/2] beam git commit: Add an element batching transform.

2017-10-30 Thread altay
Repository: beam
Updated Branches:
  refs/heads/release-2.2.0 576d22a67 -> c2e030639


Add an element batching transform.


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/e474a22d
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/e474a22d
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/e474a22d

Branch: refs/heads/release-2.2.0
Commit: e474a22d2d23b8978a9d64934e94d0e357d14153
Parents: 576d22a
Author: Robert Bradshaw 
Authored: Mon Oct 9 16:46:19 2017 -0700
Committer: Ahmet Altay 
Committed: Fri Oct 27 14:19:46 2017 -0700

--
 sdks/python/apache_beam/transforms/util.py  | 260 +++
 sdks/python/apache_beam/transforms/util_test.py | 108 
 2 files changed, 368 insertions(+)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/e474a22d/sdks/python/apache_beam/transforms/util.py
--
diff --git a/sdks/python/apache_beam/transforms/util.py 
b/sdks/python/apache_beam/transforms/util.py
index 81b8c22..bf59795 100644
--- a/sdks/python/apache_beam/transforms/util.py
+++ b/sdks/python/apache_beam/transforms/util.py
@@ -20,14 +20,25 @@
 
 from __future__ import absolute_import
 
+import collections
+import contextlib
+import time
+
+from apache_beam import typehints
+from apache_beam.metrics import Metrics
+from apache_beam.transforms import window
 from apache_beam.transforms.core import CombinePerKey
+from apache_beam.transforms.core import DoFn
 from apache_beam.transforms.core import Flatten
 from apache_beam.transforms.core import GroupByKey
 from apache_beam.transforms.core import Map
+from apache_beam.transforms.core import ParDo
 from apache_beam.transforms.ptransform import PTransform
 from apache_beam.transforms.ptransform import ptransform_fn
+from apache_beam.utils import windowed_value
 
 __all__ = [
+'BatchElements',
 'CoGroupByKey',
 'Keys',
 'KvSwap',
@@ -36,6 +47,9 @@ __all__ = [
 ]
 
 
+T = typehints.TypeVariable('T')
+
+
 class CoGroupByKey(PTransform):
   """Groups results across several PCollections by key.
 
@@ -161,3 +175,249 @@ def RemoveDuplicates(pcoll):  # pylint: 
disable=invalid-name
   | 'ToPairs' >> Map(lambda v: (v, None))
   | 'Group' >> CombinePerKey(lambda vs: None)
   | 'RemoveDuplicates' >> Keys())
+
+
+class _BatchSizeEstimator(object):
+  """Estimates the best size for batches given historical timing.
+  """
+
+  _MAX_DATA_POINTS = 100
+  _MAX_GROWTH_FACTOR = 2
+
+  def __init__(self,
+   min_batch_size=1,
+   max_batch_size=1000,
+   target_batch_overhead=.1,
+   target_batch_duration_secs=1,
+   clock=time.time):
+if min_batch_size > max_batch_size:
+  raise ValueError("Minimum (%s) must not be greater than maximum (%s)" % (
+  min_batch_size, max_batch_size))
+if target_batch_overhead and not 0 < target_batch_overhead <= 1:
+  raise ValueError("target_batch_overhead (%s) must be between 0 and 1" % (
+  target_batch_overhead))
+if target_batch_duration_secs and target_batch_duration_secs <= 0:
+  raise ValueError("target_batch_duration_secs (%s) must be positive" % (
+  target_batch_duration_secs))
+if max(0, target_batch_overhead, target_batch_duration_secs) == 0:
+  raise ValueError("At least one of target_batch_overhead or "
+   "target_batch_duration_secs must be positive.")
+self._min_batch_size = min_batch_size
+self._max_batch_size = max_batch_size
+self._target_batch_overhead = target_batch_overhead
+self._target_batch_duration_secs = target_batch_duration_secs
+self._clock = clock
+self._data = []
+self._ignore_next_timing = False
+self._size_distribution = Metrics.distribution(
+'BatchElements', 'batch_size')
+self._time_distribution = Metrics.distribution(
+'BatchElements', 'msec_per_batch')
+# Beam distributions only accept integer values, so we use this to
+# accumulate under-reported values until they add up to whole milliseconds.
+# (Milliseconds are chosen because that's conventionally used elsewhere in
+# profiling-style counters.)
+self._remainder_msecs = 0
+
+  def ignore_next_timing(self):
+"""Call to indicate the next timing should be ignored.
+
+For example, the first emit of a ParDo operation is known to be anomalous
+due to setup that may occur.
+"""
+self._ignore_next_timing = False
+
+  @contextlib.contextmanager
+  def record_time(self, batch_size):
+start = self._clock()
+yield
+elapsed = self._clock() - start
+elapsed_msec = 1e3 * elapsed + self._remainder_msecs
+self._size_distribution.update(batch_size)
+

[2/2] beam git commit: This closes #4054

2017-10-30 Thread altay
This closes #4054


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/c2e03063
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/c2e03063
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/c2e03063

Branch: refs/heads/release-2.2.0
Commit: c2e030639cf75bc98ab49a50c10759561fdbaa03
Parents: 576d22a e474a22
Author: Ahmet Altay 
Authored: Mon Oct 30 10:25:54 2017 -0700
Committer: Ahmet Altay 
Committed: Mon Oct 30 10:25:54 2017 -0700

--
 sdks/python/apache_beam/transforms/util.py  | 260 +++
 sdks/python/apache_beam/transforms/util_test.py | 108 
 2 files changed, 368 insertions(+)
--




[jira] [Assigned] (BEAM-307) Upgrade/Test to Kafka 0.10

2017-10-30 Thread Raghu Angadi (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-307?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raghu Angadi reassigned BEAM-307:
-

Assignee: Xu Mingmin  (was: Jean-Baptiste Onofré)

> Upgrade/Test to Kafka 0.10
> --
>
> Key: BEAM-307
> URL: https://issues.apache.org/jira/browse/BEAM-307
> Project: Beam
>  Issue Type: Task
>  Components: sdk-java-extensions
>Reporter: Jean-Baptiste Onofré
>Assignee: Xu Mingmin
>  Labels: backward-incompatible
>
> I gonna test at least that the KafkaIO works fine with Kafka 0.10 (I'm 
> preparing new complete samples around that).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (BEAM-3107) Python Fnapi based workloads failing

2017-10-30 Thread Valentyn Tymofieiev (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-3107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16225329#comment-16225329
 ] 

Valentyn Tymofieiev commented on BEAM-3107:
---

It seems that postcommit didn't actually run on the branch, since the SDK 
version being used is 2.3.0.dev.

> Python Fnapi based workloads failing
> 
>
> Key: BEAM-3107
> URL: https://issues.apache.org/jira/browse/BEAM-3107
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-harness
>Affects Versions: 2.2.0
>Reporter: Ahmet Altay
>Assignee: Valentyn Tymofieiev
>  Labels: portability
> Fix For: 2.2.0
>
>
> Python post commits are failing because the runner harness is not compatible 
> with the sdk harness.
> We need a new runner harness compatible with: 
> https://github.com/apache/beam/commit/80c6f4ec0c2a3cc3a441289a9cc8ff53cb70f863



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


Build failed in Jenkins: beam_PostCommit_Python_Verify #3451

2017-10-30 Thread Apache Jenkins Server
See 


Changes:

[iemejia] [BEAM-3111] Upgrade Elasticsearch to 5.6.3 and clean pom

[iemejia] [BEAM-3112] Improve error messages in ElasticsearchIO test utils

--
[...truncated 930.03 KB...]
copying apache_beam/runners/common_test.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners
copying apache_beam/runners/pipeline_context.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners
copying apache_beam/runners/pipeline_context_test.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners
copying apache_beam/runners/runner.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners
copying apache_beam/runners/runner_test.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners
copying apache_beam/runners/dataflow/__init__.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow
copying apache_beam/runners/dataflow/dataflow_metrics.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow
copying apache_beam/runners/dataflow/dataflow_metrics_test.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow
copying apache_beam/runners/dataflow/dataflow_runner.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow
copying apache_beam/runners/dataflow/dataflow_runner_test.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow
copying apache_beam/runners/dataflow/ptransform_overrides.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow
copying apache_beam/runners/dataflow/template_runner_test.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow
copying apache_beam/runners/dataflow/test_dataflow_runner.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow
copying apache_beam/runners/dataflow/internal/__init__.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/internal
copying apache_beam/runners/dataflow/internal/apiclient.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/internal
copying apache_beam/runners/dataflow/internal/apiclient_test.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/internal
copying apache_beam/runners/dataflow/internal/dependency.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/internal
copying apache_beam/runners/dataflow/internal/dependency_test.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/internal
copying apache_beam/runners/dataflow/internal/names.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/internal
copying apache_beam/runners/dataflow/internal/clients/__init__.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/internal/clients
copying apache_beam/runners/dataflow/internal/clients/dataflow/__init__.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/internal/clients/dataflow
copying 
apache_beam/runners/dataflow/internal/clients/dataflow/dataflow_v1b3_client.py 
-> apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/internal/clients/dataflow
copying 
apache_beam/runners/dataflow/internal/clients/dataflow/dataflow_v1b3_messages.py
 -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/internal/clients/dataflow
copying 
apache_beam/runners/dataflow/internal/clients/dataflow/message_matchers.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/internal/clients/dataflow
copying 
apache_beam/runners/dataflow/internal/clients/dataflow/message_matchers_test.py 
-> apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/internal/clients/dataflow
copying apache_beam/runners/dataflow/native_io/__init__.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/native_io
copying apache_beam/runners/dataflow/native_io/iobase.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/native_io
copying apache_beam/runners/dataflow/native_io/iobase_test.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/native_io
copying apache_beam/runners/dataflow/native_io/streaming_create.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/native_io
copying apache_beam/runners/direct/__init__.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/bundle_factory.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/clock.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/consumer_tracking_pipeline_visitor.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/consumer_tracking_pipeline_visitor_test.py 
-> apache-beam-2.3.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/direct_metrics.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/direct_metrics_test.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/direct_runner.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/direct_runner_test.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/evaluation_context.py 

Build failed in Jenkins: beam_PostCommit_Python_Verify #3450

2017-10-30 Thread Apache Jenkins Server
See 


--
[...truncated 931.77 KB...]
copying apache_beam/runners/common_test.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners
copying apache_beam/runners/pipeline_context.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners
copying apache_beam/runners/pipeline_context_test.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners
copying apache_beam/runners/runner.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners
copying apache_beam/runners/runner_test.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners
copying apache_beam/runners/dataflow/__init__.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow
copying apache_beam/runners/dataflow/dataflow_metrics.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow
copying apache_beam/runners/dataflow/dataflow_metrics_test.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow
copying apache_beam/runners/dataflow/dataflow_runner.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow
copying apache_beam/runners/dataflow/dataflow_runner_test.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow
copying apache_beam/runners/dataflow/ptransform_overrides.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow
copying apache_beam/runners/dataflow/template_runner_test.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow
copying apache_beam/runners/dataflow/test_dataflow_runner.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow
copying apache_beam/runners/dataflow/internal/__init__.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/internal
copying apache_beam/runners/dataflow/internal/apiclient.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/internal
copying apache_beam/runners/dataflow/internal/apiclient_test.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/internal
copying apache_beam/runners/dataflow/internal/dependency.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/internal
copying apache_beam/runners/dataflow/internal/dependency_test.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/internal
copying apache_beam/runners/dataflow/internal/names.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/internal
copying apache_beam/runners/dataflow/internal/clients/__init__.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/internal/clients
copying apache_beam/runners/dataflow/internal/clients/dataflow/__init__.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/internal/clients/dataflow
copying 
apache_beam/runners/dataflow/internal/clients/dataflow/dataflow_v1b3_client.py 
-> apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/internal/clients/dataflow
copying 
apache_beam/runners/dataflow/internal/clients/dataflow/dataflow_v1b3_messages.py
 -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/internal/clients/dataflow
copying 
apache_beam/runners/dataflow/internal/clients/dataflow/message_matchers.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/internal/clients/dataflow
copying 
apache_beam/runners/dataflow/internal/clients/dataflow/message_matchers_test.py 
-> apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/internal/clients/dataflow
copying apache_beam/runners/dataflow/native_io/__init__.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/native_io
copying apache_beam/runners/dataflow/native_io/iobase.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/native_io
copying apache_beam/runners/dataflow/native_io/iobase_test.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/native_io
copying apache_beam/runners/dataflow/native_io/streaming_create.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/native_io
copying apache_beam/runners/direct/__init__.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/bundle_factory.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/clock.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/consumer_tracking_pipeline_visitor.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/consumer_tracking_pipeline_visitor_test.py 
-> apache-beam-2.3.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/direct_metrics.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/direct_metrics_test.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/direct_runner.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/direct_runner_test.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/evaluation_context.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/executor.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/direct
copying 

[jira] [Resolved] (BEAM-3112) improve logs in ElasticsearchIO test utils

2017-10-30 Thread Etienne Chauchot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3112?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Etienne Chauchot resolved BEAM-3112.

   Resolution: Fixed
Fix Version/s: 2.3.0

> improve logs in ElasticsearchIO test utils
> --
>
> Key: BEAM-3112
> URL: https://issues.apache.org/jira/browse/BEAM-3112
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-extensions
>Reporter: Etienne Chauchot
>Assignee: Etienne Chauchot
>Priority: Minor
> Fix For: 2.3.0
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


Build failed in Jenkins: beam_PostCommit_Java_ValidatesRunner_Spark #3386

2017-10-30 Thread Apache Jenkins Server
See 


--
Started by GitHub push by asfgit
[EnvInject] - Loading node environment variables.
Building remotely on beam8 (beam) in workspace 

 > git rev-parse --is-inside-work-tree # timeout=10
Fetching changes from the remote Git repository
 > git config remote.origin.url https://github.com/apache/beam.git # timeout=10
Fetching upstream changes from https://github.com/apache/beam.git
 > git --version # timeout=10
 > git fetch --tags --progress https://github.com/apache/beam.git 
 > +refs/heads/*:refs/remotes/origin/* 
 > +refs/pull/${ghprbPullId}/*:refs/remotes/origin/pr/${ghprbPullId}/*
 > git rev-parse origin/master^{commit} # timeout=10
Checking out Revision 16b9d584ccb19b7ffaf10677e7c246f0f9647e7c (origin/master)
java.lang.NoClassDefFoundError: Could not initialize class 
jenkins.model.Jenkins$MasterComputer
at 
org.jenkinsci.plugins.gitclient.AbstractGitAPIImpl.withRepository(AbstractGitAPIImpl.java:29)
at 
org.jenkinsci.plugins.gitclient.CliGitAPIImpl.withRepository(CliGitAPIImpl.java:71)
at sun.reflect.GeneratedMethodAccessor73.invoke(Unknown Source)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
hudson.remoting.RemoteInvocationHandler$RPCRequest.perform(RemoteInvocationHandler.java:896)
at 
hudson.remoting.RemoteInvocationHandler$RPCRequest.call(RemoteInvocationHandler.java:870)
at 
hudson.remoting.RemoteInvocationHandler$RPCRequest.call(RemoteInvocationHandler.java:829)
at hudson.remoting.UserRequest.perform(UserRequest.java:181)
at hudson.remoting.UserRequest.perform(UserRequest.java:52)
at hudson.remoting.Request$2.run(Request.java:336)
at 
hudson.remoting.InterceptingExecutorService$1.call(InterceptingExecutorService.java:68)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:748)
at ..remote call to beam8(Native Method)
at hudson.remoting.Channel.attachCallSiteStackTrace(Channel.java:1554)
at hudson.remoting.UserResponse.retrieve(UserRequest.java:281)
at hudson.remoting.Channel.call(Channel.java:839)
Caused: java.io.IOException: Remote call on beam8 failed
at hudson.remoting.Channel.call(Channel.java:847)
at 
hudson.remoting.RemoteInvocationHandler.invoke(RemoteInvocationHandler.java:257)
at com.sun.proxy.$Proxy108.withRepository(Unknown Source)
at 
org.jenkinsci.plugins.gitclient.RemoteGitImpl.withRepository(RemoteGitImpl.java:235)
at hudson.plugins.git.GitSCM.printCommitMessageToLog(GitSCM.java:1195)
at hudson.plugins.git.GitSCM.checkout(GitSCM.java:1159)
at hudson.scm.SCM.checkout(SCM.java:495)
at hudson.model.AbstractProject.checkout(AbstractProject.java:1212)
at 
hudson.model.AbstractBuild$AbstractBuildExecution.defaultCheckout(AbstractBuild.java:566)
at jenkins.scm.SCMCheckoutStrategy.checkout(SCMCheckoutStrategy.java:86)
at 
hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:491)
at hudson.model.Run.execute(Run.java:1737)
at hudson.maven.MavenModuleSetBuild.run(MavenModuleSetBuild.java:543)
at hudson.model.ResourceController.execute(ResourceController.java:97)
at hudson.model.Executor.run(Executor.java:419)
Retrying after 10 seconds
 > git rev-parse --is-inside-work-tree # timeout=10
Fetching changes from the remote Git repository
 > git config remote.origin.url https://github.com/apache/beam.git # timeout=10
Fetching upstream changes from https://github.com/apache/beam.git
 > git --version # timeout=10
 > git fetch --tags --progress https://github.com/apache/beam.git 
 > +refs/heads/*:refs/remotes/origin/* 
 > +refs/pull/${ghprbPullId}/*:refs/remotes/origin/pr/${ghprbPullId}/*
 > git rev-parse origin/master^{commit} # timeout=10
Checking out Revision 16b9d584ccb19b7ffaf10677e7c246f0f9647e7c (origin/master)
java.lang.NoClassDefFoundError: Could not initialize class 
jenkins.model.Jenkins$MasterComputer
at 
org.jenkinsci.plugins.gitclient.AbstractGitAPIImpl.withRepository(AbstractGitAPIImpl.java:29)
at 
org.jenkinsci.plugins.gitclient.CliGitAPIImpl.withRepository(CliGitAPIImpl.java:71)
at sun.reflect.GeneratedMethodAccessor73.invoke(Unknown Source)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 

[jira] [Commented] (BEAM-3112) improve logs in ElasticsearchIO test utils

2017-10-30 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-3112?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16225163#comment-16225163
 ] 

ASF GitHub Bot commented on BEAM-3112:
--

Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/4051


> improve logs in ElasticsearchIO test utils
> --
>
> Key: BEAM-3112
> URL: https://issues.apache.org/jira/browse/BEAM-3112
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-extensions
>Reporter: Etienne Chauchot
>Assignee: Etienne Chauchot
>Priority: Minor
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] beam pull request #4051: [BEAM-3112] Improve error messages in Elasticsearch...

2017-10-30 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/4051


---


[2/2] beam git commit: This closes #4051

2017-10-30 Thread iemejia
This closes #4051


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/16b9d584
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/16b9d584
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/16b9d584

Branch: refs/heads/master
Commit: 16b9d584ccb19b7ffaf10677e7c246f0f9647e7c
Parents: 6b91eed 3807756
Author: Ismaël Mejía 
Authored: Mon Oct 30 16:36:35 2017 +0100
Committer: Ismaël Mejía 
Committed: Mon Oct 30 16:36:35 2017 +0100

--
 .../elasticsearch/ElasticSearchIOTestUtils.java |  8 +--
 .../sdk/io/elasticsearch/ElasticsearchIO.java   | 70 +++-
 2 files changed, 39 insertions(+), 39 deletions(-)
--




[1/2] beam git commit: [BEAM-3112] Improve error messages in ElasticsearchIO test utils

2017-10-30 Thread iemejia
Repository: beam
Updated Branches:
  refs/heads/master 6b91eed7e -> 16b9d584c


[BEAM-3112] Improve error messages in ElasticsearchIO test utils


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/38077564
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/38077564
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/38077564

Branch: refs/heads/master
Commit: 38077564496f2f7c2accda42a6c0f45f542ac694
Parents: 6b91eed
Author: Etienne Chauchot 
Authored: Mon Oct 16 14:33:38 2017 +0200
Committer: Ismaël Mejía 
Committed: Mon Oct 30 16:36:01 2017 +0100

--
 .../elasticsearch/ElasticSearchIOTestUtils.java |  8 +--
 .../sdk/io/elasticsearch/ElasticsearchIO.java   | 70 +++-
 2 files changed, 39 insertions(+), 39 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/38077564/sdks/java/io/elasticsearch-tests/elasticsearch-tests-common/src/test/java/org/apache/beam/sdk/io/elasticsearch/ElasticSearchIOTestUtils.java
--
diff --git 
a/sdks/java/io/elasticsearch-tests/elasticsearch-tests-common/src/test/java/org/apache/beam/sdk/io/elasticsearch/ElasticSearchIOTestUtils.java
 
b/sdks/java/io/elasticsearch-tests/elasticsearch-tests-common/src/test/java/org/apache/beam/sdk/io/elasticsearch/ElasticSearchIOTestUtils.java
index 142789b..bbceb8d 100644
--- 
a/sdks/java/io/elasticsearch-tests/elasticsearch-tests-common/src/test/java/org/apache/beam/sdk/io/elasticsearch/ElasticSearchIOTestUtils.java
+++ 
b/sdks/java/io/elasticsearch-tests/elasticsearch-tests-common/src/test/java/org/apache/beam/sdk/io/elasticsearch/ElasticSearchIOTestUtils.java
@@ -73,12 +73,8 @@ class ElasticSearchIOTestUtils {
 new NStringEntity(bulkRequest.toString(), 
ContentType.APPLICATION_JSON);
 Response response = restClient.performRequest("POST", endPoint,
 Collections.singletonMap("refresh", "true"), requestBody);
-JsonNode searchResult = ElasticsearchIO.parseResponse(response);
-boolean errors = searchResult.path("errors").asBoolean();
-if (errors){
-  throw new IOException(String.format("Failed to insert test documents in 
index %s",
-  connectionConfiguration.getIndex()));
-}
+ElasticsearchIO
+.checkForErrors(response, 
ElasticsearchIO.getBackendVersion(connectionConfiguration));
   }
 
   /**

http://git-wip-us.apache.org/repos/asf/beam/blob/38077564/sdks/java/io/elasticsearch/src/main/java/org/apache/beam/sdk/io/elasticsearch/ElasticsearchIO.java
--
diff --git 
a/sdks/java/io/elasticsearch/src/main/java/org/apache/beam/sdk/io/elasticsearch/ElasticsearchIO.java
 
b/sdks/java/io/elasticsearch/src/main/java/org/apache/beam/sdk/io/elasticsearch/ElasticsearchIO.java
index 5eebe00..c0d0819 100644
--- 
a/sdks/java/io/elasticsearch/src/main/java/org/apache/beam/sdk/io/elasticsearch/ElasticsearchIO.java
+++ 
b/sdks/java/io/elasticsearch/src/main/java/org/apache/beam/sdk/io/elasticsearch/ElasticsearchIO.java
@@ -149,6 +149,41 @@ public class ElasticsearchIO {
 return mapper.readValue(response.getEntity().getContent(), JsonNode.class);
   }
 
+  static void checkForErrors(Response response, int backendVersion) throws 
IOException {
+JsonNode searchResult = parseResponse(response);
+boolean errors = searchResult.path("errors").asBoolean();
+if (errors) {
+  StringBuilder errorMessages =
+  new StringBuilder(
+  "Error writing to Elasticsearch, some elements could not be 
inserted:");
+  JsonNode items = searchResult.path("items");
+  //some items present in bulk might have errors, concatenate error 
messages
+  for (JsonNode item : items) {
+String errorRootName = "";
+if (backendVersion == 2) {
+  errorRootName = "create";
+} else if (backendVersion == 5) {
+  errorRootName = "index";
+}
+JsonNode errorRoot = item.path(errorRootName);
+JsonNode error = errorRoot.get("error");
+if (error != null) {
+  String type = error.path("type").asText();
+  String reason = error.path("reason").asText();
+  String docId = errorRoot.path("_id").asText();
+  errorMessages.append(String.format("%nDocument id %s: %s (%s)", 
docId, reason, type));
+  JsonNode causedBy = error.get("caused_by");
+  if (causedBy != null) {
+String cbReason = causedBy.path("reason").asText();
+String cbType = causedBy.path("type").asText();
+errorMessages.append(String.format("%nCaused by: %s (%s)", 
cbReason, cbType));
+  }
+}
+  }
+  throw new IOException(errorMessages.toString());
+}
+  }
+
   

[jira] [Resolved] (BEAM-3111) Upgrade ElasticsearchIO elastic dependences to 5.6.3

2017-10-30 Thread Etienne Chauchot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3111?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Etienne Chauchot resolved BEAM-3111.

   Resolution: Fixed
Fix Version/s: 2.3.0

> Upgrade ElasticsearchIO elastic dependences to 5.6.3
> 
>
> Key: BEAM-3111
> URL: https://issues.apache.org/jira/browse/BEAM-3111
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-extensions
>Reporter: Etienne Chauchot
>Assignee: Etienne Chauchot
>Priority: Minor
> Fix For: 2.3.0
>
>
> Now that ES 6.0.RC1 is out, it is time to upgrade deps to the latest 5.x



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


Build failed in Jenkins: beam_PostCommit_Java_ValidatesRunner_Spark #3385

2017-10-30 Thread Apache Jenkins Server
See 


--
Started by GitHub push by asfgit
[EnvInject] - Loading node environment variables.
Building remotely on beam8 (beam) in workspace 

 > git rev-parse --is-inside-work-tree # timeout=10
Fetching changes from the remote Git repository
 > git config remote.origin.url https://github.com/apache/beam.git # timeout=10
Fetching upstream changes from https://github.com/apache/beam.git
 > git --version # timeout=10
 > git fetch --tags --progress https://github.com/apache/beam.git 
 > +refs/heads/*:refs/remotes/origin/* 
 > +refs/pull/${ghprbPullId}/*:refs/remotes/origin/pr/${ghprbPullId}/*
 > git rev-parse origin/master^{commit} # timeout=10
Checking out Revision 6b91eed7e52bb86b1d90aa47b754ee7f3dff3ef6 (origin/master)
java.lang.NoClassDefFoundError: Could not initialize class 
jenkins.model.Jenkins$MasterComputer
at 
org.jenkinsci.plugins.gitclient.AbstractGitAPIImpl.withRepository(AbstractGitAPIImpl.java:29)
at 
org.jenkinsci.plugins.gitclient.CliGitAPIImpl.withRepository(CliGitAPIImpl.java:71)
at sun.reflect.GeneratedMethodAccessor73.invoke(Unknown Source)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
hudson.remoting.RemoteInvocationHandler$RPCRequest.perform(RemoteInvocationHandler.java:896)
at 
hudson.remoting.RemoteInvocationHandler$RPCRequest.call(RemoteInvocationHandler.java:870)
at 
hudson.remoting.RemoteInvocationHandler$RPCRequest.call(RemoteInvocationHandler.java:829)
at hudson.remoting.UserRequest.perform(UserRequest.java:181)
at hudson.remoting.UserRequest.perform(UserRequest.java:52)
at hudson.remoting.Request$2.run(Request.java:336)
at 
hudson.remoting.InterceptingExecutorService$1.call(InterceptingExecutorService.java:68)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:748)
at ..remote call to beam8(Native Method)
at hudson.remoting.Channel.attachCallSiteStackTrace(Channel.java:1554)
at hudson.remoting.UserResponse.retrieve(UserRequest.java:281)
at hudson.remoting.Channel.call(Channel.java:839)
Caused: java.io.IOException: Remote call on beam8 failed
at hudson.remoting.Channel.call(Channel.java:847)
at 
hudson.remoting.RemoteInvocationHandler.invoke(RemoteInvocationHandler.java:257)
at com.sun.proxy.$Proxy108.withRepository(Unknown Source)
at 
org.jenkinsci.plugins.gitclient.RemoteGitImpl.withRepository(RemoteGitImpl.java:235)
at hudson.plugins.git.GitSCM.printCommitMessageToLog(GitSCM.java:1195)
at hudson.plugins.git.GitSCM.checkout(GitSCM.java:1159)
at hudson.scm.SCM.checkout(SCM.java:495)
at hudson.model.AbstractProject.checkout(AbstractProject.java:1212)
at 
hudson.model.AbstractBuild$AbstractBuildExecution.defaultCheckout(AbstractBuild.java:566)
at jenkins.scm.SCMCheckoutStrategy.checkout(SCMCheckoutStrategy.java:86)
at 
hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:491)
at hudson.model.Run.execute(Run.java:1737)
at hudson.maven.MavenModuleSetBuild.run(MavenModuleSetBuild.java:543)
at hudson.model.ResourceController.execute(ResourceController.java:97)
at hudson.model.Executor.run(Executor.java:419)
Retrying after 10 seconds
 > git rev-parse --is-inside-work-tree # timeout=10
Fetching changes from the remote Git repository
 > git config remote.origin.url https://github.com/apache/beam.git # timeout=10
Fetching upstream changes from https://github.com/apache/beam.git
 > git --version # timeout=10
 > git fetch --tags --progress https://github.com/apache/beam.git 
 > +refs/heads/*:refs/remotes/origin/* 
 > +refs/pull/${ghprbPullId}/*:refs/remotes/origin/pr/${ghprbPullId}/*
 > git rev-parse origin/master^{commit} # timeout=10
Checking out Revision 6b91eed7e52bb86b1d90aa47b754ee7f3dff3ef6 (origin/master)
java.lang.NoClassDefFoundError: Could not initialize class 
jenkins.model.Jenkins$MasterComputer
at 
org.jenkinsci.plugins.gitclient.AbstractGitAPIImpl.withRepository(AbstractGitAPIImpl.java:29)
at 
org.jenkinsci.plugins.gitclient.CliGitAPIImpl.withRepository(CliGitAPIImpl.java:71)
at sun.reflect.GeneratedMethodAccessor73.invoke(Unknown Source)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 

[jira] [Commented] (BEAM-3111) Upgrade ElasticsearchIO elastic dependences to 5.6.3

2017-10-30 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-3111?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16225142#comment-16225142
 ] 

ASF GitHub Bot commented on BEAM-3111:
--

Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/4050


> Upgrade ElasticsearchIO elastic dependences to 5.6.3
> 
>
> Key: BEAM-3111
> URL: https://issues.apache.org/jira/browse/BEAM-3111
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-extensions
>Reporter: Etienne Chauchot
>Assignee: Etienne Chauchot
>Priority: Minor
>
> Now that ES 6.0.RC1 is out, it is time to upgrade deps to the latest 5.x



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] beam pull request #4050: [BEAM-3111] Upgrade Elasticsearch to 5.6.3 and clea...

2017-10-30 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/4050


---


[2/2] beam git commit: This closes #4050

2017-10-30 Thread iemejia
This closes #4050


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/6b91eed7
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/6b91eed7
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/6b91eed7

Branch: refs/heads/master
Commit: 6b91eed7e52bb86b1d90aa47b754ee7f3dff3ef6
Parents: 5fb30ec f6b4045
Author: Ismaël Mejía 
Authored: Mon Oct 30 16:30:22 2017 +0100
Committer: Ismaël Mejía 
Committed: Mon Oct 30 16:30:22 2017 +0100

--
 sdks/java/io/elasticsearch-tests/elasticsearch-tests-5/pom.xml   | 2 +-
 .../apache/beam/sdk/io/elasticsearch/ElasticsearchIOTest.java| 1 +
 .../io/elasticsearch-tests/elasticsearch-tests-common/pom.xml| 1 -
 sdks/java/io/elasticsearch-tests/pom.xml | 4 ++--
 sdks/java/io/elasticsearch/pom.xml   | 4 ++--
 5 files changed, 6 insertions(+), 6 deletions(-)
--




[1/2] beam git commit: [BEAM-3111] Upgrade Elasticsearch to 5.6.3 and clean pom

2017-10-30 Thread iemejia
Repository: beam
Updated Branches:
  refs/heads/master 5fb30ec82 -> 6b91eed7e


[BEAM-3111] Upgrade Elasticsearch to 5.6.3 and clean pom


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/f6b40454
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/f6b40454
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/f6b40454

Branch: refs/heads/master
Commit: f6b40454795d20e45e6a4f53cc91d7a5b6224069
Parents: 5fb30ec
Author: Etienne Chauchot 
Authored: Tue Oct 24 16:41:25 2017 +0200
Committer: Ismaël Mejía 
Committed: Mon Oct 30 16:30:15 2017 +0100

--
 sdks/java/io/elasticsearch-tests/elasticsearch-tests-5/pom.xml   | 2 +-
 .../apache/beam/sdk/io/elasticsearch/ElasticsearchIOTest.java| 1 +
 .../io/elasticsearch-tests/elasticsearch-tests-common/pom.xml| 1 -
 sdks/java/io/elasticsearch-tests/pom.xml | 4 ++--
 sdks/java/io/elasticsearch/pom.xml   | 4 ++--
 5 files changed, 6 insertions(+), 6 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/f6b40454/sdks/java/io/elasticsearch-tests/elasticsearch-tests-5/pom.xml
--
diff --git a/sdks/java/io/elasticsearch-tests/elasticsearch-tests-5/pom.xml 
b/sdks/java/io/elasticsearch-tests/elasticsearch-tests-5/pom.xml
index c7ea474..ba76316 100644
--- a/sdks/java/io/elasticsearch-tests/elasticsearch-tests-5/pom.xml
+++ b/sdks/java/io/elasticsearch-tests/elasticsearch-tests-5/pom.xml
@@ -31,7 +31,7 @@
 Tests of ElasticsearchIO on Elasticsearch 5.x
 
 
-5.4.0
+5.6.3
 
 
 

http://git-wip-us.apache.org/repos/asf/beam/blob/f6b40454/sdks/java/io/elasticsearch-tests/elasticsearch-tests-5/src/test/java/org/apache/beam/sdk/io/elasticsearch/ElasticsearchIOTest.java
--
diff --git 
a/sdks/java/io/elasticsearch-tests/elasticsearch-tests-5/src/test/java/org/apache/beam/sdk/io/elasticsearch/ElasticsearchIOTest.java
 
b/sdks/java/io/elasticsearch-tests/elasticsearch-tests-5/src/test/java/org/apache/beam/sdk/io/elasticsearch/ElasticsearchIOTest.java
index 9a7eb07..92ad608 100644
--- 
a/sdks/java/io/elasticsearch-tests/elasticsearch-tests-5/src/test/java/org/apache/beam/sdk/io/elasticsearch/ElasticsearchIOTest.java
+++ 
b/sdks/java/io/elasticsearch-tests/elasticsearch-tests-5/src/test/java/org/apache/beam/sdk/io/elasticsearch/ElasticsearchIOTest.java
@@ -68,6 +68,7 @@ public class ElasticsearchIOTest extends ESIntegTestCase 
implements Serializable
 
   @Override
   protected Settings nodeSettings(int nodeOrdinal) {
+System.setProperty("es.set.netty.runtime.available.processors", "false");
 return Settings.builder().put(super.nodeSettings(nodeOrdinal))
 .put("http.enabled", "true")
 // had problems with some jdk, embedded ES was too slow for bulk 
insertion,

http://git-wip-us.apache.org/repos/asf/beam/blob/f6b40454/sdks/java/io/elasticsearch-tests/elasticsearch-tests-common/pom.xml
--
diff --git 
a/sdks/java/io/elasticsearch-tests/elasticsearch-tests-common/pom.xml 
b/sdks/java/io/elasticsearch-tests/elasticsearch-tests-common/pom.xml
index 6ac7fc1..b30764a 100644
--- a/sdks/java/io/elasticsearch-tests/elasticsearch-tests-common/pom.xml
+++ b/sdks/java/io/elasticsearch-tests/elasticsearch-tests-common/pom.xml
@@ -32,7 +32,6 @@
 
 1.3.2
 2.6.2
-
5.0.0
 4.4.5
 
4.1.2
 
4.5.2

http://git-wip-us.apache.org/repos/asf/beam/blob/f6b40454/sdks/java/io/elasticsearch-tests/pom.xml
--
diff --git a/sdks/java/io/elasticsearch-tests/pom.xml 
b/sdks/java/io/elasticsearch-tests/pom.xml
index 43300f8..59ef454 100644
--- a/sdks/java/io/elasticsearch-tests/pom.xml
+++ b/sdks/java/io/elasticsearch-tests/pom.xml
@@ -35,7 +35,7 @@
 1.3.2
 4.1.0
 2.6.2
-
5.4.0
+
5.6.3
 
 
 
@@ -122,7 +122,7 @@
 
 
 org.elasticsearch.client
-rest
+elasticsearch-rest-client
 ${elasticsearch.client.rest.version}
 test
 

http://git-wip-us.apache.org/repos/asf/beam/blob/f6b40454/sdks/java/io/elasticsearch/pom.xml
--
diff --git a/sdks/java/io/elasticsearch/pom.xml 
b/sdks/java/io/elasticsearch/pom.xml
index 83d8c7e..37f3d11 100644
--- a/sdks/java/io/elasticsearch/pom.xml
+++ b/sdks/java/io/elasticsearch/pom.xml
@@ -31,7 +31,7 @@
 IO to read and write on Elasticsearch
 
 
-
5.4.0
+
5.6.3
 4.4.5
 
4.1.2
 
4.5.2

  1   2   >