[jira] [Work logged] (BEAM-2857) Create FileIO in Python

2019-06-04 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-2857?focusedWorklogId=254222&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-254222
 ]

ASF GitHub Bot logged work on BEAM-2857:


Author: ASF GitHub Bot
Created on: 05/Jun/19 06:40
Start Date: 05/Jun/19 06:40
Worklog Time Spent: 10m 
  Work Description: pabloem commented on pull request #8394: [BEAM-2857] 
Implementing WriteToFiles transform for fileio (Python)
URL: https://github.com/apache/beam/pull/8394
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 254222)
Time Spent: 7h 20m  (was: 7h 10m)

> Create FileIO in Python
> ---
>
> Key: BEAM-2857
> URL: https://issues.apache.org/jira/browse/BEAM-2857
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-py-core
>Reporter: Eugene Kirpichov
>Assignee: Pablo Estrada
>Priority: Major
>  Labels: gsoc, gsoc2019, mentor
>  Time Spent: 7h 20m
>  Remaining Estimate: 0h
>
> Beam Java has a FileIO with operations: match()/matchAll(), readMatches(), 
> which together cover the majority of needs for general-purpose file 
> ingestion. Beam Python should have something similar.
> An early design document for this: https://s.apache.org/fileio-beam-python



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6620) Do not relocate guava

2019-06-04 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6620?focusedWorklogId=254207&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-254207
 ]

ASF GitHub Bot logged work on BEAM-6620:


Author: ASF GitHub Bot
Created on: 05/Jun/19 05:56
Start Date: 05/Jun/19 05:56
Worklog Time Spent: 10m 
  Work Description: lukecwik commented on issue #8762: [BEAM-6620, 
BEAM-3606] Stop shading by default.
URL: https://github.com/apache/beam/pull/8762#issuecomment-498948924
 
 
   Run Dataflow ValidatesRunner
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 254207)
Time Spent: 1h 40m  (was: 1.5h)

> Do not relocate guava
> -
>
> Key: BEAM-6620
> URL: https://issues.apache.org/jira/browse/BEAM-6620
> Project: Beam
>  Issue Type: Sub-task
>  Components: build-system
>Affects Versions: 2.11.0
>Reporter: Ismaël Mejía
>Assignee: Luke Cwik
>Priority: Critical
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> Once guava use is vendored we have to remove the automatic relocation of 
> guava so modules who use guava in their API (e.g. Cassandra or Kinesis) can 
> work correctly.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6620) Do not relocate guava

2019-06-04 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6620?focusedWorklogId=254209&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-254209
 ]

ASF GitHub Bot logged work on BEAM-6620:


Author: ASF GitHub Bot
Created on: 05/Jun/19 05:56
Start Date: 05/Jun/19 05:56
Worklog Time Spent: 10m 
  Work Description: lukecwik commented on issue #8762: [BEAM-6620, 
BEAM-3606] Stop shading by default.
URL: https://github.com/apache/beam/pull/8762#issuecomment-498949048
 
 
   Run Spark ValidatesRunner
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 254209)
Time Spent: 2h  (was: 1h 50m)

> Do not relocate guava
> -
>
> Key: BEAM-6620
> URL: https://issues.apache.org/jira/browse/BEAM-6620
> Project: Beam
>  Issue Type: Sub-task
>  Components: build-system
>Affects Versions: 2.11.0
>Reporter: Ismaël Mejía
>Assignee: Luke Cwik
>Priority: Critical
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> Once guava use is vendored we have to remove the automatic relocation of 
> guava so modules who use guava in their API (e.g. Cassandra or Kinesis) can 
> work correctly.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6620) Do not relocate guava

2019-06-04 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6620?focusedWorklogId=254208&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-254208
 ]

ASF GitHub Bot logged work on BEAM-6620:


Author: ASF GitHub Bot
Created on: 05/Jun/19 05:56
Start Date: 05/Jun/19 05:56
Worklog Time Spent: 10m 
  Work Description: lukecwik commented on issue #8762: [BEAM-6620, 
BEAM-3606] Stop shading by default.
URL: https://github.com/apache/beam/pull/8762#issuecomment-498948565
 
 
   Run Dataflow PostCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 254208)
Time Spent: 1h 50m  (was: 1h 40m)

> Do not relocate guava
> -
>
> Key: BEAM-6620
> URL: https://issues.apache.org/jira/browse/BEAM-6620
> Project: Beam
>  Issue Type: Sub-task
>  Components: build-system
>Affects Versions: 2.11.0
>Reporter: Ismaël Mejía
>Assignee: Luke Cwik
>Priority: Critical
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> Once guava use is vendored we have to remove the automatic relocation of 
> guava so modules who use guava in their API (e.g. Cassandra or Kinesis) can 
> work correctly.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6620) Do not relocate guava

2019-06-04 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6620?focusedWorklogId=254206&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-254206
 ]

ASF GitHub Bot logged work on BEAM-6620:


Author: ASF GitHub Bot
Created on: 05/Jun/19 05:55
Start Date: 05/Jun/19 05:55
Worklog Time Spent: 10m 
  Work Description: lukecwik commented on issue #8762: [BEAM-6620, 
BEAM-3606] Stop shading by default.
URL: https://github.com/apache/beam/pull/8762#issuecomment-498948815
 
 
   Run Flink ValidatesRunner
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 254206)
Time Spent: 1.5h  (was: 1h 20m)

> Do not relocate guava
> -
>
> Key: BEAM-6620
> URL: https://issues.apache.org/jira/browse/BEAM-6620
> Project: Beam
>  Issue Type: Sub-task
>  Components: build-system
>Affects Versions: 2.11.0
>Reporter: Ismaël Mejía
>Assignee: Luke Cwik
>Priority: Critical
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> Once guava use is vendored we have to remove the automatic relocation of 
> guava so modules who use guava in their API (e.g. Cassandra or Kinesis) can 
> work correctly.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6620) Do not relocate guava

2019-06-04 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6620?focusedWorklogId=254203&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-254203
 ]

ASF GitHub Bot logged work on BEAM-6620:


Author: ASF GitHub Bot
Created on: 05/Jun/19 05:55
Start Date: 05/Jun/19 05:55
Worklog Time Spent: 10m 
  Work Description: lukecwik commented on issue #8762: [BEAM-6620, 
BEAM-3606] Stop shading by default.
URL: https://github.com/apache/beam/pull/8762#issuecomment-498948610
 
 
   Run Spark PostCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 254203)
Time Spent: 1h 10m  (was: 1h)

> Do not relocate guava
> -
>
> Key: BEAM-6620
> URL: https://issues.apache.org/jira/browse/BEAM-6620
> Project: Beam
>  Issue Type: Sub-task
>  Components: build-system
>Affects Versions: 2.11.0
>Reporter: Ismaël Mejía
>Assignee: Luke Cwik
>Priority: Critical
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> Once guava use is vendored we have to remove the automatic relocation of 
> guava so modules who use guava in their API (e.g. Cassandra or Kinesis) can 
> work correctly.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6620) Do not relocate guava

2019-06-04 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6620?focusedWorklogId=254205&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-254205
 ]

ASF GitHub Bot logged work on BEAM-6620:


Author: ASF GitHub Bot
Created on: 05/Jun/19 05:55
Start Date: 05/Jun/19 05:55
Worklog Time Spent: 10m 
  Work Description: lukecwik commented on issue #8762: [BEAM-6620, 
BEAM-3606] Stop shading by default.
URL: https://github.com/apache/beam/pull/8762#issuecomment-498948696
 
 
   Run Flink PostCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 254205)
Time Spent: 1h 20m  (was: 1h 10m)

> Do not relocate guava
> -
>
> Key: BEAM-6620
> URL: https://issues.apache.org/jira/browse/BEAM-6620
> Project: Beam
>  Issue Type: Sub-task
>  Components: build-system
>Affects Versions: 2.11.0
>Reporter: Ismaël Mejía
>Assignee: Luke Cwik
>Priority: Critical
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> Once guava use is vendored we have to remove the automatic relocation of 
> guava so modules who use guava in their API (e.g. Cassandra or Kinesis) can 
> work correctly.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6620) Do not relocate guava

2019-06-04 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6620?focusedWorklogId=254202&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-254202
 ]

ASF GitHub Bot logged work on BEAM-6620:


Author: ASF GitHub Bot
Created on: 05/Jun/19 05:55
Start Date: 05/Jun/19 05:55
Worklog Time Spent: 10m 
  Work Description: lukecwik commented on issue #8762: [BEAM-6620, 
BEAM-3606] Stop shading by default.
URL: https://github.com/apache/beam/pull/8762#issuecomment-498948565
 
 
   Run Dataflow PostCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 254202)
Time Spent: 1h  (was: 50m)

> Do not relocate guava
> -
>
> Key: BEAM-6620
> URL: https://issues.apache.org/jira/browse/BEAM-6620
> Project: Beam
>  Issue Type: Sub-task
>  Components: build-system
>Affects Versions: 2.11.0
>Reporter: Ismaël Mejía
>Assignee: Luke Cwik
>Priority: Critical
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> Once guava use is vendored we have to remove the automatic relocation of 
> guava so modules who use guava in their API (e.g. Cassandra or Kinesis) can 
> work correctly.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6620) Do not relocate guava

2019-06-04 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6620?focusedWorklogId=254201&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-254201
 ]

ASF GitHub Bot logged work on BEAM-6620:


Author: ASF GitHub Bot
Created on: 05/Jun/19 05:55
Start Date: 05/Jun/19 05:55
Worklog Time Spent: 10m 
  Work Description: lukecwik commented on issue #8762: [BEAM-6620, 
BEAM-3606] Stop shading by default.
URL: https://github.com/apache/beam/pull/8762#issuecomment-498948696
 
 
   Run Flink PostCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 254201)
Time Spent: 50m  (was: 40m)

> Do not relocate guava
> -
>
> Key: BEAM-6620
> URL: https://issues.apache.org/jira/browse/BEAM-6620
> Project: Beam
>  Issue Type: Sub-task
>  Components: build-system
>Affects Versions: 2.11.0
>Reporter: Ismaël Mejía
>Assignee: Luke Cwik
>Priority: Critical
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> Once guava use is vendored we have to remove the automatic relocation of 
> guava so modules who use guava in their API (e.g. Cassandra or Kinesis) can 
> work correctly.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6620) Do not relocate guava

2019-06-04 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6620?focusedWorklogId=254200&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-254200
 ]

ASF GitHub Bot logged work on BEAM-6620:


Author: ASF GitHub Bot
Created on: 05/Jun/19 05:54
Start Date: 05/Jun/19 05:54
Worklog Time Spent: 10m 
  Work Description: lukecwik commented on issue #8762: [BEAM-6620, 
BEAM-3606] Stop shading by default.
URL: https://github.com/apache/beam/pull/8762#issuecomment-498948610
 
 
   Run Spark PostCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 254200)
Time Spent: 40m  (was: 0.5h)

> Do not relocate guava
> -
>
> Key: BEAM-6620
> URL: https://issues.apache.org/jira/browse/BEAM-6620
> Project: Beam
>  Issue Type: Sub-task
>  Components: build-system
>Affects Versions: 2.11.0
>Reporter: Ismaël Mejía
>Assignee: Luke Cwik
>Priority: Critical
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Once guava use is vendored we have to remove the automatic relocation of 
> guava so modules who use guava in their API (e.g. Cassandra or Kinesis) can 
> work correctly.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6620) Do not relocate guava

2019-06-04 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6620?focusedWorklogId=254198&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-254198
 ]

ASF GitHub Bot logged work on BEAM-6620:


Author: ASF GitHub Bot
Created on: 05/Jun/19 05:54
Start Date: 05/Jun/19 05:54
Worklog Time Spent: 10m 
  Work Description: lukecwik commented on issue #8762: [BEAM-6620, 
BEAM-3606] Stop shading by default.
URL: https://github.com/apache/beam/pull/8762#issuecomment-498948503
 
 
   Run Java PostCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 254198)
Time Spent: 20m  (was: 10m)

> Do not relocate guava
> -
>
> Key: BEAM-6620
> URL: https://issues.apache.org/jira/browse/BEAM-6620
> Project: Beam
>  Issue Type: Sub-task
>  Components: build-system
>Affects Versions: 2.11.0
>Reporter: Ismaël Mejía
>Assignee: Luke Cwik
>Priority: Critical
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Once guava use is vendored we have to remove the automatic relocation of 
> guava so modules who use guava in their API (e.g. Cassandra or Kinesis) can 
> work correctly.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6620) Do not relocate guava

2019-06-04 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6620?focusedWorklogId=254199&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-254199
 ]

ASF GitHub Bot logged work on BEAM-6620:


Author: ASF GitHub Bot
Created on: 05/Jun/19 05:54
Start Date: 05/Jun/19 05:54
Worklog Time Spent: 10m 
  Work Description: lukecwik commented on issue #8762: [BEAM-6620, 
BEAM-3606] Stop shading by default.
URL: https://github.com/apache/beam/pull/8762#issuecomment-498948565
 
 
   Run Data flow PostCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 254199)
Time Spent: 0.5h  (was: 20m)

> Do not relocate guava
> -
>
> Key: BEAM-6620
> URL: https://issues.apache.org/jira/browse/BEAM-6620
> Project: Beam
>  Issue Type: Sub-task
>  Components: build-system
>Affects Versions: 2.11.0
>Reporter: Ismaël Mejía
>Assignee: Luke Cwik
>Priority: Critical
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Once guava use is vendored we have to remove the automatic relocation of 
> guava so modules who use guava in their API (e.g. Cassandra or Kinesis) can 
> work correctly.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-7463) Bigquery IO ITs are flaky: incorrect checksum

2019-06-04 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7463?focusedWorklogId=254191&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-254191
 ]

ASF GitHub Bot logged work on BEAM-7463:


Author: ASF GitHub Bot
Created on: 05/Jun/19 05:00
Start Date: 05/Jun/19 05:00
Worklog Time Spent: 10m 
  Work Description: tvalentyn commented on issue #8751: [BEAM-7463] 
parallelize BQ IT tests
URL: https://github.com/apache/beam/pull/8751#issuecomment-498938989
 
 
   Hi @Juta, could you please comment which variables are being shared by test 
scenarios? This change should improve test parallelism in test modules that 
have multiple test cases, however I believe it does not remove side-effects in 
existing test scenarios: the tests you modify use test-case level fixtures 
(`setUp()`).  For every test method a new instance of TestCase is created, and 
setUp will be called on this instance.  
   
   From unittest docs: 
https://docs.python.org/3/library/unittest.html#class-and-module-fixtures
   
   >  A new TestCase instance is created as a unique test fixture used to 
execute each individual test method.
   
   This change would indeed take effect if we used module-level or class-level 
fixtures (e.g. `setUpClass()`): with `_multiprocesses_can_split_` module-level 
fixtures would be called multiple times. However I don't see usages of those 
types of fixtures in integration tests, and we should not use them since they 
can create side-effects. So it is not clear to me which variables are shared in 
the tests, that will not be shared with this change. If you think I am missing 
something, could you post a simplified code-snippet that demonstrates a 
potential race/side effect in existing tests? Thank you.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 254191)
Time Spent: 40m  (was: 0.5h)

> Bigquery IO ITs are flaky: incorrect checksum
> -
>
> Key: BEAM-7463
> URL: https://issues.apache.org/jira/browse/BEAM-7463
> Project: Beam
>  Issue Type: Bug
>  Components: io-python-gcp
>Reporter: Valentyn Tymofieiev
>Assignee: Juta Staes
>Priority: Major
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> {noformat}
> 15:03:38 FAIL: test_big_query_new_types 
> (apache_beam.io.gcp.big_query_query_to_table_it_test.BigQueryQueryToTableIT)
> 15:03:38 
> --
> 15:03:38 Traceback (most recent call last):
> 15:03:38   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify/src/sdks/python/apache_beam/io/gcp/big_query_query_to_table_it_test.py",
>  line 211, in test_big_query_new_types
> 15:03:38 big_query_query_to_table_pipeline.run_bq_pipeline(options)
> 15:03:38   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify/src/sdks/python/apache_beam/io/gcp/big_query_query_to_table_pipeline.py",
>  line 82, in run_bq_pipeline
> 15:03:38 result = p.run()
> 15:03:38   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify/src/sdks/python/apache_beam/testing/test_pipeline.py",
>  line 107, in run
> 15:03:38 else test_runner_api))
> 15:03:38   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify/src/sdks/python/apache_beam/pipeline.py",
>  line 406, in run
> 15:03:38 self._options).run(False)
> 15:03:38   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify/src/sdks/python/apache_beam/pipeline.py",
>  line 419, in run
> 15:03:38 return self.runner.run_pipeline(self, self._options)
> 15:03:38   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify/src/sdks/python/apache_beam/runners/direct/test_direct_runner.py",
>  line 51, in run_pipeline
> 15:03:38 hc_assert_that(self.result, pickler.loads(on_success_matcher))
> 15:03:38 AssertionError: 
> 15:03:38 Expected: (Test pipeline expected terminated in state: DONE and 
> Expected checksum is 24de460c4d344a4b77ccc4cc1acb7b7ffc11a214)
> 15:03:38  but: Expected checksum is 
> 24de460c4d344a4b77ccc4cc1acb7b7ffc11a214 Actual checksum is 
> da39a3ee5e6b4b0d3255bfef95601890afd80709
> {noformat}
> [~Juta] could this be caused by changes to Bigquery matcher? 
> https://github.com/apache/beam/pull/8621/files#diff-f1ec7e3a3e7e2e5082ddb7043954c108R134
>  
> cc: [~pabloem] [~chamikara] [~apilloud]
> A recent postcommit run has BQ failures in other tests as well: 
> https://builds.apach

[jira] [Commented] (BEAM-7493) beam-sdks-testing-nexmark produces corrupt pom.xml

2019-06-04 Thread Kenneth Knowles (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-7493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16856347#comment-16856347
 ] 

Kenneth Knowles commented on BEAM-7493:
---

Do you have time to look at this?

> beam-sdks-testing-nexmark produces corrupt pom.xml
> --
>
> Key: BEAM-7493
> URL: https://issues.apache.org/jira/browse/BEAM-7493
> Project: Beam
>  Issue Type: Bug
>  Components: examples-nexmark
>Affects Versions: 2.13.0
>Reporter: Kenneth Knowles
>Assignee: Michael Luckey
>Priority: Blocker
>
> Steps to reproduce:
> {code}
> ./gradlew -Prelease -Ppublishing 
> :sdks:java:testing:nexmark:publishMavenJavaPublicationTotestPublicationLocalRepository
> {code}
> and if you inspect the pom you will see
> {code}
>
>   org.apache.beam
>   beam-sdks-java-testing-test-utils
> ...
> 
> It appears to be constructed from the directories, but this (and also 
> load-tests) in the {{testing}} directory have their archive base name 
> overridden: 
> https://github.com/apache/beam/blob/master/sdks/java/testing/test-utils/build.gradle#L20
> So after publication, this will result in a dependency not found error.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (BEAM-7493) beam-sdks-testing-nexmark produces corrupt pom.xml

2019-06-04 Thread Kenneth Knowles (JIRA)
Kenneth Knowles created BEAM-7493:
-

 Summary: beam-sdks-testing-nexmark produces corrupt pom.xml
 Key: BEAM-7493
 URL: https://issues.apache.org/jira/browse/BEAM-7493
 Project: Beam
  Issue Type: Bug
  Components: examples-nexmark
Affects Versions: 2.13.0
Reporter: Kenneth Knowles
Assignee: Michael Luckey


Steps to reproduce:

{code}
./gradlew -Prelease -Ppublishing 
:sdks:java:testing:nexmark:publishMavenJavaPublicationTotestPublicationLocalRepository
{code}

and if you inspect the pom you will see

{code}
   
  org.apache.beam
  beam-sdks-java-testing-test-utils
...


It appears to be constructed from the directories, but this (and also 
load-tests) in the {{testing}} directory have their archive base name 
overridden: 
https://github.com/apache/beam/blob/master/sdks/java/testing/test-utils/build.gradle#L20

So after publication, this will result in a dependency not found error.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (BEAM-7493) beam-sdks-testing-nexmark produces corrupt pom.xml

2019-06-04 Thread Kenneth Knowles (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7493?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kenneth Knowles updated BEAM-7493:
--
Description: 
Steps to reproduce:

{code}
./gradlew -Prelease -Ppublishing 
:sdks:java:testing:nexmark:publishMavenJavaPublicationTotestPublicationLocalRepository
{code}

and if you inspect the pom you will see

{code}

   
  org.apache.beam
  beam-sdks-java-testing-test-utils
...

{code}

It appears to be constructed from the directories, but this (and also 
load-tests) in the {{testing}} directory have their archive base name 
overridden: 
https://github.com/apache/beam/blob/master/sdks/java/testing/test-utils/build.gradle#L20

So after publication, this will result in a dependency not found error.

  was:
Steps to reproduce:

{code}
./gradlew -Prelease -Ppublishing 
:sdks:java:testing:nexmark:publishMavenJavaPublicationTotestPublicationLocalRepository
{code}

and if you inspect the pom you will see

{code}
   
  org.apache.beam
  beam-sdks-java-testing-test-utils
...


It appears to be constructed from the directories, but this (and also 
load-tests) in the {{testing}} directory have their archive base name 
overridden: 
https://github.com/apache/beam/blob/master/sdks/java/testing/test-utils/build.gradle#L20

So after publication, this will result in a dependency not found error.


> beam-sdks-testing-nexmark produces corrupt pom.xml
> --
>
> Key: BEAM-7493
> URL: https://issues.apache.org/jira/browse/BEAM-7493
> Project: Beam
>  Issue Type: Bug
>  Components: examples-nexmark
>Affects Versions: 2.13.0
>Reporter: Kenneth Knowles
>Assignee: Michael Luckey
>Priority: Blocker
>
> Steps to reproduce:
> {code}
> ./gradlew -Prelease -Ppublishing 
> :sdks:java:testing:nexmark:publishMavenJavaPublicationTotestPublicationLocalRepository
> {code}
> and if you inspect the pom you will see
> {code}
>
>   org.apache.beam
>   beam-sdks-java-testing-test-utils
> ...
> {code}
> It appears to be constructed from the directories, but this (and also 
> load-tests) in the {{testing}} directory have their archive base name 
> overridden: 
> https://github.com/apache/beam/blob/master/sdks/java/testing/test-utils/build.gradle#L20
> So after publication, this will result in a dependency not found error.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-529) Check immutability violations in DirectPipelineRunner

2019-06-04 Thread Innocent (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-529?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16856338#comment-16856338
 ] 

Innocent commented on BEAM-529:
---

[~myffi...@gmail.com], I am trying to get as much as context on this issue as I 
can. I am curious about what your proposal was. I am sorry I do not always 
follow all discussions.

> Check immutability violations in DirectPipelineRunner
> -
>
> Key: BEAM-529
> URL: https://issues.apache.org/jira/browse/BEAM-529
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-py-core
>Reporter: Ahmet Altay
>Priority: Minor
>  Labels: newbie, starter
>
> Users are going to mutate inputs and outputs of DoFn inappropriately. We 
> should help their tests fail to catch such mistakes. (Similar to the 
> DirectPipelineRunner in Java SDK)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-7490) TfIdfTest should not be marked ValidatesRunner

2019-06-04 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7490?focusedWorklogId=254127&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-254127
 ]

ASF GitHub Bot logged work on BEAM-7490:


Author: ASF GitHub Bot
Created on: 05/Jun/19 02:14
Start Date: 05/Jun/19 02:14
Worklog Time Spent: 10m 
  Work Description: kennknowles commented on issue #8761: [BEAM-7490] 
TfIdfTest should be marked ValidatesRunner
URL: https://github.com/apache/beam/pull/8761#issuecomment-498912060
 
 
   Oh, yes, good catch. The commit also has the wrong description.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 254127)
Time Spent: 0.5h  (was: 20m)

> TfIdfTest should not be marked ValidatesRunner
> --
>
> Key: BEAM-7490
> URL: https://issues.apache.org/jira/browse/BEAM-7490
> Project: Beam
>  Issue Type: Bug
>  Components: examples-java
>Reporter: Kenneth Knowles
>Assignee: Kenneth Knowles
>Priority: Minor
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-529) Check immutability violations in DirectPipelineRunner

2019-06-04 Thread Yifan Mai (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-529?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16856227#comment-16856227
 ] 

Yifan Mai commented on BEAM-529:


No, not planning to work on this in the near future. As discussed elsewhere, I 
am not confident that my proposal will actually work.

> Check immutability violations in DirectPipelineRunner
> -
>
> Key: BEAM-529
> URL: https://issues.apache.org/jira/browse/BEAM-529
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-py-core
>Reporter: Ahmet Altay
>Priority: Minor
>  Labels: newbie, starter
>
> Users are going to mutate inputs and outputs of DoFn inappropriately. We 
> should help their tests fail to catch such mistakes. (Similar to the 
> DirectPipelineRunner in Java SDK)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (BEAM-7492) Add Spark runner to Go SDK

2019-06-04 Thread Kyle Weaver (JIRA)
Kyle Weaver created BEAM-7492:
-

 Summary: Add Spark runner to Go SDK
 Key: BEAM-7492
 URL: https://issues.apache.org/jira/browse/BEAM-7492
 Project: Beam
  Issue Type: New Feature
  Components: sdk-go
Reporter: Kyle Weaver
Assignee: Kyle Weaver


I imagine this should be as easy as copying [1] and s/flink/spark.

[1] 
[https://github.com/apache/beam/blob/master/sdks/go/pkg/beam/runners/flink/flink.go]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (BEAM-7491) Add Go SDK post commit test on the Spark runner

2019-06-04 Thread Kyle Weaver (JIRA)
Kyle Weaver created BEAM-7491:
-

 Summary: Add Go SDK post commit test on the Spark runner
 Key: BEAM-7491
 URL: https://issues.apache.org/jira/browse/BEAM-7491
 Project: Beam
  Issue Type: Test
  Components: sdk-go
Reporter: Kyle Weaver
Assignee: Kyle Weaver


We already did this for Flink (BEAM-6959), so should be pretty trivial to add 
Spark.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-2857) Create FileIO in Python

2019-06-04 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-2857?focusedWorklogId=254113&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-254113
 ]

ASF GitHub Bot logged work on BEAM-2857:


Author: ASF GitHub Bot
Created on: 05/Jun/19 00:50
Start Date: 05/Jun/19 00:50
Worklog Time Spent: 10m 
  Work Description: pabloem commented on issue #8394: [BEAM-2857] 
Implementing WriteToFiles transform for fileio (Python)
URL: https://github.com/apache/beam/pull/8394#issuecomment-498896124
 
 
   Run Portable_Python PreCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 254113)
Time Spent: 7h  (was: 6h 50m)

> Create FileIO in Python
> ---
>
> Key: BEAM-2857
> URL: https://issues.apache.org/jira/browse/BEAM-2857
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-py-core
>Reporter: Eugene Kirpichov
>Assignee: Pablo Estrada
>Priority: Major
>  Labels: gsoc, gsoc2019, mentor
>  Time Spent: 7h
>  Remaining Estimate: 0h
>
> Beam Java has a FileIO with operations: match()/matchAll(), readMatches(), 
> which together cover the majority of needs for general-purpose file 
> ingestion. Beam Python should have something similar.
> An early design document for this: https://s.apache.org/fileio-beam-python



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-2857) Create FileIO in Python

2019-06-04 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-2857?focusedWorklogId=254114&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-254114
 ]

ASF GitHub Bot logged work on BEAM-2857:


Author: ASF GitHub Bot
Created on: 05/Jun/19 00:50
Start Date: 05/Jun/19 00:50
Worklog Time Spent: 10m 
  Work Description: pabloem commented on issue #8394: [BEAM-2857] 
Implementing WriteToFiles transform for fileio (Python)
URL: https://github.com/apache/beam/pull/8394#issuecomment-498896188
 
 
   Run Python PreCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 254114)
Time Spent: 7h 10m  (was: 7h)

> Create FileIO in Python
> ---
>
> Key: BEAM-2857
> URL: https://issues.apache.org/jira/browse/BEAM-2857
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-py-core
>Reporter: Eugene Kirpichov
>Assignee: Pablo Estrada
>Priority: Major
>  Labels: gsoc, gsoc2019, mentor
>  Time Spent: 7h 10m
>  Remaining Estimate: 0h
>
> Beam Java has a FileIO with operations: match()/matchAll(), readMatches(), 
> which together cover the majority of needs for general-purpose file 
> ingestion. Beam Python should have something similar.
> An early design document for this: https://s.apache.org/fileio-beam-python



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-7351) Failure in Python streaming wordcount test: unexpected messages received on output topic.

2019-06-04 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7351?focusedWorklogId=254112&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-254112
 ]

ASF GitHub Bot logged work on BEAM-7351:


Author: ASF GitHub Bot
Created on: 05/Jun/19 00:49
Start Date: 05/Jun/19 00:49
Worklog Time Spent: 10m 
  Work Description: pabloem commented on issue #8752: [BEAM-7351] increment 
ack deadline for flaky pubsub test
URL: https://github.com/apache/beam/pull/8752#issuecomment-498895993
 
 
   Thanks @tvalentyn @Juta 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 254112)
Time Spent: 2h 10m  (was: 2h)

> Failure in Python streaming wordcount test: unexpected messages received on 
> output topic.
> -
>
> Key: BEAM-7351
> URL: https://issues.apache.org/jira/browse/BEAM-7351
> Project: Beam
>  Issue Type: Bug
>  Components: runner-dataflow, sdk-py-core
>Reporter: Valentyn Tymofieiev
>Assignee: Valentyn Tymofieiev
>Priority: Major
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> Saw this in a PostCommit test today. Likely a flake, but it's a strange 
> failure mode and we may need to investigate this.
> {noformat}
> 13:32:02 
> ==
> 13:32:02 FAIL: test_streaming_wordcount_it 
> (apache_beam.examples.streaming_wordcount_it_test.StreamingWordCountIT)
> 13:32:02 
> --
> 13:32:02 Traceback (most recent call last):
> 13:32:02   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python_Verify_PR/src/sdks/python/apache_beam/examples/streaming_wordcount_it_test.py",
>  line 110, in test_streaming_wordcount_it
> 13:32:02 self.test_pipeline.get_full_options_as_args(**extra_opts))
> 13:32:02   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python_Verify_PR/src/sdks/python/apache_beam/examples/streaming_wordcount.py",
>  line 101, in run
> 13:32:02 result = p.run()
> 13:32:02   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python_Verify_PR/src/sdks/python/apache_beam/pipeline.py",
>  line 419, in run
> 13:32:02 return self.runner.run_pipeline(self, self._options)
> 13:32:02   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python_Verify_PR/src/sdks/python/apache_beam/runners/dataflow/test_dataflow_runner.py",
>  line 68, in run_pipeline
> 13:32:02 hc_assert_that(self.result, pickler.loads(on_success_matcher))
> 13:32:02 AssertionError: 
> 13:32:02 Expected: (Test pipeline expected terminated in state: RUNNING and 
> Expected 500 messages.)
> 13:32:02  but: Expected 500 messages. Got 508 messages. Diffs (item, 
> count):
> 13:32:02   Expected but not in actual: []
> 13:32:02   Unexpected: [(u'476: 1', 1), (u'416: 1', 1), (u'245: 1', 1), 
> (u'478: 1', 1), (u'58: 1', 1), (u'364: 1', 1), (u'77: 1', 1), (u'283: 1', 1)]
> 13:32:02 
> 13:32:02  >> begin captured logging << 
> 
> 13:32:02 google.auth.transport._http_client: DEBUG: Making request: GET 
> http://169.254.169.254
> 13:32:02 google.auth.transport._http_client: DEBUG: Making request: GET 
> http://metadata.google.internal/computeMetadata/v1/project/project-id
> 13:32:02 google.auth.transport.requests: DEBUG: Making request: GET 
> http://metadata.google.internal/computeMetadata/v1/instance/service-accounts/default/?recursive=true
> 13:32:02 urllib3.connectionpool: DEBUG: Starting new HTTP connection (1): 
> metadata.google.internal:80
> 13:32:02 urllib3.connectionpool: DEBUG: http://metadata.google.internal:80 
> "GET /computeMetadata/v1/instance/service-accounts/default/?recursive=true 
> HTTP/1.1" 200 144
> 13:32:02 google.auth.transport.requests: DEBUG: Making request: GET 
> http://metadata.google.internal/computeMetadata/v1/instance/service-accounts/844138762903-comp...@developer.gserviceaccount.com/token
> 13:32:02 urllib3.connectionpool: DEBUG: http://metadata.google.internal:80 
> "GET 
> /computeMetadata/v1/instance/service-accounts/844138762903-comp...@developer.gserviceaccount.com/token
>  HTTP/1.1" 200 176
> 13:32:02 google.auth.transport._http_client: DEBUG: Making request: GET 
> http://169.254.169.254
> 13:32:02 google.auth.transport._http_client: DEBUG: Making request: GET 
> http://metadata.google.internal/computeMetadata/v1/project/project-id
> 13:32:02 google.auth.transport.

[jira] [Work logged] (BEAM-7351) Failure in Python streaming wordcount test: unexpected messages received on output topic.

2019-06-04 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7351?focusedWorklogId=254111&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-254111
 ]

ASF GitHub Bot logged work on BEAM-7351:


Author: ASF GitHub Bot
Created on: 05/Jun/19 00:49
Start Date: 05/Jun/19 00:49
Worklog Time Spent: 10m 
  Work Description: pabloem commented on pull request #8752: [BEAM-7351] 
increment ack deadline for flaky pubsub test
URL: https://github.com/apache/beam/pull/8752
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 254111)
Time Spent: 2h  (was: 1h 50m)

> Failure in Python streaming wordcount test: unexpected messages received on 
> output topic.
> -
>
> Key: BEAM-7351
> URL: https://issues.apache.org/jira/browse/BEAM-7351
> Project: Beam
>  Issue Type: Bug
>  Components: runner-dataflow, sdk-py-core
>Reporter: Valentyn Tymofieiev
>Assignee: Valentyn Tymofieiev
>Priority: Major
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> Saw this in a PostCommit test today. Likely a flake, but it's a strange 
> failure mode and we may need to investigate this.
> {noformat}
> 13:32:02 
> ==
> 13:32:02 FAIL: test_streaming_wordcount_it 
> (apache_beam.examples.streaming_wordcount_it_test.StreamingWordCountIT)
> 13:32:02 
> --
> 13:32:02 Traceback (most recent call last):
> 13:32:02   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python_Verify_PR/src/sdks/python/apache_beam/examples/streaming_wordcount_it_test.py",
>  line 110, in test_streaming_wordcount_it
> 13:32:02 self.test_pipeline.get_full_options_as_args(**extra_opts))
> 13:32:02   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python_Verify_PR/src/sdks/python/apache_beam/examples/streaming_wordcount.py",
>  line 101, in run
> 13:32:02 result = p.run()
> 13:32:02   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python_Verify_PR/src/sdks/python/apache_beam/pipeline.py",
>  line 419, in run
> 13:32:02 return self.runner.run_pipeline(self, self._options)
> 13:32:02   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python_Verify_PR/src/sdks/python/apache_beam/runners/dataflow/test_dataflow_runner.py",
>  line 68, in run_pipeline
> 13:32:02 hc_assert_that(self.result, pickler.loads(on_success_matcher))
> 13:32:02 AssertionError: 
> 13:32:02 Expected: (Test pipeline expected terminated in state: RUNNING and 
> Expected 500 messages.)
> 13:32:02  but: Expected 500 messages. Got 508 messages. Diffs (item, 
> count):
> 13:32:02   Expected but not in actual: []
> 13:32:02   Unexpected: [(u'476: 1', 1), (u'416: 1', 1), (u'245: 1', 1), 
> (u'478: 1', 1), (u'58: 1', 1), (u'364: 1', 1), (u'77: 1', 1), (u'283: 1', 1)]
> 13:32:02 
> 13:32:02  >> begin captured logging << 
> 
> 13:32:02 google.auth.transport._http_client: DEBUG: Making request: GET 
> http://169.254.169.254
> 13:32:02 google.auth.transport._http_client: DEBUG: Making request: GET 
> http://metadata.google.internal/computeMetadata/v1/project/project-id
> 13:32:02 google.auth.transport.requests: DEBUG: Making request: GET 
> http://metadata.google.internal/computeMetadata/v1/instance/service-accounts/default/?recursive=true
> 13:32:02 urllib3.connectionpool: DEBUG: Starting new HTTP connection (1): 
> metadata.google.internal:80
> 13:32:02 urllib3.connectionpool: DEBUG: http://metadata.google.internal:80 
> "GET /computeMetadata/v1/instance/service-accounts/default/?recursive=true 
> HTTP/1.1" 200 144
> 13:32:02 google.auth.transport.requests: DEBUG: Making request: GET 
> http://metadata.google.internal/computeMetadata/v1/instance/service-accounts/844138762903-comp...@developer.gserviceaccount.com/token
> 13:32:02 urllib3.connectionpool: DEBUG: http://metadata.google.internal:80 
> "GET 
> /computeMetadata/v1/instance/service-accounts/844138762903-comp...@developer.gserviceaccount.com/token
>  HTTP/1.1" 200 176
> 13:32:02 google.auth.transport._http_client: DEBUG: Making request: GET 
> http://169.254.169.254
> 13:32:02 google.auth.transport._http_client: DEBUG: Making request: GET 
> http://metadata.google.internal/computeMetadata/v1/project/project-id
> 13:32:02 google.auth.transport.requests: DEBUG: Making request: GET 
> http

[jira] [Work logged] (BEAM-6620) Do not relocate guava

2019-06-04 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6620?focusedWorklogId=254103&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-254103
 ]

ASF GitHub Bot logged work on BEAM-6620:


Author: ASF GitHub Bot
Created on: 04/Jun/19 23:52
Start Date: 04/Jun/19 23:52
Worklog Time Spent: 10m 
  Work Description: lukecwik commented on pull request #8762: [WIP] 
[BEAM-6620, BEAM-3606] Stop shading by default.
URL: https://github.com/apache/beam/pull/8762
 
 
   This stops shading guava by default in all modules that don't have a custom 
shading configuration which is all modules except for:
   * sdks/java/core
   * sdks/java/extensions/kryo
   * sdks/java/extensions/sql
   * sdks/java/extensions/sql/jdbc
   * sdks/java/harness
   * runners/spark/job-server
   * runners/direct-java
   * runners/samza/job-server
   * runners/google-cloud-dataflow-java/worker
   * runners/google-cloud-dataflow-java/worker/legacy-worker
   
   This migrates all modules except for the ones listed above so that:
   * publishing is based off of the unshaded jar and the unshaded test jar.
   * all intra module dependences use the compile (default) or testCompile 
configuration.
   * all dependencies within the module to use 
compile/testCompile/testRuntimeClasspath instead of 
shadow/shadowTest/shadowTestRuntimeClasspath
   
   **Please** add a meaningful description for your change here
   
   
   
   Thank you for your contribution! Follow this checklist to help us 
incorporate your contribution quickly and easily:
   
- [ ] [**Choose 
reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and 
mention them in a comment (`R: @username`).
- [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in 
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA 
issue, if applicable. This will automatically link the pull request to the 
issue.
- [ ] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).
   
   Post-Commit Tests Status (on master branch)
   

   
   Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark
   --- | --- | --- | --- | --- | --- | --- | ---
   Go | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/)
 | --- | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/)
 | --- | --- | ---
   Java | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/)
   Python | [![

[jira] [Work logged] (BEAM-7351) Failure in Python streaming wordcount test: unexpected messages received on output topic.

2019-06-04 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7351?focusedWorklogId=254098&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-254098
 ]

ASF GitHub Bot logged work on BEAM-7351:


Author: ASF GitHub Bot
Created on: 04/Jun/19 23:17
Start Date: 04/Jun/19 23:17
Worklog Time Spent: 10m 
  Work Description: tvalentyn commented on issue #8752: [BEAM-7351] 
increment ack deadline for flaky pubsub test
URL: https://github.com/apache/beam/pull/8752#issuecomment-498877913
 
 
   Kicked off another run of postcommits.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 254098)
Time Spent: 1h 50m  (was: 1h 40m)

> Failure in Python streaming wordcount test: unexpected messages received on 
> output topic.
> -
>
> Key: BEAM-7351
> URL: https://issues.apache.org/jira/browse/BEAM-7351
> Project: Beam
>  Issue Type: Bug
>  Components: runner-dataflow, sdk-py-core
>Reporter: Valentyn Tymofieiev
>Assignee: Valentyn Tymofieiev
>Priority: Major
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> Saw this in a PostCommit test today. Likely a flake, but it's a strange 
> failure mode and we may need to investigate this.
> {noformat}
> 13:32:02 
> ==
> 13:32:02 FAIL: test_streaming_wordcount_it 
> (apache_beam.examples.streaming_wordcount_it_test.StreamingWordCountIT)
> 13:32:02 
> --
> 13:32:02 Traceback (most recent call last):
> 13:32:02   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python_Verify_PR/src/sdks/python/apache_beam/examples/streaming_wordcount_it_test.py",
>  line 110, in test_streaming_wordcount_it
> 13:32:02 self.test_pipeline.get_full_options_as_args(**extra_opts))
> 13:32:02   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python_Verify_PR/src/sdks/python/apache_beam/examples/streaming_wordcount.py",
>  line 101, in run
> 13:32:02 result = p.run()
> 13:32:02   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python_Verify_PR/src/sdks/python/apache_beam/pipeline.py",
>  line 419, in run
> 13:32:02 return self.runner.run_pipeline(self, self._options)
> 13:32:02   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python_Verify_PR/src/sdks/python/apache_beam/runners/dataflow/test_dataflow_runner.py",
>  line 68, in run_pipeline
> 13:32:02 hc_assert_that(self.result, pickler.loads(on_success_matcher))
> 13:32:02 AssertionError: 
> 13:32:02 Expected: (Test pipeline expected terminated in state: RUNNING and 
> Expected 500 messages.)
> 13:32:02  but: Expected 500 messages. Got 508 messages. Diffs (item, 
> count):
> 13:32:02   Expected but not in actual: []
> 13:32:02   Unexpected: [(u'476: 1', 1), (u'416: 1', 1), (u'245: 1', 1), 
> (u'478: 1', 1), (u'58: 1', 1), (u'364: 1', 1), (u'77: 1', 1), (u'283: 1', 1)]
> 13:32:02 
> 13:32:02  >> begin captured logging << 
> 
> 13:32:02 google.auth.transport._http_client: DEBUG: Making request: GET 
> http://169.254.169.254
> 13:32:02 google.auth.transport._http_client: DEBUG: Making request: GET 
> http://metadata.google.internal/computeMetadata/v1/project/project-id
> 13:32:02 google.auth.transport.requests: DEBUG: Making request: GET 
> http://metadata.google.internal/computeMetadata/v1/instance/service-accounts/default/?recursive=true
> 13:32:02 urllib3.connectionpool: DEBUG: Starting new HTTP connection (1): 
> metadata.google.internal:80
> 13:32:02 urllib3.connectionpool: DEBUG: http://metadata.google.internal:80 
> "GET /computeMetadata/v1/instance/service-accounts/default/?recursive=true 
> HTTP/1.1" 200 144
> 13:32:02 google.auth.transport.requests: DEBUG: Making request: GET 
> http://metadata.google.internal/computeMetadata/v1/instance/service-accounts/844138762903-comp...@developer.gserviceaccount.com/token
> 13:32:02 urllib3.connectionpool: DEBUG: http://metadata.google.internal:80 
> "GET 
> /computeMetadata/v1/instance/service-accounts/844138762903-comp...@developer.gserviceaccount.com/token
>  HTTP/1.1" 200 176
> 13:32:02 google.auth.transport._http_client: DEBUG: Making request: GET 
> http://169.254.169.254
> 13:32:02 google.auth.transport._http_client: DEBUG: Making request: GET 
> http://metadata.google.internal/computeMetadata/v1/project/project-id
> 13:32:02 go

[jira] [Work logged] (BEAM-7351) Failure in Python streaming wordcount test: unexpected messages received on output topic.

2019-06-04 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7351?focusedWorklogId=254097&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-254097
 ]

ASF GitHub Bot logged work on BEAM-7351:


Author: ASF GitHub Bot
Created on: 04/Jun/19 23:17
Start Date: 04/Jun/19 23:17
Worklog Time Spent: 10m 
  Work Description: tvalentyn commented on issue #8752: [BEAM-7351] 
increment ack deadline for flaky pubsub test
URL: https://github.com/apache/beam/pull/8752#issuecomment-498877806
 
 
   We might not fix all of the tests in other PRs, but we may have more luck 
with another re-run. But as I said, the tests affected by this PR passed last 
time I looked into the logs, so I think it's safe to merge.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 254097)
Time Spent: 1h 40m  (was: 1.5h)

> Failure in Python streaming wordcount test: unexpected messages received on 
> output topic.
> -
>
> Key: BEAM-7351
> URL: https://issues.apache.org/jira/browse/BEAM-7351
> Project: Beam
>  Issue Type: Bug
>  Components: runner-dataflow, sdk-py-core
>Reporter: Valentyn Tymofieiev
>Assignee: Valentyn Tymofieiev
>Priority: Major
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> Saw this in a PostCommit test today. Likely a flake, but it's a strange 
> failure mode and we may need to investigate this.
> {noformat}
> 13:32:02 
> ==
> 13:32:02 FAIL: test_streaming_wordcount_it 
> (apache_beam.examples.streaming_wordcount_it_test.StreamingWordCountIT)
> 13:32:02 
> --
> 13:32:02 Traceback (most recent call last):
> 13:32:02   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python_Verify_PR/src/sdks/python/apache_beam/examples/streaming_wordcount_it_test.py",
>  line 110, in test_streaming_wordcount_it
> 13:32:02 self.test_pipeline.get_full_options_as_args(**extra_opts))
> 13:32:02   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python_Verify_PR/src/sdks/python/apache_beam/examples/streaming_wordcount.py",
>  line 101, in run
> 13:32:02 result = p.run()
> 13:32:02   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python_Verify_PR/src/sdks/python/apache_beam/pipeline.py",
>  line 419, in run
> 13:32:02 return self.runner.run_pipeline(self, self._options)
> 13:32:02   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python_Verify_PR/src/sdks/python/apache_beam/runners/dataflow/test_dataflow_runner.py",
>  line 68, in run_pipeline
> 13:32:02 hc_assert_that(self.result, pickler.loads(on_success_matcher))
> 13:32:02 AssertionError: 
> 13:32:02 Expected: (Test pipeline expected terminated in state: RUNNING and 
> Expected 500 messages.)
> 13:32:02  but: Expected 500 messages. Got 508 messages. Diffs (item, 
> count):
> 13:32:02   Expected but not in actual: []
> 13:32:02   Unexpected: [(u'476: 1', 1), (u'416: 1', 1), (u'245: 1', 1), 
> (u'478: 1', 1), (u'58: 1', 1), (u'364: 1', 1), (u'77: 1', 1), (u'283: 1', 1)]
> 13:32:02 
> 13:32:02  >> begin captured logging << 
> 
> 13:32:02 google.auth.transport._http_client: DEBUG: Making request: GET 
> http://169.254.169.254
> 13:32:02 google.auth.transport._http_client: DEBUG: Making request: GET 
> http://metadata.google.internal/computeMetadata/v1/project/project-id
> 13:32:02 google.auth.transport.requests: DEBUG: Making request: GET 
> http://metadata.google.internal/computeMetadata/v1/instance/service-accounts/default/?recursive=true
> 13:32:02 urllib3.connectionpool: DEBUG: Starting new HTTP connection (1): 
> metadata.google.internal:80
> 13:32:02 urllib3.connectionpool: DEBUG: http://metadata.google.internal:80 
> "GET /computeMetadata/v1/instance/service-accounts/default/?recursive=true 
> HTTP/1.1" 200 144
> 13:32:02 google.auth.transport.requests: DEBUG: Making request: GET 
> http://metadata.google.internal/computeMetadata/v1/instance/service-accounts/844138762903-comp...@developer.gserviceaccount.com/token
> 13:32:02 urllib3.connectionpool: DEBUG: http://metadata.google.internal:80 
> "GET 
> /computeMetadata/v1/instance/service-accounts/844138762903-comp...@developer.gserviceaccount.com/token
>  HTTP/1.1" 200 176
> 13:32:02 google.auth.transport._http_client: DEBUG: Making request: GET 
> http://169

[jira] [Work logged] (BEAM-7351) Failure in Python streaming wordcount test: unexpected messages received on output topic.

2019-06-04 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7351?focusedWorklogId=254096&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-254096
 ]

ASF GitHub Bot logged work on BEAM-7351:


Author: ASF GitHub Bot
Created on: 04/Jun/19 23:17
Start Date: 04/Jun/19 23:17
Worklog Time Spent: 10m 
  Work Description: tvalentyn commented on issue #8752: [BEAM-7351] 
increment ack deadline for flaky pubsub test
URL: https://github.com/apache/beam/pull/8752#issuecomment-498877821
 
 
   Run Python PostCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 254096)
Time Spent: 1.5h  (was: 1h 20m)

> Failure in Python streaming wordcount test: unexpected messages received on 
> output topic.
> -
>
> Key: BEAM-7351
> URL: https://issues.apache.org/jira/browse/BEAM-7351
> Project: Beam
>  Issue Type: Bug
>  Components: runner-dataflow, sdk-py-core
>Reporter: Valentyn Tymofieiev
>Assignee: Valentyn Tymofieiev
>Priority: Major
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> Saw this in a PostCommit test today. Likely a flake, but it's a strange 
> failure mode and we may need to investigate this.
> {noformat}
> 13:32:02 
> ==
> 13:32:02 FAIL: test_streaming_wordcount_it 
> (apache_beam.examples.streaming_wordcount_it_test.StreamingWordCountIT)
> 13:32:02 
> --
> 13:32:02 Traceback (most recent call last):
> 13:32:02   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python_Verify_PR/src/sdks/python/apache_beam/examples/streaming_wordcount_it_test.py",
>  line 110, in test_streaming_wordcount_it
> 13:32:02 self.test_pipeline.get_full_options_as_args(**extra_opts))
> 13:32:02   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python_Verify_PR/src/sdks/python/apache_beam/examples/streaming_wordcount.py",
>  line 101, in run
> 13:32:02 result = p.run()
> 13:32:02   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python_Verify_PR/src/sdks/python/apache_beam/pipeline.py",
>  line 419, in run
> 13:32:02 return self.runner.run_pipeline(self, self._options)
> 13:32:02   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python_Verify_PR/src/sdks/python/apache_beam/runners/dataflow/test_dataflow_runner.py",
>  line 68, in run_pipeline
> 13:32:02 hc_assert_that(self.result, pickler.loads(on_success_matcher))
> 13:32:02 AssertionError: 
> 13:32:02 Expected: (Test pipeline expected terminated in state: RUNNING and 
> Expected 500 messages.)
> 13:32:02  but: Expected 500 messages. Got 508 messages. Diffs (item, 
> count):
> 13:32:02   Expected but not in actual: []
> 13:32:02   Unexpected: [(u'476: 1', 1), (u'416: 1', 1), (u'245: 1', 1), 
> (u'478: 1', 1), (u'58: 1', 1), (u'364: 1', 1), (u'77: 1', 1), (u'283: 1', 1)]
> 13:32:02 
> 13:32:02  >> begin captured logging << 
> 
> 13:32:02 google.auth.transport._http_client: DEBUG: Making request: GET 
> http://169.254.169.254
> 13:32:02 google.auth.transport._http_client: DEBUG: Making request: GET 
> http://metadata.google.internal/computeMetadata/v1/project/project-id
> 13:32:02 google.auth.transport.requests: DEBUG: Making request: GET 
> http://metadata.google.internal/computeMetadata/v1/instance/service-accounts/default/?recursive=true
> 13:32:02 urllib3.connectionpool: DEBUG: Starting new HTTP connection (1): 
> metadata.google.internal:80
> 13:32:02 urllib3.connectionpool: DEBUG: http://metadata.google.internal:80 
> "GET /computeMetadata/v1/instance/service-accounts/default/?recursive=true 
> HTTP/1.1" 200 144
> 13:32:02 google.auth.transport.requests: DEBUG: Making request: GET 
> http://metadata.google.internal/computeMetadata/v1/instance/service-accounts/844138762903-comp...@developer.gserviceaccount.com/token
> 13:32:02 urllib3.connectionpool: DEBUG: http://metadata.google.internal:80 
> "GET 
> /computeMetadata/v1/instance/service-accounts/844138762903-comp...@developer.gserviceaccount.com/token
>  HTTP/1.1" 200 176
> 13:32:02 google.auth.transport._http_client: DEBUG: Making request: GET 
> http://169.254.169.254
> 13:32:02 google.auth.transport._http_client: DEBUG: Making request: GET 
> http://metadata.google.internal/computeMetadata/v1/project/project-id
> 13:32:02 google.auth.transport.r

[jira] [Commented] (BEAM-7395) Check immutability violations in Dataflow Runner (as an option)

2019-06-04 Thread Ahmet Altay (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-7395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16856186#comment-16856186
 ] 

Ahmet Altay commented on BEAM-7395:
---

Hi [~evindj] if you are planning to work on Beam's direct runner, that issue is 
tracked here: https://issues.apache.org/jira/browse/BEAM-529

There is no DirectPipelineRunner because that name was changed on previous 
refactorings. Code is here: 
https://github.com/apache/beam/tree/master/sdks/python/apache_beam/runners/direct

The idea here is that, elements passed to the process method should not be 
mutated by process(). Direct runner can enforce this by serializing elements 
before process call and checking after the call. Doing this in distributed 
runners like Dataflow will have a significant cost, but we can potentially do 
it based on sampling (e.g. check on 1 every N elements.)

> Check immutability violations in Dataflow Runner (as an option)
> ---
>
> Key: BEAM-7395
> URL: https://issues.apache.org/jira/browse/BEAM-7395
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-py-core
>Reporter: Ahmet Altay
>Assignee: Innocent
>Priority: Minor
>  Labels: newbie, starter
>
> Users are going to mutate inputs and outputs of DoFn inappropriately. We 
> should help their tests fail to catch such mistakes. (Similar to the 
> DirectPipelineRunner in Java SDK) This should be offered as an option and 
> worked based on sampling because of the cost of these checks.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-529) Check immutability violations in DirectPipelineRunner

2019-06-04 Thread Ahmet Altay (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-529?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16856185#comment-16856185
 ] 

Ahmet Altay commented on BEAM-529:
--

[~myffi...@gmail.com] are you planning to work on this?

> Check immutability violations in DirectPipelineRunner
> -
>
> Key: BEAM-529
> URL: https://issues.apache.org/jira/browse/BEAM-529
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-py-core
>Reporter: Ahmet Altay
>Priority: Minor
>  Labels: newbie, starter
>
> Users are going to mutate inputs and outputs of DoFn inappropriately. We 
> should help their tests fail to catch such mistakes. (Similar to the 
> DirectPipelineRunner in Java SDK)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-7357) Kinesis IO.write throws LimitExceededException

2019-06-04 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7357?focusedWorklogId=254087&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-254087
 ]

ASF GitHub Bot logged work on BEAM-7357:


Author: ASF GitHub Bot
Created on: 04/Jun/19 22:57
Start Date: 04/Jun/19 22:57
Worklog Time Spent: 10m 
  Work Description: iemejia commented on pull request #8758: [BEAM-7357] 
Remove stream existence check
URL: https://github.com/apache/beam/pull/8758
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 254087)
Time Spent: 50m  (was: 40m)

> Kinesis IO.write throws LimitExceededException
> --
>
> Key: BEAM-7357
> URL: https://issues.apache.org/jira/browse/BEAM-7357
> Project: Beam
>  Issue Type: Bug
>  Components: io-java-kinesis
>Affects Versions: 2.11.0
>Reporter: Brachi Packter
>Assignee: Alexey Romanenko
>Priority: Major
> Fix For: 2.14.0
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> I used Kinesis IO to write to kinesis. I get very quickly many exceptions 
> like:
> [shard_map.cc:150] Shard map update for stream "***" failed. Code: 
> LimitExceededException Message: Rate exceeded for stream *** under account 
> ***; retrying in ..
> Also, I see many exceptions like:
> Caused by: java.lang.IllegalArgumentException: Stream ** does not exist at 
> org.apache.beam.vendor.guava.v20_0.com.google.common.base.Preconditions.checkArgument(Preconditions.java:191)
>  at 
> org.apache.beam.sdk.io.kinesis.KinesisIO$Write$KinesisWriterFn.setup(KinesisIO.java:515)
> I'm sure this stream exists because I can see some data from my pipeline that 
> was successfully ingested to it.
>  
> Here is my code:
>  
>  
> {code:java}
> .apply(KinesisIO.write()
>        .withStreamName("**")
>        .withPartitioner(new KinesisPartitioner() {
>                        @Override
>                         public String getPartitionKey(byte[] value) {
>                                         return UUID.randomUUID().toString()
>                          }
>                         @Override
>                         public String getExplicitHashKey(byte[] value) {
>                                         return null;
>                         }
>        })
>.withAWSClientsProvider("**","***",Regions.US_EAST_1));{code}
>  
> I tried to not use the Kinesis IO. and everything works well, I can't figure 
> out what went wrong.
> I tried using the same API as the library did.
>  
> {code:java}
> .apply(
>  ParDo.of(new DoFn() {
>  private transient IKinesisProducer inlineProducer;
>  @Setup
>  public void setup(){
>  KinesisProducerConfiguration config =   
> KinesisProducerConfiguration.fromProperties(new Properties());
>  config.setRegion(Regions.US_EAST_1.getName());
>  config.setCredentialsProvider(new AWSStaticCredentialsProvider(new 
> BasicAWSCredentials("***", "***")));
>  inlineProducer = new KinesisProducer(config);
>  }
>  @ProcessElement
>  public void processElement(ProcessContext c) throws Exception {
> ByteBuffer data = ByteBuffer.wrap(c.element());
> String partitionKey =UUID.randomUUID().toString();
> ListenableFuture f =
> getProducer().addUserRecord("***", partitionKey, data);
>Futures.addCallback(f, new UserRecordResultFutureCallback());
> }
>  class UserRecordResultFutureCallback implements 
> FutureCallback {
>  @Override
>  public void onFailure(Throwable cause) {
>throw new RuntimeException("failed produce:"+cause);
>  }
>  @Override
>  public void onSuccess(UserRecordResult result) {
>  }
>  }
>  })
>  );
>  
> {code}
>  
> Any idea what I did wrong? or what the error in the KinesisIO?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-5148) Implement MongoDB IO for Python SDK

2019-06-04 Thread Ahmet Altay (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-5148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16856179#comment-16856179
 ] 

Ahmet Altay commented on BEAM-5148:
---

Hi [~GeoloeG_IsT], we would like to complete this implementation in Beam. Would 
it be fine if [~yichi] works on this? Are you still planning to work on this in 
Beam? We could base the implementation in the beam extended repo you shared 
already.

> Implement MongoDB IO for Python SDK
> ---
>
> Key: BEAM-5148
> URL: https://issues.apache.org/jira/browse/BEAM-5148
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-py-core
>Affects Versions: 3.0.0
>Reporter: Pascal Gula
>Assignee: Pascal Gula
>Priority: Major
> Fix For: Not applicable
>
>
> Currently Java SDK has MongoDB support but Python SDK does not. With current 
> portability efforts other runners may soon be able to use Python SDK. Having 
> mongoDB support will allow these runners to execute large scale jobs using it.
> Since we need this IO components @ Peat, we started working on a PyPi package 
> available at this repository: [https://github.com/PEAT-AI/beam-extended]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-7388) Reify PTransform for Python SDK

2019-06-04 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7388?focusedWorklogId=254076&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-254076
 ]

ASF GitHub Bot logged work on BEAM-7388:


Author: ASF GitHub Bot
Created on: 04/Jun/19 22:37
Start Date: 04/Jun/19 22:37
Worklog Time Spent: 10m 
  Work Description: pabloem commented on issue #8717: [BEAM-7388] Reify 
PTransform for Python SDK
URL: https://github.com/apache/beam/pull/8717#issuecomment-498868730
 
 
   Thanks Tanay! These are great. Sorru about the delay.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 254076)
Time Spent: 50m  (was: 40m)

> Reify PTransform for Python SDK
> ---
>
> Key: BEAM-7388
> URL: https://issues.apache.org/jira/browse/BEAM-7388
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-py-core
>Affects Versions: 2.12.0
>Reporter: Tanay Tummalapalli
>Assignee: Tanay Tummalapalli
>Priority: Minor
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> The Java SDK has a nice [Reify 
> PTransform|https://beam.apache.org/releases/javadoc/2.4.0/org/apache/beam/sdk/transforms/Reify.html]
>  that can be used to add timestamp, window info to elements of a PCollection. 
> This makes adding timestamp, window info reusable and easy instead of 
> defining DoFns every time. 
> Also, this can make a pipeline look really neat. Eg: [This 
> PR|https://github.com/apache/beam/pull/8589].
> This can be added to the util module of the Python SDK.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-7388) Reify PTransform for Python SDK

2019-06-04 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7388?focusedWorklogId=254077&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-254077
 ]

ASF GitHub Bot logged work on BEAM-7388:


Author: ASF GitHub Bot
Created on: 04/Jun/19 22:37
Start Date: 04/Jun/19 22:37
Worklog Time Spent: 10m 
  Work Description: pabloem commented on pull request #8717: [BEAM-7388] 
Reify PTransform for Python SDK
URL: https://github.com/apache/beam/pull/8717
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 254077)
Time Spent: 1h  (was: 50m)

> Reify PTransform for Python SDK
> ---
>
> Key: BEAM-7388
> URL: https://issues.apache.org/jira/browse/BEAM-7388
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-py-core
>Affects Versions: 2.12.0
>Reporter: Tanay Tummalapalli
>Assignee: Tanay Tummalapalli
>Priority: Minor
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> The Java SDK has a nice [Reify 
> PTransform|https://beam.apache.org/releases/javadoc/2.4.0/org/apache/beam/sdk/transforms/Reify.html]
>  that can be used to add timestamp, window info to elements of a PCollection. 
> This makes adding timestamp, window info reusable and easy instead of 
> defining DoFns every time. 
> Also, this can make a pipeline look really neat. Eg: [This 
> PR|https://github.com/apache/beam/pull/8589].
> This can be added to the util module of the Python SDK.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-7490) TfIdfTest should not be marked ValidatesRunner

2019-06-04 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7490?focusedWorklogId=254074&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-254074
 ]

ASF GitHub Bot logged work on BEAM-7490:


Author: ASF GitHub Bot
Created on: 04/Jun/19 22:34
Start Date: 04/Jun/19 22:34
Worklog Time Spent: 10m 
  Work Description: kennknowles commented on issue #8761: [BEAM-7490] 
TfIdfTest should be marked ValidatesRunner
URL: https://github.com/apache/beam/pull/8761#issuecomment-498868212
 
 
   run java postcommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 254074)
Time Spent: 20m  (was: 10m)

> TfIdfTest should not be marked ValidatesRunner
> --
>
> Key: BEAM-7490
> URL: https://issues.apache.org/jira/browse/BEAM-7490
> Project: Beam
>  Issue Type: Bug
>  Components: examples-java
>Reporter: Kenneth Knowles
>Assignee: Kenneth Knowles
>Priority: Minor
>  Time Spent: 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (BEAM-7490) TfIdfTest should not be marked ValidatesRunner

2019-06-04 Thread Kenneth Knowles (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7490?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kenneth Knowles updated BEAM-7490:
--
Status: Open  (was: Triage Needed)

> TfIdfTest should not be marked ValidatesRunner
> --
>
> Key: BEAM-7490
> URL: https://issues.apache.org/jira/browse/BEAM-7490
> Project: Beam
>  Issue Type: Bug
>  Components: examples-java
>Reporter: Kenneth Knowles
>Assignee: Kenneth Knowles
>Priority: Minor
>  Time Spent: 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-7490) TfIdfTest should not be marked ValidatesRunner

2019-06-04 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7490?focusedWorklogId=254068&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-254068
 ]

ASF GitHub Bot logged work on BEAM-7490:


Author: ASF GitHub Bot
Created on: 04/Jun/19 22:29
Start Date: 04/Jun/19 22:29
Worklog Time Spent: 10m 
  Work Description: kennknowles commented on pull request #8761: 
[BEAM-7490] TfIdfTest should be marked ValidatesRunner
URL: https://github.com/apache/beam/pull/8761
 
 
   This is a test of an example, not a model compliance test. It does not need 
any annotation at all, in fact.
   
   
   
   Thank you for your contribution! Follow this checklist to help us 
incorporate your contribution quickly and easily:
   
- [x] [**Choose 
reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and 
mention them in a comment (`R: @username`).
- [x] Format the pull request title like `[BEAM-XXX] Fixes bug in 
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA 
issue, if applicable. This will automatically link the pull request to the 
issue.
- [x] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).
   
   Post-Commit Tests Status (on master branch)
   

   
   Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark
   --- | --- | --- | --- | --- | --- | --- | ---
   Go | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/)
 | --- | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/)
 | --- | --- | ---
   Java | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/)
   Python | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Python_Verify/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python_Verify/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Python3_Verify/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python3_Verify/lastCompletedBuild/)
 | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/)
  [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/)
 | [![Build 
Status](https://builds.a

[jira] [Created] (BEAM-7490) TfIdfTest should not be marked ValidatesRunner

2019-06-04 Thread Kenneth Knowles (JIRA)
Kenneth Knowles created BEAM-7490:
-

 Summary: TfIdfTest should not be marked ValidatesRunner
 Key: BEAM-7490
 URL: https://issues.apache.org/jira/browse/BEAM-7490
 Project: Beam
  Issue Type: Bug
  Components: examples-java
Reporter: Kenneth Knowles
Assignee: Kenneth Knowles






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3606) Improve our relationship, or lack thereof, with Guava

2019-06-04 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-3606?focusedWorklogId=254061&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-254061
 ]

ASF GitHub Bot logged work on BEAM-3606:


Author: ASF GitHub Bot
Created on: 04/Jun/19 22:17
Start Date: 04/Jun/19 22:17
Worklog Time Spent: 10m 
  Work Description: lukecwik commented on pull request #8760: [BEAM-3606] 
Remove unused guava_testlib from dependency sets.
URL: https://github.com/apache/beam/pull/8760
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 254061)
Time Spent: 2h  (was: 1h 50m)

> Improve our relationship, or lack thereof, with Guava
> -
>
> Key: BEAM-3606
> URL: https://issues.apache.org/jira/browse/BEAM-3606
> Project: Beam
>  Issue Type: Improvement
>  Components: io-java-gcp, runner-core, sdk-java-core
>Reporter: Kenneth Knowles
>Priority: Major
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> This is an umbrella task for things that might move us off Guava, such as 
> replicating our own little utilities or moving to Java 8 features.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (BEAM-7175) Add Spark Portable ValidatesRunner Batch postcommit test

2019-06-04 Thread JIRA


 [ 
https://issues.apache.org/jira/browse/BEAM-7175?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismaël Mejía resolved BEAM-7175.

   Resolution: Fixed
Fix Version/s: 2.14.0

> Add Spark Portable ValidatesRunner Batch postcommit test
> 
>
> Key: BEAM-7175
> URL: https://issues.apache.org/jira/browse/BEAM-7175
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-spark
>Reporter: Kyle Weaver
>Assignee: Kyle Weaver
>Priority: Major
> Fix For: 2.14.0
>
>  Time Spent: 9h 40m
>  Remaining Estimate: 0h
>
> when it's ready



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-7175) Add Spark Portable ValidatesRunner Batch postcommit test

2019-06-04 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7175?focusedWorklogId=254055&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-254055
 ]

ASF GitHub Bot logged work on BEAM-7175:


Author: ASF GitHub Bot
Created on: 04/Jun/19 22:12
Start Date: 04/Jun/19 22:12
Worklog Time Spent: 10m 
  Work Description: iemejia commented on issue #8432: [BEAM-7175] add Spark 
validatesPortableRunner batch test postcommit
URL: https://github.com/apache/beam/pull/8432#issuecomment-498862624
 
 
   Merged manually to change the commit message and squash. Great job @ibzib !
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 254055)
Time Spent: 9.5h  (was: 9h 20m)

> Add Spark Portable ValidatesRunner Batch postcommit test
> 
>
> Key: BEAM-7175
> URL: https://issues.apache.org/jira/browse/BEAM-7175
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-spark
>Reporter: Kyle Weaver
>Assignee: Kyle Weaver
>Priority: Major
>  Time Spent: 9.5h
>  Remaining Estimate: 0h
>
> when it's ready



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-7175) Add Spark Portable ValidatesRunner Batch postcommit test

2019-06-04 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7175?focusedWorklogId=254056&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-254056
 ]

ASF GitHub Bot logged work on BEAM-7175:


Author: ASF GitHub Bot
Created on: 04/Jun/19 22:12
Start Date: 04/Jun/19 22:12
Worklog Time Spent: 10m 
  Work Description: iemejia commented on pull request #8432: [BEAM-7175] 
add Spark validatesPortableRunner batch test postcommit
URL: https://github.com/apache/beam/pull/8432
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 254056)
Time Spent: 9h 40m  (was: 9.5h)

> Add Spark Portable ValidatesRunner Batch postcommit test
> 
>
> Key: BEAM-7175
> URL: https://issues.apache.org/jira/browse/BEAM-7175
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-spark
>Reporter: Kyle Weaver
>Assignee: Kyle Weaver
>Priority: Major
>  Time Spent: 9h 40m
>  Remaining Estimate: 0h
>
> when it's ready



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (BEAM-6620) Do not relocate guava

2019-06-04 Thread Luke Cwik (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6620?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Luke Cwik reassigned BEAM-6620:
---

Assignee: Luke Cwik  (was: Kenneth Knowles)

> Do not relocate guava
> -
>
> Key: BEAM-6620
> URL: https://issues.apache.org/jira/browse/BEAM-6620
> Project: Beam
>  Issue Type: Sub-task
>  Components: build-system
>Affects Versions: 2.11.0
>Reporter: Ismaël Mejía
>Assignee: Luke Cwik
>Priority: Critical
>
> Once guava use is vendored we have to remove the automatic relocation of 
> guava so modules who use guava in their API (e.g. Cassandra or Kinesis) can 
> work correctly.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (BEAM-6249) Vendored gRPC doesn't seem to work with dataflow

2019-06-04 Thread Luke Cwik (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6249?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Luke Cwik resolved BEAM-6249.
-
   Resolution: Fixed
 Assignee: Kenneth Knowles  (was: Tyler Akidau)
Fix Version/s: 2.10.0

> Vendored gRPC doesn't seem to work with dataflow
> 
>
> Key: BEAM-6249
> URL: https://issues.apache.org/jira/browse/BEAM-6249
> Project: Beam
>  Issue Type: Bug
>  Components: runner-dataflow
>Affects Versions: 2.9.0
>Reporter: Steve Niemitz
>Assignee: Kenneth Knowles
>Priority: Major
> Fix For: 2.10.0
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> I attempted to migrate an existing pipeline (that worked in 2.8.0) to 2.9.0.  
> This pipeline is using the experimental streaming engine 
> (–experiments=enable_streaming_engine).
> The pipeline fails to start with these logs:
> {code:java}
> D  Unable to load the library 
> 'org_apache_beam_vendor_grpc_v1_13_1_netty_tcnative_linux_x86_64', trying 
> other loading mechanism. 
> D  org_apache_beam_vendor_grpc_v1_13_1_netty_tcnative_linux_x86_64 cannot be 
> loaded from java.libary.path, now trying export to -Dio.netty.native.workdir: 
> /tmp 
> D  Unable to load the library 
> '/tmp/liborg_apache_beam_vendor_grpc_v1_13_1_netty_tcnative_linux_x86_646918605450681921540.so',
>  trying other loading mechanism. 
> D  Unable to load the library 'netty_tcnative_linux_x86_64', trying next 
> name... 
> D  Unable to load the library 
> 'org_apache_beam_vendor_grpc_v1_13_1_netty_tcnative_linux_x86_64_fedora', 
> trying other loading mechanism. 
> D  org_apache_beam_vendor_grpc_v1_13_1_netty_tcnative_linux_x86_64_fedora 
> cannot be loaded from java.libary.path, now trying export to 
> -Dio.netty.native.workdir: /tmp 
> D  Unable to load the library 'netty_tcnative_linux_x86_64_fedora', trying 
> next name... 
> D  Unable to load the library 
> 'org_apache_beam_vendor_grpc_v1_13_1_netty_tcnative_x86_64', trying other 
> loading mechanism. 
> D  org_apache_beam_vendor_grpc_v1_13_1_netty_tcnative_x86_64 cannot be loaded 
> from java.libary.path, now trying export to -Dio.netty.native.workdir: /tmp 
> D  Unable to load the library 'netty_tcnative_x86_64', trying next name... 
> D  Unable to load the library 
> 'org_apache_beam_vendor_grpc_v1_13_1_netty_tcnative', trying other loading 
> mechanism. 
> D  org_apache_beam_vendor_grpc_v1_13_1_netty_tcnative cannot be loaded from 
> java.libary.path, now trying export to -Dio.netty.native.workdir: /tmp 
> D  Unable to load the library 'netty_tcnative', trying next name... 
> D  Failed to load netty-tcnative; OpenSslEngine will be unavailable, unless 
> the application has already loaded the symbols by some other means. See 
> http://netty.io/wiki/forked-tomcat-native.html for more information. 
> D  Failed to initialize netty-tcnative; OpenSslEngine will be unavailable. 
> See http://netty.io/wiki/forked-tomcat-native.html for more information. 
> I  netty-tcnative unavailable (this may be normal) 
> I  Conscrypt not found (this may be normal) 
> I  Jetty ALPN unavailable (this may be normal) 
> E  Uncaught exception in main thread. Exiting with status code 1. 
> W  Please use a logger instead of System.out or System.err.
> Please switch to using org.slf4j.Logger.
> See: https://cloud.google.com/dataflow/pipelines/logging 
> E  Uncaught exception in main thread. Exiting with status code 1. 
> E  java.lang.IllegalStateException: Could not find TLS ALPN provider; no 
> working netty-tcnative, Conscrypt, or Jetty NPN/ALPN available 
> E at 
> org.apache.beam.vendor.grpc.v1_13_1.io.grpc.netty.GrpcSslContexts.defaultSslProvider(GrpcSslContexts.java:256)
>  
> E at 
> org.apache.beam.vendor.grpc.v1_13_1.io.grpc.netty.GrpcSslContexts.configure(GrpcSslContexts.java:171)
>  
> E at 
> org.apache.beam.vendor.grpc.v1_13_1.io.grpc.netty.GrpcSslContexts.forClient(GrpcSslContexts.java:120)
>  
> E at 
> org.apache.beam.runners.dataflow.worker.windmill.GrpcWindmillServer.remoteChannel(GrpcWindmillServer.java:343)
>  
> E at 
> org.apache.beam.runners.dataflow.worker.windmill.GrpcWindmillServer.initializeWindmillService(GrpcWindmillServer.java:312)
>  
> {code}
>  
> The interesting part is in the netty load failure, the stack trace is:
> {code:java}
> exception: "java.lang.UnsatisfiedLinkError at 
> org.apache.beam.vendor.grpc.v1_13_1.io.netty.util.internal.NativeLibraryLoader.loadLibraryByHelper(NativeLibraryLoader.java:276)
>  at 
> org.apache.beam.vendor.grpc.v1_13_1.io.netty.util.internal.NativeLibraryLoader.loadLibrary(NativeLibraryLoader.java:233)
>  at 
> org.apache.beam.vendor.grpc.v1_13_1.io.netty.util.internal.NativeLibraryLoader.load(NativeLibraryLoader.java:187)
>  at 
> org.apache.beam.vendor.grpc.v1_13_1.io.netty.util.internal.NativeLibraryLoader.loadF

[jira] [Commented] (BEAM-5826) Vendor slf4j API

2019-06-04 Thread Luke Cwik (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-5826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16856155#comment-16856155
 ] 

Luke Cwik commented on BEAM-5826:
-

I concur with Thomas, the non shaded SLF4J api is meant to be used by users so 
that we can capture the logs in the SDK harness.

> Vendor slf4j API
> 
>
> Key: BEAM-5826
> URL: https://issues.apache.org/jira/browse/BEAM-5826
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-java-harness
>Reporter: Kenneth Knowles
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (BEAM-5826) Vendor slf4j API

2019-06-04 Thread Luke Cwik (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5826?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Luke Cwik resolved BEAM-5826.
-
   Resolution: Won't Do
Fix Version/s: Not applicable

> Vendor slf4j API
> 
>
> Key: BEAM-5826
> URL: https://issues.apache.org/jira/browse/BEAM-5826
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-java-harness
>Reporter: Kenneth Knowles
>Priority: Major
> Fix For: Not applicable
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (BEAM-7175) Add Spark Portable ValidatesRunner Batch postcommit test

2019-06-04 Thread JIRA


 [ 
https://issues.apache.org/jira/browse/BEAM-7175?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismaël Mejía updated BEAM-7175:
---
Summary: Add Spark Portable ValidatesRunner Batch postcommit test  (was: 
Run Spark validatesPortableRunner test as a postcommit)

> Add Spark Portable ValidatesRunner Batch postcommit test
> 
>
> Key: BEAM-7175
> URL: https://issues.apache.org/jira/browse/BEAM-7175
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-spark
>Reporter: Kyle Weaver
>Assignee: Kyle Weaver
>Priority: Major
>  Time Spent: 9h 20m
>  Remaining Estimate: 0h
>
> when it's ready



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6683) Add an integration test suite for cross-language transforms for Flink runner

2019-06-04 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6683?focusedWorklogId=254033&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-254033
 ]

ASF GitHub Bot logged work on BEAM-6683:


Author: ASF GitHub Bot
Created on: 04/Jun/19 21:17
Start Date: 04/Jun/19 21:17
Worklog Time Spent: 10m 
  Work Description: ihji commented on issue #8174: [BEAM-6683] add 
createCrossLanguageValidatesRunner task
URL: https://github.com/apache/beam/pull/8174#issuecomment-498847354
 
 
   @mxm Sure. It would be great if you can look into it. I also wonder why 
#8693 breaks it since the change #8693 introduced looks pretty trivial.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 254033)
Time Spent: 13h 20m  (was: 13h 10m)

> Add an integration test suite for cross-language transforms for Flink runner
> 
>
> Key: BEAM-6683
> URL: https://issues.apache.org/jira/browse/BEAM-6683
> Project: Beam
>  Issue Type: Test
>  Components: testing
>Reporter: Chamikara Jayalath
>Assignee: Heejong Lee
>Priority: Major
>  Time Spent: 13h 20m
>  Remaining Estimate: 0h
>
> We should add an integration test suite that covers following.
> (1) Currently available Java IO connectors that do not use UDFs work for 
> Python SDK on Flink runner.
> (2) Currently available Python IO connectors that do not use UDFs work for 
> Java SDK on Flink runner.
> (3) Currently available Java/Python pipelines work in a scalable manner for 
> cross-language pipelines (for example, try 10GB, 100GB input for 
> textio/avroio for Java and Python). 
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3606) Improve our relationship, or lack thereof, with Guava

2019-06-04 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-3606?focusedWorklogId=254027&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-254027
 ]

ASF GitHub Bot logged work on BEAM-3606:


Author: ASF GitHub Bot
Created on: 04/Jun/19 20:58
Start Date: 04/Jun/19 20:58
Worklog Time Spent: 10m 
  Work Description: youngoli commented on pull request #8759: [BEAM-3606] 
Remove guava-testlib from sdks/java/core main dependency set
URL: https://github.com/apache/beam/pull/8759#discussion_r290494236
 
 

 ##
 File path: 
sdks/java/core/src/main/java/org/apache/beam/sdk/io/fs/ResourceIdTester.java
 ##
 @@ -77,31 +80,58 @@ private static void validateResolvingIds(
 allResourceIds.add(dir2);
 
 // ResourceIds in equality groups.
-new EqualsTester()
-.addEqualityGroup(file1)
-.addEqualityGroup(file2, file2a)
-.addEqualityGroup(dir1, dir1.getCurrentDirectory())
-.addEqualityGroup(dir2, dir2a, dir2.getCurrentDirectory())
-.addEqualityGroup(baseDirectory, file1.getCurrentDirectory(), 
file2.getCurrentDirectory())
-.testEquals();
+assertEqualityGroups(
+Arrays.asList(
+Arrays.asList(file1),
+Arrays.asList(file2, file2a),
+Arrays.asList(dir1, dir1.getCurrentDirectory()),
+Arrays.asList(dir2, dir2a, dir2.getCurrentDirectory()),
+Arrays.asList(
+baseDirectory, file1.getCurrentDirectory(), 
file2.getCurrentDirectory(;
 
 // ResourceId toString() in equality groups.
-new EqualsTester()
-.addEqualityGroup(file1.toString())
-.addEqualityGroup(file2.toString(), file2a.toString())
-.addEqualityGroup(dir1.toString(), 
dir1.getCurrentDirectory().toString())
-.addEqualityGroup(dir2.toString(), dir2a.toString(), 
dir2.getCurrentDirectory().toString())
-.addEqualityGroup(
-baseDirectory.toString(),
-file1.getCurrentDirectory().toString(),
-file2.getCurrentDirectory().toString())
-.testEquals();
+assertEqualityGroups(
+Arrays.asList(
+Arrays.asList(file1.toString()),
+Arrays.asList(file2.toString(), file2a.toString()),
+Arrays.asList(dir1.toString(), 
dir1.getCurrentDirectory().toString()),
+Arrays.asList(dir2.toString(), dir2a.toString(), 
dir2.getCurrentDirectory().toString()),
+Arrays.asList(
+baseDirectory.toString(),
+file1.getCurrentDirectory().toString(),
+file2.getCurrentDirectory().toString(;
 
 // TODO: test resolving strings that need to be escaped.
 //   Possible spec: https://tools.ietf.org/html/rfc3986#section-2
 //   May need options to be filesystem-independent, e.g., if filesystems 
ban certain chars.
   }
 
+  /**
+   * Asserts that all elements in each group are equal to each other but not 
equal to any other
+   * element in another group.
+   */
+  private static  void assertEqualityGroups(List> equalityGroups) {
+for (int i = 0; i < equalityGroups.size(); ++i) {
+  List current = equalityGroups.get(i);
+  for (int j = 0; j < current.size(); ++j) {
+for (int k = 0; k < current.size(); ++k) {
 
 Review comment:
   Ah, I didn't consider that case. Ok that makes sense now.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 254027)
Time Spent: 1h 50m  (was: 1h 40m)

> Improve our relationship, or lack thereof, with Guava
> -
>
> Key: BEAM-3606
> URL: https://issues.apache.org/jira/browse/BEAM-3606
> Project: Beam
>  Issue Type: Improvement
>  Components: io-java-gcp, runner-core, sdk-java-core
>Reporter: Kenneth Knowles
>Priority: Major
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> This is an umbrella task for things that might move us off Guava, such as 
> replicating our own little utilities or moving to Java 8 features.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (BEAM-7176) Spark validatesPortableRunner test OOM

2019-06-04 Thread Kyle Weaver (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7176?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kyle Weaver resolved BEAM-7176.
---
   Resolution: Fixed
Fix Version/s: 2.14.0

> Spark validatesPortableRunner test OOM
> --
>
> Key: BEAM-7176
> URL: https://issues.apache.org/jira/browse/BEAM-7176
> Project: Beam
>  Issue Type: Bug
>  Components: runner-spark
>Reporter: Kyle Weaver
>Assignee: Kyle Weaver
>Priority: Major
> Fix For: 2.14.0
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> First ~50 tests run okay, and then this happens: 
> {{org.apache.beam.sdk.transforms.CombineTest$WindowingTests > 
> testGlobalCombineWithDefaultsAndTriggers FAILED}}
> {{ java.lang.AssertionError at 
> [CombineTest.java:1209|http://combinetest.java:1209/]}}
> {{org.apache.beam.sdk.transforms.CombineTest$WindowingTests > 
> testCombineGloballyLambda FAILED}}
> {{ java.lang.RuntimeException at 
> [CombineTest.java:1391|http://combinetest.java:1391/]}}
> {{ Caused by: java.lang.RuntimeException at 
> [CombineTest.java:1391|http://combinetest.java:1391/]}}
> {{ Caused by: java.util.concurrent.ExecutionException at 
> [CombineTest.java:1391|http://combinetest.java:1391/]}}
> {{ Caused by: java.lang.OutOfMemoryError}}
> {{}}
> 
> Most subsequent tests OOM after that.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (BEAM-7412) portable spark: thread/memory leak in local mode

2019-06-04 Thread Kyle Weaver (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7412?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kyle Weaver resolved BEAM-7412.
---
   Resolution: Fixed
Fix Version/s: 2.14.0

> portable spark: thread/memory leak in local mode
> 
>
> Key: BEAM-7412
> URL: https://issues.apache.org/jira/browse/BEAM-7412
> Project: Beam
>  Issue Type: Bug
>  Components: runner-spark
>Reporter: Kyle Weaver
>Assignee: Kyle Weaver
>Priority: Major
> Fix For: 2.14.0
>
> Attachments: Screenshot from 2019-05-20 14-26-03.png
>
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> When running validatesPortableRunnerBatch test on local mode, the portable 
> Spark runner creates ~18k threads before becoming unable to create any more 
> threads, at which point it crashes. This does not happen on standalone 
> cluster mode, where ephemeral executors (independent JVMs) are used and then 
> shut down, preventing whatever leakage is occurring from becoming too much of 
> a problem.
> Sample stack trace:
> {{"pool-3965-thread-3" #16059 daemon prio=5 os_prio=0 tid=0x7f9f85a81000 
> nid=0x32a0 waiting on condition [0x7f9f8b9f1000]}}
> {{ java.lang.Thread.State: TIMED_WAITING (parking)}}
> {{ at (C/C++) 0x7fa2f32c2dae (Unknown Source)}}
> {{ at (C/C++) 0x7fa2f2851351 (Unknown Source)}}
> {{ at sun.misc.Unsafe.park(Native Method)}}
> {{ - parking to wait for <0x000727bda848> (a 
> java.util.concurrent.SynchronousQueue$TransferStack)}}
> {{ at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)}}
> {{ at 
> java.util.concurrent.SynchronousQueue$TransferStack.awaitFulfill(SynchronousQueue.java:460)}}
> {{ at 
> java.util.concurrent.SynchronousQueue$TransferStack.transfer(SynchronousQueue.java:362)}}
> {{ at java.util.concurrent.SynchronousQueue.poll(SynchronousQueue.java:941)}}
> {{ at 
> java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1073)}}
> {{ at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134)}}
> {{ at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)}}
> {{ at java.lang.Thread.run(Thread.java:748)}}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (BEAM-6959) Run Go SDK Post Commit tests against the Flink Runner.

2019-06-04 Thread Kyle Weaver (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6959?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kyle Weaver resolved BEAM-6959.
---
   Resolution: Fixed
Fix Version/s: 2.14.0

> Run Go SDK  Post Commit tests against the Flink Runner.
> ---
>
> Key: BEAM-6959
> URL: https://issues.apache.org/jira/browse/BEAM-6959
> Project: Beam
>  Issue Type: Sub-task
>  Components: runner-flink, sdk-go, testing
>Reporter: Robert Burke
>Assignee: Kyle Weaver
>Priority: Minor
> Fix For: 2.14.0
>
>  Time Spent: 4h 40m
>  Remaining Estimate: 0h
>
> See parent task BEAM-6958



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3606) Improve our relationship, or lack thereof, with Guava

2019-06-04 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-3606?focusedWorklogId=254013&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-254013
 ]

ASF GitHub Bot logged work on BEAM-3606:


Author: ASF GitHub Bot
Created on: 04/Jun/19 20:26
Start Date: 04/Jun/19 20:26
Worklog Time Spent: 10m 
  Work Description: lukecwik commented on issue #8760: [BEAM-3606] Remove 
unused guava_testlib from dependency sets.
URL: https://github.com/apache/beam/pull/8760#issuecomment-498829802
 
 
   R: @Ardagan 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 254013)
Time Spent: 1h 40m  (was: 1.5h)

> Improve our relationship, or lack thereof, with Guava
> -
>
> Key: BEAM-3606
> URL: https://issues.apache.org/jira/browse/BEAM-3606
> Project: Beam
>  Issue Type: Improvement
>  Components: io-java-gcp, runner-core, sdk-java-core
>Reporter: Kenneth Knowles
>Priority: Major
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> This is an umbrella task for things that might move us off Guava, such as 
> replicating our own little utilities or moving to Java 8 features.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3606) Improve our relationship, or lack thereof, with Guava

2019-06-04 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-3606?focusedWorklogId=254012&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-254012
 ]

ASF GitHub Bot logged work on BEAM-3606:


Author: ASF GitHub Bot
Created on: 04/Jun/19 20:25
Start Date: 04/Jun/19 20:25
Worklog Time Spent: 10m 
  Work Description: lukecwik commented on pull request #8760: [BEAM-3606] 
Remove unused guava_testlib from dependency sets.
URL: https://github.com/apache/beam/pull/8760
 
 
   
   
   
   Thank you for your contribution! Follow this checklist to help us 
incorporate your contribution quickly and easily:
   
- [ ] [**Choose 
reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and 
mention them in a comment (`R: @username`).
- [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in 
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA 
issue, if applicable. This will automatically link the pull request to the 
issue.
- [ ] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).
   
   Post-Commit Tests Status (on master branch)
   

   
   Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark
   --- | --- | --- | --- | --- | --- | --- | ---
   Go | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/)
 | --- | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/)
 | --- | --- | ---
   Java | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)
   Python | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Python_Verify/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python_Verify/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Python3_Verify/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python3_Verify/lastCompletedBuild/)
 | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/)
  [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PreCommit_Python_PVR_Flink_Cron/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PreCommit_Python_PVR_Flink_Cron/lastCompletedBuild/)
 | --- | --- | ---
   
   Pre-Commit Tests Status (on master branch)
   
-

[jira] [Work logged] (BEAM-3606) Improve our relationship, or lack thereof, with Guava

2019-06-04 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-3606?focusedWorklogId=254010&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-254010
 ]

ASF GitHub Bot logged work on BEAM-3606:


Author: ASF GitHub Bot
Created on: 04/Jun/19 20:22
Start Date: 04/Jun/19 20:22
Worklog Time Spent: 10m 
  Work Description: lukecwik commented on pull request #8759: [BEAM-3606] 
Remove guava-testlib from sdks/java/core main dependency set
URL: https://github.com/apache/beam/pull/8759
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 254010)
Time Spent: 1h 20m  (was: 1h 10m)

> Improve our relationship, or lack thereof, with Guava
> -
>
> Key: BEAM-3606
> URL: https://issues.apache.org/jira/browse/BEAM-3606
> Project: Beam
>  Issue Type: Improvement
>  Components: io-java-gcp, runner-core, sdk-java-core
>Reporter: Kenneth Knowles
>Priority: Major
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> This is an umbrella task for things that might move us off Guava, such as 
> replicating our own little utilities or moving to Java 8 features.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-2857) Create FileIO in Python

2019-06-04 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-2857?focusedWorklogId=254007&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-254007
 ]

ASF GitHub Bot logged work on BEAM-2857:


Author: ASF GitHub Bot
Created on: 04/Jun/19 20:19
Start Date: 04/Jun/19 20:19
Worklog Time Spent: 10m 
  Work Description: chamikaramj commented on issue #8394: [BEAM-2857] 
Implementing WriteToFiles transform for fileio (Python)
URL: https://github.com/apache/beam/pull/8394#issuecomment-498827277
 
 
   Also, please fix the conflict.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 254007)
Time Spent: 6h 50m  (was: 6h 40m)

> Create FileIO in Python
> ---
>
> Key: BEAM-2857
> URL: https://issues.apache.org/jira/browse/BEAM-2857
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-py-core
>Reporter: Eugene Kirpichov
>Assignee: Pablo Estrada
>Priority: Major
>  Labels: gsoc, gsoc2019, mentor
>  Time Spent: 6h 50m
>  Remaining Estimate: 0h
>
> Beam Java has a FileIO with operations: match()/matchAll(), readMatches(), 
> which together cover the majority of needs for general-purpose file 
> ingestion. Beam Python should have something similar.
> An early design document for this: https://s.apache.org/fileio-beam-python



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-7351) Failure in Python streaming wordcount test: unexpected messages received on output topic.

2019-06-04 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7351?focusedWorklogId=254004&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-254004
 ]

ASF GitHub Bot logged work on BEAM-7351:


Author: ASF GitHub Bot
Created on: 04/Jun/19 20:15
Start Date: 04/Jun/19 20:15
Worklog Time Spent: 10m 
  Work Description: pabloem commented on issue #8752: [BEAM-7351] increment 
ack deadline for flaky pubsub test
URL: https://github.com/apache/beam/pull/8752#issuecomment-498825923
 
 
   hm I see what you mean. There's many broken tests. You're saying that 
there's fixes for the other tests in other PRs then?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 254004)
Time Spent: 1h 20m  (was: 1h 10m)

> Failure in Python streaming wordcount test: unexpected messages received on 
> output topic.
> -
>
> Key: BEAM-7351
> URL: https://issues.apache.org/jira/browse/BEAM-7351
> Project: Beam
>  Issue Type: Bug
>  Components: runner-dataflow, sdk-py-core
>Reporter: Valentyn Tymofieiev
>Assignee: Valentyn Tymofieiev
>Priority: Major
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> Saw this in a PostCommit test today. Likely a flake, but it's a strange 
> failure mode and we may need to investigate this.
> {noformat}
> 13:32:02 
> ==
> 13:32:02 FAIL: test_streaming_wordcount_it 
> (apache_beam.examples.streaming_wordcount_it_test.StreamingWordCountIT)
> 13:32:02 
> --
> 13:32:02 Traceback (most recent call last):
> 13:32:02   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python_Verify_PR/src/sdks/python/apache_beam/examples/streaming_wordcount_it_test.py",
>  line 110, in test_streaming_wordcount_it
> 13:32:02 self.test_pipeline.get_full_options_as_args(**extra_opts))
> 13:32:02   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python_Verify_PR/src/sdks/python/apache_beam/examples/streaming_wordcount.py",
>  line 101, in run
> 13:32:02 result = p.run()
> 13:32:02   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python_Verify_PR/src/sdks/python/apache_beam/pipeline.py",
>  line 419, in run
> 13:32:02 return self.runner.run_pipeline(self, self._options)
> 13:32:02   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python_Verify_PR/src/sdks/python/apache_beam/runners/dataflow/test_dataflow_runner.py",
>  line 68, in run_pipeline
> 13:32:02 hc_assert_that(self.result, pickler.loads(on_success_matcher))
> 13:32:02 AssertionError: 
> 13:32:02 Expected: (Test pipeline expected terminated in state: RUNNING and 
> Expected 500 messages.)
> 13:32:02  but: Expected 500 messages. Got 508 messages. Diffs (item, 
> count):
> 13:32:02   Expected but not in actual: []
> 13:32:02   Unexpected: [(u'476: 1', 1), (u'416: 1', 1), (u'245: 1', 1), 
> (u'478: 1', 1), (u'58: 1', 1), (u'364: 1', 1), (u'77: 1', 1), (u'283: 1', 1)]
> 13:32:02 
> 13:32:02  >> begin captured logging << 
> 
> 13:32:02 google.auth.transport._http_client: DEBUG: Making request: GET 
> http://169.254.169.254
> 13:32:02 google.auth.transport._http_client: DEBUG: Making request: GET 
> http://metadata.google.internal/computeMetadata/v1/project/project-id
> 13:32:02 google.auth.transport.requests: DEBUG: Making request: GET 
> http://metadata.google.internal/computeMetadata/v1/instance/service-accounts/default/?recursive=true
> 13:32:02 urllib3.connectionpool: DEBUG: Starting new HTTP connection (1): 
> metadata.google.internal:80
> 13:32:02 urllib3.connectionpool: DEBUG: http://metadata.google.internal:80 
> "GET /computeMetadata/v1/instance/service-accounts/default/?recursive=true 
> HTTP/1.1" 200 144
> 13:32:02 google.auth.transport.requests: DEBUG: Making request: GET 
> http://metadata.google.internal/computeMetadata/v1/instance/service-accounts/844138762903-comp...@developer.gserviceaccount.com/token
> 13:32:02 urllib3.connectionpool: DEBUG: http://metadata.google.internal:80 
> "GET 
> /computeMetadata/v1/instance/service-accounts/844138762903-comp...@developer.gserviceaccount.com/token
>  HTTP/1.1" 200 176
> 13:32:02 google.auth.transport._http_client: DEBUG: Making request: GET 
> http://169.254.169.254
> 13:32:02 google.auth.transport._http_client: DEBUG: Making request: GET 
> 

[jira] [Work logged] (BEAM-3606) Improve our relationship, or lack thereof, with Guava

2019-06-04 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-3606?focusedWorklogId=254002&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-254002
 ]

ASF GitHub Bot logged work on BEAM-3606:


Author: ASF GitHub Bot
Created on: 04/Jun/19 20:13
Start Date: 04/Jun/19 20:13
Worklog Time Spent: 10m 
  Work Description: lukecwik commented on pull request #8759: [BEAM-3606] 
Remove guava-testlib from sdks/java/core main dependency set
URL: https://github.com/apache/beam/pull/8759#discussion_r290477000
 
 

 ##
 File path: 
sdks/java/core/src/main/java/org/apache/beam/sdk/io/fs/ResourceIdTester.java
 ##
 @@ -77,31 +80,58 @@ private static void validateResolvingIds(
 allResourceIds.add(dir2);
 
 // ResourceIds in equality groups.
-new EqualsTester()
-.addEqualityGroup(file1)
-.addEqualityGroup(file2, file2a)
-.addEqualityGroup(dir1, dir1.getCurrentDirectory())
-.addEqualityGroup(dir2, dir2a, dir2.getCurrentDirectory())
-.addEqualityGroup(baseDirectory, file1.getCurrentDirectory(), 
file2.getCurrentDirectory())
-.testEquals();
+assertEqualityGroups(
+Arrays.asList(
+Arrays.asList(file1),
+Arrays.asList(file2, file2a),
+Arrays.asList(dir1, dir1.getCurrentDirectory()),
+Arrays.asList(dir2, dir2a, dir2.getCurrentDirectory()),
+Arrays.asList(
+baseDirectory, file1.getCurrentDirectory(), 
file2.getCurrentDirectory(;
 
 // ResourceId toString() in equality groups.
-new EqualsTester()
-.addEqualityGroup(file1.toString())
-.addEqualityGroup(file2.toString(), file2a.toString())
-.addEqualityGroup(dir1.toString(), 
dir1.getCurrentDirectory().toString())
-.addEqualityGroup(dir2.toString(), dir2a.toString(), 
dir2.getCurrentDirectory().toString())
-.addEqualityGroup(
-baseDirectory.toString(),
-file1.getCurrentDirectory().toString(),
-file2.getCurrentDirectory().toString())
-.testEquals();
+assertEqualityGroups(
+Arrays.asList(
+Arrays.asList(file1.toString()),
+Arrays.asList(file2.toString(), file2a.toString()),
+Arrays.asList(dir1.toString(), 
dir1.getCurrentDirectory().toString()),
+Arrays.asList(dir2.toString(), dir2a.toString(), 
dir2.getCurrentDirectory().toString()),
+Arrays.asList(
+baseDirectory.toString(),
+file1.getCurrentDirectory().toString(),
+file2.getCurrentDirectory().toString(;
 
 // TODO: test resolving strings that need to be escaped.
 //   Possible spec: https://tools.ietf.org/html/rfc3986#section-2
 //   May need options to be filesystem-independent, e.g., if filesystems 
ban certain chars.
   }
 
+  /**
+   * Asserts that all elements in each group are equal to each other but not 
equal to any other
+   * element in another group.
+   */
+  private static  void assertEqualityGroups(List> equalityGroups) {
+for (int i = 0; i < equalityGroups.size(); ++i) {
+  List current = equalityGroups.get(i);
+  for (int j = 0; j < current.size(); ++j) {
+for (int k = 0; k < current.size(); ++k) {
 
 Review comment:
   Since equals is implemented within the class a.equals(b) is not the same 
thing as b.equals(a) if a and b are two different classes, we also test 
a.equals(a) for posterity.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 254002)
Time Spent: 1h  (was: 50m)

> Improve our relationship, or lack thereof, with Guava
> -
>
> Key: BEAM-3606
> URL: https://issues.apache.org/jira/browse/BEAM-3606
> Project: Beam
>  Issue Type: Improvement
>  Components: io-java-gcp, runner-core, sdk-java-core
>Reporter: Kenneth Knowles
>Priority: Major
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> This is an umbrella task for things that might move us off Guava, such as 
> replicating our own little utilities or moving to Java 8 features.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3606) Improve our relationship, or lack thereof, with Guava

2019-06-04 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-3606?focusedWorklogId=254003&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-254003
 ]

ASF GitHub Bot logged work on BEAM-3606:


Author: ASF GitHub Bot
Created on: 04/Jun/19 20:13
Start Date: 04/Jun/19 20:13
Worklog Time Spent: 10m 
  Work Description: lukecwik commented on pull request #8759: [BEAM-3606] 
Remove guava-testlib from sdks/java/core main dependency set
URL: https://github.com/apache/beam/pull/8759#discussion_r290477276
 
 

 ##
 File path: 
sdks/java/core/src/main/java/org/apache/beam/sdk/io/fs/ResourceIdTester.java
 ##
 @@ -77,31 +80,58 @@ private static void validateResolvingIds(
 allResourceIds.add(dir2);
 
 // ResourceIds in equality groups.
-new EqualsTester()
-.addEqualityGroup(file1)
-.addEqualityGroup(file2, file2a)
-.addEqualityGroup(dir1, dir1.getCurrentDirectory())
-.addEqualityGroup(dir2, dir2a, dir2.getCurrentDirectory())
-.addEqualityGroup(baseDirectory, file1.getCurrentDirectory(), 
file2.getCurrentDirectory())
-.testEquals();
+assertEqualityGroups(
+Arrays.asList(
+Arrays.asList(file1),
+Arrays.asList(file2, file2a),
+Arrays.asList(dir1, dir1.getCurrentDirectory()),
+Arrays.asList(dir2, dir2a, dir2.getCurrentDirectory()),
+Arrays.asList(
+baseDirectory, file1.getCurrentDirectory(), 
file2.getCurrentDirectory(;
 
 // ResourceId toString() in equality groups.
-new EqualsTester()
-.addEqualityGroup(file1.toString())
-.addEqualityGroup(file2.toString(), file2a.toString())
-.addEqualityGroup(dir1.toString(), 
dir1.getCurrentDirectory().toString())
-.addEqualityGroup(dir2.toString(), dir2a.toString(), 
dir2.getCurrentDirectory().toString())
-.addEqualityGroup(
-baseDirectory.toString(),
-file1.getCurrentDirectory().toString(),
-file2.getCurrentDirectory().toString())
-.testEquals();
+assertEqualityGroups(
+Arrays.asList(
+Arrays.asList(file1.toString()),
+Arrays.asList(file2.toString(), file2a.toString()),
+Arrays.asList(dir1.toString(), 
dir1.getCurrentDirectory().toString()),
+Arrays.asList(dir2.toString(), dir2a.toString(), 
dir2.getCurrentDirectory().toString()),
+Arrays.asList(
+baseDirectory.toString(),
+file1.getCurrentDirectory().toString(),
+file2.getCurrentDirectory().toString(;
 
 // TODO: test resolving strings that need to be escaped.
 //   Possible spec: https://tools.ietf.org/html/rfc3986#section-2
 //   May need options to be filesystem-independent, e.g., if filesystems 
ban certain chars.
   }
 
+  /**
+   * Asserts that all elements in each group are equal to each other but not 
equal to any other
+   * element in another group.
+   */
+  private static  void assertEqualityGroups(List> equalityGroups) {
+for (int i = 0; i < equalityGroups.size(); ++i) {
+  List current = equalityGroups.get(i);
+  for (int j = 0; j < current.size(); ++j) {
+for (int k = 0; k < current.size(); ++k) {
+  assertEquals(
+  "Value at " + j + " should equal value at " + k + " in equality 
group " + i,
+  current.get(j),
+  current.get(k));
+}
+  }
+  for (int j = 0; j < equalityGroups.size(); ++j) {
 
 Review comment:
   Same reasoning about the equals as above, we ensure that they are not equals 
both ways as well.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 254003)
Time Spent: 1h 10m  (was: 1h)

> Improve our relationship, or lack thereof, with Guava
> -
>
> Key: BEAM-3606
> URL: https://issues.apache.org/jira/browse/BEAM-3606
> Project: Beam
>  Issue Type: Improvement
>  Components: io-java-gcp, runner-core, sdk-java-core
>Reporter: Kenneth Knowles
>Priority: Major
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> This is an umbrella task for things that might move us off Guava, such as 
> replicating our own little utilities or moving to Java 8 features.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-2857) Create FileIO in Python

2019-06-04 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-2857?focusedWorklogId=254000&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-254000
 ]

ASF GitHub Bot logged work on BEAM-2857:


Author: ASF GitHub Bot
Created on: 04/Jun/19 20:07
Start Date: 04/Jun/19 20:07
Worklog Time Spent: 10m 
  Work Description: pabloem commented on issue #8394: [BEAM-2857] 
Implementing WriteToFiles transform for fileio (Python)
URL: https://github.com/apache/beam/pull/8394#issuecomment-498823034
 
 
   Run Python_PVR_Flink PreCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 254000)
Time Spent: 6h 40m  (was: 6.5h)

> Create FileIO in Python
> ---
>
> Key: BEAM-2857
> URL: https://issues.apache.org/jira/browse/BEAM-2857
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-py-core
>Reporter: Eugene Kirpichov
>Assignee: Pablo Estrada
>Priority: Major
>  Labels: gsoc, gsoc2019, mentor
>  Time Spent: 6h 40m
>  Remaining Estimate: 0h
>
> Beam Java has a FileIO with operations: match()/matchAll(), readMatches(), 
> which together cover the majority of needs for general-purpose file 
> ingestion. Beam Python should have something similar.
> An early design document for this: https://s.apache.org/fileio-beam-python



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6683) Add an integration test suite for cross-language transforms for Flink runner

2019-06-04 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6683?focusedWorklogId=253978&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-253978
 ]

ASF GitHub Bot logged work on BEAM-6683:


Author: ASF GitHub Bot
Created on: 04/Jun/19 19:27
Start Date: 04/Jun/19 19:27
Worklog Time Spent: 10m 
  Work Description: mxm commented on issue #8174: [BEAM-6683] add 
createCrossLanguageValidatesRunner task
URL: https://github.com/apache/beam/pull/8174#issuecomment-498809192
 
 
   @ihji I can take a look if you want. Curious why this breaks because #8693 
didn't break any other integration tests.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 253978)
Time Spent: 13h 10m  (was: 13h)

> Add an integration test suite for cross-language transforms for Flink runner
> 
>
> Key: BEAM-6683
> URL: https://issues.apache.org/jira/browse/BEAM-6683
> Project: Beam
>  Issue Type: Test
>  Components: testing
>Reporter: Chamikara Jayalath
>Assignee: Heejong Lee
>Priority: Major
>  Time Spent: 13h 10m
>  Remaining Estimate: 0h
>
> We should add an integration test suite that covers following.
> (1) Currently available Java IO connectors that do not use UDFs work for 
> Python SDK on Flink runner.
> (2) Currently available Python IO connectors that do not use UDFs work for 
> Java SDK on Flink runner.
> (3) Currently available Java/Python pipelines work in a scalable manner for 
> cross-language pipelines (for example, try 10GB, 100GB input for 
> textio/avroio for Java and Python). 
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (BEAM-7489) High CPU and memory usage in the case of SqlTransform with complex triggers

2019-06-04 Thread Rui Wang (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7489?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rui Wang updated BEAM-7489:
---
Description: 
http://mail-archives.apache.org/mod_mbox/beam-user/201906.mbox/%3ccapjhctfmimy067uzudqejf-+zud1dqb+fkl8wbfopcz79bm...@mail.gmail.com%3E



Code that expose this JIRA: https://pastebin.com/nNtc9ZaG

  
was:http://mail-archives.apache.org/mod_mbox/beam-user/201906.mbox/%3ccapjhctfmimy067uzudqejf-+zud1dqb+fkl8wbfopcz79bm...@mail.gmail.com%3E


> High CPU and memory usage in the case of SqlTransform with complex triggers
> ---
>
> Key: BEAM-7489
> URL: https://issues.apache.org/jira/browse/BEAM-7489
> Project: Beam
>  Issue Type: Bug
>  Components: dsl-sql
>Reporter: Rui Wang
>Priority: Major
>
> http://mail-archives.apache.org/mod_mbox/beam-user/201906.mbox/%3ccapjhctfmimy067uzudqejf-+zud1dqb+fkl8wbfopcz79bm...@mail.gmail.com%3E
> Code that expose this JIRA: https://pastebin.com/nNtc9ZaG



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-7489) High CPU and memory usage in the case of SqlTransform with complex triggers

2019-06-04 Thread Rui Wang (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-7489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16856045#comment-16856045
 ] 

Rui Wang commented on BEAM-7489:


Note that this JIRA was found by a SqlTransform query with session windowing, 
which is less tested before.

> High CPU and memory usage in the case of SqlTransform with complex triggers
> ---
>
> Key: BEAM-7489
> URL: https://issues.apache.org/jira/browse/BEAM-7489
> Project: Beam
>  Issue Type: Bug
>  Components: dsl-sql
>Reporter: Rui Wang
>Priority: Major
>
> http://mail-archives.apache.org/mod_mbox/beam-user/201906.mbox/%3ccapjhctfmimy067uzudqejf-+zud1dqb+fkl8wbfopcz79bm...@mail.gmail.com%3E



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (BEAM-7489) High CPU and memory usage in the case of SqlTransform with complex triggers

2019-06-04 Thread Rui Wang (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7489?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rui Wang updated BEAM-7489:
---
Description: 
http://mail-archives.apache.org/mod_mbox/beam-user/201906.mbox/%3ccapjhctfmimy067uzudqejf-+zud1dqb+fkl8wbfopcz79bm...@mail.gmail.com%3E

> High CPU and memory usage in the case of SqlTransform with complex triggers
> ---
>
> Key: BEAM-7489
> URL: https://issues.apache.org/jira/browse/BEAM-7489
> Project: Beam
>  Issue Type: Bug
>  Components: dsl-sql
>Reporter: Rui Wang
>Priority: Major
>
> http://mail-archives.apache.org/mod_mbox/beam-user/201906.mbox/%3ccapjhctfmimy067uzudqejf-+zud1dqb+fkl8wbfopcz79bm...@mail.gmail.com%3E



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (BEAM-7489) High CPU and memory usage in the case of SqlTransform with complex triggers

2019-06-04 Thread Rui Wang (JIRA)
Rui Wang created BEAM-7489:
--

 Summary: High CPU and memory usage in the case of SqlTransform 
with complex triggers
 Key: BEAM-7489
 URL: https://issues.apache.org/jira/browse/BEAM-7489
 Project: Beam
  Issue Type: Bug
  Components: dsl-sql
Reporter: Rui Wang






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-7348) Option to expire SDK worker environments

2019-06-04 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7348?focusedWorklogId=253963&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-253963
 ]

ASF GitHub Bot logged work on BEAM-7348:


Author: ASF GitHub Bot
Created on: 04/Jun/19 19:19
Start Date: 04/Jun/19 19:19
Worklog Time Spent: 10m 
  Work Description: mxm commented on pull request #8714: [BEAM-7348] 
Support environment expiration
URL: https://github.com/apache/beam/pull/8714
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 253963)
Time Spent: 2h 40m  (was: 2.5h)

> Option to expire SDK worker environments
> 
>
> Key: BEAM-7348
> URL: https://issues.apache.org/jira/browse/BEAM-7348
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-core
>Affects Versions: 2.12.0
>Reporter: Thomas Weise
>Assignee: Thomas Weise
>Priority: Major
>  Labels: portability, portability-flink
> Fix For: 2.14.0
>
>  Time Spent: 2h 40m
>  Remaining Estimate: 0h
>
> We discovered that Python SDK workers are susceptible to memory leaks that 
> are quite hard to identify and/or fix. This becomes an issue in streaming 
> pipelines, where the workers run "forever". It would be good if the user has 
> an option to recycle the workers when there is no other practical way to 
> address (slow) resource leaks.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-6407) regression: FileIO.writeDynamic() with side inputs fails in DirectRunner

2019-06-04 Thread Kenneth Knowles (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-6407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16856033#comment-16856033
 ] 

Kenneth Knowles commented on BEAM-6407:
---

Yea, makes sense. Did the rolled back thing get rolled forwards again? 
Unfortunately the commit messages in the revert do not specify the hash of the 
commit(s) nor PR number. I don't currently have the time/appetite for search PR 
by subject line.

> regression: FileIO.writeDynamic() with side inputs fails in DirectRunner
> 
>
> Key: BEAM-6407
> URL: https://issues.apache.org/jira/browse/BEAM-6407
> Project: Beam
>  Issue Type: Bug
>  Components: runner-direct
>Affects Versions: 2.9.0
>Reporter: Niel Markwick
>Assignee: Niel Markwick
>Priority: Blocker
>  Labels: regression
> Fix For: 2.10.0
>
> Attachments: beam-filewriter-demo.tgz
>
>  Time Spent: 2.5h
>  Remaining Estimate: 0h
>
> When FileIO.writeDynamic is used with automatic sharding and  a Contextful.Fn 
> that uses side inputs for the file naming, DirectRunner (and TestPipeline) 
> fail with: 
> {{java.lang.IllegalStateException: All PCollectionViews that are consumed 
> must be written by some WriteView PTransform: Missing [ 
> [RunnerPCollectionView]]}}
>  
> Example code:  
> {code:java}
> PCollectionView outputFileName =
>    pipeline.apply(
>       "outputDir",
>        Create.of("/tmp/testout")).apply(View.asSingleton());
> Contextful.Fn manifestNaming =
>    (element, c) ->
>       (window, pane, numShards, shardIndex, compression) -> 
>          c.sideInput(outputFileName)+shardIndex;
> pipeline.apply(FileIO.writeDynamic()
>    .by(SerializableFunctions.constant(""))
>    .withDestinationCoder(StringUtf8Coder.of())
>    .via(TextIO.sink())
>    .withTempDirectory("/tmp")
>    .withNaming(Contextful.of(
>       manifestNaming,
>       Requirements.requiresSideInputs(outputFileName;
> {code}
>  
> This does not occur in Dataflow-runner
> It does not occur if the ContextFul.Fn is not given side inputs.
> It does not occur if withNumShards(1) is set.
> It did not occur in 2.8.0, and does in 2.9.0 and 2.10.0-SNAPSHOT (as of today)
>  
> The cause appears to be due to the DirectRunner using TransformOverrides 
> re-writing FileIO sinks to use runner-determined-sharding
> ( see [DirectRunner.java line 
> 226|https://github.com/apache/beam/blob/master/runners/direct-java/src/main/java/org/apache/beam/runners/direct/DirectRunner.java#L226]
>  )
>  but I do not know why this started occuring in 2.9.0...



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Reopened] (BEAM-6407) regression: FileIO.writeDynamic() with side inputs fails in DirectRunner

2019-06-04 Thread Kenneth Knowles (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6407?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kenneth Knowles reopened BEAM-6407:
---

Did it roll forward?

> regression: FileIO.writeDynamic() with side inputs fails in DirectRunner
> 
>
> Key: BEAM-6407
> URL: https://issues.apache.org/jira/browse/BEAM-6407
> Project: Beam
>  Issue Type: Bug
>  Components: runner-direct
>Affects Versions: 2.9.0
>Reporter: Niel Markwick
>Assignee: Niel Markwick
>Priority: Blocker
>  Labels: regression
> Fix For: 2.10.0
>
> Attachments: beam-filewriter-demo.tgz
>
>  Time Spent: 2.5h
>  Remaining Estimate: 0h
>
> When FileIO.writeDynamic is used with automatic sharding and  a Contextful.Fn 
> that uses side inputs for the file naming, DirectRunner (and TestPipeline) 
> fail with: 
> {{java.lang.IllegalStateException: All PCollectionViews that are consumed 
> must be written by some WriteView PTransform: Missing [ 
> [RunnerPCollectionView]]}}
>  
> Example code:  
> {code:java}
> PCollectionView outputFileName =
>    pipeline.apply(
>       "outputDir",
>        Create.of("/tmp/testout")).apply(View.asSingleton());
> Contextful.Fn manifestNaming =
>    (element, c) ->
>       (window, pane, numShards, shardIndex, compression) -> 
>          c.sideInput(outputFileName)+shardIndex;
> pipeline.apply(FileIO.writeDynamic()
>    .by(SerializableFunctions.constant(""))
>    .withDestinationCoder(StringUtf8Coder.of())
>    .via(TextIO.sink())
>    .withTempDirectory("/tmp")
>    .withNaming(Contextful.of(
>       manifestNaming,
>       Requirements.requiresSideInputs(outputFileName;
> {code}
>  
> This does not occur in Dataflow-runner
> It does not occur if the ContextFul.Fn is not given side inputs.
> It does not occur if withNumShards(1) is set.
> It did not occur in 2.8.0, and does in 2.9.0 and 2.10.0-SNAPSHOT (as of today)
>  
> The cause appears to be due to the DirectRunner using TransformOverrides 
> re-writing FileIO sinks to use runner-determined-sharding
> ( see [DirectRunner.java line 
> 226|https://github.com/apache/beam/blob/master/runners/direct-java/src/main/java/org/apache/beam/runners/direct/DirectRunner.java#L226]
>  )
>  but I do not know why this started occuring in 2.9.0...



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3606) Improve our relationship, or lack thereof, with Guava

2019-06-04 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-3606?focusedWorklogId=253956&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-253956
 ]

ASF GitHub Bot logged work on BEAM-3606:


Author: ASF GitHub Bot
Created on: 04/Jun/19 19:06
Start Date: 04/Jun/19 19:06
Worklog Time Spent: 10m 
  Work Description: youngoli commented on pull request #8759: [BEAM-3606] 
Remove guava-testlib from sdks/java/core main dependency set
URL: https://github.com/apache/beam/pull/8759#discussion_r290450318
 
 

 ##
 File path: 
sdks/java/core/src/main/java/org/apache/beam/sdk/io/fs/ResourceIdTester.java
 ##
 @@ -77,31 +80,58 @@ private static void validateResolvingIds(
 allResourceIds.add(dir2);
 
 // ResourceIds in equality groups.
-new EqualsTester()
-.addEqualityGroup(file1)
-.addEqualityGroup(file2, file2a)
-.addEqualityGroup(dir1, dir1.getCurrentDirectory())
-.addEqualityGroup(dir2, dir2a, dir2.getCurrentDirectory())
-.addEqualityGroup(baseDirectory, file1.getCurrentDirectory(), 
file2.getCurrentDirectory())
-.testEquals();
+assertEqualityGroups(
+Arrays.asList(
+Arrays.asList(file1),
+Arrays.asList(file2, file2a),
+Arrays.asList(dir1, dir1.getCurrentDirectory()),
+Arrays.asList(dir2, dir2a, dir2.getCurrentDirectory()),
+Arrays.asList(
+baseDirectory, file1.getCurrentDirectory(), 
file2.getCurrentDirectory(;
 
 // ResourceId toString() in equality groups.
-new EqualsTester()
-.addEqualityGroup(file1.toString())
-.addEqualityGroup(file2.toString(), file2a.toString())
-.addEqualityGroup(dir1.toString(), 
dir1.getCurrentDirectory().toString())
-.addEqualityGroup(dir2.toString(), dir2a.toString(), 
dir2.getCurrentDirectory().toString())
-.addEqualityGroup(
-baseDirectory.toString(),
-file1.getCurrentDirectory().toString(),
-file2.getCurrentDirectory().toString())
-.testEquals();
+assertEqualityGroups(
+Arrays.asList(
+Arrays.asList(file1.toString()),
+Arrays.asList(file2.toString(), file2a.toString()),
+Arrays.asList(dir1.toString(), 
dir1.getCurrentDirectory().toString()),
+Arrays.asList(dir2.toString(), dir2a.toString(), 
dir2.getCurrentDirectory().toString()),
+Arrays.asList(
+baseDirectory.toString(),
+file1.getCurrentDirectory().toString(),
+file2.getCurrentDirectory().toString(;
 
 // TODO: test resolving strings that need to be escaped.
 //   Possible spec: https://tools.ietf.org/html/rfc3986#section-2
 //   May need options to be filesystem-independent, e.g., if filesystems 
ban certain chars.
   }
 
+  /**
+   * Asserts that all elements in each group are equal to each other but not 
equal to any other
+   * element in another group.
+   */
+  private static  void assertEqualityGroups(List> equalityGroups) {
+for (int i = 0; i < equalityGroups.size(); ++i) {
+  List current = equalityGroups.get(i);
+  for (int j = 0; j < current.size(); ++j) {
+for (int k = 0; k < current.size(); ++k) {
+  assertEquals(
+  "Value at " + j + " should equal value at " + k + " in equality 
group " + i,
+  current.get(j),
+  current.get(k));
+}
+  }
+  for (int j = 0; j < equalityGroups.size(); ++j) {
 
 Review comment:
   Here too, it seems like you could just do `int j = i + 1` and avoid 
duplicate comparisons (and self comparisons)
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 253956)
Time Spent: 50m  (was: 40m)

> Improve our relationship, or lack thereof, with Guava
> -
>
> Key: BEAM-3606
> URL: https://issues.apache.org/jira/browse/BEAM-3606
> Project: Beam
>  Issue Type: Improvement
>  Components: io-java-gcp, runner-core, sdk-java-core
>Reporter: Kenneth Knowles
>Priority: Major
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> This is an umbrella task for things that might move us off Guava, such as 
> replicating our own little utilities or moving to Java 8 features.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3606) Improve our relationship, or lack thereof, with Guava

2019-06-04 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-3606?focusedWorklogId=253955&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-253955
 ]

ASF GitHub Bot logged work on BEAM-3606:


Author: ASF GitHub Bot
Created on: 04/Jun/19 19:06
Start Date: 04/Jun/19 19:06
Worklog Time Spent: 10m 
  Work Description: youngoli commented on pull request #8759: [BEAM-3606] 
Remove guava-testlib from sdks/java/core main dependency set
URL: https://github.com/apache/beam/pull/8759#discussion_r290448829
 
 

 ##
 File path: 
sdks/java/core/src/main/java/org/apache/beam/sdk/io/fs/ResourceIdTester.java
 ##
 @@ -77,31 +80,58 @@ private static void validateResolvingIds(
 allResourceIds.add(dir2);
 
 // ResourceIds in equality groups.
-new EqualsTester()
-.addEqualityGroup(file1)
-.addEqualityGroup(file2, file2a)
-.addEqualityGroup(dir1, dir1.getCurrentDirectory())
-.addEqualityGroup(dir2, dir2a, dir2.getCurrentDirectory())
-.addEqualityGroup(baseDirectory, file1.getCurrentDirectory(), 
file2.getCurrentDirectory())
-.testEquals();
+assertEqualityGroups(
+Arrays.asList(
+Arrays.asList(file1),
+Arrays.asList(file2, file2a),
+Arrays.asList(dir1, dir1.getCurrentDirectory()),
+Arrays.asList(dir2, dir2a, dir2.getCurrentDirectory()),
+Arrays.asList(
+baseDirectory, file1.getCurrentDirectory(), 
file2.getCurrentDirectory(;
 
 // ResourceId toString() in equality groups.
-new EqualsTester()
-.addEqualityGroup(file1.toString())
-.addEqualityGroup(file2.toString(), file2a.toString())
-.addEqualityGroup(dir1.toString(), 
dir1.getCurrentDirectory().toString())
-.addEqualityGroup(dir2.toString(), dir2a.toString(), 
dir2.getCurrentDirectory().toString())
-.addEqualityGroup(
-baseDirectory.toString(),
-file1.getCurrentDirectory().toString(),
-file2.getCurrentDirectory().toString())
-.testEquals();
+assertEqualityGroups(
+Arrays.asList(
+Arrays.asList(file1.toString()),
+Arrays.asList(file2.toString(), file2a.toString()),
+Arrays.asList(dir1.toString(), 
dir1.getCurrentDirectory().toString()),
+Arrays.asList(dir2.toString(), dir2a.toString(), 
dir2.getCurrentDirectory().toString()),
+Arrays.asList(
+baseDirectory.toString(),
+file1.getCurrentDirectory().toString(),
+file2.getCurrentDirectory().toString(;
 
 // TODO: test resolving strings that need to be escaped.
 //   Possible spec: https://tools.ietf.org/html/rfc3986#section-2
 //   May need options to be filesystem-independent, e.g., if filesystems 
ban certain chars.
   }
 
+  /**
+   * Asserts that all elements in each group are equal to each other but not 
equal to any other
+   * element in another group.
+   */
+  private static  void assertEqualityGroups(List> equalityGroups) {
+for (int i = 0; i < equalityGroups.size(); ++i) {
+  List current = equalityGroups.get(i);
+  for (int j = 0; j < current.size(); ++j) {
+for (int k = 0; k < current.size(); ++k) {
 
 Review comment:
   Could you do `int k = j` instead here to avoid duplicate comparisons?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 253955)
Time Spent: 40m  (was: 0.5h)

> Improve our relationship, or lack thereof, with Guava
> -
>
> Key: BEAM-3606
> URL: https://issues.apache.org/jira/browse/BEAM-3606
> Project: Beam
>  Issue Type: Improvement
>  Components: io-java-gcp, runner-core, sdk-java-core
>Reporter: Kenneth Knowles
>Priority: Major
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> This is an umbrella task for things that might move us off Guava, such as 
> replicating our own little utilities or moving to Java 8 features.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (BEAM-7232) Python test_external_transforms fails on Spark runner

2019-06-04 Thread Kyle Weaver (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7232?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kyle Weaver updated BEAM-7232:
--
Description: 
test_external_transforms should theoretically work, but it instead throws a 
cryptic error:

<_Rendezvous of RPC that terminated with:
 status = StatusCode.UNKNOWN
 details = ""
 debug_error_string = "\{"created":"@1557171218.145369401","description":"Error 
received from peer 
ipv4:127.0.0.1:48159","file":"src/core/lib/surface/call.cc","file_line":1041,"grpc_message":"","grpc_status":2}"
 >

I could just be overlooking some obvious config issue.

 

EDIT: the previous issue has been fixed. now it seems to fail because the read 
source is unbounded

[https://github.com/apache/beam/blob/23714a335e8a6e0d91106a366e470a3a4820ae27/runners/core-construction-java/src/main/java/org/apache/beam/runners/core/construction/ReadTranslation.java#L92]

  was:
test_external_transforms should theoretically work, but it instead throws a 
cryptic error:

<_Rendezvous of RPC that terminated with:
 status = StatusCode.UNKNOWN
 details = ""
 debug_error_string = "\{"created":"@1557171218.145369401","description":"Error 
received from peer 
ipv4:127.0.0.1:48159","file":"src/core/lib/surface/call.cc","file_line":1041,"grpc_message":"","grpc_status":2}"
 >

I could just be overlooking some obvious config issue.

 

EDIT: the previous issue has been fixed. now the issue seems to be because the 
read source is unbounded

[https://github.com/apache/beam/blob/23714a335e8a6e0d91106a366e470a3a4820ae27/runners/core-construction-java/src/main/java/org/apache/beam/runners/core/construction/ReadTranslation.java#L92]


> Python test_external_transforms fails on Spark runner
> -
>
> Key: BEAM-7232
> URL: https://issues.apache.org/jira/browse/BEAM-7232
> Project: Beam
>  Issue Type: Bug
>  Components: runner-spark
>Reporter: Kyle Weaver
>Assignee: Kyle Weaver
>Priority: Major
>
> test_external_transforms should theoretically work, but it instead throws a 
> cryptic error:
> <_Rendezvous of RPC that terminated with:
>  status = StatusCode.UNKNOWN
>  details = ""
>  debug_error_string = 
> "\{"created":"@1557171218.145369401","description":"Error received from peer 
> ipv4:127.0.0.1:48159","file":"src/core/lib/surface/call.cc","file_line":1041,"grpc_message":"","grpc_status":2}"
>  >
> I could just be overlooking some obvious config issue.
>  
> EDIT: the previous issue has been fixed. now it seems to fail because the 
> read source is unbounded
> [https://github.com/apache/beam/blob/23714a335e8a6e0d91106a366e470a3a4820ae27/runners/core-construction-java/src/main/java/org/apache/beam/runners/core/construction/ReadTranslation.java#L92]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (BEAM-7401) beam_PreCommit_CommunityMetrics_Cron consistently failing

2019-06-04 Thread Mikhail Gryzykhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7401?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mikhail Gryzykhin resolved BEAM-7401.
-
   Resolution: Fixed
Fix Version/s: Not applicable

> beam_PreCommit_CommunityMetrics_Cron consistently failing
> -
>
> Key: BEAM-7401
> URL: https://issues.apache.org/jira/browse/BEAM-7401
> Project: Beam
>  Issue Type: Bug
>  Components: testing
>Reporter: Michael Luckey
>Assignee: Mikhail Gryzykhin
>Priority: Major
> Fix For: Not applicable
>
>
> Since a while, community metrics Jenkins job [1] is broken.
> [1] 
> https://builds.apache.org/job/beam_PreCommit_CommunityMetrics_Cron/buildTimeTrend



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-7401) beam_PreCommit_CommunityMetrics_Cron consistently failing

2019-06-04 Thread Mikhail Gryzykhin (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-7401?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16856003#comment-16856003
 ] 

Mikhail Gryzykhin commented on BEAM-7401:
-

Missed this ticket. The issue was with docker credentials that was common 
across all our jenkins builds. Resolved by now.

Relevant ticket: https://issues.apache.org/jira/browse/BEAM-7381

> beam_PreCommit_CommunityMetrics_Cron consistently failing
> -
>
> Key: BEAM-7401
> URL: https://issues.apache.org/jira/browse/BEAM-7401
> Project: Beam
>  Issue Type: Bug
>  Components: testing
>Reporter: Michael Luckey
>Assignee: Mikhail Gryzykhin
>Priority: Major
>
> Since a while, community metrics Jenkins job [1] is broken.
> [1] 
> https://builds.apache.org/job/beam_PreCommit_CommunityMetrics_Cron/buildTimeTrend



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (BEAM-6284) [FLAKE][beam_PostCommit_Java_ValidatesRunner_Dataflow] TestRunner fails with result UNKNOWN on succeeded job and checks passed

2019-06-04 Thread Mikhail Gryzykhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6284?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mikhail Gryzykhin resolved BEAM-6284.
-
   Resolution: Fixed
Fix Version/s: 2.13.0

> [FLAKE][beam_PostCommit_Java_ValidatesRunner_Dataflow] TestRunner fails with 
> result UNKNOWN on succeeded job and checks passed
> --
>
> Key: BEAM-6284
> URL: https://issues.apache.org/jira/browse/BEAM-6284
> Project: Beam
>  Issue Type: Bug
>  Components: test-failures, testing
>Reporter: Mikhail Gryzykhin
>Assignee: Mikhail Gryzykhin
>Priority: Major
>  Labels: currently-failing
> Fix For: 2.13.0
>
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> _Use this form to file an issue for test failure:_
>  * 
> https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/testReport/junit/org.apache.beam.sdk.transforms/ViewTest/testWindowedSideInputFixedToGlobal/
> Initial investigation:
> According to logs all test-relevant checks have passed and it seem to be 
> testing framework failure.
> 
> _After you've filled out the above details, please [assign the issue to an 
> individual|https://beam.apache.org/contribute/postcommits-guides/index.html#find_specialist].
>  Assignee should [treat test failures as 
> high-priority|https://beam.apache.org/contribute/postcommits-policies/#assigned-failing-test],
>  helping to fix the issue or find a more appropriate owner. See [Apache Beam 
> Post-Commit 
> Policies|https://beam.apache.org/contribute/postcommits-policies]._



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-2857) Create FileIO in Python

2019-06-04 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-2857?focusedWorklogId=253934&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-253934
 ]

ASF GitHub Bot logged work on BEAM-2857:


Author: ASF GitHub Bot
Created on: 04/Jun/19 18:32
Start Date: 04/Jun/19 18:32
Worklog Time Spent: 10m 
  Work Description: chamikaramj commented on issue #8394: [BEAM-2857] 
Implementing WriteToFiles transform for fileio (Python)
URL: https://github.com/apache/beam/pull/8394#issuecomment-498790363
 
 
   LGTM. Thanks.
   
   Please fix-up and self-merge after tests pass.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 253934)
Time Spent: 6.5h  (was: 6h 20m)

> Create FileIO in Python
> ---
>
> Key: BEAM-2857
> URL: https://issues.apache.org/jira/browse/BEAM-2857
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-py-core
>Reporter: Eugene Kirpichov
>Assignee: Pablo Estrada
>Priority: Major
>  Labels: gsoc, gsoc2019, mentor
>  Time Spent: 6.5h
>  Remaining Estimate: 0h
>
> Beam Java has a FileIO with operations: match()/matchAll(), readMatches(), 
> which together cover the majority of needs for general-purpose file 
> ingestion. Beam Python should have something similar.
> An early design document for this: https://s.apache.org/fileio-beam-python



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-2857) Create FileIO in Python

2019-06-04 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-2857?focusedWorklogId=253913&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-253913
 ]

ASF GitHub Bot logged work on BEAM-2857:


Author: ASF GitHub Bot
Created on: 04/Jun/19 18:26
Start Date: 04/Jun/19 18:26
Worklog Time Spent: 10m 
  Work Description: pabloem commented on pull request #8394: [BEAM-2857] 
Implementing WriteToFiles transform for fileio (Python)
URL: https://github.com/apache/beam/pull/8394#discussion_r290433622
 
 

 ##
 File path: sdks/python/apache_beam/io/fileio.py
 ##
 @@ -170,3 +237,487 @@ def __init__(self, compression=None, 
skip_directories=True):
   def expand(self, pcoll):
 return pcoll | beam.ParDo(_ReadMatchesFn(self._compression,
  self._skip_directories))
+
+
+class FileSink(object):
+  """Specifies how to write elements to individual files in ``WriteToFiles``.
+
+  **NOTE: THIS CLASS IS EXPERIMENTAL.**
+
+  A Sink class must implement the following:
+
+   - The ``open`` method, which initializes writing to a file handler (it is 
not
+ responsible for opening the file handler itself).
+   - The ``write`` method, which writes an element to the file that was passed
+ in ``open``.
+   - The ``flush`` method, which flushes any buffered state. This is most often
+ called before closing a file (but not exclusively called in that
+ situation). The sink is not responsible for closing the file handler.
+   """
+
+  def open(self, fh):
+raise NotImplementedError
+
+  def write(self, record):
+raise NotImplementedError
+
+  def flush(self):
+raise NotImplementedError
+
+
+@beam.typehints.with_input_types(str)
+class TextSink(FileSink):
+  """A sink that encodes utf8 elements, and writes to file handlers.
+
+  **NOTE: THIS CLASS IS EXPERIMENTAL.**
+
+  This sink simply calls file_handler.write(record.encode('utf8') + '\n') on 
all
+  records that come into it.
+  """
+
+  def open(self, fh):
+self._fh = fh
+
+  def write(self, record):
+self._fh.write(record.encode('utf8'))
+self._fh.write(b'\n')
+
+  def flush(self):
+self._fh.flush()
+
+
+def prefix_naming(prefix):
+  return default_file_naming(prefix)
+
+
+_DEFAULT_FILE_NAME_TEMPLATE = (
+'{prefix}-{start}-{end}-{pane}-'
+'{shard:05d}-{total_shards:05d}'
+'{suffix}{compression}')
+
+
+def destination_prefix_naming():
+
+  def _inner(window, pane, shard_index, total_shards, compression, 
destination):
+kwargs = {'prefix': str(destination),
+  'start': '',
+  'end': '',
+  'pane': '',
+  'shard': 0,
+  'total_shards': 0,
+  'suffix': '',
+  'compression': ''}
+if total_shards is not None and shard_index is not None:
+  kwargs['shard'] = int(shard_index)
+  kwargs['total_shards'] = int(total_shards)
+
+if window != GlobalWindow():
+  kwargs['start'] = window.start.to_utc_datetime().isoformat()
+  kwargs['end'] = window.end.to_utc_datetime().isoformat()
+
+# TODO(pabloem): Add support for PaneInfo
+# If the PANE is the ONLY firing in the window, we don't add it.
+#if pane and not (pane.is_first and pane.is_last):
+#  kwargs['pane'] = pane.index
+
+if compression:
+  kwargs['compression'] = '.%s' % compression
+
+return _DEFAULT_FILE_NAME_TEMPLATE.format(**kwargs)
+
+  return _inner
+
+
+def default_file_naming(prefix, suffix=None):
+
+  def _inner(window, pane, shard_index, total_shards, compression, 
destination):
+kwargs = {'prefix': prefix,
+  'start': '',
+  'end': '',
+  'pane': '',
+  'shard': 0,
+  'total_shards': 0,
+  'suffix': '',
+  'compression': ''}
+if total_shards is not None and shard_index is not None:
+  kwargs['shard'] = int(shard_index)
+  kwargs['total_shards'] = int(total_shards)
+
+if window != GlobalWindow():
+  kwargs['start'] = window.start.to_utc_datetime().isoformat()
+  kwargs['end'] = window.end.to_utc_datetime().isoformat()
+
+# TODO(pabloem): Add support for PaneInfo
 
 Review comment:
   Done
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 253913)
Time Spent: 6h  (was: 5h 50m)

> Create FileIO in Python
> ---
>
> Key: BEAM-2857
> URL: https://issues.apache.org/jira/browse/BEAM-2857
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-py-core
>

[jira] [Work logged] (BEAM-2857) Create FileIO in Python

2019-06-04 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-2857?focusedWorklogId=253914&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-253914
 ]

ASF GitHub Bot logged work on BEAM-2857:


Author: ASF GitHub Bot
Created on: 04/Jun/19 18:26
Start Date: 04/Jun/19 18:26
Worklog Time Spent: 10m 
  Work Description: pabloem commented on pull request #8394: [BEAM-2857] 
Implementing WriteToFiles transform for fileio (Python)
URL: https://github.com/apache/beam/pull/8394#discussion_r290433584
 
 

 ##
 File path: sdks/python/apache_beam/io/fileio.py
 ##
 @@ -502,12 +502,13 @@ def expand(self, pcoll):
 file_results = (files_by_destination_pc
 | beam.ParDo(
 _MoveTempFilesIntoFinalDestinationFn(
-self.path, self.file_naming_fn)))
+self.path, self.file_naming_fn,
+self._temp_directory)))
 
 return file_results
 
 
-def _create_writer(base_path, file_name):
+def _create_writer(base_path, writer_key):
   # TODO(pabloem) Is this a good place to create a temporary directory?
 
 Review comment:
   Done.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 253914)
Time Spent: 6h 10m  (was: 6h)

> Create FileIO in Python
> ---
>
> Key: BEAM-2857
> URL: https://issues.apache.org/jira/browse/BEAM-2857
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-py-core
>Reporter: Eugene Kirpichov
>Assignee: Pablo Estrada
>Priority: Major
>  Labels: gsoc, gsoc2019, mentor
>  Time Spent: 6h 10m
>  Remaining Estimate: 0h
>
> Beam Java has a FileIO with operations: match()/matchAll(), readMatches(), 
> which together cover the majority of needs for general-purpose file 
> ingestion. Beam Python should have something similar.
> An early design document for this: https://s.apache.org/fileio-beam-python



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-2857) Create FileIO in Python

2019-06-04 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-2857?focusedWorklogId=253915&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-253915
 ]

ASF GitHub Bot logged work on BEAM-2857:


Author: ASF GitHub Bot
Created on: 04/Jun/19 18:26
Start Date: 04/Jun/19 18:26
Worklog Time Spent: 10m 
  Work Description: pabloem commented on pull request #8394: [BEAM-2857] 
Implementing WriteToFiles transform for fileio (Python)
URL: https://github.com/apache/beam/pull/8394#discussion_r290433653
 
 

 ##
 File path: sdks/python/apache_beam/io/fileio.py
 ##
 @@ -23,17 +23,83 @@
 metadata records, and produces a ``PCollection`` of ``ReadableFile`` objects.
 These transforms currently do not support splitting by themselves.
 
+Writing to Files
+
+
+The transforms in this file include ``WriteToFiles``, which allows you to write
+a ``beam.PCollection`` to files, and gives you many options to customize how to
+do this.
+
+File Naming
+---
+One of the parameters received by ``WriteToFiles`` is a function specifying how
+to name the files that are written. This is a function that takes in the
 
 Review comment:
   Done.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 253915)
Time Spent: 6h 20m  (was: 6h 10m)

> Create FileIO in Python
> ---
>
> Key: BEAM-2857
> URL: https://issues.apache.org/jira/browse/BEAM-2857
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-py-core
>Reporter: Eugene Kirpichov
>Assignee: Pablo Estrada
>Priority: Major
>  Labels: gsoc, gsoc2019, mentor
>  Time Spent: 6h 20m
>  Remaining Estimate: 0h
>
> Beam Java has a FileIO with operations: match()/matchAll(), readMatches(), 
> which together cover the majority of needs for general-purpose file 
> ingestion. Beam Python should have something similar.
> An early design document for this: https://s.apache.org/fileio-beam-python



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-2857) Create FileIO in Python

2019-06-04 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-2857?focusedWorklogId=253912&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-253912
 ]

ASF GitHub Bot logged work on BEAM-2857:


Author: ASF GitHub Bot
Created on: 04/Jun/19 18:25
Start Date: 04/Jun/19 18:25
Worklog Time Spent: 10m 
  Work Description: pabloem commented on pull request #8394: [BEAM-2857] 
Implementing WriteToFiles transform for fileio (Python)
URL: https://github.com/apache/beam/pull/8394#discussion_r290433597
 
 

 ##
 File path: sdks/python/apache_beam/io/fileio.py
 ##
 @@ -562,16 +569,21 @@ def process(self, element):
r.pane,
destination)
 
-logging.info('Cautiously removing temporary files for destination %s',
- destination)
-self._remove_temporary_files([r.file_name for r in file_results])
+logging.info('Cautiously removing temporary files for'
+ ' destination %s and window %s', destination, w)
+writer_key = (destination, w)
+self._remove_temporary_files(writer_key)
 
-  @staticmethod
-  def _remove_temporary_files(files):
+  def _remove_temporary_files(self, writer_key):
 try:
-  filesystems.FileSystems.delete(files)
+  prefix = filesystems.FileSystems.join(
+  self.temporary_directory.get(), str(abs(hash(writer_key
+  match_result = filesystems.FileSystems.match(['%s*' % prefix])
+  orphaned_files = [m.path for m in match_result[0].metadata_list]
+
+  logging.debug('Deleting orphaned files: %s', orphaned_files)
+  filesystems.FileSystems.delete(orphaned_files)
 
 Review comment:
   Yes. All we're doing is logging the failures (in debug level) to be aware of 
them.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 253912)
Time Spent: 5h 50m  (was: 5h 40m)

> Create FileIO in Python
> ---
>
> Key: BEAM-2857
> URL: https://issues.apache.org/jira/browse/BEAM-2857
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-py-core
>Reporter: Eugene Kirpichov
>Assignee: Pablo Estrada
>Priority: Major
>  Labels: gsoc, gsoc2019, mentor
>  Time Spent: 5h 50m
>  Remaining Estimate: 0h
>
> Beam Java has a FileIO with operations: match()/matchAll(), readMatches(), 
> which together cover the majority of needs for general-purpose file 
> ingestion. Beam Python should have something similar.
> An early design document for this: https://s.apache.org/fileio-beam-python



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6683) Add an integration test suite for cross-language transforms for Flink runner

2019-06-04 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6683?focusedWorklogId=253899&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-253899
 ]

ASF GitHub Bot logged work on BEAM-6683:


Author: ASF GitHub Bot
Created on: 04/Jun/19 18:11
Start Date: 04/Jun/19 18:11
Worklog Time Spent: 10m 
  Work Description: ihji commented on issue #8174: [BEAM-6683] add 
createCrossLanguageValidatesRunner task
URL: https://github.com/apache/beam/pull/8174#issuecomment-498782464
 
 
   Found that XVR_Flink PostCommit is broken because of 
https://github.com/apache/beam/pull/8693
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 253899)
Time Spent: 13h  (was: 12h 50m)

> Add an integration test suite for cross-language transforms for Flink runner
> 
>
> Key: BEAM-6683
> URL: https://issues.apache.org/jira/browse/BEAM-6683
> Project: Beam
>  Issue Type: Test
>  Components: testing
>Reporter: Chamikara Jayalath
>Assignee: Heejong Lee
>Priority: Major
>  Time Spent: 13h
>  Remaining Estimate: 0h
>
> We should add an integration test suite that covers following.
> (1) Currently available Java IO connectors that do not use UDFs work for 
> Python SDK on Flink runner.
> (2) Currently available Python IO connectors that do not use UDFs work for 
> Java SDK on Flink runner.
> (3) Currently available Java/Python pipelines work in a scalable manner for 
> cross-language pipelines (for example, try 10GB, 100GB input for 
> textio/avroio for Java and Python). 
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-7351) Failure in Python streaming wordcount test: unexpected messages received on output topic.

2019-06-04 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7351?focusedWorklogId=253893&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-253893
 ]

ASF GitHub Bot logged work on BEAM-7351:


Author: ASF GitHub Bot
Created on: 04/Jun/19 17:58
Start Date: 04/Jun/19 17:58
Worklog Time Spent: 10m 
  Work Description: pabloem commented on issue #8752: [BEAM-7351] increment 
ack deadline for flaky pubsub test
URL: https://github.com/apache/beam/pull/8752#issuecomment-498777859
 
 
   I'd feel more comfortable having a passing run. I'll merge after tests pass.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 253893)
Time Spent: 1h  (was: 50m)

> Failure in Python streaming wordcount test: unexpected messages received on 
> output topic.
> -
>
> Key: BEAM-7351
> URL: https://issues.apache.org/jira/browse/BEAM-7351
> Project: Beam
>  Issue Type: Bug
>  Components: runner-dataflow, sdk-py-core
>Reporter: Valentyn Tymofieiev
>Assignee: Valentyn Tymofieiev
>Priority: Major
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> Saw this in a PostCommit test today. Likely a flake, but it's a strange 
> failure mode and we may need to investigate this.
> {noformat}
> 13:32:02 
> ==
> 13:32:02 FAIL: test_streaming_wordcount_it 
> (apache_beam.examples.streaming_wordcount_it_test.StreamingWordCountIT)
> 13:32:02 
> --
> 13:32:02 Traceback (most recent call last):
> 13:32:02   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python_Verify_PR/src/sdks/python/apache_beam/examples/streaming_wordcount_it_test.py",
>  line 110, in test_streaming_wordcount_it
> 13:32:02 self.test_pipeline.get_full_options_as_args(**extra_opts))
> 13:32:02   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python_Verify_PR/src/sdks/python/apache_beam/examples/streaming_wordcount.py",
>  line 101, in run
> 13:32:02 result = p.run()
> 13:32:02   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python_Verify_PR/src/sdks/python/apache_beam/pipeline.py",
>  line 419, in run
> 13:32:02 return self.runner.run_pipeline(self, self._options)
> 13:32:02   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python_Verify_PR/src/sdks/python/apache_beam/runners/dataflow/test_dataflow_runner.py",
>  line 68, in run_pipeline
> 13:32:02 hc_assert_that(self.result, pickler.loads(on_success_matcher))
> 13:32:02 AssertionError: 
> 13:32:02 Expected: (Test pipeline expected terminated in state: RUNNING and 
> Expected 500 messages.)
> 13:32:02  but: Expected 500 messages. Got 508 messages. Diffs (item, 
> count):
> 13:32:02   Expected but not in actual: []
> 13:32:02   Unexpected: [(u'476: 1', 1), (u'416: 1', 1), (u'245: 1', 1), 
> (u'478: 1', 1), (u'58: 1', 1), (u'364: 1', 1), (u'77: 1', 1), (u'283: 1', 1)]
> 13:32:02 
> 13:32:02  >> begin captured logging << 
> 
> 13:32:02 google.auth.transport._http_client: DEBUG: Making request: GET 
> http://169.254.169.254
> 13:32:02 google.auth.transport._http_client: DEBUG: Making request: GET 
> http://metadata.google.internal/computeMetadata/v1/project/project-id
> 13:32:02 google.auth.transport.requests: DEBUG: Making request: GET 
> http://metadata.google.internal/computeMetadata/v1/instance/service-accounts/default/?recursive=true
> 13:32:02 urllib3.connectionpool: DEBUG: Starting new HTTP connection (1): 
> metadata.google.internal:80
> 13:32:02 urllib3.connectionpool: DEBUG: http://metadata.google.internal:80 
> "GET /computeMetadata/v1/instance/service-accounts/default/?recursive=true 
> HTTP/1.1" 200 144
> 13:32:02 google.auth.transport.requests: DEBUG: Making request: GET 
> http://metadata.google.internal/computeMetadata/v1/instance/service-accounts/844138762903-comp...@developer.gserviceaccount.com/token
> 13:32:02 urllib3.connectionpool: DEBUG: http://metadata.google.internal:80 
> "GET 
> /computeMetadata/v1/instance/service-accounts/844138762903-comp...@developer.gserviceaccount.com/token
>  HTTP/1.1" 200 176
> 13:32:02 google.auth.transport._http_client: DEBUG: Making request: GET 
> http://169.254.169.254
> 13:32:02 google.auth.transport._http_client: DEBUG: Making request: GET 
> http://metadata.google.internal/computeMetadata/v1/project

[jira] [Work logged] (BEAM-7351) Failure in Python streaming wordcount test: unexpected messages received on output topic.

2019-06-04 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7351?focusedWorklogId=253894&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-253894
 ]

ASF GitHub Bot logged work on BEAM-7351:


Author: ASF GitHub Bot
Created on: 04/Jun/19 17:58
Start Date: 04/Jun/19 17:58
Worklog Time Spent: 10m 
  Work Description: pabloem commented on issue #8752: [BEAM-7351] increment 
ack deadline for flaky pubsub test
URL: https://github.com/apache/beam/pull/8752#issuecomment-498777897
 
 
   Run Python PostCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 253894)
Time Spent: 1h 10m  (was: 1h)

> Failure in Python streaming wordcount test: unexpected messages received on 
> output topic.
> -
>
> Key: BEAM-7351
> URL: https://issues.apache.org/jira/browse/BEAM-7351
> Project: Beam
>  Issue Type: Bug
>  Components: runner-dataflow, sdk-py-core
>Reporter: Valentyn Tymofieiev
>Assignee: Valentyn Tymofieiev
>Priority: Major
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> Saw this in a PostCommit test today. Likely a flake, but it's a strange 
> failure mode and we may need to investigate this.
> {noformat}
> 13:32:02 
> ==
> 13:32:02 FAIL: test_streaming_wordcount_it 
> (apache_beam.examples.streaming_wordcount_it_test.StreamingWordCountIT)
> 13:32:02 
> --
> 13:32:02 Traceback (most recent call last):
> 13:32:02   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python_Verify_PR/src/sdks/python/apache_beam/examples/streaming_wordcount_it_test.py",
>  line 110, in test_streaming_wordcount_it
> 13:32:02 self.test_pipeline.get_full_options_as_args(**extra_opts))
> 13:32:02   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python_Verify_PR/src/sdks/python/apache_beam/examples/streaming_wordcount.py",
>  line 101, in run
> 13:32:02 result = p.run()
> 13:32:02   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python_Verify_PR/src/sdks/python/apache_beam/pipeline.py",
>  line 419, in run
> 13:32:02 return self.runner.run_pipeline(self, self._options)
> 13:32:02   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python_Verify_PR/src/sdks/python/apache_beam/runners/dataflow/test_dataflow_runner.py",
>  line 68, in run_pipeline
> 13:32:02 hc_assert_that(self.result, pickler.loads(on_success_matcher))
> 13:32:02 AssertionError: 
> 13:32:02 Expected: (Test pipeline expected terminated in state: RUNNING and 
> Expected 500 messages.)
> 13:32:02  but: Expected 500 messages. Got 508 messages. Diffs (item, 
> count):
> 13:32:02   Expected but not in actual: []
> 13:32:02   Unexpected: [(u'476: 1', 1), (u'416: 1', 1), (u'245: 1', 1), 
> (u'478: 1', 1), (u'58: 1', 1), (u'364: 1', 1), (u'77: 1', 1), (u'283: 1', 1)]
> 13:32:02 
> 13:32:02  >> begin captured logging << 
> 
> 13:32:02 google.auth.transport._http_client: DEBUG: Making request: GET 
> http://169.254.169.254
> 13:32:02 google.auth.transport._http_client: DEBUG: Making request: GET 
> http://metadata.google.internal/computeMetadata/v1/project/project-id
> 13:32:02 google.auth.transport.requests: DEBUG: Making request: GET 
> http://metadata.google.internal/computeMetadata/v1/instance/service-accounts/default/?recursive=true
> 13:32:02 urllib3.connectionpool: DEBUG: Starting new HTTP connection (1): 
> metadata.google.internal:80
> 13:32:02 urllib3.connectionpool: DEBUG: http://metadata.google.internal:80 
> "GET /computeMetadata/v1/instance/service-accounts/default/?recursive=true 
> HTTP/1.1" 200 144
> 13:32:02 google.auth.transport.requests: DEBUG: Making request: GET 
> http://metadata.google.internal/computeMetadata/v1/instance/service-accounts/844138762903-comp...@developer.gserviceaccount.com/token
> 13:32:02 urllib3.connectionpool: DEBUG: http://metadata.google.internal:80 
> "GET 
> /computeMetadata/v1/instance/service-accounts/844138762903-comp...@developer.gserviceaccount.com/token
>  HTTP/1.1" 200 176
> 13:32:02 google.auth.transport._http_client: DEBUG: Making request: GET 
> http://169.254.169.254
> 13:32:02 google.auth.transport._http_client: DEBUG: Making request: GET 
> http://metadata.google.internal/computeMetadata/v1/project/project-id
> 13:32:02 google.auth.transport.req

[jira] [Work logged] (BEAM-7351) Failure in Python streaming wordcount test: unexpected messages received on output topic.

2019-06-04 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7351?focusedWorklogId=253874&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-253874
 ]

ASF GitHub Bot logged work on BEAM-7351:


Author: ASF GitHub Bot
Created on: 04/Jun/19 17:40
Start Date: 04/Jun/19 17:40
Worklog Time Spent: 10m 
  Work Description: tvalentyn commented on issue #8752: [BEAM-7351] 
increment ack deadline for flaky pubsub test
URL: https://github.com/apache/beam/pull/8752#issuecomment-498771499
 
 
   Thanks, @Juta !
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 253874)
Time Spent: 50m  (was: 40m)

> Failure in Python streaming wordcount test: unexpected messages received on 
> output topic.
> -
>
> Key: BEAM-7351
> URL: https://issues.apache.org/jira/browse/BEAM-7351
> Project: Beam
>  Issue Type: Bug
>  Components: runner-dataflow, sdk-py-core
>Reporter: Valentyn Tymofieiev
>Assignee: Valentyn Tymofieiev
>Priority: Major
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> Saw this in a PostCommit test today. Likely a flake, but it's a strange 
> failure mode and we may need to investigate this.
> {noformat}
> 13:32:02 
> ==
> 13:32:02 FAIL: test_streaming_wordcount_it 
> (apache_beam.examples.streaming_wordcount_it_test.StreamingWordCountIT)
> 13:32:02 
> --
> 13:32:02 Traceback (most recent call last):
> 13:32:02   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python_Verify_PR/src/sdks/python/apache_beam/examples/streaming_wordcount_it_test.py",
>  line 110, in test_streaming_wordcount_it
> 13:32:02 self.test_pipeline.get_full_options_as_args(**extra_opts))
> 13:32:02   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python_Verify_PR/src/sdks/python/apache_beam/examples/streaming_wordcount.py",
>  line 101, in run
> 13:32:02 result = p.run()
> 13:32:02   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python_Verify_PR/src/sdks/python/apache_beam/pipeline.py",
>  line 419, in run
> 13:32:02 return self.runner.run_pipeline(self, self._options)
> 13:32:02   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python_Verify_PR/src/sdks/python/apache_beam/runners/dataflow/test_dataflow_runner.py",
>  line 68, in run_pipeline
> 13:32:02 hc_assert_that(self.result, pickler.loads(on_success_matcher))
> 13:32:02 AssertionError: 
> 13:32:02 Expected: (Test pipeline expected terminated in state: RUNNING and 
> Expected 500 messages.)
> 13:32:02  but: Expected 500 messages. Got 508 messages. Diffs (item, 
> count):
> 13:32:02   Expected but not in actual: []
> 13:32:02   Unexpected: [(u'476: 1', 1), (u'416: 1', 1), (u'245: 1', 1), 
> (u'478: 1', 1), (u'58: 1', 1), (u'364: 1', 1), (u'77: 1', 1), (u'283: 1', 1)]
> 13:32:02 
> 13:32:02  >> begin captured logging << 
> 
> 13:32:02 google.auth.transport._http_client: DEBUG: Making request: GET 
> http://169.254.169.254
> 13:32:02 google.auth.transport._http_client: DEBUG: Making request: GET 
> http://metadata.google.internal/computeMetadata/v1/project/project-id
> 13:32:02 google.auth.transport.requests: DEBUG: Making request: GET 
> http://metadata.google.internal/computeMetadata/v1/instance/service-accounts/default/?recursive=true
> 13:32:02 urllib3.connectionpool: DEBUG: Starting new HTTP connection (1): 
> metadata.google.internal:80
> 13:32:02 urllib3.connectionpool: DEBUG: http://metadata.google.internal:80 
> "GET /computeMetadata/v1/instance/service-accounts/default/?recursive=true 
> HTTP/1.1" 200 144
> 13:32:02 google.auth.transport.requests: DEBUG: Making request: GET 
> http://metadata.google.internal/computeMetadata/v1/instance/service-accounts/844138762903-comp...@developer.gserviceaccount.com/token
> 13:32:02 urllib3.connectionpool: DEBUG: http://metadata.google.internal:80 
> "GET 
> /computeMetadata/v1/instance/service-accounts/844138762903-comp...@developer.gserviceaccount.com/token
>  HTTP/1.1" 200 176
> 13:32:02 google.auth.transport._http_client: DEBUG: Making request: GET 
> http://169.254.169.254
> 13:32:02 google.auth.transport._http_client: DEBUG: Making request: GET 
> http://metadata.google.internal/computeMetadata/v1/project/project-id
> 13:32:02 google.auth.transport.requests: DE

[jira] [Work logged] (BEAM-7351) Failure in Python streaming wordcount test: unexpected messages received on output topic.

2019-06-04 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7351?focusedWorklogId=253873&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-253873
 ]

ASF GitHub Bot logged work on BEAM-7351:


Author: ASF GitHub Bot
Created on: 04/Jun/19 17:39
Start Date: 04/Jun/19 17:39
Worklog Time Spent: 10m 
  Work Description: tvalentyn commented on issue #8752: [BEAM-7351] 
increment ack deadline for flaky pubsub test
URL: https://github.com/apache/beam/pull/8752#issuecomment-498771302
 
 
   I checked that streaming wordcount tests passed in all Py2 and Py3 test 
suites.
   ```
   03:41:46 test_streaming_wordcount_it 
(apache_beam.examples.streaming_wordcount_it_test.StreamingWordCountIT) ... ok
   ```
   We have other PRs in flight to address flakiness in other tests, so I think 
we can go ahead with the merge. @pabloem, could you please help with the merge? 
Thanks. 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 253873)
Time Spent: 40m  (was: 0.5h)

> Failure in Python streaming wordcount test: unexpected messages received on 
> output topic.
> -
>
> Key: BEAM-7351
> URL: https://issues.apache.org/jira/browse/BEAM-7351
> Project: Beam
>  Issue Type: Bug
>  Components: runner-dataflow, sdk-py-core
>Reporter: Valentyn Tymofieiev
>Assignee: Valentyn Tymofieiev
>Priority: Major
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Saw this in a PostCommit test today. Likely a flake, but it's a strange 
> failure mode and we may need to investigate this.
> {noformat}
> 13:32:02 
> ==
> 13:32:02 FAIL: test_streaming_wordcount_it 
> (apache_beam.examples.streaming_wordcount_it_test.StreamingWordCountIT)
> 13:32:02 
> --
> 13:32:02 Traceback (most recent call last):
> 13:32:02   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python_Verify_PR/src/sdks/python/apache_beam/examples/streaming_wordcount_it_test.py",
>  line 110, in test_streaming_wordcount_it
> 13:32:02 self.test_pipeline.get_full_options_as_args(**extra_opts))
> 13:32:02   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python_Verify_PR/src/sdks/python/apache_beam/examples/streaming_wordcount.py",
>  line 101, in run
> 13:32:02 result = p.run()
> 13:32:02   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python_Verify_PR/src/sdks/python/apache_beam/pipeline.py",
>  line 419, in run
> 13:32:02 return self.runner.run_pipeline(self, self._options)
> 13:32:02   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python_Verify_PR/src/sdks/python/apache_beam/runners/dataflow/test_dataflow_runner.py",
>  line 68, in run_pipeline
> 13:32:02 hc_assert_that(self.result, pickler.loads(on_success_matcher))
> 13:32:02 AssertionError: 
> 13:32:02 Expected: (Test pipeline expected terminated in state: RUNNING and 
> Expected 500 messages.)
> 13:32:02  but: Expected 500 messages. Got 508 messages. Diffs (item, 
> count):
> 13:32:02   Expected but not in actual: []
> 13:32:02   Unexpected: [(u'476: 1', 1), (u'416: 1', 1), (u'245: 1', 1), 
> (u'478: 1', 1), (u'58: 1', 1), (u'364: 1', 1), (u'77: 1', 1), (u'283: 1', 1)]
> 13:32:02 
> 13:32:02  >> begin captured logging << 
> 
> 13:32:02 google.auth.transport._http_client: DEBUG: Making request: GET 
> http://169.254.169.254
> 13:32:02 google.auth.transport._http_client: DEBUG: Making request: GET 
> http://metadata.google.internal/computeMetadata/v1/project/project-id
> 13:32:02 google.auth.transport.requests: DEBUG: Making request: GET 
> http://metadata.google.internal/computeMetadata/v1/instance/service-accounts/default/?recursive=true
> 13:32:02 urllib3.connectionpool: DEBUG: Starting new HTTP connection (1): 
> metadata.google.internal:80
> 13:32:02 urllib3.connectionpool: DEBUG: http://metadata.google.internal:80 
> "GET /computeMetadata/v1/instance/service-accounts/default/?recursive=true 
> HTTP/1.1" 200 144
> 13:32:02 google.auth.transport.requests: DEBUG: Making request: GET 
> http://metadata.google.internal/computeMetadata/v1/instance/service-accounts/844138762903-comp...@developer.gserviceaccount.com/token
> 13:32:02 urllib3.connectionpool: DEBUG: http://metadata.google.internal:80 
> "GET 
> /computeMetadata/v1/instance/service-accounts/

[jira] [Commented] (BEAM-7351) Failure in Python streaming wordcount test: unexpected messages received on output topic.

2019-06-04 Thread Valentyn Tymofieiev (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-7351?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16855937#comment-16855937
 ] 

Valentyn Tymofieiev commented on BEAM-7351:
---

Thanks, [~Juta]. This explanation makes sense to me, and we do randomize the 
topic with a random uuid suffix, so a topic collision is unlikely. 

> Failure in Python streaming wordcount test: unexpected messages received on 
> output topic.
> -
>
> Key: BEAM-7351
> URL: https://issues.apache.org/jira/browse/BEAM-7351
> Project: Beam
>  Issue Type: Bug
>  Components: runner-dataflow, sdk-py-core
>Reporter: Valentyn Tymofieiev
>Assignee: Valentyn Tymofieiev
>Priority: Major
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Saw this in a PostCommit test today. Likely a flake, but it's a strange 
> failure mode and we may need to investigate this.
> {noformat}
> 13:32:02 
> ==
> 13:32:02 FAIL: test_streaming_wordcount_it 
> (apache_beam.examples.streaming_wordcount_it_test.StreamingWordCountIT)
> 13:32:02 
> --
> 13:32:02 Traceback (most recent call last):
> 13:32:02   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python_Verify_PR/src/sdks/python/apache_beam/examples/streaming_wordcount_it_test.py",
>  line 110, in test_streaming_wordcount_it
> 13:32:02 self.test_pipeline.get_full_options_as_args(**extra_opts))
> 13:32:02   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python_Verify_PR/src/sdks/python/apache_beam/examples/streaming_wordcount.py",
>  line 101, in run
> 13:32:02 result = p.run()
> 13:32:02   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python_Verify_PR/src/sdks/python/apache_beam/pipeline.py",
>  line 419, in run
> 13:32:02 return self.runner.run_pipeline(self, self._options)
> 13:32:02   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python_Verify_PR/src/sdks/python/apache_beam/runners/dataflow/test_dataflow_runner.py",
>  line 68, in run_pipeline
> 13:32:02 hc_assert_that(self.result, pickler.loads(on_success_matcher))
> 13:32:02 AssertionError: 
> 13:32:02 Expected: (Test pipeline expected terminated in state: RUNNING and 
> Expected 500 messages.)
> 13:32:02  but: Expected 500 messages. Got 508 messages. Diffs (item, 
> count):
> 13:32:02   Expected but not in actual: []
> 13:32:02   Unexpected: [(u'476: 1', 1), (u'416: 1', 1), (u'245: 1', 1), 
> (u'478: 1', 1), (u'58: 1', 1), (u'364: 1', 1), (u'77: 1', 1), (u'283: 1', 1)]
> 13:32:02 
> 13:32:02  >> begin captured logging << 
> 
> 13:32:02 google.auth.transport._http_client: DEBUG: Making request: GET 
> http://169.254.169.254
> 13:32:02 google.auth.transport._http_client: DEBUG: Making request: GET 
> http://metadata.google.internal/computeMetadata/v1/project/project-id
> 13:32:02 google.auth.transport.requests: DEBUG: Making request: GET 
> http://metadata.google.internal/computeMetadata/v1/instance/service-accounts/default/?recursive=true
> 13:32:02 urllib3.connectionpool: DEBUG: Starting new HTTP connection (1): 
> metadata.google.internal:80
> 13:32:02 urllib3.connectionpool: DEBUG: http://metadata.google.internal:80 
> "GET /computeMetadata/v1/instance/service-accounts/default/?recursive=true 
> HTTP/1.1" 200 144
> 13:32:02 google.auth.transport.requests: DEBUG: Making request: GET 
> http://metadata.google.internal/computeMetadata/v1/instance/service-accounts/844138762903-comp...@developer.gserviceaccount.com/token
> 13:32:02 urllib3.connectionpool: DEBUG: http://metadata.google.internal:80 
> "GET 
> /computeMetadata/v1/instance/service-accounts/844138762903-comp...@developer.gserviceaccount.com/token
>  HTTP/1.1" 200 176
> 13:32:02 google.auth.transport._http_client: DEBUG: Making request: GET 
> http://169.254.169.254
> 13:32:02 google.auth.transport._http_client: DEBUG: Making request: GET 
> http://metadata.google.internal/computeMetadata/v1/project/project-id
> 13:32:02 google.auth.transport.requests: DEBUG: Making request: GET 
> http://metadata.google.internal/computeMetadata/v1/instance/service-accounts/default/?recursive=true
> 13:32:02 urllib3.connectionpool: DEBUG: Starting new HTTP connection (1): 
> metadata.google.internal:80
> 13:32:02 urllib3.connectionpool: DEBUG: http://metadata.google.internal:80 
> "GET /computeMetadata/v1/instance/service-accounts/default/?recursive=true 
> HTTP/1.1" 200 144
> 13:32:02 google.auth.transport.requests: DEBUG: Making request: GET 
> http://metadata.google.internal/computeMetadata/v1/instance/service-accounts/844138762903-comp...@developer.gserviceaccount.com/token
> 13:32:02 

[jira] [Work logged] (BEAM-3606) Improve our relationship, or lack thereof, with Guava

2019-06-04 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-3606?focusedWorklogId=253861&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-253861
 ]

ASF GitHub Bot logged work on BEAM-3606:


Author: ASF GitHub Bot
Created on: 04/Jun/19 17:31
Start Date: 04/Jun/19 17:31
Worklog Time Spent: 10m 
  Work Description: lukecwik commented on issue #8759: [BEAM-3606] Remove 
guava-testlib from sdks/java/core main dependency set
URL: https://github.com/apache/beam/pull/8759#issuecomment-498768132
 
 
   R: @youngoli @Ardagan 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 253861)
Time Spent: 0.5h  (was: 20m)

> Improve our relationship, or lack thereof, with Guava
> -
>
> Key: BEAM-3606
> URL: https://issues.apache.org/jira/browse/BEAM-3606
> Project: Beam
>  Issue Type: Improvement
>  Components: io-java-gcp, runner-core, sdk-java-core
>Reporter: Kenneth Knowles
>Priority: Major
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> This is an umbrella task for things that might move us off Guava, such as 
> replicating our own little utilities or moving to Java 8 features.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3606) Improve our relationship, or lack thereof, with Guava

2019-06-04 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-3606?focusedWorklogId=253860&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-253860
 ]

ASF GitHub Bot logged work on BEAM-3606:


Author: ASF GitHub Bot
Created on: 04/Jun/19 17:31
Start Date: 04/Jun/19 17:31
Worklog Time Spent: 10m 
  Work Description: lukecwik commented on issue #8759: [BEAM-3606] Remove 
guava-testlib from sdks/java/core main dependency set
URL: https://github.com/apache/beam/pull/8759#issuecomment-498768132
 
 
   R: @youngoli 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 253860)
Time Spent: 20m  (was: 10m)

> Improve our relationship, or lack thereof, with Guava
> -
>
> Key: BEAM-3606
> URL: https://issues.apache.org/jira/browse/BEAM-3606
> Project: Beam
>  Issue Type: Improvement
>  Components: io-java-gcp, runner-core, sdk-java-core
>Reporter: Kenneth Knowles
>Priority: Major
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> This is an umbrella task for things that might move us off Guava, such as 
> replicating our own little utilities or moving to Java 8 features.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3606) Improve our relationship, or lack thereof, with Guava

2019-06-04 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-3606?focusedWorklogId=253858&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-253858
 ]

ASF GitHub Bot logged work on BEAM-3606:


Author: ASF GitHub Bot
Created on: 04/Jun/19 17:27
Start Date: 04/Jun/19 17:27
Worklog Time Spent: 10m 
  Work Description: lukecwik commented on pull request #8759: [BEAM-3606] 
Remove guava-testlib from sdks/java/core main dependency set
URL: https://github.com/apache/beam/pull/8759
 
 
   This replaces the usage of an EqualsTester with our own implementation.
   
   The eventual goal is to stop shading guava as a default and then eventually 
to stop shading modules by default.
   
   
   
   Thank you for your contribution! Follow this checklist to help us 
incorporate your contribution quickly and easily:
   
- [ ] [**Choose 
reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and 
mention them in a comment (`R: @username`).
- [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in 
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA 
issue, if applicable. This will automatically link the pull request to the 
issue.
- [ ] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).
   
   Post-Commit Tests Status (on master branch)
   

   
   Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark
   --- | --- | --- | --- | --- | --- | --- | ---
   Go | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/)
 | --- | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/)
 | --- | --- | ---
   Java | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)
   Python | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Python_Verify/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python_Verify/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Python3_Verify/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python3_Verify/lastCompletedBuild/)
 | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/)
  [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PreCommit_Python_PVR_Flink_Cron/lastCompletedBuild/badge/icon)](https://builds.apache.or

[jira] [Updated] (BEAM-7259) Add monthly aggregated view for "Stability critical jobs status - Greenness per week"

2019-06-04 Thread Daniel Oliveira (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7259?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Oliveira updated BEAM-7259:
--
Status: Open  (was: Triage Needed)

> Add monthly aggregated view for "Stability critical jobs status - Greenness 
> per week"
> -
>
> Key: BEAM-7259
> URL: https://issues.apache.org/jira/browse/BEAM-7259
> Project: Beam
>  Issue Type: New Feature
>  Components: project-management, testing
>Reporter: Daniel Oliveira
>Assignee: Mikhail Gryzykhin
>Priority: Minor
>
> Hey Mikhail, could you create a view of the Greenness metric in Grafana 
> ([Stability Critical Jobs 
> Status|http://104.154.241.245/d/McTAiu0ik/stability-critical-jobs-status?orgId=1])
>  that's aggregated per month? Specifically, as an average greenness for an 
> entire calendar month (ex. April 1 - April 30).
> I need to track this metric and seeing the weekly greenness makes it hard to 
> average over a month, especially because different weeks may have different 
> numbers of runs, and they don't line up perfectly over a calendar month. I 
> already tried changing the timespan in the upper right, but when I set it to 
> a month range the graph's points are still one point per week instead of a 
> monthly average.
> Also just to be clear, this shouldn't replace the existing graph, just be a 
> new one placed wherever makes the most sense.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (BEAM-7259) Add monthly aggregated view for "Stability critical jobs status - Greenness per week"

2019-06-04 Thread Daniel Oliveira (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7259?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Oliveira resolved BEAM-7259.
---
   Resolution: Implemented
Fix Version/s: Not applicable

> Add monthly aggregated view for "Stability critical jobs status - Greenness 
> per week"
> -
>
> Key: BEAM-7259
> URL: https://issues.apache.org/jira/browse/BEAM-7259
> Project: Beam
>  Issue Type: New Feature
>  Components: project-management, testing
>Reporter: Daniel Oliveira
>Assignee: Mikhail Gryzykhin
>Priority: Minor
> Fix For: Not applicable
>
>
> Hey Mikhail, could you create a view of the Greenness metric in Grafana 
> ([Stability Critical Jobs 
> Status|http://104.154.241.245/d/McTAiu0ik/stability-critical-jobs-status?orgId=1])
>  that's aggregated per month? Specifically, as an average greenness for an 
> entire calendar month (ex. April 1 - April 30).
> I need to track this metric and seeing the weekly greenness makes it hard to 
> average over a month, especially because different weeks may have different 
> numbers of runs, and they don't line up perfectly over a calendar month. I 
> already tried changing the timespan in the upper right, but when I set it to 
> a month range the graph's points are still one point per week instead of a 
> monthly average.
> Also just to be clear, this shouldn't replace the existing graph, just be a 
> new one placed wherever makes the most sense.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-7259) Add monthly aggregated view for "Stability critical jobs status - Greenness per week"

2019-06-04 Thread Daniel Oliveira (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-7259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16855920#comment-16855920
 ] 

Daniel Oliveira commented on BEAM-7259:
---

This was added: 

http://104.154.241.245/d/McTAiu0ik/stability-critical-jobs-status?orgId=1
https://github.com/apache/beam/pull/8630

Forgot to update this bug. I'll mark it closed.

> Add monthly aggregated view for "Stability critical jobs status - Greenness 
> per week"
> -
>
> Key: BEAM-7259
> URL: https://issues.apache.org/jira/browse/BEAM-7259
> Project: Beam
>  Issue Type: New Feature
>  Components: project-management, testing
>Reporter: Daniel Oliveira
>Assignee: Mikhail Gryzykhin
>Priority: Minor
>
> Hey Mikhail, could you create a view of the Greenness metric in Grafana 
> ([Stability Critical Jobs 
> Status|http://104.154.241.245/d/McTAiu0ik/stability-critical-jobs-status?orgId=1])
>  that's aggregated per month? Specifically, as an average greenness for an 
> entire calendar month (ex. April 1 - April 30).
> I need to track this metric and seeing the weekly greenness makes it hard to 
> average over a month, especially because different weeks may have different 
> numbers of runs, and they don't line up perfectly over a calendar month. I 
> already tried changing the timespan in the upper right, but when I set it to 
> a month range the graph's points are still one point per week instead of a 
> monthly average.
> Also just to be clear, this shouldn't replace the existing graph, just be a 
> new one placed wherever makes the most sense.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (BEAM-7488) Grafana request: Add monthly aggregated view for "Precommit Test Latency - Time In Queue"

2019-06-04 Thread Daniel Oliveira (JIRA)
Daniel Oliveira created BEAM-7488:
-

 Summary: Grafana request: Add monthly aggregated view for 
"Precommit Test Latency - Time In Queue"
 Key: BEAM-7488
 URL: https://issues.apache.org/jira/browse/BEAM-7488
 Project: Beam
  Issue Type: New Feature
  Components: project-management, testing
Reporter: Daniel Oliveira
Assignee: Mikhail Gryzykhin


Similar to [BEAM-7259|https://issues.apache.org/jira/browse/BEAM-7259], was 
hoping to get this metric in a view that's aggregated per month. In this case 
averaging all the queue times seems like it would work well. Even better if you 
could average only the 95th percentile.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-7357) Kinesis IO.write throws LimitExceededException

2019-06-04 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7357?focusedWorklogId=253855&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-253855
 ]

ASF GitHub Bot logged work on BEAM-7357:


Author: ASF GitHub Bot
Created on: 04/Jun/19 17:16
Start Date: 04/Jun/19 17:16
Worklog Time Spent: 10m 
  Work Description: aromanenko-dev commented on pull request #8758: 
[BEAM-7357] Remove stream existence check
URL: https://github.com/apache/beam/pull/8758
 
 
   AWS Kinesis has a limit of 10 transactions of `DescribeStream` (which is 
used to check a stream) call per second per account. It will fail in case if we 
run a pipeline with many (>10) workers. So, we move the responsibility of this 
checking to AWS Kinesis.
   
   
   
   Thank you for your contribution! Follow this checklist to help us 
incorporate your contribution quickly and easily:
   
- [x] [**Choose 
reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and 
mention them in a comment (`R: @username`).
- [x] Format the pull request title like `[BEAM-XXX] Fixes bug in 
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA 
issue, if applicable. This will automatically link the pull request to the 
issue.
- [ ] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).
   
   Post-Commit Tests Status (on master branch)
   

   
   Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark
   --- | --- | --- | --- | --- | --- | --- | ---
   Go | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/)
 | --- | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/)
 | --- | --- | ---
   Java | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)
   Python | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Python_Verify/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python_Verify/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Python3_Verify/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python3_Verify/lastCompletedBuild/)
 | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/)
  [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PreCommit_Python_PVR_Flink_Cron/lastCompleted

[jira] [Work logged] (BEAM-7141) Expose kv and window parameters for on_timer

2019-06-04 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7141?focusedWorklogId=253837&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-253837
 ]

ASF GitHub Bot logged work on BEAM-7141:


Author: ASF GitHub Bot
Created on: 04/Jun/19 16:27
Start Date: 04/Jun/19 16:27
Worklog Time Spent: 10m 
  Work Description: aaltay commented on pull request #8739: BEAM-7141: add 
key value timer callback
URL: https://github.com/apache/beam/pull/8739#discussion_r290385191
 
 

 ##
 File path: sdks/python/apache_beam/runners/common.py
 ##
 @@ -183,6 +184,8 @@ def __init__(self, obj_to_invoke, method_name):
 self.timestamp_arg_name = kw
   elif v == core.DoFn.WindowParam:
 self.window_arg_name = kw
+  elif v == core.DoFn.KeyParam:
 
 Review comment:
   You have a valid a point. There may not always be a key that can be passed 
to process() method. Although sometimes there might be.
   
   Now that we have a KeyParam, a user might right a process method using that 
e.g. (process(mykey=KeyParam). What would be the reasonable thing to do in this 
case:
   
   1. We can do nothing and mykey will literally have the value KeyParam.
   2. We can fail with an error saying that KeyParam is not a valid parameter 
for process method.
   3. We can try to pass the key (e.g. k, v = element) and set mykey = k. And 
if that fails (i.e. element is not a K,V) we can fail.
   
   I was leaning towards the third option. I think second option would also be 
fine and the first option will be confusing.
   
   What do you think?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 253837)
Time Spent: 5h  (was: 4h 50m)

> Expose kv and window parameters for on_timer
> 
>
> Key: BEAM-7141
> URL: https://issues.apache.org/jira/browse/BEAM-7141
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-py-core
>Affects Versions: 2.12.0
>Reporter: Thomas Weise
>Assignee: Rakesh Kumar
>Priority: Major
>  Time Spent: 5h
>  Remaining Estimate: 0h
>
> We would like to have access to key and window inside the timer callback. 
> Without, it is also difficult to debug. We run into this while working on 
> BEAM-7112



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-7427) JmsCheckpointMark AVRO Serialisation issue with unbounded Source

2019-06-04 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7427?focusedWorklogId=253835&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-253835
 ]

ASF GitHub Bot logged work on BEAM-7427:


Author: ASF GitHub Bot
Created on: 04/Jun/19 16:20
Start Date: 04/Jun/19 16:20
Worklog Time Spent: 10m 
  Work Description: zouabimourad commented on pull request #8757: 
[BEAM-7427] Fix JmsCheckpointMark Avro Encoding
URL: https://github.com/apache/beam/pull/8757
 
 
   `JmsCheckpointMark` can't be serialized with AvroCoder beause of the inner 
class `State`
   Avro schema generation of `JmsCheckpointMark` fails
   the externalisation of the inner class fixes the issue.
   
   
   
   Thank you for your contribution! Follow this checklist to help us 
incorporate your contribution quickly and easily:
   
- [ ] [**Choose 
reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and 
mention them in a comment (`R: @username`).
- [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in 
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA 
issue, if applicable. This will automatically link the pull request to the 
issue.
- [ ] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).
   
   Post-Commit Tests Status (on master branch)
   

   
   Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark
   --- | --- | --- | --- | --- | --- | --- | ---
   Go | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/)
 | --- | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/)
 | --- | --- | ---
   Java | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)
   Python | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Python_Verify/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python_Verify/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Python3_Verify/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python3_Verify/lastCompletedBuild/)
 | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/)
  [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PreCommit_Python_PVR_Flink_Cron/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/bea

[jira] [Resolved] (BEAM-6695) Latest transform for Python SDK

2019-06-04 Thread Tanay Tummalapalli (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6695?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tanay Tummalapalli resolved BEAM-6695.
--
Resolution: Fixed

PR merged.

> Latest transform for Python SDK
> ---
>
> Key: BEAM-6695
> URL: https://issues.apache.org/jira/browse/BEAM-6695
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-py-core
>Reporter: Ahmet Altay
>Assignee: Tanay Tummalapalli
>Priority: Minor
> Fix For: 2.14.0
>
>  Time Spent: 10h 50m
>  Remaining Estimate: 0h
>
> Add a PTransform} and Combine.CombineFn for computing the latest element in a 
> PCollection.
> It should offer the same API as its Java counterpart: 
> https://github.com/apache/beam/blob/11a977b8b26eff2274d706541127c19dc93131a2/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/Latest.java



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (BEAM-6695) Latest transform for Python SDK

2019-06-04 Thread Tanay Tummalapalli (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6695?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tanay Tummalapalli updated BEAM-6695:
-
Fix Version/s: 2.14.0

> Latest transform for Python SDK
> ---
>
> Key: BEAM-6695
> URL: https://issues.apache.org/jira/browse/BEAM-6695
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-py-core
>Reporter: Ahmet Altay
>Assignee: Tanay Tummalapalli
>Priority: Minor
> Fix For: 2.14.0
>
>  Time Spent: 10h 50m
>  Remaining Estimate: 0h
>
> Add a PTransform} and Combine.CombineFn for computing the latest element in a 
> PCollection.
> It should offer the same API as its Java counterpart: 
> https://github.com/apache/beam/blob/11a977b8b26eff2274d706541127c19dc93131a2/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/Latest.java



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-7437) Integration Test for BQ streaming inserts for streaming pipelines

2019-06-04 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7437?focusedWorklogId=253757&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-253757
 ]

ASF GitHub Bot logged work on BEAM-7437:


Author: ASF GitHub Bot
Created on: 04/Jun/19 14:09
Start Date: 04/Jun/19 14:09
Worklog Time Spent: 10m 
  Work Description: ttanay commented on issue #8748: [BEAM-7437] BQ 
integration test for streaming inserts in streaming
URL: https://github.com/apache/beam/pull/8748#issuecomment-498688272
 
 
   R: @pabloem 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 253757)
Time Spent: 50m  (was: 40m)

> Integration Test for BQ streaming inserts for streaming pipelines
> -
>
> Key: BEAM-7437
> URL: https://issues.apache.org/jira/browse/BEAM-7437
> Project: Beam
>  Issue Type: Test
>  Components: io-python-gcp
>Affects Versions: 2.12.0
>Reporter: Tanay Tummalapalli
>Assignee: Tanay Tummalapalli
>Priority: Minor
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> Integration Test for BigQuery Sink using Streaming Inserts for streaming 
> pipelines.
> Integration tests currently exist for batch pipelines, it can also be added 
> for streaming pipelines using TestStream. This will be a precursor to the 
> failing integration test to be added for [BEAM-6611| 
> https://issues.apache.org/jira/browse/BEAM-6611].



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-7437) Integration Test for BQ streaming inserts for streaming pipelines

2019-06-04 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7437?focusedWorklogId=253755&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-253755
 ]

ASF GitHub Bot logged work on BEAM-7437:


Author: ASF GitHub Bot
Created on: 04/Jun/19 14:09
Start Date: 04/Jun/19 14:09
Worklog Time Spent: 10m 
  Work Description: ttanay commented on issue #8748: [BEAM-7437] BQ 
integration test for streaming inserts in streaming
URL: https://github.com/apache/beam/pull/8748#issuecomment-498688165
 
 
   The test - `test_multiple_destinations_transform_streaming` in 
`BigQueryStreamingInsertTransformIntegrationTests` passes on the 
TestDirectRunner and skips on TestDataflowRunner as expected.
   The failure is not because of the test added in this PR.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 253755)
Time Spent: 40m  (was: 0.5h)

> Integration Test for BQ streaming inserts for streaming pipelines
> -
>
> Key: BEAM-7437
> URL: https://issues.apache.org/jira/browse/BEAM-7437
> Project: Beam
>  Issue Type: Test
>  Components: io-python-gcp
>Affects Versions: 2.12.0
>Reporter: Tanay Tummalapalli
>Assignee: Tanay Tummalapalli
>Priority: Minor
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Integration Test for BigQuery Sink using Streaming Inserts for streaming 
> pipelines.
> Integration tests currently exist for batch pipelines, it can also be added 
> for streaming pipelines using TestStream. This will be a precursor to the 
> failing integration test to be added for [BEAM-6611| 
> https://issues.apache.org/jira/browse/BEAM-6611].



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


  1   2   >