[jira] [Work logged] (BEAM-6583) Audit Python 3 version support and refine compatibility spec.

2019-02-13 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6583?focusedWorklogId=198506&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-198506
 ]

ASF GitHub Bot logged work on BEAM-6583:


Author: ASF GitHub Bot
Created on: 14/Feb/19 05:57
Start Date: 14/Feb/19 05:57
Worklog Time Spent: 10m 
  Work Description: aaltay commented on pull request #7839: [BEAM-6583] 
Update minimal Python 3 requirements.
URL: https://github.com/apache/beam/pull/7839
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 198506)
Time Spent: 1h  (was: 50m)

> Audit Python 3 version support and refine compatibility spec.
> -
>
> Key: BEAM-6583
> URL: https://issues.apache.org/jira/browse/BEAM-6583
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Affects Versions: 2.11.0
>Reporter: Charles Chen
>Assignee: Valentyn Tymofieiev
>Priority: Blocker
>  Labels: triaged
> Fix For: 2.11.0
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> Currently, we have a compatibility spec of >= 2.7 <= 3.7 for development.  
> However, we have not validated this (especially Python 3.7) and need to do so 
> and refine this before release.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6602) Support schemas in BigQueryIO.Write

2019-02-13 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6602?focusedWorklogId=198466&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-198466
 ]

ASF GitHub Bot logged work on BEAM-6602:


Author: ASF GitHub Bot
Created on: 14/Feb/19 04:43
Start Date: 14/Feb/19 04:43
Worklog Time Spent: 10m 
  Work Description: reuvenlax commented on pull request #7840:  [BEAM-6602] 
BigQueryIO.write natively understands Beam schemas
URL: https://github.com/apache/beam/pull/7840
 
 
   If the input PCollection has a schema, BigQueryIO can automatically infer a 
BigQuery table schema and automatically convert the input type into a TableRow. 
A new option, useBeamSchema, is introduced to enable this behavior.
   
   This PR still needs some more unit tests and an integration test, however 
sending for review now for the BigQueryIO changes.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 198466)
Time Spent: 1h 50m  (was: 1h 40m)

> Support schemas in BigQueryIO.Write
> ---
>
> Key: BEAM-6602
> URL: https://issues.apache.org/jira/browse/BEAM-6602
> Project: Beam
>  Issue Type: Sub-task
>  Components: io-java-gcp
>Reporter: Reuven Lax
>Assignee: Reuven Lax
>Priority: Major
>  Labels: triaged
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6583) Audit Python 3 version support and refine compatibility spec.

2019-02-13 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6583?focusedWorklogId=198457&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-198457
 ]

ASF GitHub Bot logged work on BEAM-6583:


Author: ASF GitHub Bot
Created on: 14/Feb/19 04:18
Start Date: 14/Feb/19 04:18
Worklog Time Spent: 10m 
  Work Description: aaltay commented on issue #7839: [BEAM-6583] Update 
minimal Python 3 requirements.
URL: https://github.com/apache/beam/pull/7839#issuecomment-463483283
 
 
   Run Python PreCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 198457)
Time Spent: 50m  (was: 40m)

> Audit Python 3 version support and refine compatibility spec.
> -
>
> Key: BEAM-6583
> URL: https://issues.apache.org/jira/browse/BEAM-6583
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Affects Versions: 2.11.0
>Reporter: Charles Chen
>Assignee: Valentyn Tymofieiev
>Priority: Blocker
>  Labels: triaged
> Fix For: 2.11.0
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> Currently, we have a compatibility spec of >= 2.7 <= 3.7 for development.  
> However, we have not validated this (especially Python 3.7) and need to do so 
> and refine this before release.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6583) Audit Python 3 version support and refine compatibility spec.

2019-02-13 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6583?focusedWorklogId=198440&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-198440
 ]

ASF GitHub Bot logged work on BEAM-6583:


Author: ASF GitHub Bot
Created on: 14/Feb/19 03:19
Start Date: 14/Feb/19 03:19
Worklog Time Spent: 10m 
  Work Description: tvalentyn commented on issue #7839: [BEAM-6583] Update 
minimal Python 3 requirements.
URL: https://github.com/apache/beam/pull/7839#issuecomment-463471804
 
 
   Run Dataflow Python ValidatesRunner
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 198440)
Time Spent: 0.5h  (was: 20m)

> Audit Python 3 version support and refine compatibility spec.
> -
>
> Key: BEAM-6583
> URL: https://issues.apache.org/jira/browse/BEAM-6583
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Affects Versions: 2.11.0
>Reporter: Charles Chen
>Assignee: Valentyn Tymofieiev
>Priority: Blocker
>  Labels: triaged
> Fix For: 2.11.0
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Currently, we have a compatibility spec of >= 2.7 <= 3.7 for development.  
> However, we have not validated this (especially Python 3.7) and need to do so 
> and refine this before release.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6583) Audit Python 3 version support and refine compatibility spec.

2019-02-13 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6583?focusedWorklogId=198441&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-198441
 ]

ASF GitHub Bot logged work on BEAM-6583:


Author: ASF GitHub Bot
Created on: 14/Feb/19 03:20
Start Date: 14/Feb/19 03:20
Worklog Time Spent: 10m 
  Work Description: tvalentyn commented on issue #7839: [BEAM-6583] Update 
minimal Python 3 requirements.
URL: https://github.com/apache/beam/pull/7839#issuecomment-463472007
 
 
   Run Python Dataflow ValidatesRunner
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 198441)
Time Spent: 40m  (was: 0.5h)

> Audit Python 3 version support and refine compatibility spec.
> -
>
> Key: BEAM-6583
> URL: https://issues.apache.org/jira/browse/BEAM-6583
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Affects Versions: 2.11.0
>Reporter: Charles Chen
>Assignee: Valentyn Tymofieiev
>Priority: Blocker
>  Labels: triaged
> Fix For: 2.11.0
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Currently, we have a compatibility spec of >= 2.7 <= 3.7 for development.  
> However, we have not validated this (especially Python 3.7) and need to do so 
> and refine this before release.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6392) Add support for new BigQuery streaming read API to BigQueryIO

2019-02-13 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6392?focusedWorklogId=198439&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-198439
 ]

ASF GitHub Bot logged work on BEAM-6392:


Author: ASF GitHub Bot
Created on: 14/Feb/19 03:04
Start Date: 14/Feb/19 03:04
Worklog Time Spent: 10m 
  Work Description: chamikaramj commented on pull request #7441: 
[BEAM-6392] Add support for the BigQuery read API to BigQueryIO.
URL: https://github.com/apache/beam/pull/7441
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 198439)
Time Spent: 4h 10m  (was: 4h)

> Add support for new BigQuery streaming read API to BigQueryIO
> -
>
> Key: BEAM-6392
> URL: https://issues.apache.org/jira/browse/BEAM-6392
> Project: Beam
>  Issue Type: New Feature
>  Components: io-java-gcp
>Reporter: Kenneth Jung
>Assignee: Kenneth Jung
>Priority: Major
>  Labels: triaged
>  Time Spent: 4h 10m
>  Remaining Estimate: 0h
>
> BigQuery has developed a new streaming egress API which will soon reach 
> public availability. Add support for the new API in BigQueryIO.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6583) Audit Python 3 version support and refine compatibility spec.

2019-02-13 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6583?focusedWorklogId=198428&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-198428
 ]

ASF GitHub Bot logged work on BEAM-6583:


Author: ASF GitHub Bot
Created on: 14/Feb/19 02:29
Start Date: 14/Feb/19 02:29
Worklog Time Spent: 10m 
  Work Description: tvalentyn commented on issue #7839: [BEAM-6583] Update 
minimal Python 3 requirements.
URL: https://github.com/apache/beam/pull/7839#issuecomment-463461306
 
 
   R: @aaltay @charlesccychen 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 198428)
Time Spent: 20m  (was: 10m)

> Audit Python 3 version support and refine compatibility spec.
> -
>
> Key: BEAM-6583
> URL: https://issues.apache.org/jira/browse/BEAM-6583
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Affects Versions: 2.11.0
>Reporter: Charles Chen
>Assignee: Valentyn Tymofieiev
>Priority: Blocker
>  Labels: triaged
> Fix For: 2.11.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Currently, we have a compatibility spec of >= 2.7 <= 3.7 for development.  
> However, we have not validated this (especially Python 3.7) and need to do so 
> and refine this before release.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-6583) Audit Python 3 version support and refine compatibility spec.

2019-02-13 Thread Valentyn Tymofieiev (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-6583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16767819#comment-16767819
 ] 

Valentyn Tymofieiev commented on BEAM-6583:
---

I propose:
-  to make 2.11 installable on Python 3.5 or higher.
-  add 3.5 version classifier as 'supported'  in setup.py (which will be 
reflected in PyPi). 3.5 is the only version that has been reasonably tested on 
Jenkins,
- runners can have additional restrictions based on Py3 versions they support.

 This is reflected in https://github.com/apache/beam/pull/7839. 

> Audit Python 3 version support and refine compatibility spec.
> -
>
> Key: BEAM-6583
> URL: https://issues.apache.org/jira/browse/BEAM-6583
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Affects Versions: 2.11.0
>Reporter: Charles Chen
>Assignee: Valentyn Tymofieiev
>Priority: Blocker
>  Labels: triaged
> Fix For: 2.11.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Currently, we have a compatibility spec of >= 2.7 <= 3.7 for development.  
> However, we have not validated this (especially Python 3.7) and need to do so 
> and refine this before release.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6583) Audit Python 3 version support and refine compatibility spec.

2019-02-13 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6583?focusedWorklogId=198424&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-198424
 ]

ASF GitHub Bot logged work on BEAM-6583:


Author: ASF GitHub Bot
Created on: 14/Feb/19 01:56
Start Date: 14/Feb/19 01:56
Worklog Time Spent: 10m 
  Work Description: tvalentyn commented on pull request #7839: [BEAM-6583] 
Update minimal Python 3 requirements.
URL: https://github.com/apache/beam/pull/7839
 
 
   Post-Commit Tests Status (on master branch)
   

   
   Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark
   --- | --- | --- | --- | --- | --- | --- | ---
   Go | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/)
 | --- | --- | --- | --- | --- | ---
   Java | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)
   Python | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Python_Verify/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python_Verify/lastCompletedBuild/)
 | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/)
  [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PreCommit_Python_PVR_Flink_Cron/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PreCommit_Python_PVR_Flink_Cron/lastCompletedBuild/)
 | --- | --- | ---
   
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 198424)
Time Spent: 10m
Remaining Estimate: 0h

> Audit Python 3 version support and refine compatibility spec.
> -
>
> Key: BEAM-6583
> URL: https://issues.apache.org/jira/browse/BEAM-6583
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Affects Versions: 2.11.0
>Reporter: Charles Chen
>Assignee: Valentyn Tymofieiev
>Priority: Blocker
>  Labels: triaged
> Fix For: 2.11.0
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Currently, we have a compatibility spec of >= 2.7 <= 3.7 for development.  
> However, we have not validated this (especially Python 3.7) and need 

[jira] [Commented] (BEAM-5846) inconsistent result from pylint

2019-02-13 Thread Heejong Lee (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-5846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16767787#comment-16767787
 ] 

Heejong Lee commented on BEAM-5846:
---

The workaround is to always use `./gradlew lint` instead of `run_pylint.sh`. 
Seems there's nothing to do from Beam. Closing.

> inconsistent result from pylint
> ---
>
> Key: BEAM-5846
> URL: https://issues.apache.org/jira/browse/BEAM-5846
> Project: Beam
>  Issue Type: Bug
>  Components: testing
>Reporter: Heejong Lee
>Assignee: Heejong Lee
>Priority: Major
>  Labels: triaged
> Attachments: lint_100.txt, lint_j1_100.txt
>
>
> scripts/run_pylint.sh returns inconsistent result. 
> {code:java}
> beam/sdks/python$ for i in `seq 100`; do scripts/run_pylint.sh; done > 
> lint_100.txt
> beam/sdks/python$ grep slots-on-old-class lint_100.txt | wc -l
> 100
> beam/sdks/python$ grep no-self-argument lint_100.txt | wc -l
> 42
> {code}
>  
> Tested @ 0c8ccae9aa608f4d64b22c08d57b9aaa8724bfee
> {code:java}
> beam/sdks/python$ pip freeze
> apache-beam==2.9.0.dev0
> avro==1.8.2
> cachetools==2.1.0
> certifi==2018.8.24
> chardet==3.0.4
> crcmod==1.7
> dill==0.2.8.2
> docopt==0.6.2
> enum34==1.1.6
> fastavro==0.21.9
> fasteners==0.14.1
> funcsigs==1.0.2
> future==0.16.0
> futures==3.2.0
> gapic-google-cloud-pubsub-v1==0.15.4
> google-apitools==0.5.20
> google-auth==1.5.1
> google-auth-httplib2==0.0.3
> google-cloud-bigquery==0.25.0
> google-cloud-core==0.25.0
> google-cloud-pubsub==0.26.0
> google-gax==0.15.16
> googleapis-common-protos==1.5.3
> googledatastore==7.0.1
> grpc-google-iam-v1==0.11.4
> grpcio==1.15.0
> hdfs==2.1.0
> httplib2==0.11.3
> idna==2.7
> mock==2.0.0
> monotonic==1.5
> nose==1.3.7
> numpy==1.15.2
> oauth2client==4.1.3
> pbr==4.3.0
> ply==3.8
> proto-google-cloud-datastore-v1==0.90.4
> proto-google-cloud-pubsub-v1==0.15.4
> protobuf==3.6.1
> pyarrow==0.11.0
> pyasn1==0.4.4
> pyasn1-modules==0.2.2
> pydot==1.2.4
> PyHamcrest==1.9.0
> pyparsing==2.2.2
> pytz==2018.4
> PyVCF==0.6.8
> PyYAML==3.13
> requests==2.19.1
> rsa==4.0
> six==1.11.0
> typing==3.6.6
> urllib3==1.23
> beam/sdks/python$ python --version
> Python 2.7.13
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (BEAM-6658) Add kms_key to BigQuery transforms, pass to Dataflow

2019-02-13 Thread Udi Meiri (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6658?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Udi Meiri updated BEAM-6658:

Fix Version/s: (was: Not applicable)
   2.11.0

> Add kms_key to BigQuery transforms, pass to Dataflow
> 
>
> Key: BEAM-6658
> URL: https://issues.apache.org/jira/browse/BEAM-6658
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-py-core
>Affects Versions: 2.11.0
>Reporter: Udi Meiri
>Assignee: Udi Meiri
>Priority: Blocker
> Fix For: 2.11.0
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (BEAM-6488) Portable Flink runner support for running cross-language transforms

2019-02-13 Thread Heejong Lee (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6488?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Heejong Lee resolved BEAM-6488.
---
   Resolution: Fixed
Fix Version/s: 2.11.0

Merged.

> Portable Flink runner support for running cross-language transforms
> ---
>
> Key: BEAM-6488
> URL: https://issues.apache.org/jira/browse/BEAM-6488
> Project: Beam
>  Issue Type: New Feature
>  Components: beam-model, runner-core, runner-flink, sdk-java-core, 
> sdk-py-core
>Reporter: Chamikara Jayalath
>Assignee: Heejong Lee
>Priority: Major
>  Labels: triaged
> Fix For: 2.11.0
>
>  Time Spent: 4.5h
>  Remaining Estimate: 0h
>
> To support running cross-language transforms, Portable Flink runner needs to 
> support executing pipelines with steps defined to be executed for different 
> environments.
> I believe this support is already there. If that is the case we should 
> validate that and add any missing tests.
> If there are missing pieces, we should figure out details and create more 
> JIRAs as needed. 
> CC: [~angoenka]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-6636) [beam_PostCommit_Java] testGcsWriteWithKmsKey test fails

2019-02-13 Thread Udi Meiri (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-6636?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16767784#comment-16767784
 ] 

Udi Meiri commented on BEAM-6636:
-

I believe this is fixed. Java post-commits are currently green.

> [beam_PostCommit_Java] testGcsWriteWithKmsKey test fails
> 
>
> Key: BEAM-6636
> URL: https://issues.apache.org/jira/browse/BEAM-6636
> Project: Beam
>  Issue Type: New Feature
>  Components: test-failures
>Reporter: Mikhail Gryzykhin
>Assignee: Udi Meiri
>Priority: Major
>  Labels: currently-failing
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> [https://builds.apache.org/job/beam_PostCommit_Java/2563/]
> Test failed, no detailed logs available.
> Relevant logs:
> |org.apache.beam.sdk.io.gcp.storage.GcsKmsKeyIT > testGcsWriteWithKmsKey 
> FAILED|
> | java.lang.IllegalArgumentException at GcsKmsKeyIT.java:73|



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Closed] (BEAM-5846) inconsistent result from pylint

2019-02-13 Thread Heejong Lee (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5846?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Heejong Lee closed BEAM-5846.
-
   Resolution: Won't Fix
Fix Version/s: Not applicable

> inconsistent result from pylint
> ---
>
> Key: BEAM-5846
> URL: https://issues.apache.org/jira/browse/BEAM-5846
> Project: Beam
>  Issue Type: Bug
>  Components: testing
>Reporter: Heejong Lee
>Assignee: Heejong Lee
>Priority: Major
>  Labels: triaged
> Fix For: Not applicable
>
> Attachments: lint_100.txt, lint_j1_100.txt
>
>
> scripts/run_pylint.sh returns inconsistent result. 
> {code:java}
> beam/sdks/python$ for i in `seq 100`; do scripts/run_pylint.sh; done > 
> lint_100.txt
> beam/sdks/python$ grep slots-on-old-class lint_100.txt | wc -l
> 100
> beam/sdks/python$ grep no-self-argument lint_100.txt | wc -l
> 42
> {code}
>  
> Tested @ 0c8ccae9aa608f4d64b22c08d57b9aaa8724bfee
> {code:java}
> beam/sdks/python$ pip freeze
> apache-beam==2.9.0.dev0
> avro==1.8.2
> cachetools==2.1.0
> certifi==2018.8.24
> chardet==3.0.4
> crcmod==1.7
> dill==0.2.8.2
> docopt==0.6.2
> enum34==1.1.6
> fastavro==0.21.9
> fasteners==0.14.1
> funcsigs==1.0.2
> future==0.16.0
> futures==3.2.0
> gapic-google-cloud-pubsub-v1==0.15.4
> google-apitools==0.5.20
> google-auth==1.5.1
> google-auth-httplib2==0.0.3
> google-cloud-bigquery==0.25.0
> google-cloud-core==0.25.0
> google-cloud-pubsub==0.26.0
> google-gax==0.15.16
> googleapis-common-protos==1.5.3
> googledatastore==7.0.1
> grpc-google-iam-v1==0.11.4
> grpcio==1.15.0
> hdfs==2.1.0
> httplib2==0.11.3
> idna==2.7
> mock==2.0.0
> monotonic==1.5
> nose==1.3.7
> numpy==1.15.2
> oauth2client==4.1.3
> pbr==4.3.0
> ply==3.8
> proto-google-cloud-datastore-v1==0.90.4
> proto-google-cloud-pubsub-v1==0.15.4
> protobuf==3.6.1
> pyarrow==0.11.0
> pyasn1==0.4.4
> pyasn1-modules==0.2.2
> pydot==1.2.4
> PyHamcrest==1.9.0
> pyparsing==2.2.2
> pytz==2018.4
> PyVCF==0.6.8
> PyYAML==3.13
> requests==2.19.1
> rsa==4.0
> six==1.11.0
> typing==3.6.6
> urllib3==1.23
> beam/sdks/python$ python --version
> Python 2.7.13
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (BEAM-5173) org.apache.beam.runners.fnexecution.control.RemoteExecutionTest.testExecutionWithMultipleStages is flaky

2019-02-13 Thread Brian Hulette (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5173?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brian Hulette reassigned BEAM-5173:
---

Assignee: Brian Hulette

> org.apache.beam.runners.fnexecution.control.RemoteExecutionTest.testExecutionWithMultipleStages
>  is flaky
> 
>
> Key: BEAM-5173
> URL: https://issues.apache.org/jira/browse/BEAM-5173
> Project: Beam
>  Issue Type: Bug
>  Components: runner-core
>Reporter: Valentyn Tymofieiev
>Assignee: Brian Hulette
>Priority: Major
>
> Hi [~lcwik], this test failed in a [recent postcommit 
> build|https://builds.apache.org/job/beam_PostCommit_Java_GradleBuild/1285/testReport/junit/org.apache.beam.runners.fnexecution.control/RemoteExecutionTest/testExecutionWithMultipleStages].
>  Could you please take a look or help triage to the right owner? Thank you.
> Stack trace: 
> ava.util.concurrent.ExecutionException: java.lang.RuntimeException: 
> java.util.concurrent.ExecutionException: 
> org.apache.beam.vendor.grpc.v1.io.grpc.StatusRuntimeException: CANCELLED: 
> Runner closed connection
>   at java.util.concurrent.FutureTask.report(FutureTask.java:122)
>   at java.util.concurrent.FutureTask.get(FutureTask.java:192)
>   at 
> org.apache.beam.runners.fnexecution.control.RemoteExecutionTest.tearDown(RemoteExecutionTest.java:198)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:33)
>   at org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:48)
>   at org.junit.rules.RunRules.evaluate(RunRules.java:20)
>   at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
>   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
>   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
>   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
>   at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
>   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
>   at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
>   at 
> org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.runTestClass(JUnitTestClassExecutor.java:106)
>   at 
> org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.execute(JUnitTestClassExecutor.java:58)
>   at 
> org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.execute(JUnitTestClassExecutor.java:38)
>   at 
> org.gradle.api.internal.tasks.testing.junit.AbstractJUnitTestClassProcessor.processTestClass(AbstractJUnitTestClassProcessor.java:66)
>   at 
> org.gradle.api.internal.tasks.testing.SuiteTestClassProcessor.processTestClass(SuiteTestClassProcessor.java:51)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:35)
>   at 
> org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:24)
>   at 
> org.gradle.internal.dispatch.ContextClassLoaderDispatch.dispatch(ContextClassLoaderDispatch.java:32)
>   at 
> org.gradle.internal.dispatch.ProxyDispatchAdapter$DispatchingInvocationHandler.invoke(ProxyDispatchAdapter.java:93)
>   at com.sun.proxy.$Proxy2.processTestClass(Unknown Source)
>   at 
> org.gradle.api.internal.tasks.testing.worker.TestWorker.processTestClass(TestWorker.java:109)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.re

[jira] [Commented] (BEAM-6636) [beam_PostCommit_Java] testGcsWriteWithKmsKey test fails

2019-02-13 Thread Udi Meiri (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-6636?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16767785#comment-16767785
 ] 

Udi Meiri commented on BEAM-6636:
-

I believe this is fixed. Java post-commits are currently green.

> [beam_PostCommit_Java] testGcsWriteWithKmsKey test fails
> 
>
> Key: BEAM-6636
> URL: https://issues.apache.org/jira/browse/BEAM-6636
> Project: Beam
>  Issue Type: New Feature
>  Components: test-failures
>Reporter: Mikhail Gryzykhin
>Assignee: Udi Meiri
>Priority: Major
>  Labels: currently-failing
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> [https://builds.apache.org/job/beam_PostCommit_Java/2563/]
> Test failed, no detailed logs available.
> Relevant logs:
> |org.apache.beam.sdk.io.gcp.storage.GcsKmsKeyIT > testGcsWriteWithKmsKey 
> FAILED|
> | java.lang.IllegalArgumentException at GcsKmsKeyIT.java:73|



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (BEAM-6636) [beam_PostCommit_Java] testGcsWriteWithKmsKey test fails

2019-02-13 Thread Udi Meiri (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6636?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Udi Meiri resolved BEAM-6636.
-
   Resolution: Fixed
Fix Version/s: Not applicable

> [beam_PostCommit_Java] testGcsWriteWithKmsKey test fails
> 
>
> Key: BEAM-6636
> URL: https://issues.apache.org/jira/browse/BEAM-6636
> Project: Beam
>  Issue Type: New Feature
>  Components: test-failures
>Reporter: Mikhail Gryzykhin
>Assignee: Udi Meiri
>Priority: Major
>  Labels: currently-failing
> Fix For: Not applicable
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> [https://builds.apache.org/job/beam_PostCommit_Java/2563/]
> Test failed, no detailed logs available.
> Relevant logs:
> |org.apache.beam.sdk.io.gcp.storage.GcsKmsKeyIT > testGcsWriteWithKmsKey 
> FAILED|
> | java.lang.IllegalArgumentException at GcsKmsKeyIT.java:73|



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (BEAM-2801) Implement a BigQuery custom sink

2019-02-13 Thread Udi Meiri (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-2801?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Udi Meiri reassigned BEAM-2801:
---

Assignee: Pablo Estrada  (was: Udi Meiri)

> Implement a BigQuery custom sink
> 
>
> Key: BEAM-2801
> URL: https://issues.apache.org/jira/browse/BEAM-2801
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-py-core
>Reporter: Chamikara Jayalath
>Assignee: Pablo Estrada
>Priority: Major
>  Labels: triaged
>
> Currently Python SDK has a native (Dataflow) BigQuery sink. We need to 
> implement a custom BigQuery sink to support following.
> * overcome BigQuery per load job quotas by executing multiple load jobs.
> * support SDK level features such as data-dependent writes 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (BEAM-5959) Add Cloud KMS support to GCS copies

2019-02-13 Thread Udi Meiri (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5959?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Udi Meiri updated BEAM-5959:

Summary: Add Cloud KMS support to GCS copies  (was: Add Cloud KMS support 
to GCS creates and copies)

> Add Cloud KMS support to GCS copies
> ---
>
> Key: BEAM-5959
> URL: https://issues.apache.org/jira/browse/BEAM-5959
> Project: Beam
>  Issue Type: Bug
>  Components: io-java-gcp, sdk-py-core
>Reporter: Udi Meiri
>Assignee: Udi Meiri
>Priority: Major
>  Labels: triaged
>  Time Spent: 30h 40m
>  Remaining Estimate: 0h
>
> Beam SDK currently uses the CopyTo GCS API call, which doesn't support 
> copying objects that Customer Managed Encryption Keys (CMEK).
> CMEKs are managed in Cloud KMS.
> Items (for Java and Python SDKs):
> - Update clients to versions that support KMS keys.
> - Change copyTo API calls to use rewriteTo (Python - directly, Java - 
> possibly convert copyTo API call to use client library)
> - Add unit tests.
> - Add basic tests (DirectRunner and GCS buckets with CMEK).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (BEAM-5959) Add Cloud KMS support to GCS copies

2019-02-13 Thread Udi Meiri (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5959?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Udi Meiri resolved BEAM-5959.
-
   Resolution: Fixed
Fix Version/s: 2.11.0

> Add Cloud KMS support to GCS copies
> ---
>
> Key: BEAM-5959
> URL: https://issues.apache.org/jira/browse/BEAM-5959
> Project: Beam
>  Issue Type: Bug
>  Components: io-java-gcp, sdk-py-core
>Reporter: Udi Meiri
>Assignee: Udi Meiri
>Priority: Major
>  Labels: triaged
> Fix For: 2.11.0
>
>  Time Spent: 30h 40m
>  Remaining Estimate: 0h
>
> Beam SDK currently uses the CopyTo GCS API call, which doesn't support 
> copying objects that Customer Managed Encryption Keys (CMEK).
> CMEKs are managed in Cloud KMS.
> Items (for Java and Python SDKs):
> - Update clients to versions that support KMS keys.
> - Change copyTo API calls to use rewriteTo (Python - directly, Java - 
> possibly convert copyTo API call to use client library)
> - Add unit tests.
> - Add basic tests (DirectRunner and GCS buckets with CMEK).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-5959) Add Cloud KMS support to GCS creates and copies

2019-02-13 Thread Udi Meiri (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-5959?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=1676#comment-1676
 ] 

Udi Meiri commented on BEAM-5959:
-

Copy support is done (implemented as a rewrite call).
Create support has been delayed. Might be added later as a --gcsKmsKey flag or 
similar.
KMS can still be used on GCS using bucket default keys. Objects created in a 
bucket will inherit the bucket default KMS key.


> Add Cloud KMS support to GCS creates and copies
> ---
>
> Key: BEAM-5959
> URL: https://issues.apache.org/jira/browse/BEAM-5959
> Project: Beam
>  Issue Type: Bug
>  Components: io-java-gcp, sdk-py-core
>Reporter: Udi Meiri
>Assignee: Udi Meiri
>Priority: Major
>  Labels: triaged
>  Time Spent: 30h 40m
>  Remaining Estimate: 0h
>
> Beam SDK currently uses the CopyTo GCS API call, which doesn't support 
> copying objects that Customer Managed Encryption Keys (CMEK).
> CMEKs are managed in Cloud KMS.
> Items (for Java and Python SDKs):
> - Update clients to versions that support KMS keys.
> - Change copyTo API calls to use rewriteTo (Python - directly, Java - 
> possibly convert copyTo API call to use client library)
> - Add unit tests.
> - Add basic tests (DirectRunner and GCS buckets with CMEK).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-6380) apache_beam.examples.wordcount_it_test.WordCountIT with DirectRunner failed

2019-02-13 Thread Udi Meiri (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-6380?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16767773#comment-16767773
 ] 

Udi Meiri commented on BEAM-6380:
-

Reference doc: 
https://googleapis.github.io/google-resumable-media-python/latest/google.resumable_media.requests.html#resumable-uploads

> apache_beam.examples.wordcount_it_test.WordCountIT with DirectRunner failed
> ---
>
> Key: BEAM-6380
> URL: https://issues.apache.org/jira/browse/BEAM-6380
> Project: Beam
>  Issue Type: Test
>  Components: test-failures
>Reporter: Boyuan Zhang
>Assignee: Udi Meiri
>Priority: Major
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> wordcount test in :pythonPostCommit failed owing to RuntimeError: 
> NotImplementedError [while running 'write/Write/WriteImpl/WriteBundles']
>  
>  https://builds.apache.org/job/beam_PostCommit_Python_Verify/7001/



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (BEAM-6668) use add experiment methods (Java and Python)

2019-02-13 Thread Udi Meiri (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6668?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Udi Meiri updated BEAM-6668:

Labels: beginner easyfix newbie  (was: )

> use add experiment methods (Java and Python)
> 
>
> Key: BEAM-6668
> URL: https://issues.apache.org/jira/browse/BEAM-6668
> Project: Beam
>  Issue Type: Improvement
>  Components: io-java-gcp, sdk-py-core
>Reporter: Udi Meiri
>Priority: Minor
>  Labels: beginner, easyfix, newbie
>
> Python:
> Convert instances of experiments.append(...)
> to debug_options.add_experiment(...)
> Java:
> Use ExperimentalOptions.addExperiment(...)
> instead of getExperiments(), modify, setExperiments() pattern.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (BEAM-6669) revert service_default_cmek_config experiment flag

2019-02-13 Thread Udi Meiri (JIRA)
Udi Meiri created BEAM-6669:
---

 Summary: revert service_default_cmek_config experiment flag 
 Key: BEAM-6669
 URL: https://issues.apache.org/jira/browse/BEAM-6669
 Project: Beam
  Issue Type: Improvement
  Components: io-java-gcp, sdk-py-core
Reporter: Udi Meiri
Assignee: Udi Meiri


Do this when --dataflowKmsKey is supported on Dataflow.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (BEAM-6668) use add experiment methods (Java and Python)

2019-02-13 Thread Udi Meiri (JIRA)
Udi Meiri created BEAM-6668:
---

 Summary: use add experiment methods (Java and Python)
 Key: BEAM-6668
 URL: https://issues.apache.org/jira/browse/BEAM-6668
 Project: Beam
  Issue Type: Improvement
  Components: io-java-gcp, sdk-py-core
Reporter: Udi Meiri


Python:
Convert instances of experiments.append(...)
to debug_options.add_experiment(...)

Java:
Use ExperimentalOptions.addExperiment(...)
instead of getExperiments(), modify, setExperiments() pattern.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (BEAM-6583) Audit Python 3 version support and refine compatibility spec.

2019-02-13 Thread Ahmet Altay (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6583?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmet Altay reassigned BEAM-6583:
-

Assignee: Valentyn Tymofieiev  (was: Ahmet Altay)

> Audit Python 3 version support and refine compatibility spec.
> -
>
> Key: BEAM-6583
> URL: https://issues.apache.org/jira/browse/BEAM-6583
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Affects Versions: 2.11.0
>Reporter: Charles Chen
>Assignee: Valentyn Tymofieiev
>Priority: Blocker
>  Labels: triaged
> Fix For: 2.11.0
>
>
> Currently, we have a compatibility spec of >= 2.7 <= 3.7 for development.  
> However, we have not validated this (especially Python 3.7) and need to do so 
> and refine this before release.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (BEAM-6583) Audit Python 3 version support and refine compatibility spec.

2019-02-13 Thread Ahmet Altay (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6583?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmet Altay updated BEAM-6583:
--
Fix Version/s: 2.11.0

> Audit Python 3 version support and refine compatibility spec.
> -
>
> Key: BEAM-6583
> URL: https://issues.apache.org/jira/browse/BEAM-6583
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Affects Versions: 2.11.0
>Reporter: Charles Chen
>Assignee: Ahmet Altay
>Priority: Blocker
>  Labels: triaged
> Fix For: 2.11.0
>
>
> Currently, we have a compatibility spec of >= 2.7 <= 3.7 for development.  
> However, we have not validated this (especially Python 3.7) and need to do so 
> and refine this before release.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-6583) Audit Python 3 version support and refine compatibility spec.

2019-02-13 Thread Ahmet Altay (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-6583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16767767#comment-16767767
 ] 

Ahmet Altay commented on BEAM-6583:
---

What needs to be done for 2.11 ?

> Audit Python 3 version support and refine compatibility spec.
> -
>
> Key: BEAM-6583
> URL: https://issues.apache.org/jira/browse/BEAM-6583
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Affects Versions: 2.11.0
>Reporter: Charles Chen
>Assignee: Valentyn Tymofieiev
>Priority: Blocker
>  Labels: triaged
> Fix For: 2.11.0
>
>
> Currently, we have a compatibility spec of >= 2.7 <= 3.7 for development.  
> However, we have not validated this (especially Python 3.7) and need to do so 
> and refine this before release.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (BEAM-6664) Temporarily convert dataflowKmsKey flag to experimental flag for Dataflow

2019-02-13 Thread Ahmet Altay (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6664?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmet Altay resolved BEAM-6664.
---
   Resolution: Fixed
Fix Version/s: 2.11.0

> Temporarily convert dataflowKmsKey flag to experimental flag for Dataflow
> -
>
> Key: BEAM-6664
> URL: https://issues.apache.org/jira/browse/BEAM-6664
> Project: Beam
>  Issue Type: Improvement
>  Components: io-java-gcp, sdk-py-core
>Reporter: Udi Meiri
>Assignee: Udi Meiri
>Priority: Major
> Fix For: 2.11.0
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-5173) org.apache.beam.runners.fnexecution.control.RemoteExecutionTest.testExecutionWithMultipleStages is flaky

2019-02-13 Thread Brian Hulette (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-5173?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16767764#comment-16767764
 ] 

Brian Hulette commented on BEAM-5173:
-

Ok I'll try that out locally

> org.apache.beam.runners.fnexecution.control.RemoteExecutionTest.testExecutionWithMultipleStages
>  is flaky
> 
>
> Key: BEAM-5173
> URL: https://issues.apache.org/jira/browse/BEAM-5173
> Project: Beam
>  Issue Type: Bug
>  Components: runner-core
>Reporter: Valentyn Tymofieiev
>Priority: Major
>
> Hi [~lcwik], this test failed in a [recent postcommit 
> build|https://builds.apache.org/job/beam_PostCommit_Java_GradleBuild/1285/testReport/junit/org.apache.beam.runners.fnexecution.control/RemoteExecutionTest/testExecutionWithMultipleStages].
>  Could you please take a look or help triage to the right owner? Thank you.
> Stack trace: 
> ava.util.concurrent.ExecutionException: java.lang.RuntimeException: 
> java.util.concurrent.ExecutionException: 
> org.apache.beam.vendor.grpc.v1.io.grpc.StatusRuntimeException: CANCELLED: 
> Runner closed connection
>   at java.util.concurrent.FutureTask.report(FutureTask.java:122)
>   at java.util.concurrent.FutureTask.get(FutureTask.java:192)
>   at 
> org.apache.beam.runners.fnexecution.control.RemoteExecutionTest.tearDown(RemoteExecutionTest.java:198)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:33)
>   at org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:48)
>   at org.junit.rules.RunRules.evaluate(RunRules.java:20)
>   at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
>   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
>   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
>   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
>   at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
>   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
>   at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
>   at 
> org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.runTestClass(JUnitTestClassExecutor.java:106)
>   at 
> org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.execute(JUnitTestClassExecutor.java:58)
>   at 
> org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.execute(JUnitTestClassExecutor.java:38)
>   at 
> org.gradle.api.internal.tasks.testing.junit.AbstractJUnitTestClassProcessor.processTestClass(AbstractJUnitTestClassProcessor.java:66)
>   at 
> org.gradle.api.internal.tasks.testing.SuiteTestClassProcessor.processTestClass(SuiteTestClassProcessor.java:51)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:35)
>   at 
> org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:24)
>   at 
> org.gradle.internal.dispatch.ContextClassLoaderDispatch.dispatch(ContextClassLoaderDispatch.java:32)
>   at 
> org.gradle.internal.dispatch.ProxyDispatchAdapter$DispatchingInvocationHandler.invoke(ProxyDispatchAdapter.java:93)
>   at com.sun.proxy.$Proxy2.processTestClass(Unknown Source)
>   at 
> org.gradle.api.internal.tasks.testing.worker.TestWorker.processTestClass(TestWorker.java:109)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   

[jira] [Created] (BEAM-6667) portableWordCount test doesn't properly cleanup python docker instance

2019-02-13 Thread Heejong Lee (JIRA)
Heejong Lee created BEAM-6667:
-

 Summary: portableWordCount test doesn't properly cleanup python 
docker instance
 Key: BEAM-6667
 URL: https://issues.apache.org/jira/browse/BEAM-6667
 Project: Beam
  Issue Type: Bug
  Components: java-fn-execution, runner-flink
Reporter: Heejong Lee
Assignee: Heejong Lee


portableWordCount test cleans flink job server docker instance but not python 
FnHarness docker instance.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-6666) subprocess.Popen hangs after use of gRPC channel

2019-02-13 Thread Heejong Lee (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16767757#comment-16767757
 ] 

Heejong Lee commented on BEAM-:
---

multiple related issues are reported to gRPC since 2017 but the problem doesn't 
look fundamentally fixed.
 * [https://github.com/grpc/grpc/issues/17986]
 * [https://github.com/grpc/grpc/issues/13873]
 * [https://github.com/grpc/grpc/issues/13998]
 * [https://github.com/grpc/grpc/issues/15334]
 * [https://github.com/grpc/grpc/issues/15557]

 

> subprocess.Popen hangs after use of gRPC channel
> 
>
> Key: BEAM-
> URL: https://issues.apache.org/jira/browse/BEAM-
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Heejong Lee
>Assignee: Heejong Lee
>Priority: Major
>
> subprocess.Popen randomly hangs after use of gRPC channel. This makes 
> cross-language wordcount test fail because the test uses Popen to launch 
> Dockerized Flink job server in `PortableRunner.run_pipeline` after use of 
> gRPC channel for the expansion service in `ExternalTransform.expand`. Few 
> symptoms are listed below:
>  * Hanging at `docker_path = check_output(['which', 'docker']).strip()`
>  * Hanging at `self.docker_process = Popen(cmd)`
>  * Crashing with `assertion failed: pthread_mutex_lock(mu) == 0` message
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (BEAM-6666) subprocess.Popen hangs after use of gRPC channel

2019-02-13 Thread Heejong Lee (JIRA)
Heejong Lee created BEAM-:
-

 Summary: subprocess.Popen hangs after use of gRPC channel
 Key: BEAM-
 URL: https://issues.apache.org/jira/browse/BEAM-
 Project: Beam
  Issue Type: Bug
  Components: sdk-py-core
Reporter: Heejong Lee
Assignee: Heejong Lee


subprocess.Popen randomly hangs after use of gRPC channel. This makes 
cross-language wordcount test fail because the test uses Popen to launch 
Dockerized Flink job server in `PortableRunner.run_pipeline` after use of gRPC 
channel for the expansion service in `ExternalTransform.expand`. Few symptoms 
are listed below:
 * Hanging at `docker_path = check_output(['which', 'docker']).strip()`
 * Hanging at `self.docker_process = Popen(cmd)`
 * Crashing with `assertion failed: pthread_mutex_lock(mu) == 0` message

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6664) Temporarily convert dataflowKmsKey flag to experimental flag for Dataflow

2019-02-13 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6664?focusedWorklogId=198418&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-198418
 ]

ASF GitHub Bot logged work on BEAM-6664:


Author: ASF GitHub Bot
Created on: 14/Feb/19 00:55
Start Date: 14/Feb/19 00:55
Worklog Time Spent: 10m 
  Work Description: aaltay commented on pull request #7835: [BEAM-6664] 
Temporarily convert dataflowKmsKey flag to experimental
URL: https://github.com/apache/beam/pull/7835
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 198418)
Time Spent: 1h  (was: 50m)

> Temporarily convert dataflowKmsKey flag to experimental flag for Dataflow
> -
>
> Key: BEAM-6664
> URL: https://issues.apache.org/jira/browse/BEAM-6664
> Project: Beam
>  Issue Type: Improvement
>  Components: io-java-gcp, sdk-py-core
>Reporter: Udi Meiri
>Assignee: Udi Meiri
>Priority: Major
>  Time Spent: 1h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-5173) org.apache.beam.runners.fnexecution.control.RemoteExecutionTest.testExecutionWithMultipleStages is flaky

2019-02-13 Thread Valentyn Tymofieiev (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-5173?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16767752#comment-16767752
 ] 

Valentyn Tymofieiev commented on BEAM-5173:
---

I recommend to run this test 1000 times and see if it is still failing.

On Wed, Feb 13, 2019 at 4:14 PM Brian Hulette (JIRA) 



> org.apache.beam.runners.fnexecution.control.RemoteExecutionTest.testExecutionWithMultipleStages
>  is flaky
> 
>
> Key: BEAM-5173
> URL: https://issues.apache.org/jira/browse/BEAM-5173
> Project: Beam
>  Issue Type: Bug
>  Components: runner-core
>Reporter: Valentyn Tymofieiev
>Priority: Major
>
> Hi [~lcwik], this test failed in a [recent postcommit 
> build|https://builds.apache.org/job/beam_PostCommit_Java_GradleBuild/1285/testReport/junit/org.apache.beam.runners.fnexecution.control/RemoteExecutionTest/testExecutionWithMultipleStages].
>  Could you please take a look or help triage to the right owner? Thank you.
> Stack trace: 
> ava.util.concurrent.ExecutionException: java.lang.RuntimeException: 
> java.util.concurrent.ExecutionException: 
> org.apache.beam.vendor.grpc.v1.io.grpc.StatusRuntimeException: CANCELLED: 
> Runner closed connection
>   at java.util.concurrent.FutureTask.report(FutureTask.java:122)
>   at java.util.concurrent.FutureTask.get(FutureTask.java:192)
>   at 
> org.apache.beam.runners.fnexecution.control.RemoteExecutionTest.tearDown(RemoteExecutionTest.java:198)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:33)
>   at org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:48)
>   at org.junit.rules.RunRules.evaluate(RunRules.java:20)
>   at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
>   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
>   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
>   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
>   at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
>   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
>   at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
>   at 
> org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.runTestClass(JUnitTestClassExecutor.java:106)
>   at 
> org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.execute(JUnitTestClassExecutor.java:58)
>   at 
> org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.execute(JUnitTestClassExecutor.java:38)
>   at 
> org.gradle.api.internal.tasks.testing.junit.AbstractJUnitTestClassProcessor.processTestClass(AbstractJUnitTestClassProcessor.java:66)
>   at 
> org.gradle.api.internal.tasks.testing.SuiteTestClassProcessor.processTestClass(SuiteTestClassProcessor.java:51)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:35)
>   at 
> org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:24)
>   at 
> org.gradle.internal.dispatch.ContextClassLoaderDispatch.dispatch(ContextClassLoaderDispatch.java:32)
>   at 
> org.gradle.internal.dispatch.ProxyDispatchAdapter$DispatchingInvocationHandler.invoke(ProxyDispatchAdapter.java:93)
>   at com.sun.proxy.$Proxy2.processTestClass(Unknown Source)
>   at 
> org.gradle.api.internal.tasks.testing.worker.TestWorker.processTestClass(TestWorker.java:109)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:

[jira] [Work logged] (BEAM-6553) A BigQuery sink thta is SDK-implemented and supports file loads in Python

2019-02-13 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6553?focusedWorklogId=198417&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-198417
 ]

ASF GitHub Bot logged work on BEAM-6553:


Author: ASF GitHub Bot
Created on: 14/Feb/19 00:51
Start Date: 14/Feb/19 00:51
Worklog Time Spent: 10m 
  Work Description: pabloem commented on issue #7655: [BEAM-6553] A Python 
SDK sink that supports File Loads into BQ
URL: https://github.com/apache/beam/pull/7655#issuecomment-463438577
 
 
   Run Python PostCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 198417)
Time Spent: 9.5h  (was: 9h 20m)

> A BigQuery sink thta is SDK-implemented and supports file loads in Python
> -
>
> Key: BEAM-6553
> URL: https://issues.apache.org/jira/browse/BEAM-6553
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-py-core
>Reporter: Pablo Estrada
>Assignee: Pablo Estrada
>Priority: Major
>  Labels: triaged
>  Time Spent: 9.5h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6380) apache_beam.examples.wordcount_it_test.WordCountIT with DirectRunner failed

2019-02-13 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6380?focusedWorklogId=198414&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-198414
 ]

ASF GitHub Bot logged work on BEAM-6380:


Author: ASF GitHub Bot
Created on: 14/Feb/19 00:47
Start Date: 14/Feb/19 00:47
Worklog Time Spent: 10m 
  Work Description: udim commented on issue #7837: [BEAM-6380] Add 
debugging output to PipeStream
URL: https://github.com/apache/beam/pull/7837#issuecomment-463437472
 
 
   R: @charlesccychen 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 198414)
Time Spent: 20m  (was: 10m)

> apache_beam.examples.wordcount_it_test.WordCountIT with DirectRunner failed
> ---
>
> Key: BEAM-6380
> URL: https://issues.apache.org/jira/browse/BEAM-6380
> Project: Beam
>  Issue Type: Test
>  Components: test-failures
>Reporter: Boyuan Zhang
>Assignee: Udi Meiri
>Priority: Major
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> wordcount test in :pythonPostCommit failed owing to RuntimeError: 
> NotImplementedError [while running 'write/Write/WriteImpl/WriteBundles']
>  
>  https://builds.apache.org/job/beam_PostCommit_Python_Verify/7001/



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6380) apache_beam.examples.wordcount_it_test.WordCountIT with DirectRunner failed

2019-02-13 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6380?focusedWorklogId=198415&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-198415
 ]

ASF GitHub Bot logged work on BEAM-6380:


Author: ASF GitHub Bot
Created on: 14/Feb/19 00:48
Start Date: 14/Feb/19 00:48
Worklog Time Spent: 10m 
  Work Description: charlesccychen commented on issue #7837: [BEAM-6380] 
Add debugging output to PipeStream
URL: https://github.com/apache/beam/pull/7837#issuecomment-463437793
 
 
   Thanks, this LGTM.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 198415)
Time Spent: 0.5h  (was: 20m)

> apache_beam.examples.wordcount_it_test.WordCountIT with DirectRunner failed
> ---
>
> Key: BEAM-6380
> URL: https://issues.apache.org/jira/browse/BEAM-6380
> Project: Beam
>  Issue Type: Test
>  Components: test-failures
>Reporter: Boyuan Zhang
>Assignee: Udi Meiri
>Priority: Major
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> wordcount test in :pythonPostCommit failed owing to RuntimeError: 
> NotImplementedError [while running 'write/Write/WriteImpl/WriteBundles']
>  
>  https://builds.apache.org/job/beam_PostCommit_Python_Verify/7001/



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6664) Temporarily convert dataflowKmsKey flag to experimental flag for Dataflow

2019-02-13 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6664?focusedWorklogId=198411&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-198411
 ]

ASF GitHub Bot logged work on BEAM-6664:


Author: ASF GitHub Bot
Created on: 14/Feb/19 00:13
Start Date: 14/Feb/19 00:13
Worklog Time Spent: 10m 
  Work Description: aaltay commented on pull request #7835: [BEAM-6664] 
Temporarily convert dataflowKmsKey flag to experimental
URL: https://github.com/apache/beam/pull/7835#discussion_r256644550
 
 

 ##
 File path: sdks/python/apache_beam/options/pipeline_options.py
 ##
 @@ -590,6 +590,12 @@ def _add_argparse_args(cls, parser):
  'enabled with this flag. Please sync with the owners of the runner '
  'before enabling any experiments.'))
 
+  def add_experiment(self, experiment):
 
 Review comment:
   This is great. Please file an JIRA to convert existing uses of 
`self.debug_options.experiments.append(...)` across the codebase to use this.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 198411)
Time Spent: 50m  (was: 40m)

> Temporarily convert dataflowKmsKey flag to experimental flag for Dataflow
> -
>
> Key: BEAM-6664
> URL: https://issues.apache.org/jira/browse/BEAM-6664
> Project: Beam
>  Issue Type: Improvement
>  Components: io-java-gcp, sdk-py-core
>Reporter: Udi Meiri
>Assignee: Udi Meiri
>Priority: Major
>  Time Spent: 50m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-5173) org.apache.beam.runners.fnexecution.control.RemoteExecutionTest.testExecutionWithMultipleStages is flaky

2019-02-13 Thread Brian Hulette (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-5173?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16767726#comment-16767726
 ] 

Brian Hulette commented on BEAM-5173:
-

I think this may already be resolved. The last time it failed in 
Java_PreCommit_Cron was #623 on Nov 23, 2018: 
https://builds.apache.org/job/beam_PreCommit_Java_Cron/623/

Cron Job History 
https://builds.apache.org/job/beam_PreCommit_Java_Cron/lastCompletedBuild/testReport/junit/org.apache.beam.runners.fnexecution.control/RemoteExecutionTest/testExecutionWithMultipleStages/history/

[~tvalentyn] would you be ok closing, or do you think it may just be infrequent?

> org.apache.beam.runners.fnexecution.control.RemoteExecutionTest.testExecutionWithMultipleStages
>  is flaky
> 
>
> Key: BEAM-5173
> URL: https://issues.apache.org/jira/browse/BEAM-5173
> Project: Beam
>  Issue Type: Bug
>  Components: runner-core
>Reporter: Valentyn Tymofieiev
>Priority: Major
>
> Hi [~lcwik], this test failed in a [recent postcommit 
> build|https://builds.apache.org/job/beam_PostCommit_Java_GradleBuild/1285/testReport/junit/org.apache.beam.runners.fnexecution.control/RemoteExecutionTest/testExecutionWithMultipleStages].
>  Could you please take a look or help triage to the right owner? Thank you.
> Stack trace: 
> ava.util.concurrent.ExecutionException: java.lang.RuntimeException: 
> java.util.concurrent.ExecutionException: 
> org.apache.beam.vendor.grpc.v1.io.grpc.StatusRuntimeException: CANCELLED: 
> Runner closed connection
>   at java.util.concurrent.FutureTask.report(FutureTask.java:122)
>   at java.util.concurrent.FutureTask.get(FutureTask.java:192)
>   at 
> org.apache.beam.runners.fnexecution.control.RemoteExecutionTest.tearDown(RemoteExecutionTest.java:198)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:33)
>   at org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:48)
>   at org.junit.rules.RunRules.evaluate(RunRules.java:20)
>   at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
>   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
>   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
>   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
>   at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
>   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
>   at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
>   at 
> org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.runTestClass(JUnitTestClassExecutor.java:106)
>   at 
> org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.execute(JUnitTestClassExecutor.java:58)
>   at 
> org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.execute(JUnitTestClassExecutor.java:38)
>   at 
> org.gradle.api.internal.tasks.testing.junit.AbstractJUnitTestClassProcessor.processTestClass(AbstractJUnitTestClassProcessor.java:66)
>   at 
> org.gradle.api.internal.tasks.testing.SuiteTestClassProcessor.processTestClass(SuiteTestClassProcessor.java:51)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:35)
>   at 
> org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:24)
>   at 
> org.gradle.internal.dispatch.ContextClassLoaderDispatch.dispatch(ContextClassLoaderDispatch.java:32)
>   at 
> org.gradle.internal.dispatch.ProxyDispatchAdapter$DispatchingInvocationHandler.invoke(ProxyDispatchAdapter.java:93)
>   

[jira] [Work logged] (BEAM-6380) apache_beam.examples.wordcount_it_test.WordCountIT with DirectRunner failed

2019-02-13 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6380?focusedWorklogId=198408&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-198408
 ]

ASF GitHub Bot logged work on BEAM-6380:


Author: ASF GitHub Bot
Created on: 14/Feb/19 00:04
Start Date: 14/Feb/19 00:04
Worklog Time Spent: 10m 
  Work Description: udim commented on pull request #7837: [BEAM-6380] Add 
some debugging output.
URL: https://github.com/apache/beam/pull/7837
 
 
   IIUC, offset should be None (which would be a bug in apitools).
   Otherwise, I'd like to verify that `position - last_position <= chunkSize`
   
   
   
   Thank you for your contribution! Follow this checklist to help us 
incorporate your contribution quickly and easily:
   
- [ ] [**Choose 
reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and 
mention them in a comment (`R: @username`).
- [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in 
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA 
issue, if applicable. This will automatically link the pull request to the 
issue.
- [ ] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).
   
   Post-Commit Tests Status (on master branch)
   

   
   Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark
   --- | --- | --- | --- | --- | --- | --- | ---
   Go | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/)
 | --- | --- | --- | --- | --- | ---
   Java | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)
   Python | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Python_Verify/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python_Verify/lastCompletedBuild/)
 | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/)
  [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PreCommit_Python_PVR_Flink_Cron/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PreCommit_Python_PVR_Flink_Cron/lastCompletedBuild/)
 | --- | --- | ---
   
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Wo

[jira] [Assigned] (BEAM-6380) apache_beam.examples.wordcount_it_test.WordCountIT with DirectRunner failed

2019-02-13 Thread Udi Meiri (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6380?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Udi Meiri reassigned BEAM-6380:
---

Assignee: Udi Meiri

> apache_beam.examples.wordcount_it_test.WordCountIT with DirectRunner failed
> ---
>
> Key: BEAM-6380
> URL: https://issues.apache.org/jira/browse/BEAM-6380
> Project: Beam
>  Issue Type: Test
>  Components: test-failures
>Reporter: Boyuan Zhang
>Assignee: Udi Meiri
>Priority: Major
>
> wordcount test in :pythonPostCommit failed owing to RuntimeError: 
> NotImplementedError [while running 'write/Write/WriteImpl/WriteBundles']
>  
>  https://builds.apache.org/job/beam_PostCommit_Python_Verify/7001/



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6553) A BigQuery sink thta is SDK-implemented and supports file loads in Python

2019-02-13 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6553?focusedWorklogId=198396&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-198396
 ]

ASF GitHub Bot logged work on BEAM-6553:


Author: ASF GitHub Bot
Created on: 13/Feb/19 23:32
Start Date: 13/Feb/19 23:32
Worklog Time Spent: 10m 
  Work Description: pabloem commented on pull request #7655: [BEAM-6553] A 
Python SDK sink that supports File Loads into BQ
URL: https://github.com/apache/beam/pull/7655#discussion_r256635014
 
 

 ##
 File path: sdks/python/apache_beam/io/gcp/bigquery_file_loads.py
 ##
 @@ -0,0 +1,589 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+"""
+Functionality to perform file loads into BigQuery for Batch and Streaming
+pipelines.
+
+This source is able to work around BigQuery load quotas and limitations. When
+destinations are dynamic, or when data for a single job is too large, the data
+will be split into multiple jobs.
+
+NOTHING IN THIS FILE HAS BACKWARDS COMPATIBILITY GUARANTEES.
+"""
+
+from __future__ import absolute_import
+
+import datetime
+import hashlib
+import logging
+import random
+import time
+import uuid
+
+from future.utils import iteritems
+
+import apache_beam as beam
+from apache_beam import pvalue
+from apache_beam.io import filesystems as fs
+from apache_beam.io.gcp import bigquery_tools
+from apache_beam.io.gcp.internal.clients import bigquery as bigquery_api
+from apache_beam.options import value_provider as vp
+from apache_beam.transforms.combiners import Count
+
+ONE_TERABYTE = (1 << 40)
+
+# The maximum file size for imports is 5TB. We keep our files under that.
+_DEFAULT_MAX_FILE_SIZE = 4 * ONE_TERABYTE
+
+_DEFAULT_MAX_WRITERS_PER_BUNDLE = 20
+
+# The maximum size for a single load job is one terabyte
+_MAXIMUM_LOAD_SIZE = 15 * ONE_TERABYTE
+
+# Big query only supports up to 10 thousand URIs for a single load job.
+_MAXIMUM_SOURCE_URIS = 10*1000
+
+
+def _generate_load_job_name():
+  datetime_component = datetime.datetime.now().strftime("%Y_%m_%d_%H%M%S")
+  # TODO(pabloem): include job id / pipeline component?
+  return 'beam_load_%s_%s' % (datetime_component, random.randint(0, 100))
+
+
+def _generate_file_prefix(pipeline_gcs_location):
+  # If a gcs location is provided to the pipeline, then we shall use that.
+  # Otherwise, we shall use the temp_location from pipeline options.
+  gcs_base = str(pipeline_gcs_location or
+ vp.RuntimeValueProvider.get_value('temp_location', str, ''))
+  prefix_uuid = _bq_uuid()
+  return fs.FileSystems.join(gcs_base, 'bq_load', prefix_uuid)
+
+
+def _make_new_file_writer(file_prefix, destination):
+  if isinstance(destination, bigquery_api.TableReference):
+destination = '%s:%s.%s' % (
+destination.projectId, destination.datasetId, destination.tableId)
+
+  directory = fs.FileSystems.join(file_prefix, destination)
+
+  if not fs.FileSystems.exists(directory):
+fs.FileSystems.mkdirs(directory)
+
+  file_name = str(uuid.uuid4())
+  file_path = fs.FileSystems.join(file_prefix, destination, file_name)
+
+  return file_path, fs.FileSystems.create(file_path, 'application/text')
+
+
+def _bq_uuid(seed=None):
+  if not seed:
+return str(uuid.uuid4()).replace("-", "")
+  else:
+return str(hashlib.md5(seed).hexdigest())
+
+
+class _AppendDestinationsFn(beam.DoFn):
+  """Adds the destination to an element, making it a KV pair.
+
+  Outputs a PCollection of KV-pairs where the key is a TableReference for the
+  destination, and the value is the record itself.
+
+  Experimental; no backwards compatibility guarantees.
+  """
+
+  def __init__(self, destination):
+if callable(destination):
+  self.destination = destination
+else:
+  self.destination = lambda x: destination
+
+  def process(self, element):
+yield (self.destination(element), element)
+
+
+class _ShardDestinations(beam.DoFn):
+  """Adds a shard number to the key of the KV element.
+
+  Experimental; no backwards compatibility guarantees."""
+  DEFAULT_SHARDING_FACTOR = 10
+
+  def __init__(self, sharding_factor=DEFAULT_SHARDING_FACTOR):
+self.sharding_factor = sharding_factor
+
+  

[jira] [Work logged] (BEAM-6553) A BigQuery sink thta is SDK-implemented and supports file loads in Python

2019-02-13 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6553?focusedWorklogId=198400&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-198400
 ]

ASF GitHub Bot logged work on BEAM-6553:


Author: ASF GitHub Bot
Created on: 13/Feb/19 23:32
Start Date: 13/Feb/19 23:32
Worklog Time Spent: 10m 
  Work Description: pabloem commented on pull request #7655: [BEAM-6553] A 
Python SDK sink that supports File Loads into BQ
URL: https://github.com/apache/beam/pull/7655#discussion_r256635339
 
 

 ##
 File path: sdks/python/apache_beam/io/gcp/bigquery_file_loads.py
 ##
 @@ -0,0 +1,533 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+"""
+Functionality to perform file loads into BigQuery for Batch and Streaming
+pipelines.
+
+NOTHING IN THIS FILE HAS BACKWARDS COMPATIBILITY GUARANTEES.
+"""
+
+from __future__ import absolute_import
+
+import datetime
+import logging
+import random
+import time
+import uuid
+
+from future.utils import iteritems
+
+import apache_beam as beam
+from apache_beam import pvalue
+from apache_beam.io import filesystems as fs
+from apache_beam.io.gcp import bigquery_tools
+from apache_beam.io.gcp.internal.clients import bigquery as bigquery_api
+from apache_beam.options import value_provider as vp
+from apache_beam.transforms.combiners import Count
+
+ONE_TERABYTE = (1 << 40)
+
+# The maximum file size for imports is 5TB. We keep our files under that.
+_DEFAULT_MAX_FILE_SIZE = 4 * ONE_TERABYTE
+
+_DEFAULT_MAX_WRITERS_PER_BUNDLE = 20
+
+# The maximum size for a single load job is one terabyte
+_MAXIMUM_LOAD_SIZE = 15 * ONE_TERABYTE
+
+
+def _generate_load_job_name():
+  datetime_component = datetime.datetime.now().strftime("%Y_%m_%d_%H%M%S")
+  # TODO(pabloem): include job id / pipeline component?
+  return 'beam_load_%s_%s' % (datetime_component, random.randint(0, 100))
+
+
+def _generate_file_prefix(pipeline_gcs_location):
+  # If a gcs location is provided to the pipeline, then we shall use that.
+  # Otherwise, we shall use the temp_location from pipeline options.
+  gcs_base = str(pipeline_gcs_location or
+ vp.RuntimeValueProvider.get_value('temp_location', str, ''))
+  return fs.FileSystems.join(gcs_base, 'bq_load')
+
+
+def _make_new_file_writer(file_prefix, destination):
+  if isinstance(destination, bigquery_api.TableReference):
+destination = '%s:%s.%s' % (
+destination.projectId, destination.datasetId, destination.tableId)
+
+  directory = fs.FileSystems.join(file_prefix, destination)
+
+  if not fs.FileSystems.exists(directory):
+fs.FileSystems.mkdirs(directory)
+
+  file_name = str(uuid.uuid4())
+  file_path = fs.FileSystems.join(file_prefix, destination, file_name)
+
+  return file_path, fs.FileSystems.create(file_path, 'application/text')
+
+
+def _bq_uuid():
+  return str(uuid.uuid4()).replace("-", "")
+
+
+class _AppendDestinationsFn(beam.DoFn):
+  """Adds the destination to an element, making it a KV pair.
+
+  Outputs a PCollection of KV-pairs where the key is a TableReference for the
+  destination, and the value is the record itself.
+
+  Experimental; no backwards compatibility guarantees.
+  """
+
+  def __init__(self, destination):
+if callable(destination):
+  self.destination = destination
+else:
+  self.destination = lambda x: destination
+
+  def process(self, element):
+yield (self.destination(element), element)
+
+
+class _ShardDestinations(beam.DoFn):
+  """Adds a shard number to the key of the KV element.
+
+  Experimental; no backwards compatibility guarantees."""
+  DEFAULT_SHARDING_FACTOR = 10
+
+  def __init__(self, sharding_factor=DEFAULT_SHARDING_FACTOR):
+self.sharding_factor = sharding_factor
+
+  def start_bundle(self):
+self._shard_count = random.randrange(self.sharding_factor)
+
+  def process(self, element):
+destination = element[0]
+row = element[1]
+
+sharded_destination = (destination,
+   self._shard_count % self.sharding_factor)
+self._shard_count += 1
+yield (sharded_destination, row)
+
+
+class WriteRecordsToFile(beam.DoFn):
+  """Write input records to files before trigger

[jira] [Work logged] (BEAM-6553) A BigQuery sink thta is SDK-implemented and supports file loads in Python

2019-02-13 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6553?focusedWorklogId=198397&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-198397
 ]

ASF GitHub Bot logged work on BEAM-6553:


Author: ASF GitHub Bot
Created on: 13/Feb/19 23:32
Start Date: 13/Feb/19 23:32
Worklog Time Spent: 10m 
  Work Description: pabloem commented on pull request #7655: [BEAM-6553] A 
Python SDK sink that supports File Loads into BQ
URL: https://github.com/apache/beam/pull/7655#discussion_r256634728
 
 

 ##
 File path: sdks/python/apache_beam/io/gcp/bigquery_file_loads.py
 ##
 @@ -0,0 +1,589 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+"""
+Functionality to perform file loads into BigQuery for Batch and Streaming
+pipelines.
+
+This source is able to work around BigQuery load quotas and limitations. When
+destinations are dynamic, or when data for a single job is too large, the data
+will be split into multiple jobs.
+
+NOTHING IN THIS FILE HAS BACKWARDS COMPATIBILITY GUARANTEES.
+"""
+
+from __future__ import absolute_import
+
+import datetime
+import hashlib
+import logging
+import random
+import time
+import uuid
+
+from future.utils import iteritems
+
+import apache_beam as beam
+from apache_beam import pvalue
+from apache_beam.io import filesystems as fs
+from apache_beam.io.gcp import bigquery_tools
+from apache_beam.io.gcp.internal.clients import bigquery as bigquery_api
+from apache_beam.options import value_provider as vp
+from apache_beam.transforms.combiners import Count
+
+ONE_TERABYTE = (1 << 40)
+
+# The maximum file size for imports is 5TB. We keep our files under that.
+_DEFAULT_MAX_FILE_SIZE = 4 * ONE_TERABYTE
+
+_DEFAULT_MAX_WRITERS_PER_BUNDLE = 20
+
+# The maximum size for a single load job is one terabyte
+_MAXIMUM_LOAD_SIZE = 15 * ONE_TERABYTE
+
+# Big query only supports up to 10 thousand URIs for a single load job.
+_MAXIMUM_SOURCE_URIS = 10*1000
+
+
+def _generate_load_job_name():
+  datetime_component = datetime.datetime.now().strftime("%Y_%m_%d_%H%M%S")
+  # TODO(pabloem): include job id / pipeline component?
+  return 'beam_load_%s_%s' % (datetime_component, random.randint(0, 100))
+
+
+def _generate_file_prefix(pipeline_gcs_location):
+  # If a gcs location is provided to the pipeline, then we shall use that.
 
 Review comment:
   Hm? GCS is what it says : )
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 198397)
Time Spent: 9h  (was: 8h 50m)

> A BigQuery sink thta is SDK-implemented and supports file loads in Python
> -
>
> Key: BEAM-6553
> URL: https://issues.apache.org/jira/browse/BEAM-6553
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-py-core
>Reporter: Pablo Estrada
>Assignee: Pablo Estrada
>Priority: Major
>  Labels: triaged
>  Time Spent: 9h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6553) A BigQuery sink thta is SDK-implemented and supports file loads in Python

2019-02-13 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6553?focusedWorklogId=198398&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-198398
 ]

ASF GitHub Bot logged work on BEAM-6553:


Author: ASF GitHub Bot
Created on: 13/Feb/19 23:32
Start Date: 13/Feb/19 23:32
Worklog Time Spent: 10m 
  Work Description: pabloem commented on pull request #7655: [BEAM-6553] A 
Python SDK sink that supports File Loads into BQ
URL: https://github.com/apache/beam/pull/7655#discussion_r256634913
 
 

 ##
 File path: sdks/python/apache_beam/io/gcp/bigquery_file_loads.py
 ##
 @@ -0,0 +1,589 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+"""
+Functionality to perform file loads into BigQuery for Batch and Streaming
+pipelines.
+
+This source is able to work around BigQuery load quotas and limitations. When
+destinations are dynamic, or when data for a single job is too large, the data
+will be split into multiple jobs.
+
+NOTHING IN THIS FILE HAS BACKWARDS COMPATIBILITY GUARANTEES.
+"""
+
+from __future__ import absolute_import
+
+import datetime
+import hashlib
+import logging
+import random
+import time
+import uuid
+
+from future.utils import iteritems
+
+import apache_beam as beam
+from apache_beam import pvalue
+from apache_beam.io import filesystems as fs
+from apache_beam.io.gcp import bigquery_tools
+from apache_beam.io.gcp.internal.clients import bigquery as bigquery_api
+from apache_beam.options import value_provider as vp
+from apache_beam.transforms.combiners import Count
+
+ONE_TERABYTE = (1 << 40)
+
+# The maximum file size for imports is 5TB. We keep our files under that.
+_DEFAULT_MAX_FILE_SIZE = 4 * ONE_TERABYTE
+
+_DEFAULT_MAX_WRITERS_PER_BUNDLE = 20
+
+# The maximum size for a single load job is one terabyte
+_MAXIMUM_LOAD_SIZE = 15 * ONE_TERABYTE
+
+# Big query only supports up to 10 thousand URIs for a single load job.
+_MAXIMUM_SOURCE_URIS = 10*1000
+
+
+def _generate_load_job_name():
+  datetime_component = datetime.datetime.now().strftime("%Y_%m_%d_%H%M%S")
+  # TODO(pabloem): include job id / pipeline component?
+  return 'beam_load_%s_%s' % (datetime_component, random.randint(0, 100))
+
+
+def _generate_file_prefix(pipeline_gcs_location):
+  # If a gcs location is provided to the pipeline, then we shall use that.
+  # Otherwise, we shall use the temp_location from pipeline options.
+  gcs_base = str(pipeline_gcs_location or
+ vp.RuntimeValueProvider.get_value('temp_location', str, ''))
+  prefix_uuid = _bq_uuid()
+  return fs.FileSystems.join(gcs_base, 'bq_load', prefix_uuid)
+
+
+def _make_new_file_writer(file_prefix, destination):
+  if isinstance(destination, bigquery_api.TableReference):
+destination = '%s:%s.%s' % (
+destination.projectId, destination.datasetId, destination.tableId)
+
+  directory = fs.FileSystems.join(file_prefix, destination)
+
+  if not fs.FileSystems.exists(directory):
+fs.FileSystems.mkdirs(directory)
+
+  file_name = str(uuid.uuid4())
+  file_path = fs.FileSystems.join(file_prefix, destination, file_name)
+
+  return file_path, fs.FileSystems.create(file_path, 'application/text')
+
+
+def _bq_uuid(seed=None):
+  if not seed:
+return str(uuid.uuid4()).replace("-", "")
+  else:
+return str(hashlib.md5(seed).hexdigest())
+
+
+class _AppendDestinationsFn(beam.DoFn):
+  """Adds the destination to an element, making it a KV pair.
+
+  Outputs a PCollection of KV-pairs where the key is a TableReference for the
+  destination, and the value is the record itself.
+
+  Experimental; no backwards compatibility guarantees.
+  """
+
+  def __init__(self, destination):
+if callable(destination):
+  self.destination = destination
+else:
+  self.destination = lambda x: destination
+
+  def process(self, element):
+yield (self.destination(element), element)
+
+
+class _ShardDestinations(beam.DoFn):
+  """Adds a shard number to the key of the KV element.
+
+  Experimental; no backwards compatibility guarantees."""
+  DEFAULT_SHARDING_FACTOR = 10
+
+  def __init__(self, sharding_factor=DEFAULT_SHARDING_FACTOR):
+self.sharding_factor = sharding_factor
+
+  

[jira] [Work logged] (BEAM-6553) A BigQuery sink thta is SDK-implemented and supports file loads in Python

2019-02-13 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6553?focusedWorklogId=198399&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-198399
 ]

ASF GitHub Bot logged work on BEAM-6553:


Author: ASF GitHub Bot
Created on: 13/Feb/19 23:32
Start Date: 13/Feb/19 23:32
Worklog Time Spent: 10m 
  Work Description: pabloem commented on pull request #7655: [BEAM-6553] A 
Python SDK sink that supports File Loads into BQ
URL: https://github.com/apache/beam/pull/7655#discussion_r256634572
 
 

 ##
 File path: sdks/python/apache_beam/io/gcp/bigquery_file_loads.py
 ##
 @@ -0,0 +1,533 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+"""
+Functionality to perform file loads into BigQuery for Batch and Streaming
+pipelines.
+
+NOTHING IN THIS FILE HAS BACKWARDS COMPATIBILITY GUARANTEES.
+"""
+
+from __future__ import absolute_import
+
+import datetime
+import logging
+import random
+import time
+import uuid
+
+from future.utils import iteritems
+
+import apache_beam as beam
+from apache_beam import pvalue
+from apache_beam.io import filesystems as fs
+from apache_beam.io.gcp import bigquery_tools
+from apache_beam.io.gcp.internal.clients import bigquery as bigquery_api
+from apache_beam.options import value_provider as vp
+from apache_beam.transforms.combiners import Count
+
+ONE_TERABYTE = (1 << 40)
+
+# The maximum file size for imports is 5TB. We keep our files under that.
+_DEFAULT_MAX_FILE_SIZE = 4 * ONE_TERABYTE
 
 Review comment:
   I've handled this now.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 198399)
Time Spent: 9h 20m  (was: 9h 10m)

> A BigQuery sink thta is SDK-implemented and supports file loads in Python
> -
>
> Key: BEAM-6553
> URL: https://issues.apache.org/jira/browse/BEAM-6553
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-py-core
>Reporter: Pablo Estrada
>Assignee: Pablo Estrada
>Priority: Major
>  Labels: triaged
>  Time Spent: 9h 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6664) Temporarily convert dataflowKmsKey flag to experimental flag for Dataflow

2019-02-13 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6664?focusedWorklogId=198390&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-198390
 ]

ASF GitHub Bot logged work on BEAM-6664:


Author: ASF GitHub Bot
Created on: 13/Feb/19 23:05
Start Date: 13/Feb/19 23:05
Worklog Time Spent: 10m 
  Work Description: udim commented on issue #7835: [BEAM-6664] Temporarily 
convert dataflowKmsKey flag to experimental
URL: https://github.com/apache/beam/pull/7835#issuecomment-463412346
 
 
   run python postcommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 198390)
Time Spent: 20m  (was: 10m)

> Temporarily convert dataflowKmsKey flag to experimental flag for Dataflow
> -
>
> Key: BEAM-6664
> URL: https://issues.apache.org/jira/browse/BEAM-6664
> Project: Beam
>  Issue Type: Improvement
>  Components: io-java-gcp, sdk-py-core
>Reporter: Udi Meiri
>Assignee: Udi Meiri
>Priority: Major
>  Time Spent: 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6664) Temporarily convert dataflowKmsKey flag to experimental flag for Dataflow

2019-02-13 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6664?focusedWorklogId=198391&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-198391
 ]

ASF GitHub Bot logged work on BEAM-6664:


Author: ASF GitHub Bot
Created on: 13/Feb/19 23:05
Start Date: 13/Feb/19 23:05
Worklog Time Spent: 10m 
  Work Description: udim commented on issue #7835: [BEAM-6664] Temporarily 
convert dataflowKmsKey flag to experimental
URL: https://github.com/apache/beam/pull/7835#issuecomment-463412375
 
 
   run java postcommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 198391)
Time Spent: 0.5h  (was: 20m)

> Temporarily convert dataflowKmsKey flag to experimental flag for Dataflow
> -
>
> Key: BEAM-6664
> URL: https://issues.apache.org/jira/browse/BEAM-6664
> Project: Beam
>  Issue Type: Improvement
>  Components: io-java-gcp, sdk-py-core
>Reporter: Udi Meiri
>Assignee: Udi Meiri
>Priority: Major
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6664) Temporarily convert dataflowKmsKey flag to experimental flag for Dataflow

2019-02-13 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6664?focusedWorklogId=198392&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-198392
 ]

ASF GitHub Bot logged work on BEAM-6664:


Author: ASF GitHub Bot
Created on: 13/Feb/19 23:06
Start Date: 13/Feb/19 23:06
Worklog Time Spent: 10m 
  Work Description: udim commented on issue #7835: [BEAM-6664] Temporarily 
convert dataflowKmsKey flag to experimental
URL: https://github.com/apache/beam/pull/7835#issuecomment-463412595
 
 
   R: @aaltay 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 198392)
Time Spent: 40m  (was: 0.5h)

> Temporarily convert dataflowKmsKey flag to experimental flag for Dataflow
> -
>
> Key: BEAM-6664
> URL: https://issues.apache.org/jira/browse/BEAM-6664
> Project: Beam
>  Issue Type: Improvement
>  Components: io-java-gcp, sdk-py-core
>Reporter: Udi Meiri
>Assignee: Udi Meiri
>Priority: Major
>  Time Spent: 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (BEAM-5816) Flink runner starts new bundles while disposing operator

2019-02-13 Thread Thomas Weise (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5816?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Weise resolved BEAM-5816.

   Resolution: Fixed
Fix Version/s: 2.11.0

> Flink runner starts new bundles while disposing operator 
> -
>
> Key: BEAM-5816
> URL: https://issues.apache.org/jira/browse/BEAM-5816
> Project: Beam
>  Issue Type: Bug
>  Components: runner-flink
>Reporter: Micah Wylde
>Assignee: Maximilian Michels
>Priority: Major
>  Labels: portability-flink, triaged
> Fix For: 2.11.0
>
>  Time Spent: 2h 50m
>  Remaining Estimate: 0h
>
> We sometimes see exceptions when shutting down portable flink pipelines 
> (either due to cancellation or failure):
> {code}
> 2018-10-19 15:54:52,905 ERROR 
> org.apache.flink.streaming.runtime.tasks.StreamTask   - Error during 
> disposal of stream operator.
> java.lang.RuntimeException: Failed to finish remote bundle
>   at 
> org.apache.beam.runners.flink.translation.wrappers.streaming.ExecutableStageDoFnOperator$SdkHarnessDoFnRunner.finishBundle(ExecutableStageDoFnOperator.java:241)
>   at 
> org.apache.beam.runners.flink.metrics.DoFnRunnerWithMetricsUpdate.finishBundle(DoFnRunnerWithMetricsUpdate.java:87)
>   at 
> org.apache.beam.runners.core.SimplePushbackSideInputDoFnRunner.finishBundle(SimplePushbackSideInputDoFnRunner.java:118)
>   at 
> org.apache.beam.runners.flink.translation.wrappers.streaming.DoFnOperator.invokeFinishBundle(DoFnOperator.java:674)
>   at 
> org.apache.beam.runners.flink.translation.wrappers.streaming.DoFnOperator.dispose(DoFnOperator.java:391)
>   at 
> org.apache.beam.runners.flink.translation.wrappers.streaming.ExecutableStageDoFnOperator.dispose(ExecutableStageDoFnOperator.java:166)
>   at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.disposeAllOperators(StreamTask.java:473)
>   at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:374)
>   at org.apache.flink.runtime.taskmanager.Task.run(Task.java:703)
>   at java.lang.Thread.run(Thread.java:748)
> Caused by: java.lang.IllegalStateException: Already closed.
>   at 
> org.apache.beam.sdk.fn.data.BeamFnDataBufferingOutboundObserver.close(BeamFnDataBufferingOutboundObserver.java:95)
>   at 
> org.apache.beam.runners.fnexecution.control.SdkHarnessClient$ActiveBundle.close(SdkHarnessClient.java:251)
>   at 
> org.apache.beam.runners.flink.translation.wrappers.streaming.ExecutableStageDoFnOperator$SdkHarnessDoFnRunner.finishBundle(ExecutableStageDoFnOperator.java:238)
>   ... 9 more
>   Suppressed: java.lang.IllegalStateException: Processing bundle failed, 
> TODO: [BEAM-3962] abort bundle.
>   at 
> org.apache.beam.runners.fnexecution.control.SdkHarnessClient$ActiveBundle.close(SdkHarnessClient.java:266)
>   ... 10 more
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-5816) Flink runner starts new bundles while disposing operator

2019-02-13 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5816?focusedWorklogId=198381&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-198381
 ]

ASF GitHub Bot logged work on BEAM-5816:


Author: ASF GitHub Bot
Created on: 13/Feb/19 22:48
Start Date: 13/Feb/19 22:48
Worklog Time Spent: 10m 
  Work Description: tweise commented on pull request #7719: [BEAM-5816] 
Finish Flink bundles exactly once
URL: https://github.com/apache/beam/pull/7719
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 198381)
Time Spent: 2h 50m  (was: 2h 40m)

> Flink runner starts new bundles while disposing operator 
> -
>
> Key: BEAM-5816
> URL: https://issues.apache.org/jira/browse/BEAM-5816
> Project: Beam
>  Issue Type: Bug
>  Components: runner-flink
>Reporter: Micah Wylde
>Assignee: Maximilian Michels
>Priority: Major
>  Labels: portability-flink, triaged
>  Time Spent: 2h 50m
>  Remaining Estimate: 0h
>
> We sometimes see exceptions when shutting down portable flink pipelines 
> (either due to cancellation or failure):
> {code}
> 2018-10-19 15:54:52,905 ERROR 
> org.apache.flink.streaming.runtime.tasks.StreamTask   - Error during 
> disposal of stream operator.
> java.lang.RuntimeException: Failed to finish remote bundle
>   at 
> org.apache.beam.runners.flink.translation.wrappers.streaming.ExecutableStageDoFnOperator$SdkHarnessDoFnRunner.finishBundle(ExecutableStageDoFnOperator.java:241)
>   at 
> org.apache.beam.runners.flink.metrics.DoFnRunnerWithMetricsUpdate.finishBundle(DoFnRunnerWithMetricsUpdate.java:87)
>   at 
> org.apache.beam.runners.core.SimplePushbackSideInputDoFnRunner.finishBundle(SimplePushbackSideInputDoFnRunner.java:118)
>   at 
> org.apache.beam.runners.flink.translation.wrappers.streaming.DoFnOperator.invokeFinishBundle(DoFnOperator.java:674)
>   at 
> org.apache.beam.runners.flink.translation.wrappers.streaming.DoFnOperator.dispose(DoFnOperator.java:391)
>   at 
> org.apache.beam.runners.flink.translation.wrappers.streaming.ExecutableStageDoFnOperator.dispose(ExecutableStageDoFnOperator.java:166)
>   at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.disposeAllOperators(StreamTask.java:473)
>   at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:374)
>   at org.apache.flink.runtime.taskmanager.Task.run(Task.java:703)
>   at java.lang.Thread.run(Thread.java:748)
> Caused by: java.lang.IllegalStateException: Already closed.
>   at 
> org.apache.beam.sdk.fn.data.BeamFnDataBufferingOutboundObserver.close(BeamFnDataBufferingOutboundObserver.java:95)
>   at 
> org.apache.beam.runners.fnexecution.control.SdkHarnessClient$ActiveBundle.close(SdkHarnessClient.java:251)
>   at 
> org.apache.beam.runners.flink.translation.wrappers.streaming.ExecutableStageDoFnOperator$SdkHarnessDoFnRunner.finishBundle(ExecutableStageDoFnOperator.java:238)
>   ... 9 more
>   Suppressed: java.lang.IllegalStateException: Processing bundle failed, 
> TODO: [BEAM-3962] abort bundle.
>   at 
> org.apache.beam.runners.fnexecution.control.SdkHarnessClient$ActiveBundle.close(SdkHarnessClient.java:266)
>   ... 10 more
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3306) Consider: Go coder registry

2019-02-13 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-3306?focusedWorklogId=198380&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-198380
 ]

ASF GitHub Bot logged work on BEAM-3306:


Author: ASF GitHub Bot
Created on: 13/Feb/19 22:46
Start Date: 13/Feb/19 22:46
Worklog Time Spent: 10m 
  Work Description: aaltay commented on pull request #7834: [BEAM-3306] 
Encapsulate coder details.
URL: https://github.com/apache/beam/pull/7834
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 198380)
Time Spent: 2h 50m  (was: 2h 40m)

> Consider: Go coder registry
> ---
>
> Key: BEAM-3306
> URL: https://issues.apache.org/jira/browse/BEAM-3306
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-go
>Reporter: Henning Rohde
>Assignee: Robert Burke
>Priority: Minor
>  Labels: triaged
>  Time Spent: 2h 50m
>  Remaining Estimate: 0h
>
> Add coder registry to allow easier overwrite of default coders. We may also 
> allow otherwise un-encodable types, but that would require that function 
> analysis depends on it.
> If we're hardcoding support for proto/avro, then there may be little need for 
> such a feature. Conversely, this may be how we implement such support.
>  
> Proposal Doc: 
> [https://docs.google.com/document/d/1kQwx4Ah6PzG8z2ZMuNsNEXkGsLXm6gADOZaIO7reUOg/edit#|https://docs.google.com/document/d/1kQwx4Ah6PzG8z2ZMuNsNEXkGsLXm6gADOZaIO7reUOg/edit]
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3306) Consider: Go coder registry

2019-02-13 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-3306?focusedWorklogId=198368&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-198368
 ]

ASF GitHub Bot logged work on BEAM-3306:


Author: ASF GitHub Bot
Created on: 13/Feb/19 22:26
Start Date: 13/Feb/19 22:26
Worklog Time Spent: 10m 
  Work Description: lostluck commented on issue #7834: [BEAM-3306] 
Encapsulate coder details.
URL: https://github.com/apache/beam/pull/7834#issuecomment-463400846
 
 
   R: @aaltay Review Please!
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 198368)
Time Spent: 2h 40m  (was: 2.5h)

> Consider: Go coder registry
> ---
>
> Key: BEAM-3306
> URL: https://issues.apache.org/jira/browse/BEAM-3306
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-go
>Reporter: Henning Rohde
>Assignee: Robert Burke
>Priority: Minor
>  Labels: triaged
>  Time Spent: 2h 40m
>  Remaining Estimate: 0h
>
> Add coder registry to allow easier overwrite of default coders. We may also 
> allow otherwise un-encodable types, but that would require that function 
> analysis depends on it.
> If we're hardcoding support for proto/avro, then there may be little need for 
> such a feature. Conversely, this may be how we implement such support.
>  
> Proposal Doc: 
> [https://docs.google.com/document/d/1kQwx4Ah6PzG8z2ZMuNsNEXkGsLXm6gADOZaIO7reUOg/edit#|https://docs.google.com/document/d/1kQwx4Ah6PzG8z2ZMuNsNEXkGsLXm6gADOZaIO7reUOg/edit]
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6664) Temporarily convert dataflowKmsKey flag to experimental flag for Dataflow

2019-02-13 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6664?focusedWorklogId=198365&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-198365
 ]

ASF GitHub Bot logged work on BEAM-6664:


Author: ASF GitHub Bot
Created on: 13/Feb/19 22:19
Start Date: 13/Feb/19 22:19
Worklog Time Spent: 10m 
  Work Description: udim commented on pull request #7835: [BEAM-6664] 
Temporarily convert dataflowKmsKey flag to experimental
URL: https://github.com/apache/beam/pull/7835
 
 
   
   
   
   Thank you for your contribution! Follow this checklist to help us 
incorporate your contribution quickly and easily:
   
- [ ] [**Choose 
reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and 
mention them in a comment (`R: @username`).
- [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in 
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA 
issue, if applicable. This will automatically link the pull request to the 
issue.
- [ ] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).
   
   Post-Commit Tests Status (on master branch)
   

   
   Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark
   --- | --- | --- | --- | --- | --- | --- | ---
   Go | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/)
 | --- | --- | --- | --- | --- | ---
   Java | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)
   Python | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Python_Verify/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python_Verify/lastCompletedBuild/)
 | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/)
  [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PreCommit_Python_PVR_Flink_Cron/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PreCommit_Python_PVR_Flink_Cron/lastCompletedBuild/)
 | --- | --- | ---
   
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 198365)
Time Spent: 10m
Remaining Estimate: 0h

> Temporarily convert dataflowKm

[jira] [Work logged] (BEAM-3306) Consider: Go coder registry

2019-02-13 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-3306?focusedWorklogId=198350&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-198350
 ]

ASF GitHub Bot logged work on BEAM-3306:


Author: ASF GitHub Bot
Created on: 13/Feb/19 21:59
Start Date: 13/Feb/19 21:59
Worklog Time Spent: 10m 
  Work Description: lostluck commented on pull request #7834: [BEAM-3306] 
Encapsulate coder details.
URL: https://github.com/apache/beam/pull/7834
 
 
   This change encapsulates internal implementation details that most users 
should not need.
   
   In particular: This CL removes needing to wrap values in the exec.FullValue 
type before encoding them. It also hides the internal coder.Coder abstraction.
   
   Previously users needed to do the following to use a beam coder to encode an 
element of a given type:
   
var t reflect.Type // from some where
coder := beam.NewCoder(typex.New(t))
enc := exec.MakeElementEncoder(beam.UnwrapCoder(coder))
var buf bytes.Buffer
err := enc.Encode(exec.FullValue{Elm: value}, &buf)
...
   
   
   This required users to access beam implementation details directly: 
coder.Coder, the typex package, the exec package & the FullValue type.
   
   Following this PR:

var t reflect.Type // from somewhere
enc := beam.NewElementEncoder(t)
var buf bytes.Buffer
err := enc.Encode(value, &buf)
...

   which doesn't leak such details. Similarly for the Decoders.
   
   This PR makes the breaking change of removing the beam.UnwrapCoder method. 
There is an open question of whether users will ever need to encode KVs or 
similar appropriately for beam themselves, rather than via the framework, 
however at that point.
   
   TODO in a later PR: allow users to register coders that implement 
beam.ElementEncoder and beam.ElementDecoder.
   
   
   
   Thank you for your contribution! Follow this checklist to help us 
incorporate your contribution quickly and easily:
   
- [ ] [**Choose 
reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and 
mention them in a comment (`R: @username`).
- [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in 
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA 
issue, if applicable. This will automatically link the pull request to the 
issue.
- [ ] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).
   
   Post-Commit Tests Status (on master branch)
   

   
   Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark
   --- | --- | --- | --- | --- | --- | --- | ---
   Go | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/)
 | --- | --- | --- | --- | --- | ---
   Java | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.a

[jira] [Work logged] (BEAM-6392) Add support for new BigQuery streaming read API to BigQueryIO

2019-02-13 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6392?focusedWorklogId=198344&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-198344
 ]

ASF GitHub Bot logged work on BEAM-6392:


Author: ASF GitHub Bot
Created on: 13/Feb/19 21:55
Start Date: 13/Feb/19 21:55
Worklog Time Spent: 10m 
  Work Description: chamikaramj commented on issue #7441: [BEAM-6392] Add 
support for the BigQuery read API to BigQueryIO.
URL: https://github.com/apache/beam/pull/7441#issuecomment-463390681
 
 
   Run Java PostCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 198344)
Time Spent: 4h  (was: 3h 50m)

> Add support for new BigQuery streaming read API to BigQueryIO
> -
>
> Key: BEAM-6392
> URL: https://issues.apache.org/jira/browse/BEAM-6392
> Project: Beam
>  Issue Type: New Feature
>  Components: io-java-gcp
>Reporter: Kenneth Jung
>Assignee: Kenneth Jung
>Priority: Major
>  Labels: triaged
>  Time Spent: 4h
>  Remaining Estimate: 0h
>
> BigQuery has developed a new streaming egress API which will soon reach 
> public availability. Add support for the new API in BigQueryIO.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (BEAM-6623) Dataflow ValidatesRunner test suite should also exercise ValidatesRunner tests under Python 3.

2019-02-13 Thread Mark Liu (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6623?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Liu reassigned BEAM-6623:
--

Assignee: Mark Liu  (was: Robbe)

> Dataflow ValidatesRunner test suite should also exercise ValidatesRunner 
> tests under Python 3.
> --
>
> Key: BEAM-6623
> URL: https://issues.apache.org/jira/browse/BEAM-6623
> Project: Beam
>  Issue Type: Sub-task
>  Components: testing
>Reporter: Valentyn Tymofieiev
>Assignee: Mark Liu
>Priority: Critical
>  Labels: triaged
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-6665) SDK source tarball is different when created on Python 2 and Python 3

2019-02-13 Thread Valentyn Tymofieiev (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-6665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16767535#comment-16767535
 ] 

Valentyn Tymofieiev commented on BEAM-6665:
---

Verified that the contents of a tarball generated on Py2 and Py3 is the same 
lest a few insignificant differences in setup.cfg


{noformat}
0 +
@@ -7,21 +7,19 @@
 branch = True
 source = apache_beam
 omit = 
-   # Omit auto-generated files by the protocol buffer compiler.
apache_beam/portability/api/*_pb2.py
apache_beam/portability/api/*_pb2_grpc.py
 
 [coverage:report]
 exclude_lines = 
-   # Have to re-enable the standard pragma
pragma: no cover
abc.abstractmethod
-   # Don't complain about missing debug-only code:
+   
def __repr__
if self\.debug
-   # Don't complain if tests don't hit defensive assertion code:
+   
raise NotImplementedError
-   # Don't complain if non-runnable code isn't run:
+   
if __name__ == .__main__.:
 

{noformat}


 

 

> SDK source tarball is different when created on Python 2 and Python 3
> -
>
> Key: BEAM-6665
> URL: https://issues.apache.org/jira/browse/BEAM-6665
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-core
>Reporter: Valentyn Tymofieiev
>Priority: Blocker
> Fix For: 2.11.0
>
>
> When we create a source tarball via `python setup.py sdist`  on Python 2, the 
> generated protocol buffer code includes relative imports.
> Because of that, a tarball created on Python 2 interpreter cannot be used on 
> Python 3.
> AFAIK, we release only one source tarball to PyPi, so if possible we should 
> make source distribution of Beam compatible both with Python 2 and Python 3. 
> When we create a source tarball on Python 3, we call futurize on generated 
> _.pb2.py files[1]. Looks like this does not happen on Python 2, which we can 
> fix in this issue.
> [1] 
> [[https://github.com/apache/beam/blob/d6dc3602c808ce1961f46098deb898e5c7480ad4/sdks/python/gen_protos.py#L122]
> cc [~altay], [~charlesche...@yahoo.com], [~RobbeSneyders]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (BEAM-6665) SDK source tarball is different when created on Python 2 and Python 3

2019-02-13 Thread Valentyn Tymofieiev (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6665?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Valentyn Tymofieiev resolved BEAM-6665.
---
Resolution: Fixed
  Assignee: Valentyn Tymofieiev

> SDK source tarball is different when created on Python 2 and Python 3
> -
>
> Key: BEAM-6665
> URL: https://issues.apache.org/jira/browse/BEAM-6665
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-core
>Reporter: Valentyn Tymofieiev
>Assignee: Valentyn Tymofieiev
>Priority: Blocker
> Fix For: 2.11.0
>
>
> When we create a source tarball via `python setup.py sdist`  on Python 2, the 
> generated protocol buffer code includes relative imports.
> Because of that, a tarball created on Python 2 interpreter cannot be used on 
> Python 3.
> AFAIK, we release only one source tarball to PyPi, so if possible we should 
> make source distribution of Beam compatible both with Python 2 and Python 3. 
> When we create a source tarball on Python 3, we call futurize on generated 
> _.pb2.py files[1]. Looks like this does not happen on Python 2, which we can 
> fix in this issue.
> [1] 
> [[https://github.com/apache/beam/blob/d6dc3602c808ce1961f46098deb898e5c7480ad4/sdks/python/gen_protos.py#L122]
> cc [~altay], [~charlesche...@yahoo.com], [~RobbeSneyders]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (BEAM-6623) Dataflow ValidatesRunner test suite should also exercise ValidatesRunner tests under Python 3.

2019-02-13 Thread Mark Liu (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6623?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Liu resolved BEAM-6623.

   Resolution: Done
Fix Version/s: Not applicable

> Dataflow ValidatesRunner test suite should also exercise ValidatesRunner 
> tests under Python 3.
> --
>
> Key: BEAM-6623
> URL: https://issues.apache.org/jira/browse/BEAM-6623
> Project: Beam
>  Issue Type: Sub-task
>  Components: testing
>Reporter: Valentyn Tymofieiev
>Assignee: Mark Liu
>Priority: Critical
>  Labels: triaged
> Fix For: Not applicable
>
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-6623) Dataflow ValidatesRunner test suite should also exercise ValidatesRunner tests under Python 3.

2019-02-13 Thread Mark Liu (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-6623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16767524#comment-16767524
 ] 

Mark Liu commented on BEAM-6623:


#7825 is merged. ValidatesRunner suite is now running in 
beam_PostCommit_Python3_Verify. 

> Dataflow ValidatesRunner test suite should also exercise ValidatesRunner 
> tests under Python 3.
> --
>
> Key: BEAM-6623
> URL: https://issues.apache.org/jira/browse/BEAM-6623
> Project: Beam
>  Issue Type: Sub-task
>  Components: testing
>Reporter: Valentyn Tymofieiev
>Assignee: Mark Liu
>Priority: Critical
>  Labels: triaged
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-6665) SDK source tarball is different when created on Python 2 and Python 3

2019-02-13 Thread Valentyn Tymofieiev (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-6665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16767516#comment-16767516
 ] 

Valentyn Tymofieiev commented on BEAM-6665:
---

Looking at gen_protos code,  futurization of protobufs happens  both on Python 
2 and Python 3 versions, and in successful scenario it does.

I encountered a corner case where protocol bufferes were generated, but 
futurization failed due to a pip error. The next time I re-ran sdist, the  
gen_protos did not re-generate protocol buffers because they were already 
generated, and futurization codepath was not exercised: 
https://github.com/apache/beam/blob/d6dc3602c808ce1961f46098deb898e5c7480ad4/sdks/python/gen_protos.py#L76

> SDK source tarball is different when created on Python 2 and Python 3
> -
>
> Key: BEAM-6665
> URL: https://issues.apache.org/jira/browse/BEAM-6665
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-core
>Reporter: Valentyn Tymofieiev
>Priority: Blocker
> Fix For: 2.11.0
>
>
> When we create a source tarball via `python setup.py sdist`  on Python 2, the 
> generated protocol buffer code includes relative imports.
> Because of that, a tarball created on Python 2 interpreter cannot be used on 
> Python 3.
> AFAIK, we release only one source tarball to PyPi, so if possible we should 
> make source distribution of Beam compatible both with Python 2 and Python 3. 
> When we create a source tarball on Python 3, we call futurize on generated 
> _.pb2.py files[1]. Looks like this does not happen on Python 2, which we can 
> fix in this issue.
> [1] 
> [[https://github.com/apache/beam/blob/d6dc3602c808ce1961f46098deb898e5c7480ad4/sdks/python/gen_protos.py#L122]
> cc [~altay], [~charlesche...@yahoo.com], [~RobbeSneyders]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-6649) beam11 worker fails all jobs

2019-02-13 Thread yifan zou (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-6649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16767509#comment-16767509
 ] 

yifan zou commented on BEAM-6649:
-

The VM was reset, waiting for reconnection. 
https://issues.apache.org/jira/browse/INFRA-17837

> beam11 worker fails all jobs
> 
>
> Key: BEAM-6649
> URL: https://issues.apache.org/jira/browse/BEAM-6649
> Project: Beam
>  Issue Type: Bug
>  Components: test-failures
>Reporter: Mikhail Gryzykhin
>Assignee: yifan zou
>Priority: Major
>  Labels: currently-failing
>
> _Use this form to file an issue for test failure:_
>  * [Jenkins 
> Job|https://builds.apache.org/job/beam_PreCommit_Portable_Python_Cron/288/console]
> Initial investigation:
> Github plugin crashed fetching data from git.
> Accroding to [stackoverflow error 
> 128|https://stackoverflow.com/questions/16721629/jenkins-returned-status-code-128-with-github]
>  means wrong ssh key.
> Relevant logs:
> *04:01:08* FATAL: Could not checkout 
> d1200202c8e98d39dc8422b1255954b31a4341cb*04:01:08* 
> hudson.plugins.git.GitException: Command "git checkout -f 
> d1200202c8e98d39dc8422b1255954b31a4341cb" returned status code 128:*04:01:08* 
> stdout: *04:01:08* stderr: fatal: unable to create threaded lstat
> 
> _After you've filled out the above details, please [assign the issue to an 
> individual|https://beam.apache.org/contribute/postcommits-guides/index.html#find_specialist].
>  Assignee should [treat test failures as 
> high-priority|https://beam.apache.org/contribute/postcommits-policies/#assigned-failing-test],
>  helping to fix the issue or find a more appropriate owner. See [Apache Beam 
> Post-Commit 
> Policies|https://beam.apache.org/contribute/postcommits-policies]._



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (BEAM-6665) SDK source tarball is different when created on Python 2 and Python 3

2019-02-13 Thread Valentyn Tymofieiev (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6665?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Valentyn Tymofieiev updated BEAM-6665:
--
Description: 
When we create a source tarball via `python setup.py sdist`  on Python 2, the 
generated protocol buffer code includes relative imports.

Because of that, a tarball created on Python 2 interpreter cannot be used on 
Python 3.

AFAIK, we release only one source tarball to PyPi, so if possible we should 
make source distribution of Beam compatible both with Python 2 and Python 3. 

When we create a source tarball on Python 3, we call futurize on generated 
_.pb2.py files[1]. Looks like this does not happen on Python 2, which we can 
fix in this issue.

[1] 
[[https://github.com/apache/beam/blob/d6dc3602c808ce1961f46098deb898e5c7480ad4/sdks/python/gen_protos.py#L122]

cc [~altay], [~charlesche...@yahoo.com], [~RobbeSneyders]

  was:
When we create a source tarball via `python setup.py sdist`  on Python 2, the 
generated protocol buffer code includes relative imports.

Because of that, a tarball created on Python 2 interpreter cannot be used on 
Python 3.

AFAIK, we release only one source tarball to PyPi, so if possible we should 
make source distribution of Beam compatible both with Python 2 and Python 3. 

When we create a source tarball on Python 3, we call futurize on generated 
_.pb2.py files[1]. Looks like this does not happen on Python 2, which we can 
fix in this issue.

[1] 
[[https://github.com/apache/beam/blob/d6dc3602c808ce1961f46098deb898e5c7480ad4/sdks/python/gen_protos.py#L122]|https://github.com/apache/beam/blob/d6dc3602c808ce1961f46098deb898e5c7480ad4/sdks/python/gen_protos.py#L122]
 

cc [~altay], [~charlesche...@yahoo.com], [~RobbeSneyders]


> SDK source tarball is different when created on Python 2 and Python 3
> -
>
> Key: BEAM-6665
> URL: https://issues.apache.org/jira/browse/BEAM-6665
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-core
>Reporter: Valentyn Tymofieiev
>Priority: Blocker
> Fix For: 2.11.0
>
>
> When we create a source tarball via `python setup.py sdist`  on Python 2, the 
> generated protocol buffer code includes relative imports.
> Because of that, a tarball created on Python 2 interpreter cannot be used on 
> Python 3.
> AFAIK, we release only one source tarball to PyPi, so if possible we should 
> make source distribution of Beam compatible both with Python 2 and Python 3. 
> When we create a source tarball on Python 3, we call futurize on generated 
> _.pb2.py files[1]. Looks like this does not happen on Python 2, which we can 
> fix in this issue.
> [1] 
> [[https://github.com/apache/beam/blob/d6dc3602c808ce1961f46098deb898e5c7480ad4/sdks/python/gen_protos.py#L122]
> cc [~altay], [~charlesche...@yahoo.com], [~RobbeSneyders]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (BEAM-6665) SDK source tarball is different when created on Python 2 and Python 3

2019-02-13 Thread Valentyn Tymofieiev (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6665?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Valentyn Tymofieiev updated BEAM-6665:
--
Description: 
When we create a source tarball via `python setup.py sdist`  on Python 2, the 
generated protocol buffer code includes relative imports.

Because of that, a tarball created on Python 2 interpreter cannot be used on 
Python 3.

AFAIK, we release only one source tarball to PyPi, so if possible we should 
make source distribution of Beam compatible both with Python 2 and Python 3. 

When we create a source tarball on Python 3, we call futurize on generated 
_.pb2.py files[1]. Looks like this does not happen on Python 2, which we can 
fix in this issue.

[1] 
[[https://github.com/apache/beam/blob/d6dc3602c808ce1961f46098deb898e5c7480ad4/sdks/python/gen_protos.py#L122]|https://github.com/apache/beam/blob/d6dc3602c808ce1961f46098deb898e5c7480ad4/sdks/python/gen_protos.py#L122]
 

cc [~altay], [~charlesche...@yahoo.com], [~RobbeSneyders]

  was:
When we create a source tarball via `python setup.py sdist`  on Python 2, the 
generated protocol buffer code includes relative imports.

Because of that, a tarball created on Python 2 interpreter cannot be used on 
Python 3.

AFAIK, we release only one source tarball to PyPi, so if possible we should 
make source distribution of Beam compatible both with Python 2 and Python 3. 

When we create a source tarball on Python 3, we call futurize on generated 
_.pb2.py files[1]. Looks like this does not happen on Python 2, which we can 
fix in this issue.

[[1] 
https://github.com/apache/beam/blob/d6dc3602c808ce1961f46098deb898e5c7480ad4/sdks/python/gen_protos.py#L122|https://github.com/apache/beam/blob/d6dc3602c808ce1961f46098deb898e5c7480ad4/sdks/python/gen_protos.py#L122]
 

cc [~altay], [~charlesche...@yahoo.com], [~RobbeSneyders]


> SDK source tarball is different when created on Python 2 and Python 3
> -
>
> Key: BEAM-6665
> URL: https://issues.apache.org/jira/browse/BEAM-6665
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-core
>Reporter: Valentyn Tymofieiev
>Priority: Blocker
> Fix For: 2.11.0
>
>
> When we create a source tarball via `python setup.py sdist`  on Python 2, the 
> generated protocol buffer code includes relative imports.
> Because of that, a tarball created on Python 2 interpreter cannot be used on 
> Python 3.
> AFAIK, we release only one source tarball to PyPi, so if possible we should 
> make source distribution of Beam compatible both with Python 2 and Python 3. 
> When we create a source tarball on Python 3, we call futurize on generated 
> _.pb2.py files[1]. Looks like this does not happen on Python 2, which we can 
> fix in this issue.
> [1] 
> [[https://github.com/apache/beam/blob/d6dc3602c808ce1961f46098deb898e5c7480ad4/sdks/python/gen_protos.py#L122]|https://github.com/apache/beam/blob/d6dc3602c808ce1961f46098deb898e5c7480ad4/sdks/python/gen_protos.py#L122]
>  
> cc [~altay], [~charlesche...@yahoo.com], [~RobbeSneyders]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (BEAM-6572) Dataflow Python runner should use a Python-3 compatible container when starting a Python 3 pipeline.

2019-02-13 Thread Valentyn Tymofieiev (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6572?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Valentyn Tymofieiev reassigned BEAM-6572:
-

Assignee: Valentyn Tymofieiev

> Dataflow Python runner should use a Python-3 compatible container when 
> starting a Python 3 pipeline.
> 
>
> Key: BEAM-6572
> URL: https://issues.apache.org/jira/browse/BEAM-6572
> Project: Beam
>  Issue Type: Sub-task
>  Components: runner-dataflow
>Reporter: Valentyn Tymofieiev
>Assignee: Valentyn Tymofieiev
>Priority: Blocker
> Fix For: 2.11.0
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Container choice will be affected by:
> - SDK version (dev/released)
> - Execution mode (FnApi/Legacy)
> - Python interpreter version.
> cc: [~ccy] [~markflyhigh] [~altay]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (BEAM-6572) Dataflow Python runner should use a Python-3 compatible container when starting a Python 3 pipeline.

2019-02-13 Thread Valentyn Tymofieiev (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6572?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Valentyn Tymofieiev resolved BEAM-6572.
---
   Resolution: Fixed
Fix Version/s: 2.11.0

> Dataflow Python runner should use a Python-3 compatible container when 
> starting a Python 3 pipeline.
> 
>
> Key: BEAM-6572
> URL: https://issues.apache.org/jira/browse/BEAM-6572
> Project: Beam
>  Issue Type: Sub-task
>  Components: runner-dataflow
>Reporter: Valentyn Tymofieiev
>Priority: Blocker
> Fix For: 2.11.0
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Container choice will be affected by:
> - SDK version (dev/released)
> - Execution mode (FnApi/Legacy)
> - Python interpreter version.
> cc: [~ccy] [~markflyhigh] [~altay]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (BEAM-6665) SDK source tarball is different when created on Python 2 and Python 3

2019-02-13 Thread Valentyn Tymofieiev (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6665?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Valentyn Tymofieiev updated BEAM-6665:
--
Fix Version/s: 2.11.0

> SDK source tarball is different when created on Python 2 and Python 3
> -
>
> Key: BEAM-6665
> URL: https://issues.apache.org/jira/browse/BEAM-6665
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-core
>Reporter: Valentyn Tymofieiev
>Priority: Blocker
> Fix For: 2.11.0
>
>
> When we create a source tarball via `python setup.py sdist`  on Python 2, the 
> generated protocol buffer code includes relative imports.
> Because of that, a tarball created on Python 2 interpreter cannot be used on 
> Python 3.
> AFAIK, we release only one source tarball to PyPi, so if possible we should 
> make source distribution of Beam compatible both with Python 2 and Python 3. 
> When we create a source tarball on Python 3, we call futurize on generated 
> _.pb2.py files[1]. Looks like this does not happen on Python 2, which we can 
> fix in this issue.
> [[1] 
> https://github.com/apache/beam/blob/d6dc3602c808ce1961f46098deb898e5c7480ad4/sdks/python/gen_protos.py#L122|https://github.com/apache/beam/blob/d6dc3602c808ce1961f46098deb898e5c7480ad4/sdks/python/gen_protos.py#L122]
>  
> cc [~altay], [~charlesche...@yahoo.com], [~RobbeSneyders]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3342) Create a Cloud Bigtable Python connector

2019-02-13 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-3342?focusedWorklogId=198268&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-198268
 ]

ASF GitHub Bot logged work on BEAM-3342:


Author: ASF GitHub Bot
Created on: 13/Feb/19 18:39
Start Date: 13/Feb/19 18:39
Worklog Time Spent: 10m 
  Work Description: angoenka commented on issue #7737: [BEAM-3342] Create a 
Cloud Bigtable Python connector Read
URL: https://github.com/apache/beam/pull/7737#issuecomment-463315783
 
 
   @chamikaramj can you please take a look.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 198268)
Time Spent: 15h 40m  (was: 15.5h)

> Create a Cloud Bigtable Python connector
> 
>
> Key: BEAM-3342
> URL: https://issues.apache.org/jira/browse/BEAM-3342
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Solomon Duskis
>Assignee: Solomon Duskis
>Priority: Major
>  Labels: triaged
>  Time Spent: 15h 40m
>  Remaining Estimate: 0h
>
> I would like to create a Cloud Bigtable python connector.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6623) Dataflow ValidatesRunner test suite should also exercise ValidatesRunner tests under Python 3.

2019-02-13 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6623?focusedWorklogId=198272&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-198272
 ]

ASF GitHub Bot logged work on BEAM-6623:


Author: ASF GitHub Bot
Created on: 13/Feb/19 18:43
Start Date: 13/Feb/19 18:43
Worklog Time Spent: 10m 
  Work Description: aaltay commented on pull request #7825: [BEAM-6623] 
Exercise ValidatesRunner batch tests in Python 3
URL: https://github.com/apache/beam/pull/7825
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 198272)
Time Spent: 1h 10m  (was: 1h)

> Dataflow ValidatesRunner test suite should also exercise ValidatesRunner 
> tests under Python 3.
> --
>
> Key: BEAM-6623
> URL: https://issues.apache.org/jira/browse/BEAM-6623
> Project: Beam
>  Issue Type: Sub-task
>  Components: testing
>Reporter: Valentyn Tymofieiev
>Assignee: Robbe
>Priority: Critical
>  Labels: triaged
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6572) Dataflow Python runner should use a Python-3 compatible container when starting a Python 3 pipeline.

2019-02-13 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6572?focusedWorklogId=198270&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-198270
 ]

ASF GitHub Bot logged work on BEAM-6572:


Author: ASF GitHub Bot
Created on: 13/Feb/19 18:42
Start Date: 13/Feb/19 18:42
Worklog Time Spent: 10m 
  Work Description: aaltay commented on pull request #7829: [BEAM-6572] 
Makes Dataflow runner use python3-fnapi container when applicable.
URL: https://github.com/apache/beam/pull/7829
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 198270)
Time Spent: 0.5h  (was: 20m)

> Dataflow Python runner should use a Python-3 compatible container when 
> starting a Python 3 pipeline.
> 
>
> Key: BEAM-6572
> URL: https://issues.apache.org/jira/browse/BEAM-6572
> Project: Beam
>  Issue Type: Sub-task
>  Components: runner-dataflow
>Reporter: Valentyn Tymofieiev
>Priority: Blocker
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Container choice will be affected by:
> - SDK version (dev/released)
> - Execution mode (FnApi/Legacy)
> - Python interpreter version.
> cc: [~ccy] [~markflyhigh] [~altay]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-5315) Finish Python 3 porting for io module

2019-02-13 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5315?focusedWorklogId=198269&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-198269
 ]

ASF GitHub Bot logged work on BEAM-5315:


Author: ASF GitHub Bot
Created on: 13/Feb/19 18:42
Start Date: 13/Feb/19 18:42
Worklog Time Spent: 10m 
  Work Description: aaltay commented on pull request #7804: [BEAM-5315] 
python 3 datastore fail gracefully
URL: https://github.com/apache/beam/pull/7804
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 198269)
Time Spent: 19.5h  (was: 19h 20m)

> Finish Python 3 porting for io module
> -
>
> Key: BEAM-5315
> URL: https://issues.apache.org/jira/browse/BEAM-5315
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-core
>Reporter: Robbe
>Assignee: Robbe
>Priority: Major
>  Labels: triaged
>  Time Spent: 19.5h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6392) Add support for new BigQuery streaming read API to BigQueryIO

2019-02-13 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6392?focusedWorklogId=198267&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-198267
 ]

ASF GitHub Bot logged work on BEAM-6392:


Author: ASF GitHub Bot
Created on: 13/Feb/19 18:38
Start Date: 13/Feb/19 18:38
Worklog Time Spent: 10m 
  Work Description: kmjung commented on issue #7441: [BEAM-6392] Add 
support for the BigQuery read API to BigQueryIO.
URL: https://github.com/apache/beam/pull/7441#issuecomment-463315433
 
 
   @chamikaramj we should be good to go here.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 198267)
Time Spent: 3h 50m  (was: 3h 40m)

> Add support for new BigQuery streaming read API to BigQueryIO
> -
>
> Key: BEAM-6392
> URL: https://issues.apache.org/jira/browse/BEAM-6392
> Project: Beam
>  Issue Type: New Feature
>  Components: io-java-gcp
>Reporter: Kenneth Jung
>Assignee: Kenneth Jung
>Priority: Major
>  Labels: triaged
>  Time Spent: 3h 50m
>  Remaining Estimate: 0h
>
> BigQuery has developed a new streaming egress API which will soon reach 
> public availability. Add support for the new API in BigQueryIO.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4461) Create a library of useful transforms that use schemas

2019-02-13 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4461?focusedWorklogId=198266&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-198266
 ]

ASF GitHub Bot logged work on BEAM-4461:


Author: ASF GitHub Bot
Created on: 13/Feb/19 18:38
Start Date: 13/Feb/19 18:38
Worklog Time Spent: 10m 
  Work Description: kanterov commented on issue #7353: [BEAM-4461] Support 
inner and outer style joins in CoGroup.
URL: https://github.com/apache/beam/pull/7353#issuecomment-463315320
 
 
   @reuvenlax I'm going on vacation, but I review on the week of 25th February
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 198266)
Time Spent: 19.5h  (was: 19h 20m)

> Create a library of useful transforms that use schemas
> --
>
> Key: BEAM-4461
> URL: https://issues.apache.org/jira/browse/BEAM-4461
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-java-core
>Reporter: Reuven Lax
>Assignee: Reuven Lax
>Priority: Major
>  Labels: triaged
>  Time Spent: 19.5h
>  Remaining Estimate: 0h
>
> e.g. JoinBy(fields). Project, Filter, etc.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6650) FlinkRunner fails to checkpoint elements emitted during finishBundle

2019-02-13 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6650?focusedWorklogId=198263&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-198263
 ]

ASF GitHub Bot logged work on BEAM-6650:


Author: ASF GitHub Bot
Created on: 13/Feb/19 18:33
Start Date: 13/Feb/19 18:33
Worklog Time Spent: 10m 
  Work Description: mxm commented on issue #7810: [BEAM-6650] Add bundle 
test with checkpointing for keyed processing
URL: https://github.com/apache/beam/pull/7810#issuecomment-463313948
 
 
   @aljoscha I'm assuming insertion order at the state backend at the moment. 
Not sure if that assumption holds. At least for the tests it works.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 198263)
Time Spent: 1h 40m  (was: 1.5h)

> FlinkRunner fails to checkpoint elements emitted during finishBundle
> 
>
> Key: BEAM-6650
> URL: https://issues.apache.org/jira/browse/BEAM-6650
> Project: Beam
>  Issue Type: Bug
>  Components: runner-flink
>Reporter: Maximilian Michels
>Assignee: Maximilian Michels
>Priority: Major
> Fix For: 2.11.0
>
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> Elements emitted during the finalizeBundle call in snapshopState are lost 
> after the pipeline is restored. This only happens when the operator is keyed.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-5816) Flink runner starts new bundles while disposing operator

2019-02-13 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5816?focusedWorklogId=198254&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-198254
 ]

ASF GitHub Bot logged work on BEAM-5816:


Author: ASF GitHub Bot
Created on: 13/Feb/19 18:16
Start Date: 13/Feb/19 18:16
Worklog Time Spent: 10m 
  Work Description: mxm commented on issue #7719: [BEAM-5816] Finish Flink 
bundles exactly once
URL: https://github.com/apache/beam/pull/7719#issuecomment-463307702
 
 
   >1. After an exception or cancel, we don't want any more "normal" 
processing (looks like this is now addressed)
   
   Yes, this is addressed. Bundle will only be finalized in case of normal 
operation. Though my original intention here was to fix the race condition 
between the timer thread for finishing the bundle and a regular bundle finish
   
   >2. Best effort resource cleanup for user code (not sure about that one, 
would this happen after an exception)?
   
   Cleanup is performed via DoFn#teardown which will be called in any case in 
`dispose()`.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 198254)
Time Spent: 2h 40m  (was: 2.5h)

> Flink runner starts new bundles while disposing operator 
> -
>
> Key: BEAM-5816
> URL: https://issues.apache.org/jira/browse/BEAM-5816
> Project: Beam
>  Issue Type: Bug
>  Components: runner-flink
>Reporter: Micah Wylde
>Assignee: Maximilian Michels
>Priority: Major
>  Labels: portability-flink, triaged
>  Time Spent: 2h 40m
>  Remaining Estimate: 0h
>
> We sometimes see exceptions when shutting down portable flink pipelines 
> (either due to cancellation or failure):
> {code}
> 2018-10-19 15:54:52,905 ERROR 
> org.apache.flink.streaming.runtime.tasks.StreamTask   - Error during 
> disposal of stream operator.
> java.lang.RuntimeException: Failed to finish remote bundle
>   at 
> org.apache.beam.runners.flink.translation.wrappers.streaming.ExecutableStageDoFnOperator$SdkHarnessDoFnRunner.finishBundle(ExecutableStageDoFnOperator.java:241)
>   at 
> org.apache.beam.runners.flink.metrics.DoFnRunnerWithMetricsUpdate.finishBundle(DoFnRunnerWithMetricsUpdate.java:87)
>   at 
> org.apache.beam.runners.core.SimplePushbackSideInputDoFnRunner.finishBundle(SimplePushbackSideInputDoFnRunner.java:118)
>   at 
> org.apache.beam.runners.flink.translation.wrappers.streaming.DoFnOperator.invokeFinishBundle(DoFnOperator.java:674)
>   at 
> org.apache.beam.runners.flink.translation.wrappers.streaming.DoFnOperator.dispose(DoFnOperator.java:391)
>   at 
> org.apache.beam.runners.flink.translation.wrappers.streaming.ExecutableStageDoFnOperator.dispose(ExecutableStageDoFnOperator.java:166)
>   at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.disposeAllOperators(StreamTask.java:473)
>   at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:374)
>   at org.apache.flink.runtime.taskmanager.Task.run(Task.java:703)
>   at java.lang.Thread.run(Thread.java:748)
> Caused by: java.lang.IllegalStateException: Already closed.
>   at 
> org.apache.beam.sdk.fn.data.BeamFnDataBufferingOutboundObserver.close(BeamFnDataBufferingOutboundObserver.java:95)
>   at 
> org.apache.beam.runners.fnexecution.control.SdkHarnessClient$ActiveBundle.close(SdkHarnessClient.java:251)
>   at 
> org.apache.beam.runners.flink.translation.wrappers.streaming.ExecutableStageDoFnOperator$SdkHarnessDoFnRunner.finishBundle(ExecutableStageDoFnOperator.java:238)
>   ... 9 more
>   Suppressed: java.lang.IllegalStateException: Processing bundle failed, 
> TODO: [BEAM-3962] abort bundle.
>   at 
> org.apache.beam.runners.fnexecution.control.SdkHarnessClient$ActiveBundle.close(SdkHarnessClient.java:266)
>   ... 10 more
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-5315) Finish Python 3 porting for io module

2019-02-13 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5315?focusedWorklogId=198255&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-198255
 ]

ASF GitHub Bot logged work on BEAM-5315:


Author: ASF GitHub Bot
Created on: 13/Feb/19 18:20
Start Date: 13/Feb/19 18:20
Worklog Time Spent: 10m 
  Work Description: tvalentyn commented on pull request #7804: [BEAM-5315] 
python 3 datastore fail gracefully
URL: https://github.com/apache/beam/pull/7804#discussion_r256525248
 
 

 ##
 File path: sdks/python/tox.ini
 ##
 @@ -93,7 +93,7 @@ setenv =
   RUN_SKIPPED_PY3_TESTS=0
 extras = test,gcp
 modules =
-  
apache_beam.typehints,apache_beam.coders,apache_beam.options,apache_beam.tools,apache_beam.utils,apache_beam.internal,apache_beam.metrics,apache_beam.portability,apache_beam.pipeline_test,apache_beam.pvalue_test,apache_beam.runners,apache_beam.io.hadoopfilesystem_test,apache_beam.io.gcp.tests.utils_test,apache_beam.io.gcp.big_query_query_to_table_it_test,apache_beam.io.gcp.bigquery_io_read_it_test,apache_beam.io.gcp.bigquery_test,apache_beam.io.gcp.pubsub_integration_test,apache_beam.io.hdfs_integration_test,apache_beam.io.gcp.internal,apache_beam.io.filesystem_test,apache_beam.io.filesystems_test,apache_beam.io.sources_test,apache_beam.transforms,apache_beam.testing,apache_beam.io.filesystemio_test,apache_beam.io.localfilesystem_test,apache_beam.io.range_trackers_test,apache_beam.io.restriction_trackers_test,apache_beam.io.source_test_utils_test,apache_beam.io.concat_source_test,apache_beam.io.filebasedsink_test,apache_beam.io.filebasedsource_test,apache_beam.io.textio_test,apache_beam.io.tfrecordio_test,apache_beam.examples.wordcount_debugging_test,apache_beam.examples.wordcount_minimal_test,apache_beam.examples.wordcount_test,apache_beam.io.parquetio_test,apache_beam.io.gcp.gcsfilesystem_test,apache_beam.io.gcp.gcsio_integration_test,apache_beam.io.gcp.bigquery_test,apache_beam.io.gcp.big_query_query_to_table_it_test,apache_beam.io.gcp.bigquery_io_read_it_test,apache_beam.io.gcp.bigquery_tools_test,apache_beam.io.gcp.pubsub_integration_test,apache_beam.io.gcp.pubsub_test
+  
apache_beam.typehints,apache_beam.coders,apache_beam.options,apache_beam.tools,apache_beam.utils,apache_beam.internal,apache_beam.metrics,apache_beam.portability,apache_beam.pipeline_test,apache_beam.pvalue_test,apache_beam.runners,apache_beam.io.hadoopfilesystem_test,apache_beam.io.gcp.tests.utils_test,apache_beam.io.gcp.big_query_query_to_table_it_test,apache_beam.io.gcp.bigquery_io_read_it_test,apache_beam.io.gcp.bigquery_test,apache_beam.io.gcp.pubsub_integration_test,apache_beam.io.hdfs_integration_test,apache_beam.io.gcp.internal,apache_beam.io.filesystem_test,apache_beam.io.filesystems_test,apache_beam.io.sources_test,apache_beam.transforms,apache_beam.testing,apache_beam.io.filesystemio_test,apache_beam.io.localfilesystem_test,apache_beam.io.range_trackers_test,apache_beam.io.restriction_trackers_test,apache_beam.io.source_test_utils_test,apache_beam.io.concat_source_test,apache_beam.io.filebasedsink_test,apache_beam.io.filebasedsource_test,apache_beam.io.textio_test,apache_beam.io.tfrecordio_test,apache_beam.examples.wordcount_debugging_test,apache_beam.examples.wordcount_minimal_test,apache_beam.examples.wordcount_test,apache_beam.io.parquetio_test,apache_beam.io.gcp.gcsfilesystem_test,apache_beam.io.gcp.gcsio_integration_test,apache_beam.io.gcp.bigquery_test,apache_beam.io.gcp.big_query_query_to_table_it_test,apache_beam.io.gcp.bigquery_io_read_it_test,apache_beam.io.gcp.bigquery_tools_test,apache_beam.io.gcp.pubsub_integration_test,apache_beam.io.gcp.pubsub_test,apache_beam.io.gcp.datastore,apache_beam.io.gcp.datastore_write_it_test
 
 Review comment:
   Adding Datastore tests doesn't change anything since we skip them anyway, 
but it's a harmless change, so we can merge it. Let's try to simplify this to  
run all tests in a future PR.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 198255)
Time Spent: 19h 20m  (was: 19h 10m)

> Finish Python 3 porting for io module
> -
>
> Key: BEAM-5315
> URL: https://issues.apache.org/jira/browse/BEAM-5315
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-core
>Reporter: Robbe
>Assignee: Robbe
>Priority: Major
>  Labels: triaged
>  Time Spent: 19h 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6623) Dataflow ValidatesRunner test suite should also exercise ValidatesRunner tests under Python 3.

2019-02-13 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6623?focusedWorklogId=198251&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-198251
 ]

ASF GitHub Bot logged work on BEAM-6623:


Author: ASF GitHub Bot
Created on: 13/Feb/19 18:08
Start Date: 13/Feb/19 18:08
Worklog Time Spent: 10m 
  Work Description: tvalentyn commented on issue #7825: [BEAM-6623] 
Exercise ValidatesRunner batch tests in Python 3
URL: https://github.com/apache/beam/pull/7825#issuecomment-463304891
 
 
   Run Java PreCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 198251)
Time Spent: 50m  (was: 40m)

> Dataflow ValidatesRunner test suite should also exercise ValidatesRunner 
> tests under Python 3.
> --
>
> Key: BEAM-6623
> URL: https://issues.apache.org/jira/browse/BEAM-6623
> Project: Beam
>  Issue Type: Sub-task
>  Components: testing
>Reporter: Valentyn Tymofieiev
>Assignee: Robbe
>Priority: Critical
>  Labels: triaged
>  Time Spent: 50m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (BEAM-6665) SDK source tarball is different when created on Python 2 and Python 3

2019-02-13 Thread Valentyn Tymofieiev (JIRA)
Valentyn Tymofieiev created BEAM-6665:
-

 Summary: SDK source tarball is different when created on Python 2 
and Python 3
 Key: BEAM-6665
 URL: https://issues.apache.org/jira/browse/BEAM-6665
 Project: Beam
  Issue Type: Sub-task
  Components: sdk-py-core
Reporter: Valentyn Tymofieiev


When we create a source tarball via `python setup.py sdist`  on Python 2, the 
generated protocol buffer code includes relative imports.

Because of that, a tarball created on Python 2 interpreter cannot be used on 
Python 3.

AFAIK, we release only one source tarball to PyPi, so if possible we should 
make source distribution of Beam compatible both with Python 2 and Python 3. 

When we create a source tarball on Python 3, we call futurize on generated 
_.pb2.py files[1]. Looks like this does not happen on Python 2, which we can 
fix in this issue.

[[1] 
https://github.com/apache/beam/blob/d6dc3602c808ce1961f46098deb898e5c7480ad4/sdks/python/gen_protos.py#L122|https://github.com/apache/beam/blob/d6dc3602c808ce1961f46098deb898e5c7480ad4/sdks/python/gen_protos.py#L122]
 

cc [~altay], [~charlesche...@yahoo.com], [~RobbeSneyders]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-5816) Flink runner starts new bundles while disposing operator

2019-02-13 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5816?focusedWorklogId=198253&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-198253
 ]

ASF GitHub Bot logged work on BEAM-5816:


Author: ASF GitHub Bot
Created on: 13/Feb/19 18:12
Start Date: 13/Feb/19 18:12
Worklog Time Spent: 10m 
  Work Description: mxm commented on pull request #7719: [BEAM-5816] Finish 
Flink bundles exactly once
URL: https://github.com/apache/beam/pull/7719#discussion_r256522068
 
 

 ##
 File path: 
runners/flink/src/main/java/org/apache/beam/runners/flink/translation/wrappers/streaming/DoFnOperator.java
 ##
 @@ -178,7 +179,7 @@
   private transient PushedBackElementsHandler> 
pushedBackElementsHandler;
 
   // bundle control
-  private transient boolean bundleStarted = false;
+  private transient AtomicBoolean bundleStarted;
 
 Review comment:
   Yes, there is a timer thread for the "finish bundle by time" feature, see 
`FlinkPipelineOptions#setMaxBundleTimeMills`.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 198253)
Time Spent: 2.5h  (was: 2h 20m)

> Flink runner starts new bundles while disposing operator 
> -
>
> Key: BEAM-5816
> URL: https://issues.apache.org/jira/browse/BEAM-5816
> Project: Beam
>  Issue Type: Bug
>  Components: runner-flink
>Reporter: Micah Wylde
>Assignee: Maximilian Michels
>Priority: Major
>  Labels: portability-flink, triaged
>  Time Spent: 2.5h
>  Remaining Estimate: 0h
>
> We sometimes see exceptions when shutting down portable flink pipelines 
> (either due to cancellation or failure):
> {code}
> 2018-10-19 15:54:52,905 ERROR 
> org.apache.flink.streaming.runtime.tasks.StreamTask   - Error during 
> disposal of stream operator.
> java.lang.RuntimeException: Failed to finish remote bundle
>   at 
> org.apache.beam.runners.flink.translation.wrappers.streaming.ExecutableStageDoFnOperator$SdkHarnessDoFnRunner.finishBundle(ExecutableStageDoFnOperator.java:241)
>   at 
> org.apache.beam.runners.flink.metrics.DoFnRunnerWithMetricsUpdate.finishBundle(DoFnRunnerWithMetricsUpdate.java:87)
>   at 
> org.apache.beam.runners.core.SimplePushbackSideInputDoFnRunner.finishBundle(SimplePushbackSideInputDoFnRunner.java:118)
>   at 
> org.apache.beam.runners.flink.translation.wrappers.streaming.DoFnOperator.invokeFinishBundle(DoFnOperator.java:674)
>   at 
> org.apache.beam.runners.flink.translation.wrappers.streaming.DoFnOperator.dispose(DoFnOperator.java:391)
>   at 
> org.apache.beam.runners.flink.translation.wrappers.streaming.ExecutableStageDoFnOperator.dispose(ExecutableStageDoFnOperator.java:166)
>   at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.disposeAllOperators(StreamTask.java:473)
>   at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:374)
>   at org.apache.flink.runtime.taskmanager.Task.run(Task.java:703)
>   at java.lang.Thread.run(Thread.java:748)
> Caused by: java.lang.IllegalStateException: Already closed.
>   at 
> org.apache.beam.sdk.fn.data.BeamFnDataBufferingOutboundObserver.close(BeamFnDataBufferingOutboundObserver.java:95)
>   at 
> org.apache.beam.runners.fnexecution.control.SdkHarnessClient$ActiveBundle.close(SdkHarnessClient.java:251)
>   at 
> org.apache.beam.runners.flink.translation.wrappers.streaming.ExecutableStageDoFnOperator$SdkHarnessDoFnRunner.finishBundle(ExecutableStageDoFnOperator.java:238)
>   ... 9 more
>   Suppressed: java.lang.IllegalStateException: Processing bundle failed, 
> TODO: [BEAM-3962] abort bundle.
>   at 
> org.apache.beam.runners.fnexecution.control.SdkHarnessClient$ActiveBundle.close(SdkHarnessClient.java:266)
>   ... 10 more
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6623) Dataflow ValidatesRunner test suite should also exercise ValidatesRunner tests under Python 3.

2019-02-13 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6623?focusedWorklogId=198252&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-198252
 ]

ASF GitHub Bot logged work on BEAM-6623:


Author: ASF GitHub Bot
Created on: 13/Feb/19 18:08
Start Date: 13/Feb/19 18:08
Worklog Time Spent: 10m 
  Work Description: markflyhigh commented on issue #7825: [BEAM-6623] 
Exercise ValidatesRunner batch tests in Python 3
URL: https://github.com/apache/beam/pull/7825#issuecomment-462945640
 
 
   Run Seed Job
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 198252)
Time Spent: 1h  (was: 50m)

> Dataflow ValidatesRunner test suite should also exercise ValidatesRunner 
> tests under Python 3.
> --
>
> Key: BEAM-6623
> URL: https://issues.apache.org/jira/browse/BEAM-6623
> Project: Beam
>  Issue Type: Sub-task
>  Components: testing
>Reporter: Valentyn Tymofieiev
>Assignee: Robbe
>Priority: Critical
>  Labels: triaged
>  Time Spent: 1h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6623) Dataflow ValidatesRunner test suite should also exercise ValidatesRunner tests under Python 3.

2019-02-13 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6623?focusedWorklogId=198250&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-198250
 ]

ASF GitHub Bot logged work on BEAM-6623:


Author: ASF GitHub Bot
Created on: 13/Feb/19 18:08
Start Date: 13/Feb/19 18:08
Worklog Time Spent: 10m 
  Work Description: markflyhigh commented on issue #7825: [BEAM-6623] 
Exercise ValidatesRunner batch tests in Python 3
URL: https://github.com/apache/beam/pull/7825#issuecomment-463304817
 
 
   A java unit test failed Java Precommit, but definitely unrelated to this PR 
since changes are only in Gradle scripts and doesn't touch any Java code.
   
   Do I need to rebase?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 198250)
Time Spent: 40m  (was: 0.5h)

> Dataflow ValidatesRunner test suite should also exercise ValidatesRunner 
> tests under Python 3.
> --
>
> Key: BEAM-6623
> URL: https://issues.apache.org/jira/browse/BEAM-6623
> Project: Beam
>  Issue Type: Sub-task
>  Components: testing
>Reporter: Valentyn Tymofieiev
>Assignee: Robbe
>Priority: Critical
>  Labels: triaged
>  Time Spent: 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-3310) Push metrics to a backend in an runner agnostic way

2019-02-13 Thread JIRA


[ 
https://issues.apache.org/jira/browse/BEAM-3310?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16767420#comment-16767420
 ] 

Ismaël Mejía commented on BEAM-3310:


This ticket was still open because not all of the sub-tasks have been solved. 
But you received a notification because I updated the core ticket to link to 
some repeated Jira issues while doing triage.

However we found an interesting regression related to this. Since the move to 
gradle there is a metrics test of the Spark ValidatesRunner kind that it is 
taking 10 minutes which previously took seconds. Can you please take a look 
BEAM-6633.

> Push metrics to a backend in an runner agnostic way
> ---
>
> Key: BEAM-3310
> URL: https://issues.apache.org/jira/browse/BEAM-3310
> Project: Beam
>  Issue Type: New Feature
>  Components: runner-extensions-metrics, sdk-java-core
>Reporter: Etienne Chauchot
>Assignee: Etienne Chauchot
>Priority: Major
>  Time Spent: 18h 50m
>  Remaining Estimate: 0h
>
> The idea is to avoid relying on the runners to provide access to the metrics 
> (either at the end of the pipeline or while it runs) because they don't have 
> all the same capabilities towards metrics (e.g. spark runner configures sinks 
>  like csv, graphite or in memory sinks using the spark engine conf). The 
> target is to push the metrics in the common runner code so that no matter the 
> chosen runner, a user can get his metrics out of beam.
> Here is the link to the discussion thread on the dev ML: 
> https://lists.apache.org/thread.html/01a80d62f2df6b84bfa41f05e15fda900178f882877c294fed8be91e@%3Cdev.beam.apache.org%3E
> And the design doc:
> https://s.apache.org/runner_independent_metrics_extraction



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6650) FlinkRunner fails to checkpoint elements emitted during finishBundle

2019-02-13 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6650?focusedWorklogId=198249&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-198249
 ]

ASF GitHub Bot logged work on BEAM-6650:


Author: ASF GitHub Bot
Created on: 13/Feb/19 18:06
Start Date: 13/Feb/19 18:06
Worklog Time Spent: 10m 
  Work Description: mxm commented on issue #7810: [BEAM-6650] Add bundle 
test with checkpointing for keyed processing
URL: https://github.com/apache/beam/pull/7810#issuecomment-463304417
 
 
   Run Flink ValidatesRunner
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 198249)
Time Spent: 1.5h  (was: 1h 20m)

> FlinkRunner fails to checkpoint elements emitted during finishBundle
> 
>
> Key: BEAM-6650
> URL: https://issues.apache.org/jira/browse/BEAM-6650
> Project: Beam
>  Issue Type: Bug
>  Components: runner-flink
>Reporter: Maximilian Michels
>Assignee: Maximilian Michels
>Priority: Major
> Fix For: 2.11.0
>
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> Elements emitted during the finalizeBundle call in snapshopState are lost 
> after the pipeline is restored. This only happens when the operator is keyed.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-5816) Flink runner starts new bundles while disposing operator

2019-02-13 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5816?focusedWorklogId=198244&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-198244
 ]

ASF GitHub Bot logged work on BEAM-5816:


Author: ASF GitHub Bot
Created on: 13/Feb/19 18:00
Start Date: 13/Feb/19 18:00
Worklog Time Spent: 10m 
  Work Description: tweise commented on issue #7719: [BEAM-5816] Finish 
Flink bundles exactly once
URL: https://github.com/apache/beam/pull/7719#issuecomment-463302036
 
 
   I think there are two things we want to accomplish:
   1) After an exception or cancel, we don't want any more "normal" processing 
(looks like this is now addressed)
   2) Best effort resource cleanup for user code (not sure about that one, 
would this happen after an exception)?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 198244)
Time Spent: 2h 20m  (was: 2h 10m)

> Flink runner starts new bundles while disposing operator 
> -
>
> Key: BEAM-5816
> URL: https://issues.apache.org/jira/browse/BEAM-5816
> Project: Beam
>  Issue Type: Bug
>  Components: runner-flink
>Reporter: Micah Wylde
>Assignee: Maximilian Michels
>Priority: Major
>  Labels: portability-flink, triaged
>  Time Spent: 2h 20m
>  Remaining Estimate: 0h
>
> We sometimes see exceptions when shutting down portable flink pipelines 
> (either due to cancellation or failure):
> {code}
> 2018-10-19 15:54:52,905 ERROR 
> org.apache.flink.streaming.runtime.tasks.StreamTask   - Error during 
> disposal of stream operator.
> java.lang.RuntimeException: Failed to finish remote bundle
>   at 
> org.apache.beam.runners.flink.translation.wrappers.streaming.ExecutableStageDoFnOperator$SdkHarnessDoFnRunner.finishBundle(ExecutableStageDoFnOperator.java:241)
>   at 
> org.apache.beam.runners.flink.metrics.DoFnRunnerWithMetricsUpdate.finishBundle(DoFnRunnerWithMetricsUpdate.java:87)
>   at 
> org.apache.beam.runners.core.SimplePushbackSideInputDoFnRunner.finishBundle(SimplePushbackSideInputDoFnRunner.java:118)
>   at 
> org.apache.beam.runners.flink.translation.wrappers.streaming.DoFnOperator.invokeFinishBundle(DoFnOperator.java:674)
>   at 
> org.apache.beam.runners.flink.translation.wrappers.streaming.DoFnOperator.dispose(DoFnOperator.java:391)
>   at 
> org.apache.beam.runners.flink.translation.wrappers.streaming.ExecutableStageDoFnOperator.dispose(ExecutableStageDoFnOperator.java:166)
>   at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.disposeAllOperators(StreamTask.java:473)
>   at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:374)
>   at org.apache.flink.runtime.taskmanager.Task.run(Task.java:703)
>   at java.lang.Thread.run(Thread.java:748)
> Caused by: java.lang.IllegalStateException: Already closed.
>   at 
> org.apache.beam.sdk.fn.data.BeamFnDataBufferingOutboundObserver.close(BeamFnDataBufferingOutboundObserver.java:95)
>   at 
> org.apache.beam.runners.fnexecution.control.SdkHarnessClient$ActiveBundle.close(SdkHarnessClient.java:251)
>   at 
> org.apache.beam.runners.flink.translation.wrappers.streaming.ExecutableStageDoFnOperator$SdkHarnessDoFnRunner.finishBundle(ExecutableStageDoFnOperator.java:238)
>   ... 9 more
>   Suppressed: java.lang.IllegalStateException: Processing bundle failed, 
> TODO: [BEAM-3962] abort bundle.
>   at 
> org.apache.beam.runners.fnexecution.control.SdkHarnessClient$ActiveBundle.close(SdkHarnessClient.java:266)
>   ... 10 more
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-5816) Flink runner starts new bundles while disposing operator

2019-02-13 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5816?focusedWorklogId=198241&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-198241
 ]

ASF GitHub Bot logged work on BEAM-5816:


Author: ASF GitHub Bot
Created on: 13/Feb/19 17:57
Start Date: 13/Feb/19 17:57
Worklog Time Spent: 10m 
  Work Description: tweise commented on pull request #7719: [BEAM-5816] 
Finish Flink bundles exactly once
URL: https://github.com/apache/beam/pull/7719#discussion_r256516130
 
 

 ##
 File path: 
runners/flink/src/main/java/org/apache/beam/runners/flink/translation/wrappers/streaming/DoFnOperator.java
 ##
 @@ -178,7 +179,7 @@
   private transient PushedBackElementsHandler> 
pushedBackElementsHandler;
 
   // bundle control
-  private transient boolean bundleStarted = false;
+  private transient AtomicBoolean bundleStarted;
 
 Review comment:
   Is there a scenario of multiple threads accessing this?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 198241)
Time Spent: 2h 10m  (was: 2h)

> Flink runner starts new bundles while disposing operator 
> -
>
> Key: BEAM-5816
> URL: https://issues.apache.org/jira/browse/BEAM-5816
> Project: Beam
>  Issue Type: Bug
>  Components: runner-flink
>Reporter: Micah Wylde
>Assignee: Maximilian Michels
>Priority: Major
>  Labels: portability-flink, triaged
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> We sometimes see exceptions when shutting down portable flink pipelines 
> (either due to cancellation or failure):
> {code}
> 2018-10-19 15:54:52,905 ERROR 
> org.apache.flink.streaming.runtime.tasks.StreamTask   - Error during 
> disposal of stream operator.
> java.lang.RuntimeException: Failed to finish remote bundle
>   at 
> org.apache.beam.runners.flink.translation.wrappers.streaming.ExecutableStageDoFnOperator$SdkHarnessDoFnRunner.finishBundle(ExecutableStageDoFnOperator.java:241)
>   at 
> org.apache.beam.runners.flink.metrics.DoFnRunnerWithMetricsUpdate.finishBundle(DoFnRunnerWithMetricsUpdate.java:87)
>   at 
> org.apache.beam.runners.core.SimplePushbackSideInputDoFnRunner.finishBundle(SimplePushbackSideInputDoFnRunner.java:118)
>   at 
> org.apache.beam.runners.flink.translation.wrappers.streaming.DoFnOperator.invokeFinishBundle(DoFnOperator.java:674)
>   at 
> org.apache.beam.runners.flink.translation.wrappers.streaming.DoFnOperator.dispose(DoFnOperator.java:391)
>   at 
> org.apache.beam.runners.flink.translation.wrappers.streaming.ExecutableStageDoFnOperator.dispose(ExecutableStageDoFnOperator.java:166)
>   at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.disposeAllOperators(StreamTask.java:473)
>   at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:374)
>   at org.apache.flink.runtime.taskmanager.Task.run(Task.java:703)
>   at java.lang.Thread.run(Thread.java:748)
> Caused by: java.lang.IllegalStateException: Already closed.
>   at 
> org.apache.beam.sdk.fn.data.BeamFnDataBufferingOutboundObserver.close(BeamFnDataBufferingOutboundObserver.java:95)
>   at 
> org.apache.beam.runners.fnexecution.control.SdkHarnessClient$ActiveBundle.close(SdkHarnessClient.java:251)
>   at 
> org.apache.beam.runners.flink.translation.wrappers.streaming.ExecutableStageDoFnOperator$SdkHarnessDoFnRunner.finishBundle(ExecutableStageDoFnOperator.java:238)
>   ... 9 more
>   Suppressed: java.lang.IllegalStateException: Processing bundle failed, 
> TODO: [BEAM-3962] abort bundle.
>   at 
> org.apache.beam.runners.fnexecution.control.SdkHarnessClient$ActiveBundle.close(SdkHarnessClient.java:266)
>   ... 10 more
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-2185) KafkaIO bounded source

2019-02-13 Thread Raghu Angadi (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-2185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16767411#comment-16767411
 ] 

Raghu Angadi commented on BEAM-2185:


Yes. The whole batch use case should document and/or log a big caveat listing 
all these concerns.
When a user sets `enable.auto.commit = true`, the user is essentially 
introducing parallel checkpoint-like functionality outside of Apache Beam 
control. I think as with 'commitOffsetsInFinalize()' it can help with resuming 
from a reasonable point on restart, but does not guarantee exactly-once (in 
fact only 'update' guarantees exact-once in Beam, no restart of a pipeline 
does).

> KafkaIO bounded source
> --
>
> Key: BEAM-2185
> URL: https://issues.apache.org/jira/browse/BEAM-2185
> Project: Beam
>  Issue Type: New Feature
>  Components: io-java-kafka
>Reporter: Raghu Angadi
>Priority: Major
>
> KafkaIO could be a useful source for batch applications as well. It could 
> implement a bounded source. The primary question is how the bounds are 
> specified.
> One option : Source specifies a time period (say 9am-10am), and KafkaIO 
> fetches appropriate start and end offsets based on time-index in Kafka. This 
> would suite many batch applications that are launched on a scheduled.
> Another option is to always read till the end and commit the offsets to 
> Kafka. Handling failures and multiple runs of a task might be complicated.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Closed] (BEAM-5375) KafkaIO reader should handle runtime exceptions kafka client

2019-02-13 Thread Raghu Angadi (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5375?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raghu Angadi closed BEAM-5375.
--
Resolution: Fixed

> KafkaIO reader should handle runtime exceptions kafka client
> 
>
> Key: BEAM-5375
> URL: https://issues.apache.org/jira/browse/BEAM-5375
> Project: Beam
>  Issue Type: Bug
>  Components: io-java-kafka
>Affects Versions: 2.7.0
>Reporter: Raghu Angadi
>Assignee: Raghu Angadi
>Priority: Major
>  Labels: triaged
> Fix For: 2.7.0
>
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> KafkaIO reader might stop reading from Kafka without any explicit error 
> message if KafkaConsumer throws a runtime exception while polling for 
> messages. One of the Dataflow customers encountered this issue (see [user@ 
> thread|https://lists.apache.org/thread.html/c0cf8f45f567a0623592e2d8340f5288e3e774b59bca985aec410a81@%3Cuser.beam.apache.org%3E])]
> 'consumerPollThread()' in KafkaIO deliberately avoided catching runtime 
> exceptions. It shoud handle it.. stuff happens at runtime. 
> It should result in 'IOException' from start()/advance(). The runners will 
> handle properly reporting and closing the readers. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (BEAM-5375) KafkaIO reader should handle runtime exceptions kafka client

2019-02-13 Thread Raghu Angadi (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5375?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raghu Angadi updated BEAM-5375:
---
Fix Version/s: 2.7.0

> KafkaIO reader should handle runtime exceptions kafka client
> 
>
> Key: BEAM-5375
> URL: https://issues.apache.org/jira/browse/BEAM-5375
> Project: Beam
>  Issue Type: Bug
>  Components: io-java-kafka
>Affects Versions: 2.7.0
>Reporter: Raghu Angadi
>Assignee: Raghu Angadi
>Priority: Major
>  Labels: triaged
> Fix For: 2.7.0
>
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> KafkaIO reader might stop reading from Kafka without any explicit error 
> message if KafkaConsumer throws a runtime exception while polling for 
> messages. One of the Dataflow customers encountered this issue (see [user@ 
> thread|https://lists.apache.org/thread.html/c0cf8f45f567a0623592e2d8340f5288e3e774b59bca985aec410a81@%3Cuser.beam.apache.org%3E])]
> 'consumerPollThread()' in KafkaIO deliberately avoided catching runtime 
> exceptions. It shoud handle it.. stuff happens at runtime. 
> It should result in 'IOException' from start()/advance(). The runners will 
> handle properly reporting and closing the readers. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (BEAM-2466) Add Kafka Streams runner

2019-02-13 Thread Alexey Romanenko (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-2466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16766129#comment-16766129
 ] 

Alexey Romanenko edited comment on BEAM-2466 at 2/13/19 5:31 PM:
-

[~teabot]
1) I'm not very familiar with KStreams API. Does it require to have Kafka topic 
as input in any case? Is KStreams source heavily linked with Kafka topics?
2) "Only streaming support" - well, this is kind of against initial Beam idea 
of easily switching between batch and streaming pipelines (since it won't be 
possible to use KafkaRunner for other pipelines, written in traditional Beam 
way). So, it's quite serious limitation, imho, but, perhaps, there could be 
some workarounds for that.


was (Author: aromanenko):
[~teabot]
1) I'm not very familiar with KStreams API. Does it require to have Kafka topic 
as input in any case? Does KStreams source heavily linked with Kafka topics?
2) "Only streaming support" - well, this is kind of against initial Beam idea 
of easily switching between batch and streaming pipelines (since it won't be 
possible to use KafkaRunner for other pipelines, written in traditional Beam 
way). So, it's quite serious limitation, imho, but, perhaps, there could be 
some workarounds for that.

> Add Kafka Streams runner
> 
>
> Key: BEAM-2466
> URL: https://issues.apache.org/jira/browse/BEAM-2466
> Project: Beam
>  Issue Type: Wish
>  Components: runner-ideas
>Reporter: Lorand Peter Kasler
>Assignee: Kai Jiang
>Priority: Minor
>  Labels: triaged
>
> Kafka Streams (https://kafka.apache.org/documentation/streams)  has more and 
> more features that could make it a viable candidate for a streaming runner. 
> It uses DataFlow-like model



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (BEAM-6664) Temporarily convert dataflowKmsKey flag to experimental flag for Dataflow

2019-02-13 Thread Udi Meiri (JIRA)
Udi Meiri created BEAM-6664:
---

 Summary: Temporarily convert dataflowKmsKey flag to experimental 
flag for Dataflow
 Key: BEAM-6664
 URL: https://issues.apache.org/jira/browse/BEAM-6664
 Project: Beam
  Issue Type: Improvement
  Components: io-java-gcp, sdk-py-core
Reporter: Udi Meiri
Assignee: Udi Meiri






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-2466) Add Kafka Streams runner

2019-02-13 Thread Kai Jiang (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-2466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16767394#comment-16767394
 ] 

Kai Jiang commented on BEAM-2466:
-

[~teabot] IMHO, There should not exist a 'batch processing' concept for 
KStreams. I think we needs limit Beam only in streaming mode for 
KafkaStreamsRunner.
[~aromanenko] I think it requires kafka topic as streaming input. Internally, 
KStream source utilized kafka topics as input.

PoC branch: https://github.com/vectorijk/beam/tree/kafka-stream
Welcome any ideas!



> Add Kafka Streams runner
> 
>
> Key: BEAM-2466
> URL: https://issues.apache.org/jira/browse/BEAM-2466
> Project: Beam
>  Issue Type: Wish
>  Components: runner-ideas
>Reporter: Lorand Peter Kasler
>Assignee: Kai Jiang
>Priority: Minor
>  Labels: triaged
>
> Kafka Streams (https://kafka.apache.org/documentation/streams)  has more and 
> more features that could make it a viable candidate for a streaming runner. 
> It uses DataFlow-like model



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6650) FlinkRunner fails to checkpoint elements emitted during finishBundle

2019-02-13 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6650?focusedWorklogId=198205&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-198205
 ]

ASF GitHub Bot logged work on BEAM-6650:


Author: ASF GitHub Bot
Created on: 13/Feb/19 16:50
Start Date: 13/Feb/19 16:50
Worklog Time Spent: 10m 
  Work Description: mxm commented on issue #7810: [BEAM-6650] Add bundle 
test with checkpointing for keyed processing
URL: https://github.com/apache/beam/pull/7810#issuecomment-463275855
 
 
   Run Flink ValidatesRunner
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 198205)
Time Spent: 1h 10m  (was: 1h)

> FlinkRunner fails to checkpoint elements emitted during finishBundle
> 
>
> Key: BEAM-6650
> URL: https://issues.apache.org/jira/browse/BEAM-6650
> Project: Beam
>  Issue Type: Bug
>  Components: runner-flink
>Reporter: Maximilian Michels
>Assignee: Maximilian Michels
>Priority: Major
> Fix For: 2.11.0
>
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> Elements emitted during the finalizeBundle call in snapshopState are lost 
> after the pipeline is restored. This only happens when the operator is keyed.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6650) FlinkRunner fails to checkpoint elements emitted during finishBundle

2019-02-13 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6650?focusedWorklogId=198206&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-198206
 ]

ASF GitHub Bot logged work on BEAM-6650:


Author: ASF GitHub Bot
Created on: 13/Feb/19 16:50
Start Date: 13/Feb/19 16:50
Worklog Time Spent: 10m 
  Work Description: mxm commented on issue #7810: [BEAM-6650] Add bundle 
test with checkpointing for keyed processing
URL: https://github.com/apache/beam/pull/7810#issuecomment-463275894
 
 
   Run Python PreCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 198206)
Time Spent: 1h 20m  (was: 1h 10m)

> FlinkRunner fails to checkpoint elements emitted during finishBundle
> 
>
> Key: BEAM-6650
> URL: https://issues.apache.org/jira/browse/BEAM-6650
> Project: Beam
>  Issue Type: Bug
>  Components: runner-flink
>Reporter: Maximilian Michels
>Assignee: Maximilian Michels
>Priority: Major
> Fix For: 2.11.0
>
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> Elements emitted during the finalizeBundle call in snapshopState are lost 
> after the pipeline is restored. This only happens when the operator is keyed.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6650) FlinkRunner fails to checkpoint elements emitted during finishBundle

2019-02-13 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6650?focusedWorklogId=198204&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-198204
 ]

ASF GitHub Bot logged work on BEAM-6650:


Author: ASF GitHub Bot
Created on: 13/Feb/19 16:50
Start Date: 13/Feb/19 16:50
Worklog Time Spent: 10m 
  Work Description: mxm commented on issue #7810: [BEAM-6650] Add bundle 
test with checkpointing for keyed processing
URL: https://github.com/apache/beam/pull/7810#issuecomment-463275837
 
 
   Batch execution fails for ValidatesRunner, due to memory issues on jenkins.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 198204)
Time Spent: 1h  (was: 50m)

> FlinkRunner fails to checkpoint elements emitted during finishBundle
> 
>
> Key: BEAM-6650
> URL: https://issues.apache.org/jira/browse/BEAM-6650
> Project: Beam
>  Issue Type: Bug
>  Components: runner-flink
>Reporter: Maximilian Michels
>Assignee: Maximilian Michels
>Priority: Major
> Fix For: 2.11.0
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> Elements emitted during the finalizeBundle call in snapshopState are lost 
> after the pipeline is restored. This only happens when the operator is keyed.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4461) Create a library of useful transforms that use schemas

2019-02-13 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4461?focusedWorklogId=198196&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-198196
 ]

ASF GitHub Bot logged work on BEAM-4461:


Author: ASF GitHub Bot
Created on: 13/Feb/19 16:26
Start Date: 13/Feb/19 16:26
Worklog Time Spent: 10m 
  Work Description: reuvenlax commented on issue #7353: [BEAM-4461] Support 
inner and outer style joins in CoGroup.
URL: https://github.com/apache/beam/pull/7353#issuecomment-463265969
 
 
   @kanterov do you have any time to help review this PR?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 198196)
Time Spent: 19h 20m  (was: 19h 10m)

> Create a library of useful transforms that use schemas
> --
>
> Key: BEAM-4461
> URL: https://issues.apache.org/jira/browse/BEAM-4461
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-java-core
>Reporter: Reuven Lax
>Assignee: Reuven Lax
>Priority: Major
>  Labels: triaged
>  Time Spent: 19h 20m
>  Remaining Estimate: 0h
>
> e.g. JoinBy(fields). Project, Filter, etc.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4076) Schema followups

2019-02-13 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4076?focusedWorklogId=198194&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-198194
 ]

ASF GitHub Bot logged work on BEAM-4076:


Author: ASF GitHub Bot
Created on: 13/Feb/19 16:25
Start Date: 13/Feb/19 16:25
Worklog Time Spent: 10m 
  Work Description: reuvenlax commented on issue #7635: [BEAM-4076] 
Generalize schema inputs to ParDo
URL: https://github.com/apache/beam/pull/7635#issuecomment-463265598
 
 
   Run Java PreCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 198194)
Time Spent: 18h 40m  (was: 18.5h)

> Schema followups
> 
>
> Key: BEAM-4076
> URL: https://issues.apache.org/jira/browse/BEAM-4076
> Project: Beam
>  Issue Type: Improvement
>  Components: beam-model, dsl-sql, sdk-java-core
>Reporter: Kenneth Knowles
>Priority: Major
>  Time Spent: 18h 40m
>  Remaining Estimate: 0h
>
> This umbrella bug contains subtasks with followups for Beam schemas, which 
> were moved from SQL to the core Java SDK and made to be type-name-based 
> rather than coder based.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4076) Schema followups

2019-02-13 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4076?focusedWorklogId=198195&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-198195
 ]

ASF GitHub Bot logged work on BEAM-4076:


Author: ASF GitHub Bot
Created on: 13/Feb/19 16:25
Start Date: 13/Feb/19 16:25
Worklog Time Spent: 10m 
  Work Description: reuvenlax commented on issue #7635: [BEAM-4076] 
Generalize schema inputs to ParDo
URL: https://github.com/apache/beam/pull/7635#issuecomment-463265698
 
 
   @kennknowles any comments on this PR?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 198195)
Time Spent: 18h 50m  (was: 18h 40m)

> Schema followups
> 
>
> Key: BEAM-4076
> URL: https://issues.apache.org/jira/browse/BEAM-4076
> Project: Beam
>  Issue Type: Improvement
>  Components: beam-model, dsl-sql, sdk-java-core
>Reporter: Kenneth Knowles
>Priority: Major
>  Time Spent: 18h 50m
>  Remaining Estimate: 0h
>
> This umbrella bug contains subtasks with followups for Beam schemas, which 
> were moved from SQL to the core Java SDK and made to be type-name-based 
> rather than coder based.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (BEAM-6663) SerializablePipelineOptions should override toString()

2019-02-13 Thread Etienne Chauchot (JIRA)
Etienne Chauchot created BEAM-6663:
--

 Summary: SerializablePipelineOptions should override toString()
 Key: BEAM-6663
 URL: https://issues.apache.org/jira/browse/BEAM-6663
 Project: Beam
  Issue Type: Improvement
  Components: runner-core
Reporter: Etienne Chauchot
Assignee: Etienne Chauchot


Many runners, store both options and SerializableOptions in their context 
whereas SerializableOptions stores both Options and json serialized Options. 
Options can be obtain by SerializableOptions.get() and json cannot be obtained. 
 I propose to use only SerializableOptions inside the runners (as they all have 
serialization issues) and simply override toString to get the json version.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


  1   2   >