[jira] [Commented] (BEAM-4775) JobService should support returning metrics
[ https://issues.apache.org/jira/browse/BEAM-4775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16773652#comment-16773652 ] Ryan Williams commented on BEAM-4775: - cc [~robertwb] [~ajam...@google.com] [~Ardagan] I've moved the state of this work here from [#7823|https://github.com/apache/beam/pull/7823], and will keep it up to date. > JobService should support returning metrics > --- > > Key: BEAM-4775 > URL: https://issues.apache.org/jira/browse/BEAM-4775 > Project: Beam > Issue Type: Bug > Components: beam-model >Reporter: Eugene Kirpichov >Assignee: Ryan Williams >Priority: Major > Labels: triaged > Time Spent: 18h 50m > Remaining Estimate: 0h > > Design doc: https://s.apache.org/get-metrics-api. > h1. Relevant PRs in flight: > h2. Approved / Ready to merge: > * [#7890|https://github.com/apache/beam/pull/7890]: consolidate MetricResult > implementations > * [#7883|https://github.com/apache/beam/pull/7883]: Add > MetricQueryResults.allMetrics() helper > h2. Ready for Review: > * #[7915|https://github.com/apache/beam/pull/7915]: use MonitoringInfo data > model in Java SDK metrics > ** Depends on [#7867|https://github.com/apache/beam/pull/7867] > ** Both of these require adding a {{sdks/java/core}} dependency on the > {{model/fn-execution}} protos module. > *** I want to discuss whether that's ok. > *** It may not be totally necessary; see discussion on > #[7915|https://github.com/apache/beam/pull/7915]. > h2. Iterating / Discussing: > * [#7868|https://github.com/apache/beam/pull/7868]: MonitoringInfo URN tweaks > h2. Merged > * #7866: move function helpers from fn-harness to sdks/java/core > h2. Closed > * [#7876|https://github.com/apache/beam/pull/7876]: Clean up metric protos; > support integer distributions, gauges > h1. Likely pieces still to come: > I have these implemented in a branch, but need to pull them out into > manageable PRs: > * adding the job-API metrics RPC > * python support > h1. Previous Description: > [https://github.com/apache/beam/blob/master/model/job-management/src/main/proto/beam_job_api.proto] > currently doesn't appear to have a way for JobService to return metrics to a > user, even though > [https://github.com/apache/beam/blob/master/model/fn-execution/src/main/proto/beam_fn_api.proto] > includes support for reporting SDK metrics to the runner harness. > Metrics are apparently necessary to run any ValidatesRunner tests because > PAssert needs to validate that the assertions succeeded. However, this > statement should be double-checked: perhaps it's possible to somehow work > with PAssert without metrics support. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-4775) JobService should support returning metrics
[ https://issues.apache.org/jira/browse/BEAM-4775?focusedWorklogId=201756&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-201756 ] ASF GitHub Bot logged work on BEAM-4775: Author: ASF GitHub Bot Created on: 21/Feb/19 04:57 Start Date: 21/Feb/19 04:57 Worklog Time Spent: 10m Work Description: ryan-williams commented on issue #7823: [DO NOT MERGE] [BEAM-4775] Second take on portable metrics over the job-server API URL: https://github.com/apache/beam/pull/7823#issuecomment-465862017 I've moved all the info in this PR's description to [BEAM-4775](https://issues.apache.org/jira/browse/BEAM-4775), which is probably a better place to track the state of this work! Closing this and adding a note to this effect to the title message. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 201756) Time Spent: 18h 40m (was: 18.5h) > JobService should support returning metrics > --- > > Key: BEAM-4775 > URL: https://issues.apache.org/jira/browse/BEAM-4775 > Project: Beam > Issue Type: Bug > Components: beam-model >Reporter: Eugene Kirpichov >Assignee: Ryan Williams >Priority: Major > Labels: triaged > Time Spent: 18h 40m > Remaining Estimate: 0h > > Design doc: https://s.apache.org/get-metrics-api. > h1. Relevant PRs in flight: > h2. Approved / Ready to merge: > * [#7890|https://github.com/apache/beam/pull/7890]: consolidate MetricResult > implementations > * [#7883|https://github.com/apache/beam/pull/7883]: Add > MetricQueryResults.allMetrics() helper > h2. Ready for Review: > * #[7915|https://github.com/apache/beam/pull/7915]: use MonitoringInfo data > model in Java SDK metrics > ** Depends on [#7867|https://github.com/apache/beam/pull/7867] > ** Both of these require adding a {{sdks/java/core}} dependency on the > {{model/fn-execution}} protos module. > *** I want to discuss whether that's ok. > *** It may not be totally necessary; see discussion on > #[7915|https://github.com/apache/beam/pull/7915]. > h2. Iterating / Discussing: > * [#7868|https://github.com/apache/beam/pull/7868]: MonitoringInfo URN tweaks > h2. Merged > * #7866: move function helpers from fn-harness to sdks/java/core > h2. Closed > * [#7876|https://github.com/apache/beam/pull/7876]: Clean up metric protos; > support integer distributions, gauges > h1. Likely pieces still to come: > I have these implemented in a branch, but need to pull them out into > manageable PRs: > * adding the job-API metrics RPC > * python support > h1. Previous Description: > [https://github.com/apache/beam/blob/master/model/job-management/src/main/proto/beam_job_api.proto] > currently doesn't appear to have a way for JobService to return metrics to a > user, even though > [https://github.com/apache/beam/blob/master/model/fn-execution/src/main/proto/beam_fn_api.proto] > includes support for reporting SDK metrics to the runner harness. > Metrics are apparently necessary to run any ValidatesRunner tests because > PAssert needs to validate that the assertions succeeded. However, this > statement should be double-checked: perhaps it's possible to somehow work > with PAssert without metrics support. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (BEAM-4775) JobService should support returning metrics
[ https://issues.apache.org/jira/browse/BEAM-4775?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ryan Williams updated BEAM-4775: Description: Design doc: [https://s.apache.org/get-metrics-api]. h1. Relevant PRs in flight: h2. Approved / Ready to merge: * [#7890|https://github.com/apache/beam/pull/7890]: consolidate MetricResult implementations * [#7883|https://github.com/apache/beam/pull/7883]: Add MetricQueryResults.allMetrics() helper h2. Ready for Review: * #[7915|https://github.com/apache/beam/pull/7915]: use MonitoringInfo data model in Java SDK metrics ** Depends on [#7867|https://github.com/apache/beam/pull/7867] ** Both of these require adding a {{sdks/java/core}} dependency on the {{model/fn-execution}} protos module. *** I want to discuss whether that's ok. *** It may not be totally necessary; see discussion on #[7915|https://github.com/apache/beam/pull/7915]. h2. Iterating / Discussing: * [#7868|https://github.com/apache/beam/pull/7868]: MonitoringInfo URN tweaks h2. Merged * [#7866|https://github.com/apache/beam/pull/7866]: move function helpers from fn-harness to sdks/java/core h2. Closed * [#7876|https://github.com/apache/beam/pull/7876]: Clean up metric protos; support integer distributions, gauges h1. Likely pieces still to come: I have these implemented in a branch, but need to pull them out into manageable PRs: * adding the job-API metrics RPC * python support h1. Previous Description: [https://github.com/apache/beam/blob/master/model/job-management/src/main/proto/beam_job_api.proto] currently doesn't appear to have a way for JobService to return metrics to a user, even though [https://github.com/apache/beam/blob/master/model/fn-execution/src/main/proto/beam_fn_api.proto] includes support for reporting SDK metrics to the runner harness. Metrics are apparently necessary to run any ValidatesRunner tests because PAssert needs to validate that the assertions succeeded. However, this statement should be double-checked: perhaps it's possible to somehow work with PAssert without metrics support. was: Design doc: https://s.apache.org/get-metrics-api. h1. Relevant PRs in flight: h2. Approved / Ready to merge: * [#7890|https://github.com/apache/beam/pull/7890]: consolidate MetricResult implementations * [#7883|https://github.com/apache/beam/pull/7883]: Add MetricQueryResults.allMetrics() helper h2. Ready for Review: * #[7915|https://github.com/apache/beam/pull/7915]: use MonitoringInfo data model in Java SDK metrics ** Depends on [#7867|https://github.com/apache/beam/pull/7867] ** Both of these require adding a {{sdks/java/core}} dependency on the {{model/fn-execution}} protos module. *** I want to discuss whether that's ok. *** It may not be totally necessary; see discussion on #[7915|https://github.com/apache/beam/pull/7915]. h2. Iterating / Discussing: * [#7868|https://github.com/apache/beam/pull/7868]: MonitoringInfo URN tweaks h2. Merged * #7866: move function helpers from fn-harness to sdks/java/core h2. Closed * [#7876|https://github.com/apache/beam/pull/7876]: Clean up metric protos; support integer distributions, gauges h1. Likely pieces still to come: I have these implemented in a branch, but need to pull them out into manageable PRs: * adding the job-API metrics RPC * python support h1. Previous Description: [https://github.com/apache/beam/blob/master/model/job-management/src/main/proto/beam_job_api.proto] currently doesn't appear to have a way for JobService to return metrics to a user, even though [https://github.com/apache/beam/blob/master/model/fn-execution/src/main/proto/beam_fn_api.proto] includes support for reporting SDK metrics to the runner harness. Metrics are apparently necessary to run any ValidatesRunner tests because PAssert needs to validate that the assertions succeeded. However, this statement should be double-checked: perhaps it's possible to somehow work with PAssert without metrics support. > JobService should support returning metrics > --- > > Key: BEAM-4775 > URL: https://issues.apache.org/jira/browse/BEAM-4775 > Project: Beam > Issue Type: Bug > Components: beam-model >Reporter: Eugene Kirpichov >Assignee: Ryan Williams >Priority: Major > Labels: triaged > Time Spent: 18h 50m > Remaining Estimate: 0h > > Design doc: [https://s.apache.org/get-metrics-api]. > h1. Relevant PRs in flight: > h2. Approved / Ready to merge: > * [#7890|https://github.com/apache/beam/pull/7890]: consolidate MetricResult > implementations > * [#7883|https://github.com/apache/beam/pull/7883]: Add > MetricQueryResults.allMetrics() helper > h2. Ready for Review: > * #[7915|https://github.com/apache/beam/pull/7915]: use MonitoringInfo data > model in Java SDK metrics > ** Depends on [#
[jira] [Work logged] (BEAM-4775) JobService should support returning metrics
[ https://issues.apache.org/jira/browse/BEAM-4775?focusedWorklogId=201757&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-201757 ] ASF GitHub Bot logged work on BEAM-4775: Author: ASF GitHub Bot Created on: 21/Feb/19 04:57 Start Date: 21/Feb/19 04:57 Worklog Time Spent: 10m Work Description: ryan-williams commented on pull request #7823: [DO NOT MERGE] [BEAM-4775] Second take on portable metrics over the job-server API URL: https://github.com/apache/beam/pull/7823 This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 201757) Time Spent: 18h 50m (was: 18h 40m) > JobService should support returning metrics > --- > > Key: BEAM-4775 > URL: https://issues.apache.org/jira/browse/BEAM-4775 > Project: Beam > Issue Type: Bug > Components: beam-model >Reporter: Eugene Kirpichov >Assignee: Ryan Williams >Priority: Major > Labels: triaged > Time Spent: 18h 50m > Remaining Estimate: 0h > > Design doc: https://s.apache.org/get-metrics-api. > h1. Relevant PRs in flight: > h2. Approved / Ready to merge: > * [#7890|https://github.com/apache/beam/pull/7890]: consolidate MetricResult > implementations > * [#7883|https://github.com/apache/beam/pull/7883]: Add > MetricQueryResults.allMetrics() helper > h2. Ready for Review: > * #[7915|https://github.com/apache/beam/pull/7915]: use MonitoringInfo data > model in Java SDK metrics > ** Depends on [#7867|https://github.com/apache/beam/pull/7867] > ** Both of these require adding a {{sdks/java/core}} dependency on the > {{model/fn-execution}} protos module. > *** I want to discuss whether that's ok. > *** It may not be totally necessary; see discussion on > #[7915|https://github.com/apache/beam/pull/7915]. > h2. Iterating / Discussing: > * [#7868|https://github.com/apache/beam/pull/7868]: MonitoringInfo URN tweaks > h2. Merged > * #7866: move function helpers from fn-harness to sdks/java/core > h2. Closed > * [#7876|https://github.com/apache/beam/pull/7876]: Clean up metric protos; > support integer distributions, gauges > h1. Likely pieces still to come: > I have these implemented in a branch, but need to pull them out into > manageable PRs: > * adding the job-API metrics RPC > * python support > h1. Previous Description: > [https://github.com/apache/beam/blob/master/model/job-management/src/main/proto/beam_job_api.proto] > currently doesn't appear to have a way for JobService to return metrics to a > user, even though > [https://github.com/apache/beam/blob/master/model/fn-execution/src/main/proto/beam_fn_api.proto] > includes support for reporting SDK metrics to the runner harness. > Metrics are apparently necessary to run any ValidatesRunner tests because > PAssert needs to validate that the assertions succeeded. However, this > statement should be double-checked: perhaps it's possible to somehow work > with PAssert without metrics support. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (BEAM-4775) JobService should support returning metrics
[ https://issues.apache.org/jira/browse/BEAM-4775?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ryan Williams updated BEAM-4775: Description: Design doc: https://s.apache.org/get-metrics-api. h1. Relevant PRs in flight: h2. Approved / Ready to merge: * [#7890|https://github.com/apache/beam/pull/7890]: consolidate MetricResult implementations * [#7883|https://github.com/apache/beam/pull/7883]: Add MetricQueryResults.allMetrics() helper h2. Ready for Review: * #[7915|https://github.com/apache/beam/pull/7915]: use MonitoringInfo data model in Java SDK metrics ** Depends on [#7867|https://github.com/apache/beam/pull/7867] ** Both of these require adding a {{sdks/java/core}} dependency on the {{model/fn-execution}} protos module. *** I want to discuss whether that's ok. *** It may not be totally necessary; see discussion on #[7915|https://github.com/apache/beam/pull/7915]. h2. Iterating / Discussing: * [#7868|https://github.com/apache/beam/pull/7868]: MonitoringInfo URN tweaks h2. Merged * #7866: move function helpers from fn-harness to sdks/java/core h2. Closed * [#7876|https://github.com/apache/beam/pull/7876]: Clean up metric protos; support integer distributions, gauges h1. Likely pieces still to come: I have these implemented in a branch, but need to pull them out into manageable PRs: * adding the job-API metrics RPC * python support h1. Previous Description: [https://github.com/apache/beam/blob/master/model/job-management/src/main/proto/beam_job_api.proto] currently doesn't appear to have a way for JobService to return metrics to a user, even though [https://github.com/apache/beam/blob/master/model/fn-execution/src/main/proto/beam_fn_api.proto] includes support for reporting SDK metrics to the runner harness. Metrics are apparently necessary to run any ValidatesRunner tests because PAssert needs to validate that the assertions succeeded. However, this statement should be double-checked: perhaps it's possible to somehow work with PAssert without metrics support. was: [https://github.com/apache/beam/blob/master/model/job-management/src/main/proto/beam_job_api.proto] currently doesn't appear to have a way for JobService to return metrics to a user, even though [https://github.com/apache/beam/blob/master/model/fn-execution/src/main/proto/beam_fn_api.proto] includes support for reporting SDK metrics to the runner harness. Metrics are apparently necessary to run any ValidatesRunner tests because PAssert needs to validate that the assertions succeeded. However, this statement should be double-checked: perhaps it's possible to somehow work with PAssert without metrics support. > JobService should support returning metrics > --- > > Key: BEAM-4775 > URL: https://issues.apache.org/jira/browse/BEAM-4775 > Project: Beam > Issue Type: Bug > Components: beam-model >Reporter: Eugene Kirpichov >Assignee: Ryan Williams >Priority: Major > Labels: triaged > Time Spent: 18.5h > Remaining Estimate: 0h > > Design doc: https://s.apache.org/get-metrics-api. > h1. Relevant PRs in flight: > h2. Approved / Ready to merge: > * [#7890|https://github.com/apache/beam/pull/7890]: consolidate MetricResult > implementations > * [#7883|https://github.com/apache/beam/pull/7883]: Add > MetricQueryResults.allMetrics() helper > h2. Ready for Review: > * #[7915|https://github.com/apache/beam/pull/7915]: use MonitoringInfo data > model in Java SDK metrics > ** Depends on [#7867|https://github.com/apache/beam/pull/7867] > ** Both of these require adding a {{sdks/java/core}} dependency on the > {{model/fn-execution}} protos module. > *** I want to discuss whether that's ok. > *** It may not be totally necessary; see discussion on > #[7915|https://github.com/apache/beam/pull/7915]. > h2. Iterating / Discussing: > * [#7868|https://github.com/apache/beam/pull/7868]: MonitoringInfo URN tweaks > h2. Merged > * #7866: move function helpers from fn-harness to sdks/java/core > h2. Closed > * [#7876|https://github.com/apache/beam/pull/7876]: Clean up metric protos; > support integer distributions, gauges > h1. Likely pieces still to come: > I have these implemented in a branch, but need to pull them out into > manageable PRs: > * adding the job-API metrics RPC > * python support > h1. Previous Description: > [https://github.com/apache/beam/blob/master/model/job-management/src/main/proto/beam_job_api.proto] > currently doesn't appear to have a way for JobService to return metrics to a > user, even though > [https://github.com/apache/beam/blob/master/model/fn-execution/src/main/proto/beam_fn_api.proto] > includes support for reporting SDK metrics to the runner harness. > Metrics are apparently necessary to run any ValidatesRunner tests because > PAssert need
[jira] [Work logged] (BEAM-4775) JobService should support returning metrics
[ https://issues.apache.org/jira/browse/BEAM-4775?focusedWorklogId=201750&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-201750 ] ASF GitHub Bot logged work on BEAM-4775: Author: ASF GitHub Bot Created on: 21/Feb/19 04:42 Start Date: 21/Feb/19 04:42 Worklog Time Spent: 10m Work Description: ryan-williams commented on pull request #7876: [BEAM-4775] Clean up metric protos; support integer distributions, gauges URL: https://github.com/apache/beam/pull/7876 This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 201750) Time Spent: 18h 20m (was: 18h 10m) > JobService should support returning metrics > --- > > Key: BEAM-4775 > URL: https://issues.apache.org/jira/browse/BEAM-4775 > Project: Beam > Issue Type: Bug > Components: beam-model >Reporter: Eugene Kirpichov >Assignee: Ryan Williams >Priority: Major > Labels: triaged > Time Spent: 18h 20m > Remaining Estimate: 0h > > [https://github.com/apache/beam/blob/master/model/job-management/src/main/proto/beam_job_api.proto] > currently doesn't appear to have a way for JobService to return metrics to a > user, even though > [https://github.com/apache/beam/blob/master/model/fn-execution/src/main/proto/beam_fn_api.proto] > includes support for reporting SDK metrics to the runner harness. > > Metrics are apparently necessary to run any ValidatesRunner tests because > PAssert needs to validate that the assertions succeeded. However, this > statement should be double-checked: perhaps it's possible to somehow work > with PAssert without metrics support. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-4775) JobService should support returning metrics
[ https://issues.apache.org/jira/browse/BEAM-4775?focusedWorklogId=201751&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-201751 ] ASF GitHub Bot logged work on BEAM-4775: Author: ASF GitHub Bot Created on: 21/Feb/19 04:42 Start Date: 21/Feb/19 04:42 Worklog Time Spent: 10m Work Description: ryan-williams commented on issue #7876: [BEAM-4775] Clean up metric protos; support integer distributions, gauges URL: https://github.com/apache/beam/pull/7876#issuecomment-465859852 Thanks for the feedback, these discussions were very useful for me! As I mentioned [here](https://github.com/apache/beam/pull/7876#discussion_r258585934), these proto changes are basically backwards from the direction they are supposed to be evolving in. I'm going to close this out; I think I see how to move the other #7823-associated PRs forward orthogonally, and probably more simply for not needing to deal with new proto structures. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 201751) Time Spent: 18.5h (was: 18h 20m) > JobService should support returning metrics > --- > > Key: BEAM-4775 > URL: https://issues.apache.org/jira/browse/BEAM-4775 > Project: Beam > Issue Type: Bug > Components: beam-model >Reporter: Eugene Kirpichov >Assignee: Ryan Williams >Priority: Major > Labels: triaged > Time Spent: 18.5h > Remaining Estimate: 0h > > [https://github.com/apache/beam/blob/master/model/job-management/src/main/proto/beam_job_api.proto] > currently doesn't appear to have a way for JobService to return metrics to a > user, even though > [https://github.com/apache/beam/blob/master/model/fn-execution/src/main/proto/beam_fn_api.proto] > includes support for reporting SDK metrics to the runner harness. > > Metrics are apparently necessary to run any ValidatesRunner tests because > PAssert needs to validate that the assertions succeeded. However, this > statement should be double-checked: perhaps it's possible to somehow work > with PAssert without metrics support. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (BEAM-6706) User reports trouble downloading 2.10.0 Dataflow worker image
[ https://issues.apache.org/jira/browse/BEAM-6706?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16773628#comment-16773628 ] Kenneth Knowles commented on BEAM-6706: --- I expect these are two different issues. > User reports trouble downloading 2.10.0 Dataflow worker image > - > > Key: BEAM-6706 > URL: https://issues.apache.org/jira/browse/BEAM-6706 > Project: Beam > Issue Type: Bug > Components: runner-dataflow >Reporter: Kenneth Knowles >Assignee: Kenneth Knowles >Priority: Blocker > > DataFlow however is throwing all sorts of errors. For example: > * Handler for GET > /v1.27/images/gcr.io/cloud-dataflow/v1beta3/beam-java-batch:beam-2.10.0/json > returned error: No such image: > gcr.io/cloud-dataflow/v1beta3/beam-java-batch:beam-2.10.0" > * while reading 'google-dockercfg' metadata: http status code: 404 while > fetching url > http://metadata.google.internal./computeMetadata/v1/instance/attributes/google-dockercfg"; > * Error syncing pod..." > The job gets stuck after starting a worker and after an hour or so it gives > up with a failure. 2.9.0 runs fine. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (BEAM-6706) User reports trouble downloading 2.10.0 Dataflow worker image
[ https://issues.apache.org/jira/browse/BEAM-6706?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16773630#comment-16773630 ] Kenneth Knowles commented on BEAM-6706: --- Based on the discussion here, the likelihood that there's a critical issue in the Beam release (versus Dataflow) seems lessened, so I'm lowering priority away from "Blocker". > User reports trouble downloading 2.10.0 Dataflow worker image > - > > Key: BEAM-6706 > URL: https://issues.apache.org/jira/browse/BEAM-6706 > Project: Beam > Issue Type: Bug > Components: runner-dataflow >Reporter: Kenneth Knowles >Assignee: Kenneth Knowles >Priority: Major > > DataFlow however is throwing all sorts of errors. For example: > * Handler for GET > /v1.27/images/gcr.io/cloud-dataflow/v1beta3/beam-java-batch:beam-2.10.0/json > returned error: No such image: > gcr.io/cloud-dataflow/v1beta3/beam-java-batch:beam-2.10.0" > * while reading 'google-dockercfg' metadata: http status code: 404 while > fetching url > http://metadata.google.internal./computeMetadata/v1/instance/attributes/google-dockercfg"; > * Error syncing pod..." > The job gets stuck after starting a worker and after an hour or so it gives > up with a failure. 2.9.0 runs fine. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (BEAM-6706) User reports trouble downloading 2.10.0 Dataflow worker image
[ https://issues.apache.org/jira/browse/BEAM-6706?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kenneth Knowles updated BEAM-6706: -- Priority: Major (was: Blocker) > User reports trouble downloading 2.10.0 Dataflow worker image > - > > Key: BEAM-6706 > URL: https://issues.apache.org/jira/browse/BEAM-6706 > Project: Beam > Issue Type: Bug > Components: runner-dataflow >Reporter: Kenneth Knowles >Assignee: Kenneth Knowles >Priority: Major > > DataFlow however is throwing all sorts of errors. For example: > * Handler for GET > /v1.27/images/gcr.io/cloud-dataflow/v1beta3/beam-java-batch:beam-2.10.0/json > returned error: No such image: > gcr.io/cloud-dataflow/v1beta3/beam-java-batch:beam-2.10.0" > * while reading 'google-dockercfg' metadata: http status code: 404 while > fetching url > http://metadata.google.internal./computeMetadata/v1/instance/attributes/google-dockercfg"; > * Error syncing pod..." > The job gets stuck after starting a worker and after an hour or so it gives > up with a failure. 2.9.0 runs fine. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-6720) Binary incompatibility introduced to MapElements between 2.9.0 and 2.10.0
[ https://issues.apache.org/jira/browse/BEAM-6720?focusedWorklogId=201728&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-201728 ] ASF GitHub Bot logged work on BEAM-6720: Author: ASF GitHub Bot Created on: 21/Feb/19 03:04 Start Date: 21/Feb/19 03:04 Worklog Time Spent: 10m Work Description: aaltay commented on issue #7911: [BEAM-6720] Add binary compatibility adapters for ProcessFunction/InferableFunction overloads URL: https://github.com/apache/beam/pull/7911#issuecomment-465843623 > I should remove the deprecation, because overloading will prefer the more specific class. OK. I think we can still update the euphoria code (not in this PR). Let me know if I can help here. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 201728) Time Spent: 2h (was: 1h 50m) > Binary incompatibility introduced to MapElements between 2.9.0 and 2.10.0 > - > > Key: BEAM-6720 > URL: https://issues.apache.org/jira/browse/BEAM-6720 > Project: Beam > Issue Type: Bug > Components: sdk-java-core >Affects Versions: 2.10.0 >Reporter: Kenneth Knowles >Assignee: Kenneth Knowles >Priority: Blocker > Fix For: 2.11.0 > > Time Spent: 2h > Remaining Estimate: 0h > > In https://github.com/apache/beam/pull/7160 > {{MapElements.via(SimpleFunction)}} was removed and replaced with > {{MapElements.via(InferableFunction)}}. > This is compatible with a recompile, but loses binary compatibility because > the needed method signature is missing. I believe a pass-through method with > the needed signature can be added to restore binary compatibility. > CC [~jeff.klu...@gmail.com] [~altay] -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-6720) Binary incompatibility introduced to MapElements between 2.9.0 and 2.10.0
[ https://issues.apache.org/jira/browse/BEAM-6720?focusedWorklogId=201727&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-201727 ] ASF GitHub Bot logged work on BEAM-6720: Author: ASF GitHub Bot Created on: 21/Feb/19 03:04 Start Date: 21/Feb/19 03:04 Worklog Time Spent: 10m Work Description: aaltay commented on issue #7911: [BEAM-6720] Add binary compatibility adapters for ProcessFunction/InferableFunction overloads URL: https://github.com/apache/beam/pull/7911#issuecomment-465843623 > I should remove the deprecation, because overloading will prefer the more specific class. OK. I think we can still update the euphoria code (not in this PR). Let me know if I can help here. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 201727) Time Spent: 1h 50m (was: 1h 40m) > Binary incompatibility introduced to MapElements between 2.9.0 and 2.10.0 > - > > Key: BEAM-6720 > URL: https://issues.apache.org/jira/browse/BEAM-6720 > Project: Beam > Issue Type: Bug > Components: sdk-java-core >Affects Versions: 2.10.0 >Reporter: Kenneth Knowles >Assignee: Kenneth Knowles >Priority: Blocker > Fix For: 2.11.0 > > Time Spent: 1h 50m > Remaining Estimate: 0h > > In https://github.com/apache/beam/pull/7160 > {{MapElements.via(SimpleFunction)}} was removed and replaced with > {{MapElements.via(InferableFunction)}}. > This is compatible with a recompile, but loses binary compatibility because > the needed method signature is missing. I believe a pass-through method with > the needed signature can be added to restore binary compatibility. > CC [~jeff.klu...@gmail.com] [~altay] -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-6720) Binary incompatibility introduced to MapElements between 2.9.0 and 2.10.0
[ https://issues.apache.org/jira/browse/BEAM-6720?focusedWorklogId=201716&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-201716 ] ASF GitHub Bot logged work on BEAM-6720: Author: ASF GitHub Bot Created on: 21/Feb/19 01:35 Start Date: 21/Feb/19 01:35 Worklog Time Spent: 10m Work Description: kennknowles commented on issue #7911: [BEAM-6720] Add binary compatibility adapters for ProcessFunction/InferableFunction overloads URL: https://github.com/apache/beam/pull/7911#issuecomment-465825766 I should remove the deprecation, because overloading will prefer the more specific class. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 201716) Time Spent: 1h 40m (was: 1.5h) > Binary incompatibility introduced to MapElements between 2.9.0 and 2.10.0 > - > > Key: BEAM-6720 > URL: https://issues.apache.org/jira/browse/BEAM-6720 > Project: Beam > Issue Type: Bug > Components: sdk-java-core >Affects Versions: 2.10.0 >Reporter: Kenneth Knowles >Assignee: Kenneth Knowles >Priority: Blocker > Fix For: 2.11.0 > > Time Spent: 1h 40m > Remaining Estimate: 0h > > In https://github.com/apache/beam/pull/7160 > {{MapElements.via(SimpleFunction)}} was removed and replaced with > {{MapElements.via(InferableFunction)}}. > This is compatible with a recompile, but loses binary compatibility because > the needed method signature is missing. I believe a pass-through method with > the needed signature can be added to restore binary compatibility. > CC [~jeff.klu...@gmail.com] [~altay] -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-4076) Schema followups
[ https://issues.apache.org/jira/browse/BEAM-4076?focusedWorklogId=201713&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-201713 ] ASF GitHub Bot logged work on BEAM-4076: Author: ASF GitHub Bot Created on: 21/Feb/19 01:22 Start Date: 21/Feb/19 01:22 Worklog Time Spent: 10m Work Description: reuvenlax commented on issue #7635: [BEAM-4076] Generalize schema inputs to ParDo URL: https://github.com/apache/beam/pull/7635#issuecomment-465823091 Run Python PreCommit This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 201713) Time Spent: 21h 10m (was: 21h) > Schema followups > > > Key: BEAM-4076 > URL: https://issues.apache.org/jira/browse/BEAM-4076 > Project: Beam > Issue Type: Improvement > Components: beam-model, dsl-sql, sdk-java-core >Reporter: Kenneth Knowles >Priority: Major > Time Spent: 21h 10m > Remaining Estimate: 0h > > This umbrella bug contains subtasks with followups for Beam schemas, which > were moved from SQL to the core Java SDK and made to be type-name-based > rather than coder based. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-4076) Schema followups
[ https://issues.apache.org/jira/browse/BEAM-4076?focusedWorklogId=201711&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-201711 ] ASF GitHub Bot logged work on BEAM-4076: Author: ASF GitHub Bot Created on: 21/Feb/19 01:22 Start Date: 21/Feb/19 01:22 Worklog Time Spent: 10m Work Description: reuvenlax commented on issue #7635: [BEAM-4076] Generalize schema inputs to ParDo URL: https://github.com/apache/beam/pull/7635#issuecomment-465823016 Run Java PreCommit This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 201711) Time Spent: 20h 50m (was: 20h 40m) > Schema followups > > > Key: BEAM-4076 > URL: https://issues.apache.org/jira/browse/BEAM-4076 > Project: Beam > Issue Type: Improvement > Components: beam-model, dsl-sql, sdk-java-core >Reporter: Kenneth Knowles >Priority: Major > Time Spent: 20h 50m > Remaining Estimate: 0h > > This umbrella bug contains subtasks with followups for Beam schemas, which > were moved from SQL to the core Java SDK and made to be type-name-based > rather than coder based. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-4076) Schema followups
[ https://issues.apache.org/jira/browse/BEAM-4076?focusedWorklogId=201712&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-201712 ] ASF GitHub Bot logged work on BEAM-4076: Author: ASF GitHub Bot Created on: 21/Feb/19 01:22 Start Date: 21/Feb/19 01:22 Worklog Time Spent: 10m Work Description: reuvenlax commented on issue #7635: [BEAM-4076] Generalize schema inputs to ParDo URL: https://github.com/apache/beam/pull/7635#issuecomment-465823046 Run JavaPortabilityApi PreCommit This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 201712) Time Spent: 21h (was: 20h 50m) > Schema followups > > > Key: BEAM-4076 > URL: https://issues.apache.org/jira/browse/BEAM-4076 > Project: Beam > Issue Type: Improvement > Components: beam-model, dsl-sql, sdk-java-core >Reporter: Kenneth Knowles >Priority: Major > Time Spent: 21h > Remaining Estimate: 0h > > This umbrella bug contains subtasks with followups for Beam schemas, which > were moved from SQL to the core Java SDK and made to be type-name-based > rather than coder based. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-6720) Binary incompatibility introduced to MapElements between 2.9.0 and 2.10.0
[ https://issues.apache.org/jira/browse/BEAM-6720?focusedWorklogId=201709&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-201709 ] ASF GitHub Bot logged work on BEAM-6720: Author: ASF GitHub Bot Created on: 21/Feb/19 01:14 Start Date: 21/Feb/19 01:14 Worklog Time Spent: 10m Work Description: aaltay commented on issue #7911: [BEAM-6720] Add binary compatibility adapters for ProcessFunction/InferableFunction overloads URL: https://github.com/apache/beam/pull/7911#issuecomment-465821377 This use (https://github.com/apache/beam/blob/master/sdks/java/extensions/euphoria/src/main/java/org/apache/beam/sdk/extensions/euphoria/core/translate/ReduceByKeyTranslator.java#L143) of SimpleFunction need to be changed to use ProcessFunction. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 201709) Time Spent: 1.5h (was: 1h 20m) > Binary incompatibility introduced to MapElements between 2.9.0 and 2.10.0 > - > > Key: BEAM-6720 > URL: https://issues.apache.org/jira/browse/BEAM-6720 > Project: Beam > Issue Type: Bug > Components: sdk-java-core >Affects Versions: 2.10.0 >Reporter: Kenneth Knowles >Assignee: Kenneth Knowles >Priority: Blocker > Fix For: 2.11.0 > > Time Spent: 1.5h > Remaining Estimate: 0h > > In https://github.com/apache/beam/pull/7160 > {{MapElements.via(SimpleFunction)}} was removed and replaced with > {{MapElements.via(InferableFunction)}}. > This is compatible with a recompile, but loses binary compatibility because > the needed method signature is missing. I believe a pass-through method with > the needed signature can be added to restore binary compatibility. > CC [~jeff.klu...@gmail.com] [~altay] -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-6720) Binary incompatibility introduced to MapElements between 2.9.0 and 2.10.0
[ https://issues.apache.org/jira/browse/BEAM-6720?focusedWorklogId=201707&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-201707 ] ASF GitHub Bot logged work on BEAM-6720: Author: ASF GitHub Bot Created on: 21/Feb/19 01:07 Start Date: 21/Feb/19 01:07 Worklog Time Spent: 10m Work Description: aaltay commented on issue #7911: [BEAM-6720] Add binary compatibility adapters for ProcessFunction/InferableFunction overloads URL: https://github.com/apache/beam/pull/7911#issuecomment-465815162 There is a compiler warning failing the build: (on Java PreCommit) 15:06:33 > Task :beam-sdks-java-extensions-euphoria:compileJava FAILED 15:06:33 /home/jenkins/jenkins-slave/workspace/beam_PreCommit_Java_Commit/src/sdks/java/extensions/euphoria/src/main/java/org/apache/beam/sdk/extensions/euphoria/core/translate/ReduceByKeyTranslator.java:69: warning: [deprecation] via(SimpleFunction) in MapElements has been deprecated 15:06:33 MapElements.via(new KeyValueExtractor<>(keyExtractor, valueExtractor)); 15:06:33^ 15:06:33 where InputT,OutputT are type-variables: 15:06:33 InputT extends Object declared in method via(SimpleFunction) 15:06:33 OutputT extends Object declared in method via(SimpleFunction) 15:06:33 error: warnings found and -Werror specified 15:06:33 1 error 15:06:33 1 warning Is this because the Deprecated annotation? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 201707) Time Spent: 1h 20m (was: 1h 10m) > Binary incompatibility introduced to MapElements between 2.9.0 and 2.10.0 > - > > Key: BEAM-6720 > URL: https://issues.apache.org/jira/browse/BEAM-6720 > Project: Beam > Issue Type: Bug > Components: sdk-java-core >Affects Versions: 2.10.0 >Reporter: Kenneth Knowles >Assignee: Kenneth Knowles >Priority: Blocker > Fix For: 2.11.0 > > Time Spent: 1h 20m > Remaining Estimate: 0h > > In https://github.com/apache/beam/pull/7160 > {{MapElements.via(SimpleFunction)}} was removed and replaced with > {{MapElements.via(InferableFunction)}}. > This is compatible with a recompile, but loses binary compatibility because > the needed method signature is missing. I believe a pass-through method with > the needed signature can be added to restore binary compatibility. > CC [~jeff.klu...@gmail.com] [~altay] -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-6720) Binary incompatibility introduced to MapElements between 2.9.0 and 2.10.0
[ https://issues.apache.org/jira/browse/BEAM-6720?focusedWorklogId=201703&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-201703 ] ASF GitHub Bot logged work on BEAM-6720: Author: ASF GitHub Bot Created on: 21/Feb/19 00:44 Start Date: 21/Feb/19 00:44 Worklog Time Spent: 10m Work Description: aaltay commented on issue #7911: [BEAM-6720] Add binary compatibility adapters for ProcessFunction/InferableFunction overloads URL: https://github.com/apache/beam/pull/7911#issuecomment-465815162 There is a compiler warning failing the build: 15:06:33 > Task :beam-sdks-java-extensions-euphoria:compileJava FAILED 15:06:33 /home/jenkins/jenkins-slave/workspace/beam_PreCommit_Java_Commit/src/sdks/java/extensions/euphoria/src/main/java/org/apache/beam/sdk/extensions/euphoria/core/translate/ReduceByKeyTranslator.java:69: warning: [deprecation] via(SimpleFunction) in MapElements has been deprecated 15:06:33 MapElements.via(new KeyValueExtractor<>(keyExtractor, valueExtractor)); 15:06:33^ 15:06:33 where InputT,OutputT are type-variables: 15:06:33 InputT extends Object declared in method via(SimpleFunction) 15:06:33 OutputT extends Object declared in method via(SimpleFunction) 15:06:33 error: warnings found and -Werror specified 15:06:33 1 error 15:06:33 1 warning Is this because the Deprecated annotation? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 201703) Time Spent: 1h (was: 50m) > Binary incompatibility introduced to MapElements between 2.9.0 and 2.10.0 > - > > Key: BEAM-6720 > URL: https://issues.apache.org/jira/browse/BEAM-6720 > Project: Beam > Issue Type: Bug > Components: sdk-java-core >Affects Versions: 2.10.0 >Reporter: Kenneth Knowles >Assignee: Kenneth Knowles >Priority: Blocker > Fix For: 2.11.0 > > Time Spent: 1h > Remaining Estimate: 0h > > In https://github.com/apache/beam/pull/7160 > {{MapElements.via(SimpleFunction)}} was removed and replaced with > {{MapElements.via(InferableFunction)}}. > This is compatible with a recompile, but loses binary compatibility because > the needed method signature is missing. I believe a pass-through method with > the needed signature can be added to restore binary compatibility. > CC [~jeff.klu...@gmail.com] [~altay] -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-6720) Binary incompatibility introduced to MapElements between 2.9.0 and 2.10.0
[ https://issues.apache.org/jira/browse/BEAM-6720?focusedWorklogId=201704&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-201704 ] ASF GitHub Bot logged work on BEAM-6720: Author: ASF GitHub Bot Created on: 21/Feb/19 00:44 Start Date: 21/Feb/19 00:44 Worklog Time Spent: 10m Work Description: aaltay commented on issue #7911: [BEAM-6720] Add binary compatibility adapters for ProcessFunction/InferableFunction overloads URL: https://github.com/apache/beam/pull/7911#issuecomment-465815195 Run Java_Examples_Dataflow PreCommit This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 201704) Time Spent: 1h 10m (was: 1h) > Binary incompatibility introduced to MapElements between 2.9.0 and 2.10.0 > - > > Key: BEAM-6720 > URL: https://issues.apache.org/jira/browse/BEAM-6720 > Project: Beam > Issue Type: Bug > Components: sdk-java-core >Affects Versions: 2.10.0 >Reporter: Kenneth Knowles >Assignee: Kenneth Knowles >Priority: Blocker > Fix For: 2.11.0 > > Time Spent: 1h 10m > Remaining Estimate: 0h > > In https://github.com/apache/beam/pull/7160 > {{MapElements.via(SimpleFunction)}} was removed and replaced with > {{MapElements.via(InferableFunction)}}. > This is compatible with a recompile, but loses binary compatibility because > the needed method signature is missing. I believe a pass-through method with > the needed signature can be added to restore binary compatibility. > CC [~jeff.klu...@gmail.com] [~altay] -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-6720) Binary incompatibility introduced to MapElements between 2.9.0 and 2.10.0
[ https://issues.apache.org/jira/browse/BEAM-6720?focusedWorklogId=201702&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-201702 ] ASF GitHub Bot logged work on BEAM-6720: Author: ASF GitHub Bot Created on: 21/Feb/19 00:41 Start Date: 21/Feb/19 00:41 Worklog Time Spent: 10m Work Description: aaltay commented on issue #7911: [BEAM-6720] Add binary compatibility adapters for ProcessFunction/InferableFunction overloads URL: https://github.com/apache/beam/pull/7911#issuecomment-465814593 Run Java_Examples_Dataflow PreCommit" This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 201702) Time Spent: 50m (was: 40m) > Binary incompatibility introduced to MapElements between 2.9.0 and 2.10.0 > - > > Key: BEAM-6720 > URL: https://issues.apache.org/jira/browse/BEAM-6720 > Project: Beam > Issue Type: Bug > Components: sdk-java-core >Affects Versions: 2.10.0 >Reporter: Kenneth Knowles >Assignee: Kenneth Knowles >Priority: Blocker > Fix For: 2.11.0 > > Time Spent: 50m > Remaining Estimate: 0h > > In https://github.com/apache/beam/pull/7160 > {{MapElements.via(SimpleFunction)}} was removed and replaced with > {{MapElements.via(InferableFunction)}}. > This is compatible with a recompile, but loses binary compatibility because > the needed method signature is missing. I believe a pass-through method with > the needed signature can be added to restore binary compatibility. > CC [~jeff.klu...@gmail.com] [~altay] -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-6698) Portable Validates Runner Tests on Flink flaky after update to gradle5
[ https://issues.apache.org/jira/browse/BEAM-6698?focusedWorklogId=201699&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-201699 ] ASF GitHub Bot logged work on BEAM-6698: Author: ASF GitHub Bot Created on: 21/Feb/19 00:30 Start Date: 21/Feb/19 00:30 Worklog Time Spent: 10m Work Description: adude3141 commented on issue #7877: [BEAM-6698] increase maxHeapSize to prevent OutOfMemoryError (direct … URL: https://github.com/apache/beam/pull/7877#issuecomment-465812378 @mxm Could you help merging? I forgot to change title when asking for review. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 201699) Time Spent: 3h 40m (was: 3.5h) > Portable Validates Runner Tests on Flink flaky after update to gradle5 > --- > > Key: BEAM-6698 > URL: https://issues.apache.org/jira/browse/BEAM-6698 > Project: Beam > Issue Type: Bug > Components: test-failures >Reporter: Michael Luckey >Assignee: Michael Luckey >Priority: Major > Time Spent: 3h 40m > Remaining Estimate: 0h > > After upgrade to gradle 5 [1], the two portable runner test projects on > jenkins > - beam_PostCommit_Java_PVR_Flink_Streaming [2] > - beam_PostCommit_Java_PVR_Flink_Batch [3] > became flaky. > First investigation seems to point the tests to be failing on direct buffer > memory (e.g. [4]) while staging files. > Although I am unsure, whether this is really the root cause, or something > that shows up after some other failure. > {noformat} > INFO: Transport failed > org.apache.beam.vendor.grpc.v1p13p1.io.netty.util.internal.OutOfDirectMemoryError: > failed to allocate 16777216 byte(s) of direct memory (used: 1895825695, max: > 1908932608) > {noformat} > As far as I know, we do not set `-XX:MaxDirectMemorySize` anywhere in our > setup, neither does gradle itself. At least on my machine both gradle 4 and > gradle 5 stick to the same jvm default > {noformat} > ### > sun.misc.VM.maxDirectMemory(): 1908932608 Bytes > sun.misc.VM.maxDirectMemory(): 1820 MB > ### > {noformat} > > Unfortunately this does not reproduce on (my) local machine. We might try to > workaround here by increasing ` -XX:MaxDirectMemorySize==3G` but this would > probably only hide the problem? But might still be helpful to increase > temporarily on branch just to be sure, that this is indeed the root cause? > [1] https://issues.apache.org/jira/browse/BEAM-6630 > [2] [https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/] > [3] [https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/] > [4] > [https://scans.gradle.com/s/tpo3yffjznfxa/tests/yobvrae4rwsg4-go44ti5iq45vq] > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-6711) Bigquery Tornadoes IT is broken in Python3 PostCommit test suite.
[ https://issues.apache.org/jira/browse/BEAM-6711?focusedWorklogId=201691&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-201691 ] ASF GitHub Bot logged work on BEAM-6711: Author: ASF GitHub Bot Created on: 20/Feb/19 23:48 Start Date: 20/Feb/19 23:48 Worklog Time Spent: 10m Work Description: tvalentyn commented on issue #7892: [BEAM-6711] [BEAM-6553] A Python SDK sink that supports File Loads into BQ (#7655) URL: https://github.com/apache/beam/pull/7892#issuecomment-465802015 Run Python PostCommit This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 201691) Time Spent: 5.5h (was: 5h 20m) > Bigquery Tornadoes IT is broken in Python3 PostCommit test suite. > -- > > Key: BEAM-6711 > URL: https://issues.apache.org/jira/browse/BEAM-6711 > Project: Beam > Issue Type: Sub-task > Components: sdk-py-core >Reporter: Valentyn Tymofieiev >Assignee: Pablo Estrada >Priority: Major > Time Spent: 5.5h > Remaining Estimate: 0h > > First failure was observed in > https://builds.apache.org/job/beam_PostCommit_Python3_Verify/54 , after > https://github.com/apache/beam/commit/cdea885872b3be7de9ba22f22700be89f7d53766 > was merged. > [~pabloem], could you please take a look? I suggest we do a rollback + > rollforward with a fix. > {noformat} > root: ERROR: Exception at bundle > , > due to an exception. > Traceback (most recent call last): > File > "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify/src/sdks/python/apache_beam/runners/common.py", > line 727, in process > return self.do_fn_invoker.invoke_process(windowed_value) > File > "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify/src/sdks/python/apache_beam/runners/common.py", > line 556, in invoke_process > windowed_value, additional_args, additional_kwargs, output_processor) > File > "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify/src/sdks/python/apache_beam/runners/common.py", > line 622, in _invoke_per_window > self.process_method(*args_for_process, **kwargs_for_process)) > File > "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify/src/sdks/python/apache_beam/runners/common.py", > line 823, in process_outputs > for result in results: > File "/home/jenkins/jenkins-slave/works > pace/beam_PostCommit_Python3_Verify/src/sdks/python/apache_beam/io/gcp/bigquery_file_loads.py", > line 191, in process > if destination in self._destination_to_file_writer: > TypeError: unhashable type: 'TableReference' > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-6627) Use Metrics API in IO performance tests
[ https://issues.apache.org/jira/browse/BEAM-6627?focusedWorklogId=201689&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-201689 ] ASF GitHub Bot logged work on BEAM-6627: Author: ASF GitHub Bot Created on: 20/Feb/19 23:40 Start Date: 20/Feb/19 23:40 Worklog Time Spent: 10m Work Description: udim commented on pull request #7772: [BEAM-6627] Added Metrics API processing time reporting to TextIOIT URL: https://github.com/apache/beam/pull/7772#discussion_r258724722 ## File path: sdks/java/io/file-based-io-tests/src/test/java/org/apache/beam/sdk/io/text/TextIOIT.java ## @@ -127,28 +140,49 @@ public void writeThenReadAll() { PipelineResult result = pipeline.run(); result.waitUntilFinish(); -publishGcsResults(result); +gatherAndPublishMetrics(result); } - private void publishGcsResults(PipelineResult result) { + private void gatherAndPublishMetrics(PipelineResult result) { +String uuid = UUID.randomUUID().toString(); +Timestamp timestamp = Timestamp.now(); +List namedTestResults = readMetrics(result, uuid, timestamp); +publishToBigQuery(namedTestResults, bigQueryDataset, bigQueryTable); +ConsoleResultPublisher.publish(namedTestResults, uuid, timestamp.toString()); + } + + private List readMetrics( + PipelineResult result, String uuid, Timestamp timestamp) { +List results = new ArrayList<>(); + +MetricsReader reader = new MetricsReader(result, FILEIOIT_NAMESPACE); +long writeStartTime = reader.getStartTimeMetric("startTime"); +long writeEndTime = reader.getEndTimeMetric("middleTime"); +long readStartTime = reader.getStartTimeMetric("middleTime"); +long readEndTime = reader.getEndTimeMetric("endTime"); +double writeTime = (writeEndTime - writeStartTime) / 1000.0; +double readTime = (readEndTime - readStartTime) / 1000.0; +double copiesPerSec = calculateGcsMetric(result); + +if (copiesPerSec > 0) { + results.add( + NamedTestResult.create(uuid, timestamp.toString(), "copies_per_sec", copiesPerSec)); +} + +results.add(NamedTestResult.create(uuid, timestamp.toString(), "read_time", readTime)); +results.add(NamedTestResult.create(uuid, timestamp.toString(), "write_time", writeTime)); + +return results; + } + + private double calculateGcsMetric(PipelineResult result) { Review comment: I guess the way it was working before was by checking `if (numCopies < 0 || copyTimeMsec < 0)`, which is false if `options.getGcsPerformanceMetrics()` is false. So I think we can merge --reportGcsPerformanceMetrics and --gcsPerformanceMetrics into the latter. No need to add a separate flag, and only report copiesPerSec if --gcsPerformanceMetrics is set. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 201689) Time Spent: 4h (was: 3h 50m) > Use Metrics API in IO performance tests > --- > > Key: BEAM-6627 > URL: https://issues.apache.org/jira/browse/BEAM-6627 > Project: Beam > Issue Type: Improvement > Components: testing >Reporter: Michal Walenia >Assignee: Michal Walenia >Priority: Minor > Time Spent: 4h > Remaining Estimate: 0h > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-6596) Beam Python SDK release qualification should verify supported Python 3 versions.
[ https://issues.apache.org/jira/browse/BEAM-6596?focusedWorklogId=201687&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-201687 ] ASF GitHub Bot logged work on BEAM-6596: Author: ASF GitHub Bot Created on: 20/Feb/19 23:35 Start Date: 20/Feb/19 23:35 Worklog Time Spent: 10m Work Description: tvalentyn commented on issue #7914: [BEAM-6596] Use multiple interpreter versions for SDK release validation. URL: https://github.com/apache/beam/pull/7914#issuecomment-465799080 I recommend to ignore whitespaces when reviewing this. https://github.com/apache/beam/pull/7914/files?utf8=%E2%9C%93&diff=split&w=1 Also, I am still testing this. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 201687) Time Spent: 0.5h (was: 20m) > Beam Python SDK release qualification should verify supported Python 3 > versions. > > > Key: BEAM-6596 > URL: https://issues.apache.org/jira/browse/BEAM-6596 > Project: Beam > Issue Type: Sub-task > Components: sdk-py-core >Reporter: Valentyn Tymofieiev >Assignee: Valentyn Tymofieiev >Priority: Major > Fix For: 2.11.0 > > Time Spent: 0.5h > Remaining Estimate: 0h > > We likely won't get to this for 2.11, but I can't assign a 2.12 as a fix > version, setting 2.11 for now, and we can move this to 2.12 once this tag is > available. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-6711) Bigquery Tornadoes IT is broken in Python3 PostCommit test suite.
[ https://issues.apache.org/jira/browse/BEAM-6711?focusedWorklogId=201688&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-201688 ] ASF GitHub Bot logged work on BEAM-6711: Author: ASF GitHub Bot Created on: 20/Feb/19 23:35 Start Date: 20/Feb/19 23:35 Worklog Time Spent: 10m Work Description: pabloem commented on issue #7892: [BEAM-6711] [BEAM-6553] A Python SDK sink that supports File Loads into BQ (#7655) URL: https://github.com/apache/beam/pull/7892#issuecomment-465799137 I dont know why postcommits on py3 are failing. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 201688) Time Spent: 5h 20m (was: 5h 10m) > Bigquery Tornadoes IT is broken in Python3 PostCommit test suite. > -- > > Key: BEAM-6711 > URL: https://issues.apache.org/jira/browse/BEAM-6711 > Project: Beam > Issue Type: Sub-task > Components: sdk-py-core >Reporter: Valentyn Tymofieiev >Assignee: Pablo Estrada >Priority: Major > Time Spent: 5h 20m > Remaining Estimate: 0h > > First failure was observed in > https://builds.apache.org/job/beam_PostCommit_Python3_Verify/54 , after > https://github.com/apache/beam/commit/cdea885872b3be7de9ba22f22700be89f7d53766 > was merged. > [~pabloem], could you please take a look? I suggest we do a rollback + > rollforward with a fix. > {noformat} > root: ERROR: Exception at bundle > , > due to an exception. > Traceback (most recent call last): > File > "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify/src/sdks/python/apache_beam/runners/common.py", > line 727, in process > return self.do_fn_invoker.invoke_process(windowed_value) > File > "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify/src/sdks/python/apache_beam/runners/common.py", > line 556, in invoke_process > windowed_value, additional_args, additional_kwargs, output_processor) > File > "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify/src/sdks/python/apache_beam/runners/common.py", > line 622, in _invoke_per_window > self.process_method(*args_for_process, **kwargs_for_process)) > File > "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify/src/sdks/python/apache_beam/runners/common.py", > line 823, in process_outputs > for result in results: > File "/home/jenkins/jenkins-slave/works > pace/beam_PostCommit_Python3_Verify/src/sdks/python/apache_beam/io/gcp/bigquery_file_loads.py", > line 191, in process > if destination in self._destination_to_file_writer: > TypeError: unhashable type: 'TableReference' > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Closed] (BEAM-6616) Stager should stage Python 3 wheels for Beam SDK once they are released.
[ https://issues.apache.org/jira/browse/BEAM-6616?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ahmet Altay closed BEAM-6616. - Resolution: Fixed > Stager should stage Python 3 wheels for Beam SDK once they are released. > > > Key: BEAM-6616 > URL: https://issues.apache.org/jira/browse/BEAM-6616 > Project: Beam > Issue Type: Sub-task > Components: sdk-py-core >Reporter: Valentyn Tymofieiev >Assignee: Valentyn Tymofieiev >Priority: Major > Labels: triaged > Fix For: 2.11.0 > > Time Spent: 20m > Remaining Estimate: 0h > > We currently assume py27/cp27mu wheel file configuration for SDK wheel files > in > [https://github.com/apache/beam/blob/3a182d64c86ad038692800f5c343659ab0b935b0/sdks/python/apache_beam/runners/portability/stager.py#L491,] > > To accommodate Python 3 version of whl files, we need to adjust this logic. > cc: [~altay] [~ccy] -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-6596) Beam Python SDK release qualification should verify supported Python 3 versions.
[ https://issues.apache.org/jira/browse/BEAM-6596?focusedWorklogId=201686&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-201686 ] ASF GitHub Bot logged work on BEAM-6596: Author: ASF GitHub Bot Created on: 20/Feb/19 23:27 Start Date: 20/Feb/19 23:27 Worklog Time Spent: 10m Work Description: aaltay commented on issue #7914: [BEAM-6596] Use multiple interpreter versions for SDK release validation. URL: https://github.com/apache/beam/pull/7914#issuecomment-465797365 R: @markflyhigh This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 201686) Time Spent: 20m (was: 10m) > Beam Python SDK release qualification should verify supported Python 3 > versions. > > > Key: BEAM-6596 > URL: https://issues.apache.org/jira/browse/BEAM-6596 > Project: Beam > Issue Type: Sub-task > Components: sdk-py-core >Reporter: Valentyn Tymofieiev >Assignee: Valentyn Tymofieiev >Priority: Major > Fix For: 2.11.0 > > Time Spent: 20m > Remaining Estimate: 0h > > We likely won't get to this for 2.11, but I can't assign a 2.12 as a fix > version, setting 2.11 for now, and we can move this to 2.12 once this tag is > available. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (BEAM-6596) Beam Python SDK release qualification should verify supported Python 3 versions.
[ https://issues.apache.org/jira/browse/BEAM-6596?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16773506#comment-16773506 ] Valentyn Tymofieiev commented on BEAM-6596: --- [https://github.com/apache/beam/pull/7914] adds this. still figuring out how to test this PR. > Beam Python SDK release qualification should verify supported Python 3 > versions. > > > Key: BEAM-6596 > URL: https://issues.apache.org/jira/browse/BEAM-6596 > Project: Beam > Issue Type: Sub-task > Components: sdk-py-core >Reporter: Valentyn Tymofieiev >Assignee: Valentyn Tymofieiev >Priority: Major > Fix For: 2.11.0 > > Time Spent: 10m > Remaining Estimate: 0h > > We likely won't get to this for 2.11, but I can't assign a 2.12 as a fix > version, setting 2.11 for now, and we can move this to 2.12 once this tag is > available. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-6596) Beam Python SDK release qualification should verify supported Python 3 versions.
[ https://issues.apache.org/jira/browse/BEAM-6596?focusedWorklogId=201682&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-201682 ] ASF GitHub Bot logged work on BEAM-6596: Author: ASF GitHub Bot Created on: 20/Feb/19 23:25 Start Date: 20/Feb/19 23:25 Worklog Time Spent: 10m Work Description: tvalentyn commented on pull request #7914: [BEAM-6596] Use multiple interpreter versions for SDK release validation. URL: https://github.com/apache/beam/pull/7914 Post-Commit Tests Status (on master branch) Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark --- | --- | --- | --- | --- | --- | --- | --- Go | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/) | --- | --- | --- | --- | --- | --- Java | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/) Python | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Python_Verify/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python_Verify/lastCompletedBuild/) | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/) [![Build Status](https://builds.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PreCommit_Python_PVR_Flink_Cron/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PreCommit_Python_PVR_Flink_Cron/lastCompletedBuild/) | --- | --- | --- See [.test-infra/jenkins/README](../.test-infra/jenkins/README.md) for trigger phrase, status and link of all Jenkins jobs. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 201682) Time Spent: 10m Remaining Estimate: 0h > Beam Python SDK release qualification should verify supported Python 3 > versions. > > > Key: BEAM-6596 > URL: https://issues.apache.org/jira/browse/BEAM-6596 > Project: Beam > Issue Type: Sub-task > Components: sdk-py-core >Reporter: Valentyn Tymofieiev >Assignee: Valentyn Tymofieiev >Priority: Major > Fix For: 2.11.0 > > Time Spent: 10m > Remaining Estimate: 0h > > W
[jira] [Work logged] (BEAM-6711) Bigquery Tornadoes IT is broken in Python3 PostCommit test suite.
[ https://issues.apache.org/jira/browse/BEAM-6711?focusedWorklogId=201684&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-201684 ] ASF GitHub Bot logged work on BEAM-6711: Author: ASF GitHub Bot Created on: 20/Feb/19 23:25 Start Date: 20/Feb/19 23:25 Worklog Time Spent: 10m Work Description: pabloem commented on issue #7892: [BEAM-6711] [BEAM-6553] A Python SDK sink that supports File Loads into BQ (#7655) URL: https://github.com/apache/beam/pull/7892#issuecomment-465796755 Run Python PreCommit This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 201684) Time Spent: 5h (was: 4h 50m) > Bigquery Tornadoes IT is broken in Python3 PostCommit test suite. > -- > > Key: BEAM-6711 > URL: https://issues.apache.org/jira/browse/BEAM-6711 > Project: Beam > Issue Type: Sub-task > Components: sdk-py-core >Reporter: Valentyn Tymofieiev >Assignee: Pablo Estrada >Priority: Major > Time Spent: 5h > Remaining Estimate: 0h > > First failure was observed in > https://builds.apache.org/job/beam_PostCommit_Python3_Verify/54 , after > https://github.com/apache/beam/commit/cdea885872b3be7de9ba22f22700be89f7d53766 > was merged. > [~pabloem], could you please take a look? I suggest we do a rollback + > rollforward with a fix. > {noformat} > root: ERROR: Exception at bundle > , > due to an exception. > Traceback (most recent call last): > File > "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify/src/sdks/python/apache_beam/runners/common.py", > line 727, in process > return self.do_fn_invoker.invoke_process(windowed_value) > File > "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify/src/sdks/python/apache_beam/runners/common.py", > line 556, in invoke_process > windowed_value, additional_args, additional_kwargs, output_processor) > File > "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify/src/sdks/python/apache_beam/runners/common.py", > line 622, in _invoke_per_window > self.process_method(*args_for_process, **kwargs_for_process)) > File > "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify/src/sdks/python/apache_beam/runners/common.py", > line 823, in process_outputs > for result in results: > File "/home/jenkins/jenkins-slave/works > pace/beam_PostCommit_Python3_Verify/src/sdks/python/apache_beam/io/gcp/bigquery_file_loads.py", > line 191, in process > if destination in self._destination_to_file_writer: > TypeError: unhashable type: 'TableReference' > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-6711) Bigquery Tornadoes IT is broken in Python3 PostCommit test suite.
[ https://issues.apache.org/jira/browse/BEAM-6711?focusedWorklogId=201683&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-201683 ] ASF GitHub Bot logged work on BEAM-6711: Author: ASF GitHub Bot Created on: 20/Feb/19 23:25 Start Date: 20/Feb/19 23:25 Worklog Time Spent: 10m Work Description: pabloem commented on issue #7892: [BEAM-6711] [BEAM-6553] A Python SDK sink that supports File Loads into BQ (#7655) URL: https://github.com/apache/beam/pull/7892#issuecomment-465796755 Run Python PreCommit This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 201683) Time Spent: 4h 50m (was: 4h 40m) > Bigquery Tornadoes IT is broken in Python3 PostCommit test suite. > -- > > Key: BEAM-6711 > URL: https://issues.apache.org/jira/browse/BEAM-6711 > Project: Beam > Issue Type: Sub-task > Components: sdk-py-core >Reporter: Valentyn Tymofieiev >Assignee: Pablo Estrada >Priority: Major > Time Spent: 4h 50m > Remaining Estimate: 0h > > First failure was observed in > https://builds.apache.org/job/beam_PostCommit_Python3_Verify/54 , after > https://github.com/apache/beam/commit/cdea885872b3be7de9ba22f22700be89f7d53766 > was merged. > [~pabloem], could you please take a look? I suggest we do a rollback + > rollforward with a fix. > {noformat} > root: ERROR: Exception at bundle > , > due to an exception. > Traceback (most recent call last): > File > "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify/src/sdks/python/apache_beam/runners/common.py", > line 727, in process > return self.do_fn_invoker.invoke_process(windowed_value) > File > "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify/src/sdks/python/apache_beam/runners/common.py", > line 556, in invoke_process > windowed_value, additional_args, additional_kwargs, output_processor) > File > "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify/src/sdks/python/apache_beam/runners/common.py", > line 622, in _invoke_per_window > self.process_method(*args_for_process, **kwargs_for_process)) > File > "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify/src/sdks/python/apache_beam/runners/common.py", > line 823, in process_outputs > for result in results: > File "/home/jenkins/jenkins-slave/works > pace/beam_PostCommit_Python3_Verify/src/sdks/python/apache_beam/io/gcp/bigquery_file_loads.py", > line 191, in process > if destination in self._destination_to_file_writer: > TypeError: unhashable type: 'TableReference' > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-6711) Bigquery Tornadoes IT is broken in Python3 PostCommit test suite.
[ https://issues.apache.org/jira/browse/BEAM-6711?focusedWorklogId=201685&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-201685 ] ASF GitHub Bot logged work on BEAM-6711: Author: ASF GitHub Bot Created on: 20/Feb/19 23:25 Start Date: 20/Feb/19 23:25 Worklog Time Spent: 10m Work Description: pabloem commented on issue #7892: [BEAM-6711] [BEAM-6553] A Python SDK sink that supports File Loads into BQ (#7655) URL: https://github.com/apache/beam/pull/7892#issuecomment-465796793 Run Python PostCommit This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 201685) Time Spent: 5h 10m (was: 5h) > Bigquery Tornadoes IT is broken in Python3 PostCommit test suite. > -- > > Key: BEAM-6711 > URL: https://issues.apache.org/jira/browse/BEAM-6711 > Project: Beam > Issue Type: Sub-task > Components: sdk-py-core >Reporter: Valentyn Tymofieiev >Assignee: Pablo Estrada >Priority: Major > Time Spent: 5h 10m > Remaining Estimate: 0h > > First failure was observed in > https://builds.apache.org/job/beam_PostCommit_Python3_Verify/54 , after > https://github.com/apache/beam/commit/cdea885872b3be7de9ba22f22700be89f7d53766 > was merged. > [~pabloem], could you please take a look? I suggest we do a rollback + > rollforward with a fix. > {noformat} > root: ERROR: Exception at bundle > , > due to an exception. > Traceback (most recent call last): > File > "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify/src/sdks/python/apache_beam/runners/common.py", > line 727, in process > return self.do_fn_invoker.invoke_process(windowed_value) > File > "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify/src/sdks/python/apache_beam/runners/common.py", > line 556, in invoke_process > windowed_value, additional_args, additional_kwargs, output_processor) > File > "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify/src/sdks/python/apache_beam/runners/common.py", > line 622, in _invoke_per_window > self.process_method(*args_for_process, **kwargs_for_process)) > File > "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify/src/sdks/python/apache_beam/runners/common.py", > line 823, in process_outputs > for result in results: > File "/home/jenkins/jenkins-slave/works > pace/beam_PostCommit_Python3_Verify/src/sdks/python/apache_beam/io/gcp/bigquery_file_loads.py", > line 191, in process > if destination in self._destination_to_file_writer: > TypeError: unhashable type: 'TableReference' > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-6711) Bigquery Tornadoes IT is broken in Python3 PostCommit test suite.
[ https://issues.apache.org/jira/browse/BEAM-6711?focusedWorklogId=201681&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-201681 ] ASF GitHub Bot logged work on BEAM-6711: Author: ASF GitHub Bot Created on: 20/Feb/19 23:20 Start Date: 20/Feb/19 23:20 Worklog Time Spent: 10m Work Description: pabloem commented on issue #7892: [BEAM-6711] [BEAM-6553] A Python SDK sink that supports File Loads into BQ (#7655) URL: https://github.com/apache/beam/pull/7892#issuecomment-465795531 Run Python PostCommit This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 201681) Time Spent: 4h 40m (was: 4.5h) > Bigquery Tornadoes IT is broken in Python3 PostCommit test suite. > -- > > Key: BEAM-6711 > URL: https://issues.apache.org/jira/browse/BEAM-6711 > Project: Beam > Issue Type: Sub-task > Components: sdk-py-core >Reporter: Valentyn Tymofieiev >Assignee: Pablo Estrada >Priority: Major > Time Spent: 4h 40m > Remaining Estimate: 0h > > First failure was observed in > https://builds.apache.org/job/beam_PostCommit_Python3_Verify/54 , after > https://github.com/apache/beam/commit/cdea885872b3be7de9ba22f22700be89f7d53766 > was merged. > [~pabloem], could you please take a look? I suggest we do a rollback + > rollforward with a fix. > {noformat} > root: ERROR: Exception at bundle > , > due to an exception. > Traceback (most recent call last): > File > "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify/src/sdks/python/apache_beam/runners/common.py", > line 727, in process > return self.do_fn_invoker.invoke_process(windowed_value) > File > "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify/src/sdks/python/apache_beam/runners/common.py", > line 556, in invoke_process > windowed_value, additional_args, additional_kwargs, output_processor) > File > "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify/src/sdks/python/apache_beam/runners/common.py", > line 622, in _invoke_per_window > self.process_method(*args_for_process, **kwargs_for_process)) > File > "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify/src/sdks/python/apache_beam/runners/common.py", > line 823, in process_outputs > for result in results: > File "/home/jenkins/jenkins-slave/works > pace/beam_PostCommit_Python3_Verify/src/sdks/python/apache_beam/io/gcp/bigquery_file_loads.py", > line 191, in process > if destination in self._destination_to_file_writer: > TypeError: unhashable type: 'TableReference' > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-6711) Bigquery Tornadoes IT is broken in Python3 PostCommit test suite.
[ https://issues.apache.org/jira/browse/BEAM-6711?focusedWorklogId=201680&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-201680 ] ASF GitHub Bot logged work on BEAM-6711: Author: ASF GitHub Bot Created on: 20/Feb/19 23:20 Start Date: 20/Feb/19 23:20 Worklog Time Spent: 10m Work Description: pabloem commented on issue #7892: [BEAM-6711] [BEAM-6553] A Python SDK sink that supports File Loads into BQ (#7655) URL: https://github.com/apache/beam/pull/7892#issuecomment-465795493 Run Portable_Python PreCommit This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 201680) Time Spent: 4.5h (was: 4h 20m) > Bigquery Tornadoes IT is broken in Python3 PostCommit test suite. > -- > > Key: BEAM-6711 > URL: https://issues.apache.org/jira/browse/BEAM-6711 > Project: Beam > Issue Type: Sub-task > Components: sdk-py-core >Reporter: Valentyn Tymofieiev >Assignee: Pablo Estrada >Priority: Major > Time Spent: 4.5h > Remaining Estimate: 0h > > First failure was observed in > https://builds.apache.org/job/beam_PostCommit_Python3_Verify/54 , after > https://github.com/apache/beam/commit/cdea885872b3be7de9ba22f22700be89f7d53766 > was merged. > [~pabloem], could you please take a look? I suggest we do a rollback + > rollforward with a fix. > {noformat} > root: ERROR: Exception at bundle > , > due to an exception. > Traceback (most recent call last): > File > "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify/src/sdks/python/apache_beam/runners/common.py", > line 727, in process > return self.do_fn_invoker.invoke_process(windowed_value) > File > "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify/src/sdks/python/apache_beam/runners/common.py", > line 556, in invoke_process > windowed_value, additional_args, additional_kwargs, output_processor) > File > "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify/src/sdks/python/apache_beam/runners/common.py", > line 622, in _invoke_per_window > self.process_method(*args_for_process, **kwargs_for_process)) > File > "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify/src/sdks/python/apache_beam/runners/common.py", > line 823, in process_outputs > for result in results: > File "/home/jenkins/jenkins-slave/works > pace/beam_PostCommit_Python3_Verify/src/sdks/python/apache_beam/io/gcp/bigquery_file_loads.py", > line 191, in process > if destination in self._destination_to_file_writer: > TypeError: unhashable type: 'TableReference' > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (BEAM-6616) Stager should stage Python 3 wheels for Beam SDK once they are released.
[ https://issues.apache.org/jira/browse/BEAM-6616?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16773501#comment-16773501 ] Ahmet Altay commented on BEAM-6616: --- Closing this (https://github.com/apache/beam/pull/7862) is merged to 2.11 branch. > Stager should stage Python 3 wheels for Beam SDK once they are released. > > > Key: BEAM-6616 > URL: https://issues.apache.org/jira/browse/BEAM-6616 > Project: Beam > Issue Type: Sub-task > Components: sdk-py-core >Reporter: Valentyn Tymofieiev >Assignee: Valentyn Tymofieiev >Priority: Major > Labels: triaged > Fix For: 2.11.0 > > Time Spent: 20m > Remaining Estimate: 0h > > We currently assume py27/cp27mu wheel file configuration for SDK wheel files > in > [https://github.com/apache/beam/blob/3a182d64c86ad038692800f5c343659ab0b935b0/sdks/python/apache_beam/runners/portability/stager.py#L491,] > > To accommodate Python 3 version of whl files, we need to adjust this logic. > cc: [~altay] [~ccy] -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-6714) Move runner-agnostic code out of FlinkJobServerDriver
[ https://issues.apache.org/jira/browse/BEAM-6714?focusedWorklogId=201677&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-201677 ] ASF GitHub Bot logged work on BEAM-6714: Author: ASF GitHub Bot Created on: 20/Feb/19 23:12 Start Date: 20/Feb/19 23:12 Worklog Time Spent: 10m Work Description: ibzib commented on issue #7907: [BEAM-6714] Move runner-agnostic code out of FlinkJobServerDriver URL: https://github.com/apache/beam/pull/7907#issuecomment-465793493 Run Java PreCommit This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 201677) Time Spent: 20m (was: 10m) > Move runner-agnostic code out of FlinkJobServerDriver > - > > Key: BEAM-6714 > URL: https://issues.apache.org/jira/browse/BEAM-6714 > Project: Beam > Issue Type: Task > Components: runner-flink, runner-samza, runner-spark >Reporter: Kyle Weaver >Assignee: Kyle Weaver >Priority: Major > Time Spent: 20m > Remaining Estimate: 0h > > [FlinkJobServerDriver|https://github.com/apache/beam/blob/master/runners/flink/src/main/java/org/apache/beam/runners/flink/FlinkJobServerDriver.java] > contains quite a bit of code that is not actually specific to the Flink > runner. This runner-agnostic code should be shared so that other runners (ie > Spark) developing portability can leverage it. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-6714) Move runner-agnostic code out of FlinkJobServerDriver
[ https://issues.apache.org/jira/browse/BEAM-6714?focusedWorklogId=201678&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-201678 ] ASF GitHub Bot logged work on BEAM-6714: Author: ASF GitHub Bot Created on: 20/Feb/19 23:17 Start Date: 20/Feb/19 23:17 Worklog Time Spent: 10m Work Description: ibzib commented on issue #7907: [BEAM-6714] Move runner-agnostic code out of FlinkJobServerDriver URL: https://github.com/apache/beam/pull/7907#issuecomment-465793493 Run Java PreCommit This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 201678) Time Spent: 0.5h (was: 20m) > Move runner-agnostic code out of FlinkJobServerDriver > - > > Key: BEAM-6714 > URL: https://issues.apache.org/jira/browse/BEAM-6714 > Project: Beam > Issue Type: Task > Components: runner-flink, runner-samza, runner-spark >Reporter: Kyle Weaver >Assignee: Kyle Weaver >Priority: Major > Time Spent: 0.5h > Remaining Estimate: 0h > > [FlinkJobServerDriver|https://github.com/apache/beam/blob/master/runners/flink/src/main/java/org/apache/beam/runners/flink/FlinkJobServerDriver.java] > contains quite a bit of code that is not actually specific to the Flink > runner. This runner-agnostic code should be shared so that other runners (ie > Spark) developing portability can leverage it. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-6720) Binary incompatibility introduced to MapElements between 2.9.0 and 2.10.0
[ https://issues.apache.org/jira/browse/BEAM-6720?focusedWorklogId=201675&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-201675 ] ASF GitHub Bot logged work on BEAM-6720: Author: ASF GitHub Bot Created on: 20/Feb/19 23:04 Start Date: 20/Feb/19 23:04 Worklog Time Spent: 10m Work Description: kennknowles commented on issue #7911: [BEAM-6720] Add binary compatibility adapters for ProcessFunction/InferableFunction overloads URL: https://github.com/apache/beam/pull/7911#issuecomment-465791397 OK, added the next couple adapters and eliminated use of rawtypes. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 201675) Time Spent: 40m (was: 0.5h) > Binary incompatibility introduced to MapElements between 2.9.0 and 2.10.0 > - > > Key: BEAM-6720 > URL: https://issues.apache.org/jira/browse/BEAM-6720 > Project: Beam > Issue Type: Bug > Components: sdk-java-core >Affects Versions: 2.10.0 >Reporter: Kenneth Knowles >Assignee: Kenneth Knowles >Priority: Blocker > Fix For: 2.11.0 > > Time Spent: 40m > Remaining Estimate: 0h > > In https://github.com/apache/beam/pull/7160 > {{MapElements.via(SimpleFunction)}} was removed and replaced with > {{MapElements.via(InferableFunction)}}. > This is compatible with a recompile, but loses binary compatibility because > the needed method signature is missing. I believe a pass-through method with > the needed signature can be added to restore binary compatibility. > CC [~jeff.klu...@gmail.com] [~altay] -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work started] (BEAM-6720) Binary incompatibility introduced to MapElements between 2.9.0 and 2.10.0
[ https://issues.apache.org/jira/browse/BEAM-6720?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on BEAM-6720 started by Kenneth Knowles. - > Binary incompatibility introduced to MapElements between 2.9.0 and 2.10.0 > - > > Key: BEAM-6720 > URL: https://issues.apache.org/jira/browse/BEAM-6720 > Project: Beam > Issue Type: Bug > Components: sdk-java-core >Affects Versions: 2.10.0 >Reporter: Kenneth Knowles >Assignee: Kenneth Knowles >Priority: Blocker > Fix For: 2.11.0 > > Time Spent: 40m > Remaining Estimate: 0h > > In https://github.com/apache/beam/pull/7160 > {{MapElements.via(SimpleFunction)}} was removed and replaced with > {{MapElements.via(InferableFunction)}}. > This is compatible with a recompile, but loses binary compatibility because > the needed method signature is missing. I believe a pass-through method with > the needed signature can be added to restore binary compatibility. > CC [~jeff.klu...@gmail.com] [~altay] -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-5638) Add exception handling to single message transforms in Java SDK
[ https://issues.apache.org/jira/browse/BEAM-5638?focusedWorklogId=201670&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-201670 ] ASF GitHub Bot logged work on BEAM-5638: Author: ASF GitHub Bot Created on: 20/Feb/19 22:56 Start Date: 20/Feb/19 22:56 Worklog Time Spent: 10m Work Description: jklukas commented on issue #7736: [BEAM-5638] Exception handling for Java MapElements and FlatMapElements URL: https://github.com/apache/beam/pull/7736#issuecomment-465789311 This is now consistently using "failure" terminology in general and "exception" in cases that specifically refer to exceptions. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 201670) Time Spent: 8h 50m (was: 8h 40m) Remaining Estimate: 159h 10m (was: 159h 20m) > Add exception handling to single message transforms in Java SDK > --- > > Key: BEAM-5638 > URL: https://issues.apache.org/jira/browse/BEAM-5638 > Project: Beam > Issue Type: Improvement > Components: sdk-java-core >Reporter: Jeff Klukas >Assignee: Jeff Klukas >Priority: Minor > Labels: triaged > Original Estimate: 168h > Time Spent: 8h 50m > Remaining Estimate: 159h 10m > > Add methods to MapElements, FlatMapElements, and Filter that allow users to > specify expected exceptions and tuple tags to associate with the with > collections of the successfully and unsuccessfully processed elements. > See discussion on dev list: > https://lists.apache.org/thread.html/936ed2a5f2c01be066fd903abf70130625e0b8cf4028c11b89b8b23f@%3Cdev.beam.apache.org%3E -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-6443) decrease the number of threads for BigQuery streaming insertAll
[ https://issues.apache.org/jira/browse/BEAM-6443?focusedWorklogId=201667&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-201667 ] ASF GitHub Bot logged work on BEAM-6443: Author: ASF GitHub Bot Created on: 20/Feb/19 22:45 Start Date: 20/Feb/19 22:45 Worklog Time Spent: 10m Work Description: ihji commented on issue #7547: [BEAM-6443] decrease the number of thread for BigQuery streaming inse… URL: https://github.com/apache/beam/pull/7547#issuecomment-465786164 @reuvenlax Friendly ping for additional feedback. I hope we can close this PR soon. It has been opened for a month now :sob: This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 201667) Time Spent: 3.5h (was: 3h 20m) > decrease the number of threads for BigQuery streaming insertAll > --- > > Key: BEAM-6443 > URL: https://issues.apache.org/jira/browse/BEAM-6443 > Project: Beam > Issue Type: Improvement > Components: io-java-gcp >Reporter: Heejong Lee >Assignee: Heejong Lee >Priority: Major > Labels: triaged > Time Spent: 3.5h > Remaining Estimate: 0h > > When inserting (a large number of ) very small elements into BigQuery via > streaming insertAll, BigQueryIO causes lots of quota exceeded errors. This > implies that 1) BigQueryIO puts unnecessary overheads on BigQuery API layer > by sending requests too fast 2) log file becomes very big because of repeated > same error messages. Currently we use 50 shards for writing data into > BigQuery and in each bundle 20-30 futures are executed simultaneously with > unlimited thread pool. It would be worth investigating whether just single > thread pool is sufficient for running concurrent insertAll. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-6720) Binary incompatibility introduced to MapElements between 2.9.0 and 2.10.0
[ https://issues.apache.org/jira/browse/BEAM-6720?focusedWorklogId=201668&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-201668 ] ASF GitHub Bot logged work on BEAM-6720: Author: ASF GitHub Bot Created on: 20/Feb/19 22:50 Start Date: 20/Feb/19 22:50 Worklog Time Spent: 10m Work Description: jklukas commented on issue #7911: [BEAM-6720] Add binary compatibility adapters for ProcessFunction/InferableFunction overloads URL: https://github.com/apache/beam/pull/7911#issuecomment-465787552 [#7160 also changed the signatures of TypeDescriptors.inputOf](https://github.com/apache/beam/pull/7160/files#diff-bb65d6d39ff34da1fa01173bf54ed1f5R404) and .outputOf. Would they also need adapters? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 201668) Time Spent: 0.5h (was: 20m) > Binary incompatibility introduced to MapElements between 2.9.0 and 2.10.0 > - > > Key: BEAM-6720 > URL: https://issues.apache.org/jira/browse/BEAM-6720 > Project: Beam > Issue Type: Bug > Components: sdk-java-core >Affects Versions: 2.10.0 >Reporter: Kenneth Knowles >Assignee: Kenneth Knowles >Priority: Blocker > Fix For: 2.11.0 > > Time Spent: 0.5h > Remaining Estimate: 0h > > In https://github.com/apache/beam/pull/7160 > {{MapElements.via(SimpleFunction)}} was removed and replaced with > {{MapElements.via(InferableFunction)}}. > This is compatible with a recompile, but loses binary compatibility because > the needed method signature is missing. I believe a pass-through method with > the needed signature can be added to restore binary compatibility. > CC [~jeff.klu...@gmail.com] [~altay] -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-6720) Binary incompatibility introduced to MapElements between 2.9.0 and 2.10.0
[ https://issues.apache.org/jira/browse/BEAM-6720?focusedWorklogId=201666&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-201666 ] ASF GitHub Bot logged work on BEAM-6720: Author: ASF GitHub Bot Created on: 20/Feb/19 22:44 Start Date: 20/Feb/19 22:44 Worklog Time Spent: 10m Work Description: kennknowles commented on issue #7911: [BEAM-6720] Add binary compatibility adapters for ProcessFunction/InferableFunction overloads URL: https://github.com/apache/beam/pull/7911#issuecomment-465785652 R: @jklukas @aaltay This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 201666) Time Spent: 20m (was: 10m) > Binary incompatibility introduced to MapElements between 2.9.0 and 2.10.0 > - > > Key: BEAM-6720 > URL: https://issues.apache.org/jira/browse/BEAM-6720 > Project: Beam > Issue Type: Bug > Components: sdk-java-core >Affects Versions: 2.10.0 >Reporter: Kenneth Knowles >Assignee: Kenneth Knowles >Priority: Blocker > Fix For: 2.11.0 > > Time Spent: 20m > Remaining Estimate: 0h > > In https://github.com/apache/beam/pull/7160 > {{MapElements.via(SimpleFunction)}} was removed and replaced with > {{MapElements.via(InferableFunction)}}. > This is compatible with a recompile, but loses binary compatibility because > the needed method signature is missing. I believe a pass-through method with > the needed signature can be added to restore binary compatibility. > CC [~jeff.klu...@gmail.com] [~altay] -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-6711) Bigquery Tornadoes IT is broken in Python3 PostCommit test suite.
[ https://issues.apache.org/jira/browse/BEAM-6711?focusedWorklogId=201665&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-201665 ] ASF GitHub Bot logged work on BEAM-6711: Author: ASF GitHub Bot Created on: 20/Feb/19 22:43 Start Date: 20/Feb/19 22:43 Worklog Time Spent: 10m Work Description: tvalentyn commented on pull request #7892: [BEAM-6711] [BEAM-6553] A Python SDK sink that supports File Loads into BQ (#7655) URL: https://github.com/apache/beam/pull/7892#discussion_r258698399 ## File path: sdks/python/apache_beam/io/gcp/bigquery.py ## @@ -831,7 +839,7 @@ def expand(self, pcoll): self.table_reference.projectId = pcoll.pipeline.options.view_as( GoogleCloudOptions).project -if standard_options.streaming: +if standard_options.streaming or self.method == 'STREAMING_INSERTS': Review comment: Should we print a warning or fail if a user is explicitly setting FILE_LOADS in a Streaming pipeline? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 201665) Time Spent: 4h 20m (was: 4h 10m) > Bigquery Tornadoes IT is broken in Python3 PostCommit test suite. > -- > > Key: BEAM-6711 > URL: https://issues.apache.org/jira/browse/BEAM-6711 > Project: Beam > Issue Type: Sub-task > Components: sdk-py-core >Reporter: Valentyn Tymofieiev >Assignee: Pablo Estrada >Priority: Major > Time Spent: 4h 20m > Remaining Estimate: 0h > > First failure was observed in > https://builds.apache.org/job/beam_PostCommit_Python3_Verify/54 , after > https://github.com/apache/beam/commit/cdea885872b3be7de9ba22f22700be89f7d53766 > was merged. > [~pabloem], could you please take a look? I suggest we do a rollback + > rollforward with a fix. > {noformat} > root: ERROR: Exception at bundle > , > due to an exception. > Traceback (most recent call last): > File > "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify/src/sdks/python/apache_beam/runners/common.py", > line 727, in process > return self.do_fn_invoker.invoke_process(windowed_value) > File > "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify/src/sdks/python/apache_beam/runners/common.py", > line 556, in invoke_process > windowed_value, additional_args, additional_kwargs, output_processor) > File > "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify/src/sdks/python/apache_beam/runners/common.py", > line 622, in _invoke_per_window > self.process_method(*args_for_process, **kwargs_for_process)) > File > "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify/src/sdks/python/apache_beam/runners/common.py", > line 823, in process_outputs > for result in results: > File "/home/jenkins/jenkins-slave/works > pace/beam_PostCommit_Python3_Verify/src/sdks/python/apache_beam/io/gcp/bigquery_file_loads.py", > line 191, in process > if destination in self._destination_to_file_writer: > TypeError: unhashable type: 'TableReference' > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-6720) Binary incompatibility introduced to MapElements between 2.9.0 and 2.10.0
[ https://issues.apache.org/jira/browse/BEAM-6720?focusedWorklogId=201663&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-201663 ] ASF GitHub Bot logged work on BEAM-6720: Author: ASF GitHub Bot Created on: 20/Feb/19 22:42 Start Date: 20/Feb/19 22:42 Worklog Time Spent: 10m Work Description: kennknowles commented on pull request #7911: [BEAM-6720] Add binary compatibility adapters for ProcessFunction/InferableFunction overloads URL: https://github.com/apache/beam/pull/7911 Because method overloads are resolved statically, 2.10.0 is compile compatible but not binary compatible with 2.9.0. This introduces adapters where needed to restore binary compatibility. Testing will be by hand and one-off as I'm not sure the infrastructure that would be needed to test these automatically. Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily: - [x] [**Choose reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and mention them in a comment (`R: @username`). - [x] Format the pull request title like `[BEAM-XXX] Fixes bug in ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA issue, if applicable. This will automatically link the pull request to the issue. - [x] If this contribution is large, please file an Apache [Individual Contributor License Agreement](https://www.apache.org/licenses/icla.pdf). Post-Commit Tests Status (on master branch) Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark --- | --- | --- | --- | --- | --- | --- | --- Go | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/) | --- | --- | --- | --- | --- | --- Java | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/) Python | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Python_Verify/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python_Verify/lastCompletedBuild/) | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/) [![Build Status](https://builds.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PreCommit_Python_PVR_Flink_Cron/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PreCommit_Python_PVR_Flink_Cron/lastCompletedBuild/) | --- | --- | --- See [.test-infra/jenkins/README](../.test-infra/jenkins/README.md) for trigger phrase, status and link of all Jenkins jobs.
[jira] [Work logged] (BEAM-6711) Bigquery Tornadoes IT is broken in Python3 PostCommit test suite.
[ https://issues.apache.org/jira/browse/BEAM-6711?focusedWorklogId=201662&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-201662 ] ASF GitHub Bot logged work on BEAM-6711: Author: ASF GitHub Bot Created on: 20/Feb/19 22:42 Start Date: 20/Feb/19 22:42 Worklog Time Spent: 10m Work Description: tvalentyn commented on pull request #7892: [BEAM-6711] [BEAM-6553] A Python SDK sink that supports File Loads into BQ (#7655) URL: https://github.com/apache/beam/pull/7892#discussion_r258698399 ## File path: sdks/python/apache_beam/io/gcp/bigquery.py ## @@ -831,7 +839,7 @@ def expand(self, pcoll): self.table_reference.projectId = pcoll.pipeline.options.view_as( GoogleCloudOptions).project -if standard_options.streaming: +if standard_options.streaming or self.method == 'STREAMING_INSERTS': Review comment: Should print a warning or fail if a user is explicitly setting FILE_LOADS in a Batch pipeline? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 201662) Time Spent: 4h (was: 3h 50m) > Bigquery Tornadoes IT is broken in Python3 PostCommit test suite. > -- > > Key: BEAM-6711 > URL: https://issues.apache.org/jira/browse/BEAM-6711 > Project: Beam > Issue Type: Sub-task > Components: sdk-py-core >Reporter: Valentyn Tymofieiev >Assignee: Pablo Estrada >Priority: Major > Time Spent: 4h > Remaining Estimate: 0h > > First failure was observed in > https://builds.apache.org/job/beam_PostCommit_Python3_Verify/54 , after > https://github.com/apache/beam/commit/cdea885872b3be7de9ba22f22700be89f7d53766 > was merged. > [~pabloem], could you please take a look? I suggest we do a rollback + > rollforward with a fix. > {noformat} > root: ERROR: Exception at bundle > , > due to an exception. > Traceback (most recent call last): > File > "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify/src/sdks/python/apache_beam/runners/common.py", > line 727, in process > return self.do_fn_invoker.invoke_process(windowed_value) > File > "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify/src/sdks/python/apache_beam/runners/common.py", > line 556, in invoke_process > windowed_value, additional_args, additional_kwargs, output_processor) > File > "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify/src/sdks/python/apache_beam/runners/common.py", > line 622, in _invoke_per_window > self.process_method(*args_for_process, **kwargs_for_process)) > File > "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify/src/sdks/python/apache_beam/runners/common.py", > line 823, in process_outputs > for result in results: > File "/home/jenkins/jenkins-slave/works > pace/beam_PostCommit_Python3_Verify/src/sdks/python/apache_beam/io/gcp/bigquery_file_loads.py", > line 191, in process > if destination in self._destination_to_file_writer: > TypeError: unhashable type: 'TableReference' > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-6711) Bigquery Tornadoes IT is broken in Python3 PostCommit test suite.
[ https://issues.apache.org/jira/browse/BEAM-6711?focusedWorklogId=201660&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-201660 ] ASF GitHub Bot logged work on BEAM-6711: Author: ASF GitHub Bot Created on: 20/Feb/19 22:42 Start Date: 20/Feb/19 22:42 Worklog Time Spent: 10m Work Description: tvalentyn commented on pull request #7892: [BEAM-6711] [BEAM-6553] A Python SDK sink that supports File Loads into BQ (#7655) URL: https://github.com/apache/beam/pull/7892#discussion_r258695330 ## File path: sdks/python/apache_beam/io/gcp/big_query_query_to_table_pipeline.py ## @@ -50,9 +50,10 @@ def run_bq_pipeline(argv=None): help='Output BQ table to write results to.') parser.add_argument('--kms_key', default=None, help='Use this Cloud KMS key with BigQuery.') - parser.add_argument('--gs_location', + parser.add_argument('--gcs_location', default=None, - help='GCS bucket location to use to store files.') + help=('GCS bucket location to use to store files for ' Review comment: nit: We can s/GCS bucket location/GCS bucket/ in the help string to avoid confusion with GCS's concept of bucket location: https://cloud.google.com/storage/docs/locations. Perhaps `bq_temp_location` would be a better name? Also, there are other places in this pr where we still use 'gs_location' - just making sure that's intentional. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 201660) Time Spent: 3h 40m (was: 3.5h) > Bigquery Tornadoes IT is broken in Python3 PostCommit test suite. > -- > > Key: BEAM-6711 > URL: https://issues.apache.org/jira/browse/BEAM-6711 > Project: Beam > Issue Type: Sub-task > Components: sdk-py-core >Reporter: Valentyn Tymofieiev >Assignee: Pablo Estrada >Priority: Major > Time Spent: 3h 40m > Remaining Estimate: 0h > > First failure was observed in > https://builds.apache.org/job/beam_PostCommit_Python3_Verify/54 , after > https://github.com/apache/beam/commit/cdea885872b3be7de9ba22f22700be89f7d53766 > was merged. > [~pabloem], could you please take a look? I suggest we do a rollback + > rollforward with a fix. > {noformat} > root: ERROR: Exception at bundle > , > due to an exception. > Traceback (most recent call last): > File > "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify/src/sdks/python/apache_beam/runners/common.py", > line 727, in process > return self.do_fn_invoker.invoke_process(windowed_value) > File > "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify/src/sdks/python/apache_beam/runners/common.py", > line 556, in invoke_process > windowed_value, additional_args, additional_kwargs, output_processor) > File > "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify/src/sdks/python/apache_beam/runners/common.py", > line 622, in _invoke_per_window > self.process_method(*args_for_process, **kwargs_for_process)) > File > "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify/src/sdks/python/apache_beam/runners/common.py", > line 823, in process_outputs > for result in results: > File "/home/jenkins/jenkins-slave/works > pace/beam_PostCommit_Python3_Verify/src/sdks/python/apache_beam/io/gcp/bigquery_file_loads.py", > line 191, in process > if destination in self._destination_to_file_writer: > TypeError: unhashable type: 'TableReference' > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-6711) Bigquery Tornadoes IT is broken in Python3 PostCommit test suite.
[ https://issues.apache.org/jira/browse/BEAM-6711?focusedWorklogId=201661&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-201661 ] ASF GitHub Bot logged work on BEAM-6711: Author: ASF GitHub Bot Created on: 20/Feb/19 22:42 Start Date: 20/Feb/19 22:42 Worklog Time Spent: 10m Work Description: tvalentyn commented on pull request #7892: [BEAM-6711] [BEAM-6553] A Python SDK sink that supports File Loads into BQ (#7655) URL: https://github.com/apache/beam/pull/7892#discussion_r258697638 ## File path: sdks/python/apache_beam/io/gcp/bigquery.py ## @@ -727,6 +729,11 @@ def __init__(self, loads into BigQuery. By default, this will use the pipeline's temp_location, but for pipelines whose temp_location is not appropriate for BQ File Loads, users should pass a specific one. + method: The method to use to write to BigQuery. It may be +STREAMING_INSERTS, FILE_LOADS, or DEFAULT. An intoduction on loading Review comment: nit: introduction This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 201661) Time Spent: 3h 50m (was: 3h 40m) > Bigquery Tornadoes IT is broken in Python3 PostCommit test suite. > -- > > Key: BEAM-6711 > URL: https://issues.apache.org/jira/browse/BEAM-6711 > Project: Beam > Issue Type: Sub-task > Components: sdk-py-core >Reporter: Valentyn Tymofieiev >Assignee: Pablo Estrada >Priority: Major > Time Spent: 3h 50m > Remaining Estimate: 0h > > First failure was observed in > https://builds.apache.org/job/beam_PostCommit_Python3_Verify/54 , after > https://github.com/apache/beam/commit/cdea885872b3be7de9ba22f22700be89f7d53766 > was merged. > [~pabloem], could you please take a look? I suggest we do a rollback + > rollforward with a fix. > {noformat} > root: ERROR: Exception at bundle > , > due to an exception. > Traceback (most recent call last): > File > "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify/src/sdks/python/apache_beam/runners/common.py", > line 727, in process > return self.do_fn_invoker.invoke_process(windowed_value) > File > "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify/src/sdks/python/apache_beam/runners/common.py", > line 556, in invoke_process > windowed_value, additional_args, additional_kwargs, output_processor) > File > "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify/src/sdks/python/apache_beam/runners/common.py", > line 622, in _invoke_per_window > self.process_method(*args_for_process, **kwargs_for_process)) > File > "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify/src/sdks/python/apache_beam/runners/common.py", > line 823, in process_outputs > for result in results: > File "/home/jenkins/jenkins-slave/works > pace/beam_PostCommit_Python3_Verify/src/sdks/python/apache_beam/io/gcp/bigquery_file_loads.py", > line 191, in process > if destination in self._destination_to_file_writer: > TypeError: unhashable type: 'TableReference' > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-6711) Bigquery Tornadoes IT is broken in Python3 PostCommit test suite.
[ https://issues.apache.org/jira/browse/BEAM-6711?focusedWorklogId=201664&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-201664 ] ASF GitHub Bot logged work on BEAM-6711: Author: ASF GitHub Bot Created on: 20/Feb/19 22:42 Start Date: 20/Feb/19 22:42 Worklog Time Spent: 10m Work Description: tvalentyn commented on pull request #7892: [BEAM-6711] [BEAM-6553] A Python SDK sink that supports File Loads into BQ (#7655) URL: https://github.com/apache/beam/pull/7892#discussion_r258698399 ## File path: sdks/python/apache_beam/io/gcp/bigquery.py ## @@ -831,7 +839,7 @@ def expand(self, pcoll): self.table_reference.projectId = pcoll.pipeline.options.view_as( GoogleCloudOptions).project -if standard_options.streaming: +if standard_options.streaming or self.method == 'STREAMING_INSERTS': Review comment: Should we print a warning or fail if a user is explicitly setting FILE_LOADS in a Batch pipeline? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 201664) Time Spent: 4h 10m (was: 4h) > Bigquery Tornadoes IT is broken in Python3 PostCommit test suite. > -- > > Key: BEAM-6711 > URL: https://issues.apache.org/jira/browse/BEAM-6711 > Project: Beam > Issue Type: Sub-task > Components: sdk-py-core >Reporter: Valentyn Tymofieiev >Assignee: Pablo Estrada >Priority: Major > Time Spent: 4h 10m > Remaining Estimate: 0h > > First failure was observed in > https://builds.apache.org/job/beam_PostCommit_Python3_Verify/54 , after > https://github.com/apache/beam/commit/cdea885872b3be7de9ba22f22700be89f7d53766 > was merged. > [~pabloem], could you please take a look? I suggest we do a rollback + > rollforward with a fix. > {noformat} > root: ERROR: Exception at bundle > , > due to an exception. > Traceback (most recent call last): > File > "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify/src/sdks/python/apache_beam/runners/common.py", > line 727, in process > return self.do_fn_invoker.invoke_process(windowed_value) > File > "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify/src/sdks/python/apache_beam/runners/common.py", > line 556, in invoke_process > windowed_value, additional_args, additional_kwargs, output_processor) > File > "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify/src/sdks/python/apache_beam/runners/common.py", > line 622, in _invoke_per_window > self.process_method(*args_for_process, **kwargs_for_process)) > File > "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify/src/sdks/python/apache_beam/runners/common.py", > line 823, in process_outputs > for result in results: > File "/home/jenkins/jenkins-slave/works > pace/beam_PostCommit_Python3_Verify/src/sdks/python/apache_beam/io/gcp/bigquery_file_loads.py", > line 191, in process > if destination in self._destination_to_file_writer: > TypeError: unhashable type: 'TableReference' > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-6638) Python ExternalTransform output mismatched
[ https://issues.apache.org/jira/browse/BEAM-6638?focusedWorklogId=201659&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-201659 ] ASF GitHub Bot logged work on BEAM-6638: Author: ASF GitHub Bot Created on: 20/Feb/19 22:35 Start Date: 20/Feb/19 22:35 Worklog Time Spent: 10m Work Description: ihji commented on issue #7792: [BEAM-6638] Python ExternalTransform output mismatched URL: https://github.com/apache/beam/pull/7792#issuecomment-465783103 @robertwb waiting for additional review. Also https://github.com/apache/beam/pull/7845 This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 201659) Time Spent: 2h 10m (was: 2h) > Python ExternalTransform output mismatched > -- > > Key: BEAM-6638 > URL: https://issues.apache.org/jira/browse/BEAM-6638 > Project: Beam > Issue Type: Bug > Components: sdk-py-core >Reporter: Heejong Lee >Assignee: Heejong Lee >Priority: Major > Time Spent: 2h 10m > Remaining Estimate: 0h > > Java PipelineValidator prints out the error that an expanded external > transform points to unknown output PCollection. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-4775) JobService should support returning metrics
[ https://issues.apache.org/jira/browse/BEAM-4775?focusedWorklogId=201658&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-201658 ] ASF GitHub Bot logged work on BEAM-4775: Author: ASF GitHub Bot Created on: 20/Feb/19 22:32 Start Date: 20/Feb/19 22:32 Worklog Time Spent: 10m Work Description: ajamato commented on pull request #7868: [BEAM-4775] MonitoringInfo URN tweaks URL: https://github.com/apache/beam/pull/7868#discussion_r258706115 ## File path: model/fn-execution/src/main/proto/beam_fn_api.proto ## @@ -339,14 +339,16 @@ message MonitoringInfoSpecs { enum Enum { // TODO(ajamato): Add the PTRANSFORM name as a required label after // upgrading the python SDK. -USER_COUNTER = 0 [(monitoring_info_spec) = { +USER_METRIC = 0 [(monitoring_info_spec) = { Review comment: @Ardagan and I have discussed that its not really the best idea to make the URN a prefix for the user metric. As this has led to writing a lot of code to parse the URN to obtain those fields out. And being this weird exception makes all the code have to special case it. It would be better to package the namespace and name as a label on the MonitoringInfo. Then we could use the same URN everywhere, no parsing, no special casing. Given this, then there are two choices i.e. (1) Use a different URN for user gauge, counter and distribution - "beam:metric:user:counter:v1" - "beam:metric:user:gauge:v1" - "beam:metric:user:distribution:v1" Then change the MonitoringInfoSpec: required_labels: [ "PTRANSFORM", "NAMESPACE", "NAME" ], and each will have a separate type_urn in their spec. type_urn: (2) Use a single URN "beam:metric:user:v1" for user gauge, counter, distribution Then change the MonitoringInfoSpec as well: required_labels: [ "PTRANSFORM", "NAMESPACE", "NAME" ], and we will NOT enforce the type_urn in the spec. I prefer #1, as this will better describe each one, and if we add new user metric styles we can add a spec for them This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 201658) Time Spent: 18h 10m (was: 18h) > JobService should support returning metrics > --- > > Key: BEAM-4775 > URL: https://issues.apache.org/jira/browse/BEAM-4775 > Project: Beam > Issue Type: Bug > Components: beam-model >Reporter: Eugene Kirpichov >Assignee: Ryan Williams >Priority: Major > Labels: triaged > Time Spent: 18h 10m > Remaining Estimate: 0h > > [https://github.com/apache/beam/blob/master/model/job-management/src/main/proto/beam_job_api.proto] > currently doesn't appear to have a way for JobService to return metrics to a > user, even though > [https://github.com/apache/beam/blob/master/model/fn-execution/src/main/proto/beam_fn_api.proto] > includes support for reporting SDK metrics to the runner harness. > > Metrics are apparently necessary to run any ValidatesRunner tests because > PAssert needs to validate that the assertions succeeded. However, this > statement should be double-checked: perhaps it's possible to somehow work > with PAssert without metrics support. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-6711) Bigquery Tornadoes IT is broken in Python3 PostCommit test suite.
[ https://issues.apache.org/jira/browse/BEAM-6711?focusedWorklogId=201654&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-201654 ] ASF GitHub Bot logged work on BEAM-6711: Author: ASF GitHub Bot Created on: 20/Feb/19 22:28 Start Date: 20/Feb/19 22:28 Worklog Time Spent: 10m Work Description: pabloem commented on issue #7892: [BEAM-6711] [BEAM-6553] A Python SDK sink that supports File Loads into BQ (#7655) URL: https://github.com/apache/beam/pull/7892#issuecomment-465781105 Run Python PostCommit This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 201654) Time Spent: 3.5h (was: 3h 20m) > Bigquery Tornadoes IT is broken in Python3 PostCommit test suite. > -- > > Key: BEAM-6711 > URL: https://issues.apache.org/jira/browse/BEAM-6711 > Project: Beam > Issue Type: Sub-task > Components: sdk-py-core >Reporter: Valentyn Tymofieiev >Assignee: Pablo Estrada >Priority: Major > Time Spent: 3.5h > Remaining Estimate: 0h > > First failure was observed in > https://builds.apache.org/job/beam_PostCommit_Python3_Verify/54 , after > https://github.com/apache/beam/commit/cdea885872b3be7de9ba22f22700be89f7d53766 > was merged. > [~pabloem], could you please take a look? I suggest we do a rollback + > rollforward with a fix. > {noformat} > root: ERROR: Exception at bundle > , > due to an exception. > Traceback (most recent call last): > File > "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify/src/sdks/python/apache_beam/runners/common.py", > line 727, in process > return self.do_fn_invoker.invoke_process(windowed_value) > File > "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify/src/sdks/python/apache_beam/runners/common.py", > line 556, in invoke_process > windowed_value, additional_args, additional_kwargs, output_processor) > File > "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify/src/sdks/python/apache_beam/runners/common.py", > line 622, in _invoke_per_window > self.process_method(*args_for_process, **kwargs_for_process)) > File > "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify/src/sdks/python/apache_beam/runners/common.py", > line 823, in process_outputs > for result in results: > File "/home/jenkins/jenkins-slave/works > pace/beam_PostCommit_Python3_Verify/src/sdks/python/apache_beam/io/gcp/bigquery_file_loads.py", > line 191, in process > if destination in self._destination_to_file_writer: > TypeError: unhashable type: 'TableReference' > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-4775) JobService should support returning metrics
[ https://issues.apache.org/jira/browse/BEAM-4775?focusedWorklogId=201653&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-201653 ] ASF GitHub Bot logged work on BEAM-4775: Author: ASF GitHub Bot Created on: 20/Feb/19 22:23 Start Date: 20/Feb/19 22:23 Worklog Time Spent: 10m Work Description: ajamato commented on pull request #7876: [BEAM-4775] Clean up metric protos; support integer distributions, gauges URL: https://github.com/apache/beam/pull/7876#discussion_r258702806 ## File path: model/fn-execution/src/main/proto/beam_fn_api.proto ## @@ -501,48 +501,9 @@ message MonitoringInfoTypeUrns { message Metric { // (Required) The data for this metric. oneof data { -CounterData counter_data = 1; -DistributionData distribution_data = 2; -ExtremaData extrema_data = 3; - } -} - -// Data associated with a Counter or Gauge metric. -// This is designed to be compatible with metric collection -// systems such as DropWizard. -message CounterData { - oneof value { -int64 int64_value = 1; -double double_value = 2; -string string_value = 3; - } -} - -// Extrema messages are used for calculating -// Top-N/Bottom-N metrics. -message ExtremaData { - oneof extrema { -IntExtremaData int_extrema_data = 1; -DoubleExtremaData double_extrema_data = 2; - } -} - -message IntExtremaData { - repeated int64 int_values = 1; -} - -message DoubleExtremaData { - repeated double double_values = 2; -} - -// Data associated with a distribution metric. -// This is based off of the current DistributionData metric. -// This is not a stackdriver or dropwizard compatible -// style of distribution metric. -message DistributionData { - oneof distribution { -IntDistributionData int_distribution_data = 1; -DoubleDistributionData double_distribution_data = 2; +int64 counter = 1; +IntDistributionData distribution = 2; +IntGaugeData gauge = 3; Review comment: The one thing worth pointing out here, is that we never introduced this before because this only holds an int64, which is the same as a counter, which holds an int64 (Note you don't need the timestamp inside IntGaugeData), as its already in the MonitoringInfo proto. So we didn't include both as separate fields in the oneof. You would just specify which way it is aggregated using the MonitoringInfo type field. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 201653) Time Spent: 18h (was: 17h 50m) > JobService should support returning metrics > --- > > Key: BEAM-4775 > URL: https://issues.apache.org/jira/browse/BEAM-4775 > Project: Beam > Issue Type: Bug > Components: beam-model >Reporter: Eugene Kirpichov >Assignee: Ryan Williams >Priority: Major > Labels: triaged > Time Spent: 18h > Remaining Estimate: 0h > > [https://github.com/apache/beam/blob/master/model/job-management/src/main/proto/beam_job_api.proto] > currently doesn't appear to have a way for JobService to return metrics to a > user, even though > [https://github.com/apache/beam/blob/master/model/fn-execution/src/main/proto/beam_fn_api.proto] > includes support for reporting SDK metrics to the runner harness. > > Metrics are apparently necessary to run any ValidatesRunner tests because > PAssert needs to validate that the assertions succeeded. However, this > statement should be double-checked: perhaps it's possible to somehow work > with PAssert without metrics support. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-4775) JobService should support returning metrics
[ https://issues.apache.org/jira/browse/BEAM-4775?focusedWorklogId=201652&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-201652 ] ASF GitHub Bot logged work on BEAM-4775: Author: ASF GitHub Bot Created on: 20/Feb/19 22:20 Start Date: 20/Feb/19 22:20 Worklog Time Spent: 10m Work Description: ajamato commented on pull request #7876: [BEAM-4775] Clean up metric protos; support integer distributions, gauges URL: https://github.com/apache/beam/pull/7876#discussion_r258702084 ## File path: model/fn-execution/src/main/proto/beam_fn_api.proto ## @@ -501,48 +501,9 @@ message MonitoringInfoTypeUrns { message Metric { // (Required) The data for this metric. oneof data { -CounterData counter_data = 1; -DistributionData distribution_data = 2; -ExtremaData extrema_data = 3; - } -} - -// Data associated with a Counter or Gauge metric. -// This is designed to be compatible with metric collection -// systems such as DropWizard. -message CounterData { - oneof value { -int64 int64_value = 1; -double double_value = 2; -string string_value = 3; - } -} - -// Extrema messages are used for calculating -// Top-N/Bottom-N metrics. -message ExtremaData { - oneof extrema { -IntExtremaData int_extrema_data = 1; -DoubleExtremaData double_extrema_data = 2; - } -} - -message IntExtremaData { - repeated int64 int_values = 1; -} - -message DoubleExtremaData { - repeated double double_values = 2; -} - -// Data associated with a distribution metric. -// This is based off of the current DistributionData metric. -// This is not a stackdriver or dropwizard compatible -// style of distribution metric. -message DistributionData { - oneof distribution { -IntDistributionData int_distribution_data = 1; -DoubleDistributionData double_distribution_data = 2; +int64 counter = 1; Review comment: @robertwb I'd like to push back a bit about a significant change, where we use opaque bytes payloads. I am okay with removing the layers in this proto and using the design Ryan has proposed here. But not an extensive rewrite at this stage of the game, which will slow down progress significantly. We have protos for the MonitoringInfoTable. This solution is the extensible format, where you can store basically anything and the producer and consumer of this proto can be the only ones who need to understand it (and it could be passed through the RunnerHarness) https://github.com/apache/beam/blob/197a06852d27c5baea6a4c65894a8841672af7b6/model/fn-execution/src/main/proto/beam_fn_api.proto#L583 But we chose to have separate protos for the common well known metric formats (Metrics being a well defined concept shared by many systems, i.e. a timeseries of data): counter, gauge, distribution, etc. Additionally, if you wish to modify these formats significantly at a future stage, you can always - add to these protos without deleting - Change the URN version number of a metric - Upgrade all SDKs to package the metric with a format for the previous and new version number of the URN - Upgrade all Runners that wish to support it. Upgrading the URN version is the safe way to do this without introducing a breaking change, as consumers of MonitoringInfos can freely chose which URNs+versions they support. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 201652) Time Spent: 17h 50m (was: 17h 40m) > JobService should support returning metrics > --- > > Key: BEAM-4775 > URL: https://issues.apache.org/jira/browse/BEAM-4775 > Project: Beam > Issue Type: Bug > Components: beam-model >Reporter: Eugene Kirpichov >Assignee: Ryan Williams >Priority: Major > Labels: triaged > Time Spent: 17h 50m > Remaining Estimate: 0h > > [https://github.com/apache/beam/blob/master/model/job-management/src/main/proto/beam_job_api.proto] > currently doesn't appear to have a way for JobService to return metrics to a > user, even though > [https://github.com/apache/beam/blob/master/model/fn-execution/src/main/proto/beam_fn_api.proto] > includes support for reporting SDK metrics to the runner harness. > > Metrics are apparently necessary to run any ValidatesRunner tests because > PAssert needs to validate that the assertions succeeded. However, this > statement should be double-checked: perhaps it's possible to somehow work >
[jira] [Work logged] (BEAM-6711) Bigquery Tornadoes IT is broken in Python3 PostCommit test suite.
[ https://issues.apache.org/jira/browse/BEAM-6711?focusedWorklogId=201637&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-201637 ] ASF GitHub Bot logged work on BEAM-6711: Author: ASF GitHub Bot Created on: 20/Feb/19 21:55 Start Date: 20/Feb/19 21:55 Worklog Time Spent: 10m Work Description: pabloem commented on pull request #7892: [BEAM-6711] [BEAM-6553] A Python SDK sink that supports File Loads into BQ (#7655) URL: https://github.com/apache/beam/pull/7892#discussion_r258684590 ## File path: sdks/python/apache_beam/io/gcp/bigquery.py ## @@ -449,6 +450,10 @@ def __init__(self, table, dataset=None, project=None, schema=None, match the expected format. """ +import warnings Review comment: Done. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 201637) Time Spent: 2h 50m (was: 2h 40m) > Bigquery Tornadoes IT is broken in Python3 PostCommit test suite. > -- > > Key: BEAM-6711 > URL: https://issues.apache.org/jira/browse/BEAM-6711 > Project: Beam > Issue Type: Sub-task > Components: sdk-py-core >Reporter: Valentyn Tymofieiev >Assignee: Pablo Estrada >Priority: Major > Time Spent: 2h 50m > Remaining Estimate: 0h > > First failure was observed in > https://builds.apache.org/job/beam_PostCommit_Python3_Verify/54 , after > https://github.com/apache/beam/commit/cdea885872b3be7de9ba22f22700be89f7d53766 > was merged. > [~pabloem], could you please take a look? I suggest we do a rollback + > rollforward with a fix. > {noformat} > root: ERROR: Exception at bundle > , > due to an exception. > Traceback (most recent call last): > File > "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify/src/sdks/python/apache_beam/runners/common.py", > line 727, in process > return self.do_fn_invoker.invoke_process(windowed_value) > File > "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify/src/sdks/python/apache_beam/runners/common.py", > line 556, in invoke_process > windowed_value, additional_args, additional_kwargs, output_processor) > File > "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify/src/sdks/python/apache_beam/runners/common.py", > line 622, in _invoke_per_window > self.process_method(*args_for_process, **kwargs_for_process)) > File > "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify/src/sdks/python/apache_beam/runners/common.py", > line 823, in process_outputs > for result in results: > File "/home/jenkins/jenkins-slave/works > pace/beam_PostCommit_Python3_Verify/src/sdks/python/apache_beam/io/gcp/bigquery_file_loads.py", > line 191, in process > if destination in self._destination_to_file_writer: > TypeError: unhashable type: 'TableReference' > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-6512) [beam_PreCommit_Java_Cron] [GrpcDataServiceTest] Flake, Multiplexer hanging up
[ https://issues.apache.org/jira/browse/BEAM-6512?focusedWorklogId=201641&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-201641 ] ASF GitHub Bot logged work on BEAM-6512: Author: ASF GitHub Bot Created on: 20/Feb/19 21:56 Start Date: 20/Feb/19 21:56 Worklog Time Spent: 10m Work Description: TheNeuralBit commented on issue #7794: [BEAM-6512] Run GrpcDataServiceTest clients on the main thread URL: https://github.com/apache/beam/pull/7794#issuecomment-465770710 Do you think this is ok @robertwb? FYI I also wrote up #7784 which just moves the countdown latch synchronization into the onNext function. I was wary of that approach though since it requires that exactly three elements are received across all the clients, and I thought receiving n != 3 elements could be a failure mode we care about. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 201641) Time Spent: 2.5h (was: 2h 20m) > [beam_PreCommit_Java_Cron] [GrpcDataServiceTest] Flake, Multiplexer hanging up > -- > > Key: BEAM-6512 > URL: https://issues.apache.org/jira/browse/BEAM-6512 > Project: Beam > Issue Type: Bug > Components: test-failures >Reporter: Daniel Oliveira >Assignee: Brian Hulette >Priority: Minor > Labels: currently-failing, flake, triaged > Time Spent: 2.5h > Remaining Estimate: 0h > > _Use this form to file an issue for test failure:_ > * [Jenkins Job|https://builds.apache.org/job/beam_PreCommit_Java_Cron/869/] > * [Gradle Build Scan|https://scans.gradle.com/s/wodzocvegyy5a] > * [Test source > code|https://github.com/apache/beam/blob/f560edc5a4e38cb13d41718540271ae79d7d00ee/runners/java-fn-execution/src/test/java/org/apache/beam/runners/fnexecution/data/GrpcDataServiceTest.java#L105] > Initial investigation: > I see this message: > {noformat} > org.apache.beam.runners.fnexecution.data.GrpcDataServiceTest > > testMessageReceivedBySingleClientWhenThereAreMultipleClients FAILED > java.lang.AssertionError at GrpcDataServiceTest.java:105 > {noformat} > And this message: > {noformat} > [grpc-default-executor-2] WARN > org.apache.beam.sdk.fn.data.BeamFnDataGrpcMultiplexer - Hanged up for unknown > endpoint. > Jan 25, 2019 6:16:02 PM > org.apache.beam.vendor.grpc.v1p13p1.io.grpc.internal.SerializingExecutor run > SEVERE: Exception while executing runnable > org.apache.beam.vendor.grpc.v1p13p1.io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl$1StreamClosed@41415b8e > java.lang.RuntimeException: > org.apache.beam.vendor.grpc.v1p13p1.io.grpc.StatusRuntimeException: > CANCELLED: Multiplexer hanging up > at > org.apache.beam.sdk.fn.test.TestStreams.lambda$throwingErrorHandler$0(TestStreams.java:95) > at > org.apache.beam.sdk.fn.test.TestStreams$ForwardingCallStreamObserver.onError(TestStreams.java:144) > at > org.apache.beam.vendor.grpc.v1p13p1.io.grpc.stub.ClientCalls$StreamObserverToCallListenerAdapter.onClose(ClientCalls.java:420) > at > org.apache.beam.vendor.grpc.v1p13p1.io.grpc.PartialForwardingClientCallListener.onClose(PartialForwardingClientCallListener.java:39) > at > org.apache.beam.vendor.grpc.v1p13p1.io.grpc.ForwardingClientCallListener.onClose(ForwardingClientCallListener.java:23) > at > org.apache.beam.vendor.grpc.v1p13p1.io.grpc.ForwardingClientCallListener$SimpleForwardingClientCallListener.onClose(ForwardingClientCallListener.java:40) > at > org.apache.beam.vendor.grpc.v1p13p1.io.grpc.internal.CensusStatsModule$StatsClientInterceptor$1$1.onClose(CensusStatsModule.java:684) > at > org.apache.beam.vendor.grpc.v1p13p1.io.grpc.PartialForwardingClientCallListener.onClose(PartialForwardingClientCallListener.java:39) > at > org.apache.beam.vendor.grpc.v1p13p1.io.grpc.ForwardingClientCallListener.onClose(ForwardingClientCallListener.java:23) > at > org.apache.beam.vendor.grpc.v1p13p1.io.grpc.ForwardingClientCallListener$SimpleForwardingClientCallListener.onClose(ForwardingClientCallListener.java:40) > at > org.apache.beam.vendor.grpc.v1p13p1.io.grpc.internal.CensusTracingModule$TracingClientInterceptor$1$1.onClose(CensusTracingModule.java:403) > at > org.apache.beam.vendor.grpc.v1p13p1.io.grpc.internal.ClientCallImpl.closeObserver(ClientCallImpl.java:459) > at > org.apache.beam.vendor.grpc.v1p13p1.io.grpc.internal.ClientCallImpl.access$300(ClientCallImp
[jira] [Work logged] (BEAM-6711) Bigquery Tornadoes IT is broken in Python3 PostCommit test suite.
[ https://issues.apache.org/jira/browse/BEAM-6711?focusedWorklogId=201640&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-201640 ] ASF GitHub Bot logged work on BEAM-6711: Author: ASF GitHub Bot Created on: 20/Feb/19 21:55 Start Date: 20/Feb/19 21:55 Worklog Time Spent: 10m Work Description: pabloem commented on issue #7892: [BEAM-6711] [BEAM-6553] A Python SDK sink that supports File Loads into BQ (#7655) URL: https://github.com/apache/beam/pull/7892#issuecomment-465770652 Run Python PostCommit This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 201640) Time Spent: 3h 20m (was: 3h 10m) > Bigquery Tornadoes IT is broken in Python3 PostCommit test suite. > -- > > Key: BEAM-6711 > URL: https://issues.apache.org/jira/browse/BEAM-6711 > Project: Beam > Issue Type: Sub-task > Components: sdk-py-core >Reporter: Valentyn Tymofieiev >Assignee: Pablo Estrada >Priority: Major > Time Spent: 3h 20m > Remaining Estimate: 0h > > First failure was observed in > https://builds.apache.org/job/beam_PostCommit_Python3_Verify/54 , after > https://github.com/apache/beam/commit/cdea885872b3be7de9ba22f22700be89f7d53766 > was merged. > [~pabloem], could you please take a look? I suggest we do a rollback + > rollforward with a fix. > {noformat} > root: ERROR: Exception at bundle > , > due to an exception. > Traceback (most recent call last): > File > "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify/src/sdks/python/apache_beam/runners/common.py", > line 727, in process > return self.do_fn_invoker.invoke_process(windowed_value) > File > "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify/src/sdks/python/apache_beam/runners/common.py", > line 556, in invoke_process > windowed_value, additional_args, additional_kwargs, output_processor) > File > "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify/src/sdks/python/apache_beam/runners/common.py", > line 622, in _invoke_per_window > self.process_method(*args_for_process, **kwargs_for_process)) > File > "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify/src/sdks/python/apache_beam/runners/common.py", > line 823, in process_outputs > for result in results: > File "/home/jenkins/jenkins-slave/works > pace/beam_PostCommit_Python3_Verify/src/sdks/python/apache_beam/io/gcp/bigquery_file_loads.py", > line 191, in process > if destination in self._destination_to_file_writer: > TypeError: unhashable type: 'TableReference' > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-6711) Bigquery Tornadoes IT is broken in Python3 PostCommit test suite.
[ https://issues.apache.org/jira/browse/BEAM-6711?focusedWorklogId=201638&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-201638 ] ASF GitHub Bot logged work on BEAM-6711: Author: ASF GitHub Bot Created on: 20/Feb/19 21:55 Start Date: 20/Feb/19 21:55 Worklog Time Spent: 10m Work Description: pabloem commented on pull request #7892: [BEAM-6711] [BEAM-6553] A Python SDK sink that supports File Loads into BQ (#7655) URL: https://github.com/apache/beam/pull/7892#discussion_r258684839 ## File path: sdks/python/apache_beam/io/gcp/bigquery.py ## @@ -727,6 +733,8 @@ def __init__(self, loads into BigQuery. By default, this will use the pipeline's temp_location, but for pipelines whose temp_location is not appropriate for BQ File Loads, users should pass a specific one. + method: The method to use to write to BigQuery. It may be Review comment: Added. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 201638) Time Spent: 3h (was: 2h 50m) > Bigquery Tornadoes IT is broken in Python3 PostCommit test suite. > -- > > Key: BEAM-6711 > URL: https://issues.apache.org/jira/browse/BEAM-6711 > Project: Beam > Issue Type: Sub-task > Components: sdk-py-core >Reporter: Valentyn Tymofieiev >Assignee: Pablo Estrada >Priority: Major > Time Spent: 3h > Remaining Estimate: 0h > > First failure was observed in > https://builds.apache.org/job/beam_PostCommit_Python3_Verify/54 , after > https://github.com/apache/beam/commit/cdea885872b3be7de9ba22f22700be89f7d53766 > was merged. > [~pabloem], could you please take a look? I suggest we do a rollback + > rollforward with a fix. > {noformat} > root: ERROR: Exception at bundle > , > due to an exception. > Traceback (most recent call last): > File > "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify/src/sdks/python/apache_beam/runners/common.py", > line 727, in process > return self.do_fn_invoker.invoke_process(windowed_value) > File > "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify/src/sdks/python/apache_beam/runners/common.py", > line 556, in invoke_process > windowed_value, additional_args, additional_kwargs, output_processor) > File > "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify/src/sdks/python/apache_beam/runners/common.py", > line 622, in _invoke_per_window > self.process_method(*args_for_process, **kwargs_for_process)) > File > "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify/src/sdks/python/apache_beam/runners/common.py", > line 823, in process_outputs > for result in results: > File "/home/jenkins/jenkins-slave/works > pace/beam_PostCommit_Python3_Verify/src/sdks/python/apache_beam/io/gcp/bigquery_file_loads.py", > line 191, in process > if destination in self._destination_to_file_writer: > TypeError: unhashable type: 'TableReference' > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-6711) Bigquery Tornadoes IT is broken in Python3 PostCommit test suite.
[ https://issues.apache.org/jira/browse/BEAM-6711?focusedWorklogId=201635&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-201635 ] ASF GitHub Bot logged work on BEAM-6711: Author: ASF GitHub Bot Created on: 20/Feb/19 21:55 Start Date: 20/Feb/19 21:55 Worklog Time Spent: 10m Work Description: pabloem commented on pull request #7892: [BEAM-6711] [BEAM-6553] A Python SDK sink that supports File Loads into BQ (#7655) URL: https://github.com/apache/beam/pull/7892#discussion_r258687177 ## File path: sdks/python/apache_beam/examples/cookbook/bigquery_tornadoes.py ## @@ -73,8 +73,14 @@ def run(argv=None): help= ('Output BigQuery table for results specified as: PROJECT:DATASET.TABLE ' 'or DATASET.TABLE.')) + known_args, pipeline_args = parser.parse_known_args(argv) + method = ('DEFAULT' +if any('runner' in elm Review comment: Instead, the test will have a gs_location provided. Thisshould be fixed now. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 201635) Time Spent: 2h 40m (was: 2.5h) > Bigquery Tornadoes IT is broken in Python3 PostCommit test suite. > -- > > Key: BEAM-6711 > URL: https://issues.apache.org/jira/browse/BEAM-6711 > Project: Beam > Issue Type: Sub-task > Components: sdk-py-core >Reporter: Valentyn Tymofieiev >Assignee: Pablo Estrada >Priority: Major > Time Spent: 2h 40m > Remaining Estimate: 0h > > First failure was observed in > https://builds.apache.org/job/beam_PostCommit_Python3_Verify/54 , after > https://github.com/apache/beam/commit/cdea885872b3be7de9ba22f22700be89f7d53766 > was merged. > [~pabloem], could you please take a look? I suggest we do a rollback + > rollforward with a fix. > {noformat} > root: ERROR: Exception at bundle > , > due to an exception. > Traceback (most recent call last): > File > "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify/src/sdks/python/apache_beam/runners/common.py", > line 727, in process > return self.do_fn_invoker.invoke_process(windowed_value) > File > "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify/src/sdks/python/apache_beam/runners/common.py", > line 556, in invoke_process > windowed_value, additional_args, additional_kwargs, output_processor) > File > "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify/src/sdks/python/apache_beam/runners/common.py", > line 622, in _invoke_per_window > self.process_method(*args_for_process, **kwargs_for_process)) > File > "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify/src/sdks/python/apache_beam/runners/common.py", > line 823, in process_outputs > for result in results: > File "/home/jenkins/jenkins-slave/works > pace/beam_PostCommit_Python3_Verify/src/sdks/python/apache_beam/io/gcp/bigquery_file_loads.py", > line 191, in process > if destination in self._destination_to_file_writer: > TypeError: unhashable type: 'TableReference' > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-6711) Bigquery Tornadoes IT is broken in Python3 PostCommit test suite.
[ https://issues.apache.org/jira/browse/BEAM-6711?focusedWorklogId=201636&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-201636 ] ASF GitHub Bot logged work on BEAM-6711: Author: ASF GitHub Bot Created on: 20/Feb/19 21:55 Start Date: 20/Feb/19 21:55 Worklog Time Spent: 10m Work Description: pabloem commented on pull request #7892: [BEAM-6711] [BEAM-6553] A Python SDK sink that supports File Loads into BQ (#7655) URL: https://github.com/apache/beam/pull/7892#discussion_r258685469 ## File path: sdks/python/apache_beam/io/gcp/big_query_query_to_table_pipeline.py ## @@ -50,6 +50,9 @@ def run_bq_pipeline(argv=None): help='Output BQ table to write results to.') parser.add_argument('--kms_key', default=None, help='Use this Cloud KMS key with BigQuery.') + parser.add_argument('--gs_location', + default=None, + help='GCS bucket location to use to store files.') Review comment: Improved documentation. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 201636) Time Spent: 2h 50m (was: 2h 40m) > Bigquery Tornadoes IT is broken in Python3 PostCommit test suite. > -- > > Key: BEAM-6711 > URL: https://issues.apache.org/jira/browse/BEAM-6711 > Project: Beam > Issue Type: Sub-task > Components: sdk-py-core >Reporter: Valentyn Tymofieiev >Assignee: Pablo Estrada >Priority: Major > Time Spent: 2h 50m > Remaining Estimate: 0h > > First failure was observed in > https://builds.apache.org/job/beam_PostCommit_Python3_Verify/54 , after > https://github.com/apache/beam/commit/cdea885872b3be7de9ba22f22700be89f7d53766 > was merged. > [~pabloem], could you please take a look? I suggest we do a rollback + > rollforward with a fix. > {noformat} > root: ERROR: Exception at bundle > , > due to an exception. > Traceback (most recent call last): > File > "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify/src/sdks/python/apache_beam/runners/common.py", > line 727, in process > return self.do_fn_invoker.invoke_process(windowed_value) > File > "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify/src/sdks/python/apache_beam/runners/common.py", > line 556, in invoke_process > windowed_value, additional_args, additional_kwargs, output_processor) > File > "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify/src/sdks/python/apache_beam/runners/common.py", > line 622, in _invoke_per_window > self.process_method(*args_for_process, **kwargs_for_process)) > File > "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify/src/sdks/python/apache_beam/runners/common.py", > line 823, in process_outputs > for result in results: > File "/home/jenkins/jenkins-slave/works > pace/beam_PostCommit_Python3_Verify/src/sdks/python/apache_beam/io/gcp/bigquery_file_loads.py", > line 191, in process > if destination in self._destination_to_file_writer: > TypeError: unhashable type: 'TableReference' > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-6711) Bigquery Tornadoes IT is broken in Python3 PostCommit test suite.
[ https://issues.apache.org/jira/browse/BEAM-6711?focusedWorklogId=201639&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-201639 ] ASF GitHub Bot logged work on BEAM-6711: Author: ASF GitHub Bot Created on: 20/Feb/19 21:55 Start Date: 20/Feb/19 21:55 Worklog Time Spent: 10m Work Description: pabloem commented on pull request #7892: [BEAM-6711] [BEAM-6553] A Python SDK sink that supports File Loads into BQ (#7655) URL: https://github.com/apache/beam/pull/7892#discussion_r258685580 ## File path: sdks/python/apache_beam/io/gcp/big_query_query_to_table_pipeline.py ## @@ -50,6 +50,9 @@ def run_bq_pipeline(argv=None): help='Output BQ table to write results to.') parser.add_argument('--kms_key', default=None, help='Use this Cloud KMS key with BigQuery.') + parser.add_argument('--gs_location', Review comment: Done. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 201639) Time Spent: 3h 10m (was: 3h) > Bigquery Tornadoes IT is broken in Python3 PostCommit test suite. > -- > > Key: BEAM-6711 > URL: https://issues.apache.org/jira/browse/BEAM-6711 > Project: Beam > Issue Type: Sub-task > Components: sdk-py-core >Reporter: Valentyn Tymofieiev >Assignee: Pablo Estrada >Priority: Major > Time Spent: 3h 10m > Remaining Estimate: 0h > > First failure was observed in > https://builds.apache.org/job/beam_PostCommit_Python3_Verify/54 , after > https://github.com/apache/beam/commit/cdea885872b3be7de9ba22f22700be89f7d53766 > was merged. > [~pabloem], could you please take a look? I suggest we do a rollback + > rollforward with a fix. > {noformat} > root: ERROR: Exception at bundle > , > due to an exception. > Traceback (most recent call last): > File > "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify/src/sdks/python/apache_beam/runners/common.py", > line 727, in process > return self.do_fn_invoker.invoke_process(windowed_value) > File > "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify/src/sdks/python/apache_beam/runners/common.py", > line 556, in invoke_process > windowed_value, additional_args, additional_kwargs, output_processor) > File > "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify/src/sdks/python/apache_beam/runners/common.py", > line 622, in _invoke_per_window > self.process_method(*args_for_process, **kwargs_for_process)) > File > "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify/src/sdks/python/apache_beam/runners/common.py", > line 823, in process_outputs > for result in results: > File "/home/jenkins/jenkins-slave/works > pace/beam_PostCommit_Python3_Verify/src/sdks/python/apache_beam/io/gcp/bigquery_file_loads.py", > line 191, in process > if destination in self._destination_to_file_writer: > TypeError: unhashable type: 'TableReference' > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-6714) Move runner-agnostic code out of FlinkJobServerDriver
[ https://issues.apache.org/jira/browse/BEAM-6714?focusedWorklogId=201631&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-201631 ] ASF GitHub Bot logged work on BEAM-6714: Author: ASF GitHub Bot Created on: 20/Feb/19 21:49 Start Date: 20/Feb/19 21:49 Worklog Time Spent: 10m Work Description: ibzib commented on pull request #7907: [BEAM-6714] Move runner-agnostic code out of FlinkJobServerDriver URL: https://github.com/apache/beam/pull/7907 I moved runner-agnostic code in `FlinkJobServerDriver` to the new abstract class `JobServerDriver`, which can be reused by other portable runners going forward. I did *not* make any changes to `SamzaJobServerDriver` yet, but it might be a good idea to adapt `SamzaJobServerDriver` to extend `JobServerDriver`. As far as I can tell, the former only includes a subset of the latter's features. R: @angoenka R: @robertwb Post-Commit Tests Status (on master branch) Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark --- | --- | --- | --- | --- | --- | --- | --- Go | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/) | --- | --- | --- | --- | --- | --- Java | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/) Python | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Python_Verify/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python_Verify/lastCompletedBuild/) | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/) [![Build Status](https://builds.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PreCommit_Python_PVR_Flink_Cron/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PreCommit_Python_PVR_Flink_Cron/lastCompletedBuild/) | --- | --- | --- See [.test-infra/jenkins/README](../.test-infra/jenkins/README.md) for trigger phrase, status and link of all Jenkins jobs. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 201631) Time Spent: 10m Remaining Estimate: 0h > Move runner-agnostic code out of FlinkJobServerDriver > -
[jira] [Commented] (BEAM-6650) FlinkRunner fails to checkpoint elements emitted during finishBundle
[ https://issues.apache.org/jira/browse/BEAM-6650?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16773412#comment-16773412 ] Ahmet Altay commented on BEAM-6650: --- https://github.com/apache/beam/pull/7874 - cp'ed changes to the release branch, closing. > FlinkRunner fails to checkpoint elements emitted during finishBundle > > > Key: BEAM-6650 > URL: https://issues.apache.org/jira/browse/BEAM-6650 > Project: Beam > Issue Type: Bug > Components: runner-flink >Reporter: Maximilian Michels >Assignee: Maximilian Michels >Priority: Major > Fix For: 2.11.0 > > Time Spent: 6h 10m > Remaining Estimate: 0h > > Elements emitted during the finalizeBundle call in snapshopState are lost > after the pipeline is restored. This only happens when the operator is keyed. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Closed] (BEAM-6678) FlinkRunner does not checkpoint partition view of watermark holds
[ https://issues.apache.org/jira/browse/BEAM-6678?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ahmet Altay closed BEAM-6678. - Resolution: Fixed > FlinkRunner does not checkpoint partition view of watermark holds > - > > Key: BEAM-6678 > URL: https://issues.apache.org/jira/browse/BEAM-6678 > Project: Beam > Issue Type: Bug > Components: runner-flink >Reporter: Maximilian Michels >Assignee: Maximilian Michels >Priority: Major > Fix For: 2.11.0 > > Time Spent: 2h > Remaining Estimate: 0h > > The FlinkRunner does not persist its view of the per-partition Watermark > holds. This can lead to elements to be considered late after restoring from a > savepoint or resuming a failed pipeline. > Similar to the approach in BEAM-6650, we can recover the Watermarks by > iterating through the keys of the state backend during recovery. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (BEAM-6650) FlinkRunner fails to checkpoint elements emitted during finishBundle
[ https://issues.apache.org/jira/browse/BEAM-6650?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ahmet Altay resolved BEAM-6650. --- Resolution: Fixed > FlinkRunner fails to checkpoint elements emitted during finishBundle > > > Key: BEAM-6650 > URL: https://issues.apache.org/jira/browse/BEAM-6650 > Project: Beam > Issue Type: Bug > Components: runner-flink >Reporter: Maximilian Michels >Assignee: Maximilian Michels >Priority: Major > Fix For: 2.11.0 > > Time Spent: 6h 10m > Remaining Estimate: 0h > > Elements emitted during the finalizeBundle call in snapshopState are lost > after the pipeline is restored. This only happens when the operator is keyed. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (BEAM-6678) FlinkRunner does not checkpoint partition view of watermark holds
[ https://issues.apache.org/jira/browse/BEAM-6678?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16773413#comment-16773413 ] Ahmet Altay commented on BEAM-6678: --- https://github.com/apache/beam/pull/7874 - cp'ed changes to the release branch, closing. > FlinkRunner does not checkpoint partition view of watermark holds > - > > Key: BEAM-6678 > URL: https://issues.apache.org/jira/browse/BEAM-6678 > Project: Beam > Issue Type: Bug > Components: runner-flink >Reporter: Maximilian Michels >Assignee: Maximilian Michels >Priority: Major > Fix For: 2.11.0 > > Time Spent: 2h > Remaining Estimate: 0h > > The FlinkRunner does not persist its view of the per-partition Watermark > holds. This can lead to elements to be considered late after restoring from a > savepoint or resuming a failed pipeline. > Similar to the approach in BEAM-6650, we can recover the Watermarks by > iterating through the keys of the state backend during recovery. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-6650) FlinkRunner fails to checkpoint elements emitted during finishBundle
[ https://issues.apache.org/jira/browse/BEAM-6650?focusedWorklogId=201628&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-201628 ] ASF GitHub Bot logged work on BEAM-6650: Author: ASF GitHub Bot Created on: 20/Feb/19 21:36 Start Date: 20/Feb/19 21:36 Worklog Time Spent: 10m Work Description: aaltay commented on pull request #7874: [release-2.11.0] Backport for BEAM-6650 and BEAM-6678 URL: https://github.com/apache/beam/pull/7874 This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 201628) Time Spent: 6h 10m (was: 6h) > FlinkRunner fails to checkpoint elements emitted during finishBundle > > > Key: BEAM-6650 > URL: https://issues.apache.org/jira/browse/BEAM-6650 > Project: Beam > Issue Type: Bug > Components: runner-flink >Reporter: Maximilian Michels >Assignee: Maximilian Michels >Priority: Major > Fix For: 2.11.0 > > Time Spent: 6h 10m > Remaining Estimate: 0h > > Elements emitted during the finalizeBundle call in snapshopState are lost > after the pipeline is restored. This only happens when the operator is keyed. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (BEAM-6720) Binary incompatibility introduced to MapElements between 2.9.0 and 2.10.0
Kenneth Knowles created BEAM-6720: - Summary: Binary incompatibility introduced to MapElements between 2.9.0 and 2.10.0 Key: BEAM-6720 URL: https://issues.apache.org/jira/browse/BEAM-6720 Project: Beam Issue Type: Bug Components: sdk-java-core Affects Versions: 2.10.0 Reporter: Kenneth Knowles Assignee: Kenneth Knowles Fix For: 2.11.0 In https://github.com/apache/beam/pull/7160 {{MapElements.via(SimpleFunction)}} was removed and replaced with {{MapElements.via(InferableFunction)}}. This is compatible with a recompile, but loses binary compatibility because the needed method signature is missing. I believe a pass-through method with the needed signature can be added to restore binary compatibility. CC [~jeff.klu...@gmail.com] [~altay] -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-5638) Add exception handling to single message transforms in Java SDK
[ https://issues.apache.org/jira/browse/BEAM-5638?focusedWorklogId=201603&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-201603 ] ASF GitHub Bot logged work on BEAM-5638: Author: ASF GitHub Bot Created on: 20/Feb/19 20:51 Start Date: 20/Feb/19 20:51 Worklog Time Spent: 10m Work Description: tims commented on issue #7736: [BEAM-5638] Exception handling for Java MapElements and FlatMapElements URL: https://github.com/apache/beam/pull/7736#issuecomment-465748762 > That would be quite nice, but it does add complexity and restrictions. In many, cases the type returned in each step of a series of transforms will be different, and I'd generally expect it will be necessary to define an exception handler per input type. I'm imagining that often users will want to preserve the input element along with info about the exception, so that they can reprocess the failed elements some time in the future. It's also common to just want all failures to be logged to the same place per job. And many times, in the steps I want to union the failures for, they all have a common element type anyway. You would still be able to stop the chaining at any point if you wanted to use different error types, by just getting the output() from Result. Extending the Result class with an apply method to chain ordinary transforms with exception handling could be nice future PR work, I'm happy to give it a try. To chain non failure handling transforms with a common exception/failure handler way, I think would require MapElements and FlatMapElements to have a common interface or annotation though. So that a Result.apply method could call `exceptionsVia`/etc on them. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 201603) Time Spent: 8h 40m (was: 8.5h) Remaining Estimate: 159h 20m (was: 159.5h) > Add exception handling to single message transforms in Java SDK > --- > > Key: BEAM-5638 > URL: https://issues.apache.org/jira/browse/BEAM-5638 > Project: Beam > Issue Type: Improvement > Components: sdk-java-core >Reporter: Jeff Klukas >Assignee: Jeff Klukas >Priority: Minor > Labels: triaged > Original Estimate: 168h > Time Spent: 8h 40m > Remaining Estimate: 159h 20m > > Add methods to MapElements, FlatMapElements, and Filter that allow users to > specify expected exceptions and tuple tags to associate with the with > collections of the successfully and unsuccessfully processed elements. > See discussion on dev list: > https://lists.apache.org/thread.html/936ed2a5f2c01be066fd903abf70130625e0b8cf4028c11b89b8b23f@%3Cdev.beam.apache.org%3E -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-6718) Logical types PR seems to break SQL postcommits
[ https://issues.apache.org/jira/browse/BEAM-6718?focusedWorklogId=201602&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-201602 ] ASF GitHub Bot logged work on BEAM-6718: Author: ASF GitHub Bot Created on: 20/Feb/19 20:51 Start Date: 20/Feb/19 20:51 Worklog Time Spent: 10m Work Description: reuvenlax commented on issue #7904: [BEAM-6718] Fix BigQuery SQL postcommit URL: https://github.com/apache/beam/pull/7904#issuecomment-465748738 Run JavaPortabilityApi PreCommit This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 201602) Time Spent: 1h (was: 50m) > Logical types PR seems to break SQL postcommits > --- > > Key: BEAM-6718 > URL: https://issues.apache.org/jira/browse/BEAM-6718 > Project: Beam > Issue Type: Bug > Components: dsl-sql >Reporter: Kenneth Knowles >Assignee: Reuven Lax >Priority: Critical > Time Spent: 1h > Remaining Estimate: 0h > > Starting here it has been red on BQ integration test: > https://builds.apache.org/job/beam_PostCommit_SQL/634/ -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-6718) Logical types PR seems to break SQL postcommits
[ https://issues.apache.org/jira/browse/BEAM-6718?focusedWorklogId=201601&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-201601 ] ASF GitHub Bot logged work on BEAM-6718: Author: ASF GitHub Bot Created on: 20/Feb/19 20:50 Start Date: 20/Feb/19 20:50 Worklog Time Spent: 10m Work Description: reuvenlax commented on issue #7904: [BEAM-6718] Fix BigQuery SQL postcommit URL: https://github.com/apache/beam/pull/7904#issuecomment-465748693 Run Java PreCommit This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 201601) Time Spent: 50m (was: 40m) > Logical types PR seems to break SQL postcommits > --- > > Key: BEAM-6718 > URL: https://issues.apache.org/jira/browse/BEAM-6718 > Project: Beam > Issue Type: Bug > Components: dsl-sql >Reporter: Kenneth Knowles >Assignee: Reuven Lax >Priority: Critical > Time Spent: 50m > Remaining Estimate: 0h > > Starting here it has been red on BQ integration test: > https://builds.apache.org/job/beam_PostCommit_SQL/634/ -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-6650) FlinkRunner fails to checkpoint elements emitted during finishBundle
[ https://issues.apache.org/jira/browse/BEAM-6650?focusedWorklogId=201596&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-201596 ] ASF GitHub Bot logged work on BEAM-6650: Author: ASF GitHub Bot Created on: 20/Feb/19 20:44 Start Date: 20/Feb/19 20:44 Worklog Time Spent: 10m Work Description: aaltay commented on issue #7874: [release-2.11.0] Backport for BEAM-6650 and BEAM-6678 URL: https://github.com/apache/beam/pull/7874#issuecomment-465746463 Run Python PreCommit This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 201596) Time Spent: 6h (was: 5h 50m) > FlinkRunner fails to checkpoint elements emitted during finishBundle > > > Key: BEAM-6650 > URL: https://issues.apache.org/jira/browse/BEAM-6650 > Project: Beam > Issue Type: Bug > Components: runner-flink >Reporter: Maximilian Michels >Assignee: Maximilian Michels >Priority: Major > Fix For: 2.11.0 > > Time Spent: 6h > Remaining Estimate: 0h > > Elements emitted during the finalizeBundle call in snapshopState are lost > after the pipeline is restored. This only happens when the operator is keyed. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-6711) Bigquery Tornadoes IT is broken in Python3 PostCommit test suite.
[ https://issues.apache.org/jira/browse/BEAM-6711?focusedWorklogId=201585&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-201585 ] ASF GitHub Bot logged work on BEAM-6711: Author: ASF GitHub Bot Created on: 20/Feb/19 20:33 Start Date: 20/Feb/19 20:33 Worklog Time Spent: 10m Work Description: tvalentyn commented on pull request #7892: [BEAM-6711] [BEAM-6553] A Python SDK sink that supports File Loads into BQ (#7655) URL: https://github.com/apache/beam/pull/7892#discussion_r258629196 ## File path: sdks/python/apache_beam/io/gcp/bigquery.py ## @@ -449,6 +450,10 @@ def __init__(self, table, dataset=None, project=None, schema=None, match the expected format. """ +import warnings Review comment: Consider using apache_beam/utils/annotations.py, or if not - make this a DeprecationWarning. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 201585) Time Spent: 1h 50m (was: 1h 40m) > Bigquery Tornadoes IT is broken in Python3 PostCommit test suite. > -- > > Key: BEAM-6711 > URL: https://issues.apache.org/jira/browse/BEAM-6711 > Project: Beam > Issue Type: Sub-task > Components: sdk-py-core >Reporter: Valentyn Tymofieiev >Assignee: Pablo Estrada >Priority: Major > Time Spent: 1h 50m > Remaining Estimate: 0h > > First failure was observed in > https://builds.apache.org/job/beam_PostCommit_Python3_Verify/54 , after > https://github.com/apache/beam/commit/cdea885872b3be7de9ba22f22700be89f7d53766 > was merged. > [~pabloem], could you please take a look? I suggest we do a rollback + > rollforward with a fix. > {noformat} > root: ERROR: Exception at bundle > , > due to an exception. > Traceback (most recent call last): > File > "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify/src/sdks/python/apache_beam/runners/common.py", > line 727, in process > return self.do_fn_invoker.invoke_process(windowed_value) > File > "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify/src/sdks/python/apache_beam/runners/common.py", > line 556, in invoke_process > windowed_value, additional_args, additional_kwargs, output_processor) > File > "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify/src/sdks/python/apache_beam/runners/common.py", > line 622, in _invoke_per_window > self.process_method(*args_for_process, **kwargs_for_process)) > File > "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify/src/sdks/python/apache_beam/runners/common.py", > line 823, in process_outputs > for result in results: > File "/home/jenkins/jenkins-slave/works > pace/beam_PostCommit_Python3_Verify/src/sdks/python/apache_beam/io/gcp/bigquery_file_loads.py", > line 191, in process > if destination in self._destination_to_file_writer: > TypeError: unhashable type: 'TableReference' > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-6718) Logical types PR seems to break SQL postcommits
[ https://issues.apache.org/jira/browse/BEAM-6718?focusedWorklogId=201594&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-201594 ] ASF GitHub Bot logged work on BEAM-6718: Author: ASF GitHub Bot Created on: 20/Feb/19 20:37 Start Date: 20/Feb/19 20:37 Worklog Time Spent: 10m Work Description: reuvenlax commented on issue #7904: [BEAM-6718] Fix BigQuery SQL postcommit URL: https://github.com/apache/beam/pull/7904#issuecomment-465744324 Run SQL PostCommit This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 201594) Time Spent: 40m (was: 0.5h) > Logical types PR seems to break SQL postcommits > --- > > Key: BEAM-6718 > URL: https://issues.apache.org/jira/browse/BEAM-6718 > Project: Beam > Issue Type: Bug > Components: dsl-sql >Reporter: Kenneth Knowles >Assignee: Reuven Lax >Priority: Critical > Time Spent: 40m > Remaining Estimate: 0h > > Starting here it has been red on BQ integration test: > https://builds.apache.org/job/beam_PostCommit_SQL/634/ -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-6158) Enable support for save_main_session in Python 3
[ https://issues.apache.org/jira/browse/BEAM-6158?focusedWorklogId=201595&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-201595 ] ASF GitHub Bot logged work on BEAM-6158: Author: ASF GitHub Bot Created on: 20/Feb/19 20:41 Start Date: 20/Feb/19 20:41 Worklog Time Spent: 10m Work Description: markflyhigh commented on issue #7888: [BEAM-6158] Remove test_wordcount_without_save_main_session URL: https://github.com/apache/beam/pull/7888#issuecomment-465745434 @charlesccychen Please take a look and merge. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 201595) Time Spent: 3h 10m (was: 3h) > Enable support for save_main_session in Python 3 > > > Key: BEAM-6158 > URL: https://issues.apache.org/jira/browse/BEAM-6158 > Project: Beam > Issue Type: Sub-task > Components: sdk-py-harness >Reporter: Mark Liu >Assignee: Valentyn Tymofieiev >Priority: Major > Labels: triaged > Time Spent: 3h 10m > Remaining Estimate: 0h > > This happened when I run wordcount example with portable Dataflow runner in > Python 3.5. The failure shows in worker log (unfortunately unformatted) of > [this > job|https://pantheon.corp.google.com/dataflow/jobsDetail/locations/us-central1/jobs/2018-11-29_11_47_38-6731484595556255542?project=google.com:clouddfe]: > {code:java} > Could not load main session: Traceback (most recent call last): File > "/usr/local/lib/python3.5/site-packages/apache_beam/runners/worker/sdk_worker_main.py", > line 125, in main _load_main_session(semi_persistent_directory) File > "/usr/local/lib/python3.5/site-packages/apache_beam/runners/worker/sdk_worker_main.py", > line 201, in _load_main_session pickler.load_session(session_file) File > "/usr/local/lib/python3.5/site-packages/apache_beam/internal/pickler.py", > line 269, in load_session return dill.load_session(file_path) File > "/usr/local/lib/python3.5/site-packages/dill/_dill.py", line 402, in > load_session module = unpickler.load() File > "/usr/local/lib/python3.5/site-packages/dill/_dill.py", line 465, in > find_class return StockUnpickler.find_class(self, module, name) > AttributeError: Can't get attribute 'WordExtractingDoFn' on 'apache_beam.runners.worker.sdk_worker_main' from > '/usr/local/lib/python3.5/site-packages/apache_beam/runners/worker/sdk_worker_main.py'> > Traceback (most recent call last): File > "/usr/local/lib/python3.5/site-packages/apache_beam/runners/worker/sdk_worker_main.py", > line 125, in main _load_main_session(semi_persistent_directory) File > "/usr/local/lib/python3.5/site-packages/apache_beam/runners/worker/sdk_worker_main.py", > line 201, in _load_main_session pickler.load_session(session_file) File > "/usr/local/lib/python3.5/site-packages/apache_beam/internal/pickler.py", > line 269, in load_session return dill.load_session(file_path) File > "/usr/local/lib/python3.5/site-packages/dill/_dill.py", line 402, in > load_session module = unpickler.load() File > "/usr/local/lib/python3.5/site-packages/dill/_dill.py", line 465, in > find_class return StockUnpickler.find_class(self, module, name) > AttributeError: Can't get attribute 'WordExtractingDoFn' on 'apache_beam.runners.worker.sdk_worker_main' from > '/usr/local/lib/python3.5/site-packages/apache_beam/runners/worker/sdk_worker_main.py'> > {code} > Looks like saved main session didn't work properly in Python 3. > +cc: [~tvalentyn] [~robertwb] [~altay] -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (BEAM-6706) User reports trouble downloading 2.10.0 Dataflow worker image
[ https://issues.apache.org/jira/browse/BEAM-6706?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16773376#comment-16773376 ] Valentyn Tymofieiev commented on BEAM-6706: --- [~arpi], it sounds like you would like to pass a custom container image, on Dataflow side this functionality is supported only Portable Pipeline execution model (that uses FnAPI). Python streaming and Go pipelines would always use this execution model in Dataflow runner. Custom containers currently are always pulled at runtime from GCR, since a custom container is not present on the VM image used by Dataflow worker. If SDK is using a legacy execution mode, for example a typical Java Batch pipeline, passing a custom worker harness container image is currently disallowed by Dataflow service. > User reports trouble downloading 2.10.0 Dataflow worker image > - > > Key: BEAM-6706 > URL: https://issues.apache.org/jira/browse/BEAM-6706 > Project: Beam > Issue Type: Bug > Components: runner-dataflow >Reporter: Kenneth Knowles >Assignee: Kenneth Knowles >Priority: Blocker > > DataFlow however is throwing all sorts of errors. For example: > * Handler for GET > /v1.27/images/gcr.io/cloud-dataflow/v1beta3/beam-java-batch:beam-2.10.0/json > returned error: No such image: > gcr.io/cloud-dataflow/v1beta3/beam-java-batch:beam-2.10.0" > * while reading 'google-dockercfg' metadata: http status code: 404 while > fetching url > http://metadata.google.internal./computeMetadata/v1/instance/attributes/google-dockercfg"; > * Error syncing pod..." > The job gets stuck after starting a worker and after an hour or so it gives > up with a failure. 2.9.0 runs fine. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-6711) Bigquery Tornadoes IT is broken in Python3 PostCommit test suite.
[ https://issues.apache.org/jira/browse/BEAM-6711?focusedWorklogId=201586&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-201586 ] ASF GitHub Bot logged work on BEAM-6711: Author: ASF GitHub Bot Created on: 20/Feb/19 20:33 Start Date: 20/Feb/19 20:33 Worklog Time Spent: 10m Work Description: tvalentyn commented on pull request #7892: [BEAM-6711] [BEAM-6553] A Python SDK sink that supports File Loads into BQ (#7655) URL: https://github.com/apache/beam/pull/7892#discussion_r258631588 ## File path: sdks/python/apache_beam/io/gcp/big_query_query_to_table_pipeline.py ## @@ -50,6 +50,9 @@ def run_bq_pipeline(argv=None): help='Output BQ table to write results to.') parser.add_argument('--kms_key', default=None, help='Use this Cloud KMS key with BigQuery.') + parser.add_argument('--gs_location', Review comment: nit: The meaning of `gcs_location` maybe more intuitive than `gs_location`. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 201586) Time Spent: 2h (was: 1h 50m) > Bigquery Tornadoes IT is broken in Python3 PostCommit test suite. > -- > > Key: BEAM-6711 > URL: https://issues.apache.org/jira/browse/BEAM-6711 > Project: Beam > Issue Type: Sub-task > Components: sdk-py-core >Reporter: Valentyn Tymofieiev >Assignee: Pablo Estrada >Priority: Major > Time Spent: 2h > Remaining Estimate: 0h > > First failure was observed in > https://builds.apache.org/job/beam_PostCommit_Python3_Verify/54 , after > https://github.com/apache/beam/commit/cdea885872b3be7de9ba22f22700be89f7d53766 > was merged. > [~pabloem], could you please take a look? I suggest we do a rollback + > rollforward with a fix. > {noformat} > root: ERROR: Exception at bundle > , > due to an exception. > Traceback (most recent call last): > File > "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify/src/sdks/python/apache_beam/runners/common.py", > line 727, in process > return self.do_fn_invoker.invoke_process(windowed_value) > File > "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify/src/sdks/python/apache_beam/runners/common.py", > line 556, in invoke_process > windowed_value, additional_args, additional_kwargs, output_processor) > File > "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify/src/sdks/python/apache_beam/runners/common.py", > line 622, in _invoke_per_window > self.process_method(*args_for_process, **kwargs_for_process)) > File > "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify/src/sdks/python/apache_beam/runners/common.py", > line 823, in process_outputs > for result in results: > File "/home/jenkins/jenkins-slave/works > pace/beam_PostCommit_Python3_Verify/src/sdks/python/apache_beam/io/gcp/bigquery_file_loads.py", > line 191, in process > if destination in self._destination_to_file_writer: > TypeError: unhashable type: 'TableReference' > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-6711) Bigquery Tornadoes IT is broken in Python3 PostCommit test suite.
[ https://issues.apache.org/jira/browse/BEAM-6711?focusedWorklogId=201589&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-201589 ] ASF GitHub Bot logged work on BEAM-6711: Author: ASF GitHub Bot Created on: 20/Feb/19 20:33 Start Date: 20/Feb/19 20:33 Worklog Time Spent: 10m Work Description: tvalentyn commented on pull request #7892: [BEAM-6711] [BEAM-6553] A Python SDK sink that supports File Loads into BQ (#7655) URL: https://github.com/apache/beam/pull/7892#discussion_r258660940 ## File path: sdks/python/apache_beam/io/gcp/bigquery_file_loads.py ## @@ -181,13 +181,22 @@ def display_data(self): 'coder': self.coder.__class__.__name__ } + def _get_hashable_destination(self, destination): +if isinstance(destination, bigquery_api.TableReference): + return '%s:%s.%s' % ( + destination.projectId, destination.datasetId, destination.tableId) +else: + return destination + def start_bundle(self): self._destination_to_file_writer = {} def process(self, element, file_prefix): Review comment: Is it possible to add a docstring for `process` to explain what we expect as inputs? Sounds like for `element` we expect a tuple of objects with particular input types. Could we clarify that? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 201589) > Bigquery Tornadoes IT is broken in Python3 PostCommit test suite. > -- > > Key: BEAM-6711 > URL: https://issues.apache.org/jira/browse/BEAM-6711 > Project: Beam > Issue Type: Sub-task > Components: sdk-py-core >Reporter: Valentyn Tymofieiev >Assignee: Pablo Estrada >Priority: Major > Time Spent: 2h 20m > Remaining Estimate: 0h > > First failure was observed in > https://builds.apache.org/job/beam_PostCommit_Python3_Verify/54 , after > https://github.com/apache/beam/commit/cdea885872b3be7de9ba22f22700be89f7d53766 > was merged. > [~pabloem], could you please take a look? I suggest we do a rollback + > rollforward with a fix. > {noformat} > root: ERROR: Exception at bundle > , > due to an exception. > Traceback (most recent call last): > File > "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify/src/sdks/python/apache_beam/runners/common.py", > line 727, in process > return self.do_fn_invoker.invoke_process(windowed_value) > File > "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify/src/sdks/python/apache_beam/runners/common.py", > line 556, in invoke_process > windowed_value, additional_args, additional_kwargs, output_processor) > File > "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify/src/sdks/python/apache_beam/runners/common.py", > line 622, in _invoke_per_window > self.process_method(*args_for_process, **kwargs_for_process)) > File > "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify/src/sdks/python/apache_beam/runners/common.py", > line 823, in process_outputs > for result in results: > File "/home/jenkins/jenkins-slave/works > pace/beam_PostCommit_Python3_Verify/src/sdks/python/apache_beam/io/gcp/bigquery_file_loads.py", > line 191, in process > if destination in self._destination_to_file_writer: > TypeError: unhashable type: 'TableReference' > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-6711) Bigquery Tornadoes IT is broken in Python3 PostCommit test suite.
[ https://issues.apache.org/jira/browse/BEAM-6711?focusedWorklogId=201591&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-201591 ] ASF GitHub Bot logged work on BEAM-6711: Author: ASF GitHub Bot Created on: 20/Feb/19 20:33 Start Date: 20/Feb/19 20:33 Worklog Time Spent: 10m Work Description: tvalentyn commented on pull request #7892: [BEAM-6711] [BEAM-6553] A Python SDK sink that supports File Loads into BQ (#7655) URL: https://github.com/apache/beam/pull/7892#discussion_r258662278 ## File path: sdks/python/apache_beam/io/gcp/big_query_query_to_table_pipeline.py ## @@ -50,6 +50,9 @@ def run_bq_pipeline(argv=None): help='Output BQ table to write results to.') parser.add_argument('--kms_key', default=None, help='Use this Cloud KMS key with BigQuery.') + parser.add_argument('--gs_location', + default=None, + help='GCS bucket location to use to store files.') Review comment: Is it clear for the users of this module which files we are talking about? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 201591) > Bigquery Tornadoes IT is broken in Python3 PostCommit test suite. > -- > > Key: BEAM-6711 > URL: https://issues.apache.org/jira/browse/BEAM-6711 > Project: Beam > Issue Type: Sub-task > Components: sdk-py-core >Reporter: Valentyn Tymofieiev >Assignee: Pablo Estrada >Priority: Major > Time Spent: 2.5h > Remaining Estimate: 0h > > First failure was observed in > https://builds.apache.org/job/beam_PostCommit_Python3_Verify/54 , after > https://github.com/apache/beam/commit/cdea885872b3be7de9ba22f22700be89f7d53766 > was merged. > [~pabloem], could you please take a look? I suggest we do a rollback + > rollforward with a fix. > {noformat} > root: ERROR: Exception at bundle > , > due to an exception. > Traceback (most recent call last): > File > "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify/src/sdks/python/apache_beam/runners/common.py", > line 727, in process > return self.do_fn_invoker.invoke_process(windowed_value) > File > "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify/src/sdks/python/apache_beam/runners/common.py", > line 556, in invoke_process > windowed_value, additional_args, additional_kwargs, output_processor) > File > "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify/src/sdks/python/apache_beam/runners/common.py", > line 622, in _invoke_per_window > self.process_method(*args_for_process, **kwargs_for_process)) > File > "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify/src/sdks/python/apache_beam/runners/common.py", > line 823, in process_outputs > for result in results: > File "/home/jenkins/jenkins-slave/works > pace/beam_PostCommit_Python3_Verify/src/sdks/python/apache_beam/io/gcp/bigquery_file_loads.py", > line 191, in process > if destination in self._destination_to_file_writer: > TypeError: unhashable type: 'TableReference' > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-6711) Bigquery Tornadoes IT is broken in Python3 PostCommit test suite.
[ https://issues.apache.org/jira/browse/BEAM-6711?focusedWorklogId=201587&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-201587 ] ASF GitHub Bot logged work on BEAM-6711: Author: ASF GitHub Bot Created on: 20/Feb/19 20:33 Start Date: 20/Feb/19 20:33 Worklog Time Spent: 10m Work Description: tvalentyn commented on pull request #7892: [BEAM-6711] [BEAM-6553] A Python SDK sink that supports File Loads into BQ (#7655) URL: https://github.com/apache/beam/pull/7892#discussion_r258642329 ## File path: sdks/python/apache_beam/io/gcp/bigquery.py ## @@ -727,6 +733,8 @@ def __init__(self, loads into BigQuery. By default, this will use the pipeline's temp_location, but for pipelines whose temp_location is not appropriate for BQ File Loads, users should pass a specific one. + method: The method to use to write to BigQuery. It may be Review comment: Is the description of differences between these methods, and semantic of DEFAULT documented somewhere? Can we add a link? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 201587) Time Spent: 2h 10m (was: 2h) > Bigquery Tornadoes IT is broken in Python3 PostCommit test suite. > -- > > Key: BEAM-6711 > URL: https://issues.apache.org/jira/browse/BEAM-6711 > Project: Beam > Issue Type: Sub-task > Components: sdk-py-core >Reporter: Valentyn Tymofieiev >Assignee: Pablo Estrada >Priority: Major > Time Spent: 2h 10m > Remaining Estimate: 0h > > First failure was observed in > https://builds.apache.org/job/beam_PostCommit_Python3_Verify/54 , after > https://github.com/apache/beam/commit/cdea885872b3be7de9ba22f22700be89f7d53766 > was merged. > [~pabloem], could you please take a look? I suggest we do a rollback + > rollforward with a fix. > {noformat} > root: ERROR: Exception at bundle > , > due to an exception. > Traceback (most recent call last): > File > "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify/src/sdks/python/apache_beam/runners/common.py", > line 727, in process > return self.do_fn_invoker.invoke_process(windowed_value) > File > "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify/src/sdks/python/apache_beam/runners/common.py", > line 556, in invoke_process > windowed_value, additional_args, additional_kwargs, output_processor) > File > "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify/src/sdks/python/apache_beam/runners/common.py", > line 622, in _invoke_per_window > self.process_method(*args_for_process, **kwargs_for_process)) > File > "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify/src/sdks/python/apache_beam/runners/common.py", > line 823, in process_outputs > for result in results: > File "/home/jenkins/jenkins-slave/works > pace/beam_PostCommit_Python3_Verify/src/sdks/python/apache_beam/io/gcp/bigquery_file_loads.py", > line 191, in process > if destination in self._destination_to_file_writer: > TypeError: unhashable type: 'TableReference' > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-6711) Bigquery Tornadoes IT is broken in Python3 PostCommit test suite.
[ https://issues.apache.org/jira/browse/BEAM-6711?focusedWorklogId=201588&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-201588 ] ASF GitHub Bot logged work on BEAM-6711: Author: ASF GitHub Bot Created on: 20/Feb/19 20:33 Start Date: 20/Feb/19 20:33 Worklog Time Spent: 10m Work Description: tvalentyn commented on pull request #7892: [BEAM-6711] [BEAM-6553] A Python SDK sink that supports File Loads into BQ (#7655) URL: https://github.com/apache/beam/pull/7892#discussion_r258643970 ## File path: sdks/python/apache_beam/examples/cookbook/bigquery_tornadoes.py ## @@ -73,6 +73,10 @@ def run(argv=None): help= ('Output BigQuery table for results specified as: PROJECT:DATASET.TABLE ' 'or DATASET.TABLE.')) + + method = ('DEFAULT' +if 'Dataflow' in known_args.runner Review comment: I wonder why the modification needs to be on example side, and not on Dataflow runner to set correct method? How would users come up with this if they were writing the example from scratch? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 201588) Time Spent: 2h 20m (was: 2h 10m) > Bigquery Tornadoes IT is broken in Python3 PostCommit test suite. > -- > > Key: BEAM-6711 > URL: https://issues.apache.org/jira/browse/BEAM-6711 > Project: Beam > Issue Type: Sub-task > Components: sdk-py-core >Reporter: Valentyn Tymofieiev >Assignee: Pablo Estrada >Priority: Major > Time Spent: 2h 20m > Remaining Estimate: 0h > > First failure was observed in > https://builds.apache.org/job/beam_PostCommit_Python3_Verify/54 , after > https://github.com/apache/beam/commit/cdea885872b3be7de9ba22f22700be89f7d53766 > was merged. > [~pabloem], could you please take a look? I suggest we do a rollback + > rollforward with a fix. > {noformat} > root: ERROR: Exception at bundle > , > due to an exception. > Traceback (most recent call last): > File > "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify/src/sdks/python/apache_beam/runners/common.py", > line 727, in process > return self.do_fn_invoker.invoke_process(windowed_value) > File > "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify/src/sdks/python/apache_beam/runners/common.py", > line 556, in invoke_process > windowed_value, additional_args, additional_kwargs, output_processor) > File > "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify/src/sdks/python/apache_beam/runners/common.py", > line 622, in _invoke_per_window > self.process_method(*args_for_process, **kwargs_for_process)) > File > "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify/src/sdks/python/apache_beam/runners/common.py", > line 823, in process_outputs > for result in results: > File "/home/jenkins/jenkins-slave/works > pace/beam_PostCommit_Python3_Verify/src/sdks/python/apache_beam/io/gcp/bigquery_file_loads.py", > line 191, in process > if destination in self._destination_to_file_writer: > TypeError: unhashable type: 'TableReference' > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-6711) Bigquery Tornadoes IT is broken in Python3 PostCommit test suite.
[ https://issues.apache.org/jira/browse/BEAM-6711?focusedWorklogId=201590&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-201590 ] ASF GitHub Bot logged work on BEAM-6711: Author: ASF GitHub Bot Created on: 20/Feb/19 20:33 Start Date: 20/Feb/19 20:33 Worklog Time Spent: 10m Work Description: tvalentyn commented on pull request #7892: [BEAM-6711] [BEAM-6553] A Python SDK sink that supports File Loads into BQ (#7655) URL: https://github.com/apache/beam/pull/7892#discussion_r258660428 ## File path: sdks/python/apache_beam/io/gcp/bigquery_file_loads.py ## @@ -181,13 +181,22 @@ def display_data(self): 'coder': self.coder.__class__.__name__ } + def _get_hashable_destination(self, destination): Review comment: I suggest to slightly rewrite this to avoid mixing `destination` and `key_destination`. ``` def _extract_destination(self, table_reference): if isinstance(destination, bigquery_api.TableReference): return '%s:%s.%s' % ( table_reference.projectId, table_reference.datasetId, table_reference.tableId) else: # Should we raise an error here? def process(self, element, file_prefix): destination = _extract_destination(element[0]) row = element[1] ``` This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 201590) Time Spent: 2.5h (was: 2h 20m) > Bigquery Tornadoes IT is broken in Python3 PostCommit test suite. > -- > > Key: BEAM-6711 > URL: https://issues.apache.org/jira/browse/BEAM-6711 > Project: Beam > Issue Type: Sub-task > Components: sdk-py-core >Reporter: Valentyn Tymofieiev >Assignee: Pablo Estrada >Priority: Major > Time Spent: 2.5h > Remaining Estimate: 0h > > First failure was observed in > https://builds.apache.org/job/beam_PostCommit_Python3_Verify/54 , after > https://github.com/apache/beam/commit/cdea885872b3be7de9ba22f22700be89f7d53766 > was merged. > [~pabloem], could you please take a look? I suggest we do a rollback + > rollforward with a fix. > {noformat} > root: ERROR: Exception at bundle > , > due to an exception. > Traceback (most recent call last): > File > "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify/src/sdks/python/apache_beam/runners/common.py", > line 727, in process > return self.do_fn_invoker.invoke_process(windowed_value) > File > "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify/src/sdks/python/apache_beam/runners/common.py", > line 556, in invoke_process > windowed_value, additional_args, additional_kwargs, output_processor) > File > "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify/src/sdks/python/apache_beam/runners/common.py", > line 622, in _invoke_per_window > self.process_method(*args_for_process, **kwargs_for_process)) > File > "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify/src/sdks/python/apache_beam/runners/common.py", > line 823, in process_outputs > for result in results: > File "/home/jenkins/jenkins-slave/works > pace/beam_PostCommit_Python3_Verify/src/sdks/python/apache_beam/io/gcp/bigquery_file_loads.py", > line 191, in process > if destination in self._destination_to_file_writer: > TypeError: unhashable type: 'TableReference' > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-6711) Bigquery Tornadoes IT is broken in Python3 PostCommit test suite.
[ https://issues.apache.org/jira/browse/BEAM-6711?focusedWorklogId=201584&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-201584 ] ASF GitHub Bot logged work on BEAM-6711: Author: ASF GitHub Bot Created on: 20/Feb/19 20:33 Start Date: 20/Feb/19 20:33 Worklog Time Spent: 10m Work Description: tvalentyn commented on pull request #7892: [BEAM-6711] [BEAM-6553] A Python SDK sink that supports File Loads into BQ (#7655) URL: https://github.com/apache/beam/pull/7892#discussion_r258628801 ## File path: sdks/python/apache_beam/examples/cookbook/bigquery_tornadoes.py ## @@ -73,8 +73,14 @@ def run(argv=None): help= ('Output BigQuery table for results specified as: PROJECT:DATASET.TABLE ' 'or DATASET.TABLE.')) + known_args, pipeline_args = parser.parse_known_args(argv) + method = ('DEFAULT' +if any('runner' in elm Review comment: What is the meaning of this condition? Can we make it more explicit? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 201584) Time Spent: 1h 40m (was: 1.5h) > Bigquery Tornadoes IT is broken in Python3 PostCommit test suite. > -- > > Key: BEAM-6711 > URL: https://issues.apache.org/jira/browse/BEAM-6711 > Project: Beam > Issue Type: Sub-task > Components: sdk-py-core >Reporter: Valentyn Tymofieiev >Assignee: Pablo Estrada >Priority: Major > Time Spent: 1h 40m > Remaining Estimate: 0h > > First failure was observed in > https://builds.apache.org/job/beam_PostCommit_Python3_Verify/54 , after > https://github.com/apache/beam/commit/cdea885872b3be7de9ba22f22700be89f7d53766 > was merged. > [~pabloem], could you please take a look? I suggest we do a rollback + > rollforward with a fix. > {noformat} > root: ERROR: Exception at bundle > , > due to an exception. > Traceback (most recent call last): > File > "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify/src/sdks/python/apache_beam/runners/common.py", > line 727, in process > return self.do_fn_invoker.invoke_process(windowed_value) > File > "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify/src/sdks/python/apache_beam/runners/common.py", > line 556, in invoke_process > windowed_value, additional_args, additional_kwargs, output_processor) > File > "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify/src/sdks/python/apache_beam/runners/common.py", > line 622, in _invoke_per_window > self.process_method(*args_for_process, **kwargs_for_process)) > File > "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify/src/sdks/python/apache_beam/runners/common.py", > line 823, in process_outputs > for result in results: > File "/home/jenkins/jenkins-slave/works > pace/beam_PostCommit_Python3_Verify/src/sdks/python/apache_beam/io/gcp/bigquery_file_loads.py", > line 191, in process > if destination in self._destination_to_file_writer: > TypeError: unhashable type: 'TableReference' > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (BEAM-6530) Strange character on website (contact page)
[ https://issues.apache.org/jira/browse/BEAM-6530?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16773363#comment-16773363 ] Melissa Pashniak commented on BEAM-6530: Those are the footnotes that point to the definition – all of them point to the same definition, so they are all listed on the same line. [https://kramdown.gettalong.org/quickref.html#footnotes] I agree it looks a bit odd and I'm open to ideas on how to improve it. I believe Rafael added the footnotes because there was feedback that it was not clear how to subscribe/unsubscribe. > Strange character on website (contact page) > --- > > Key: BEAM-6530 > URL: https://issues.apache.org/jira/browse/BEAM-6530 > Project: Beam > Issue Type: Bug > Components: website >Reporter: Ruoyun Huang >Priority: Minor > Attachments: Screen Shot 2019-01-28 at 6.22.11 PM.png > > > see screen shot as attached. > > Looks like an html error somewhere, Looking at the code though don't see > strange redundant characters: > [https://github.com/apache/beam/blob/master/website/src/community/contact-us.md] > > Some one know more about how the web pages organized might want to take a > look. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (BEAM-6706) User reports trouble downloading 2.10.0 Dataflow worker image
[ https://issues.apache.org/jira/browse/BEAM-6706?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16773356#comment-16773356 ] Robin Palotai commented on BEAM-6706: - [~tvalentyn] should the caching work with custom images stored on gcr.io as well? This is a private image, stored in a bucket in the same project as the Dataflow work itself. I saw that the error message was repeated over the span of 30-60 minutes, then the job was cancelled due to stuckness. > User reports trouble downloading 2.10.0 Dataflow worker image > - > > Key: BEAM-6706 > URL: https://issues.apache.org/jira/browse/BEAM-6706 > Project: Beam > Issue Type: Bug > Components: runner-dataflow >Reporter: Kenneth Knowles >Assignee: Kenneth Knowles >Priority: Blocker > > DataFlow however is throwing all sorts of errors. For example: > * Handler for GET > /v1.27/images/gcr.io/cloud-dataflow/v1beta3/beam-java-batch:beam-2.10.0/json > returned error: No such image: > gcr.io/cloud-dataflow/v1beta3/beam-java-batch:beam-2.10.0" > * while reading 'google-dockercfg' metadata: http status code: 404 while > fetching url > http://metadata.google.internal./computeMetadata/v1/instance/attributes/google-dockercfg"; > * Error syncing pod..." > The job gets stuck after starting a worker and after an hour or so it gives > up with a failure. 2.9.0 runs fine. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (BEAM-6714) Move runner-agnostic code out of FlinkJobServerDriver
[ https://issues.apache.org/jira/browse/BEAM-6714?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kyle Weaver updated BEAM-6714: -- Component/s: runner-samza > Move runner-agnostic code out of FlinkJobServerDriver > - > > Key: BEAM-6714 > URL: https://issues.apache.org/jira/browse/BEAM-6714 > Project: Beam > Issue Type: Task > Components: runner-flink, runner-samza, runner-spark >Reporter: Kyle Weaver >Assignee: Kyle Weaver >Priority: Major > > [FlinkJobServerDriver|https://github.com/apache/beam/blob/master/runners/flink/src/main/java/org/apache/beam/runners/flink/FlinkJobServerDriver.java] > contains quite a bit of code that is not actually specific to the Flink > runner. This runner-agnostic code should be shared so that other runners (ie > Spark) developing portability can leverage it. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (BEAM-6714) Move runner-agnostic code out of FlinkJobServerDriver
[ https://issues.apache.org/jira/browse/BEAM-6714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16773321#comment-16773321 ] Kyle Weaver commented on BEAM-6714: --- It looks like the Samza runner already implemented its own, with slightly less complexity and flexibility than the Flink version: https://github.com/apache/beam/blob/master/runners/samza/src/main/java/org/apache/beam/runners/samza/SamzaJobServerDriver.java > Move runner-agnostic code out of FlinkJobServerDriver > - > > Key: BEAM-6714 > URL: https://issues.apache.org/jira/browse/BEAM-6714 > Project: Beam > Issue Type: Task > Components: runner-flink, runner-samza, runner-spark >Reporter: Kyle Weaver >Assignee: Kyle Weaver >Priority: Major > > [FlinkJobServerDriver|https://github.com/apache/beam/blob/master/runners/flink/src/main/java/org/apache/beam/runners/flink/FlinkJobServerDriver.java] > contains quite a bit of code that is not actually specific to the Flink > runner. This runner-agnostic code should be shared so that other runners (ie > Spark) developing portability can leverage it. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-6324) CassandraIO.Read - Add the ability to provide a filter to the query
[ https://issues.apache.org/jira/browse/BEAM-6324?focusedWorklogId=201549&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-201549 ] ASF GitHub Bot logged work on BEAM-6324: Author: ASF GitHub Bot Created on: 20/Feb/19 19:26 Start Date: 20/Feb/19 19:26 Worklog Time Spent: 10m Work Description: srfrnk commented on issue #7340: [BEAM-6324] - Cassandra reader with query implemented URL: https://github.com/apache/beam/pull/7340#issuecomment-465719844 @echauchot - I rebased the code. Had to manually re-add the changes but I think it was worth it. Your PR makes the tests much better IMHO. @timrobertson100 - I hope this time I got everything right - the code, single commit, docs and tests. Could you please review? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 201549) Time Spent: 8h 20m (was: 8h 10m) > CassandraIO.Read - Add the ability to provide a filter to the query > --- > > Key: BEAM-6324 > URL: https://issues.apache.org/jira/browse/BEAM-6324 > Project: Beam > Issue Type: Improvement > Components: io-java-cassandra >Affects Versions: 2.9.0 >Reporter: Shahar Frank >Assignee: Shahar Frank >Priority: Major > Labels: performance, pull-request-available, triaged > Time Spent: 8h 20m > Remaining Estimate: 0h > > CassandraIO.Read doesn't support using WHERE to filter the input at the > source (In Cassandra) which might provide great performance boost. > Already implemented by: > https://github.com/apache/beam/pull/7340 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-6158) Enable support for save_main_session in Python 3
[ https://issues.apache.org/jira/browse/BEAM-6158?focusedWorklogId=201534&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-201534 ] ASF GitHub Bot logged work on BEAM-6158: Author: ASF GitHub Bot Created on: 20/Feb/19 19:10 Start Date: 20/Feb/19 19:10 Worklog Time Spent: 10m Work Description: tvalentyn commented on pull request #7888: [BEAM-6158] Remove test_wordcount_without_save_main_session URL: https://github.com/apache/beam/pull/7888#discussion_r258634142 ## File path: sdks/python/container/run_validatescontainer_py3.sh ## @@ -82,7 +83,7 @@ SDK_LOCATION=$(find dist/apache-beam-*.tar.gz) # Run ValidatesRunner tests on Google Cloud Dataflow service echo ">>> RUNNING DATAFLOW RUNNER VALIDATESCONTAINER TEST" python setup.py nosetests \ - --attr Py3IT \ + --attr ValidatesContainer \ Review comment: Ok, I totally missed this change in the earlier revision, now it is clear what keeps WordCount running, thanks. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 201534) Time Spent: 2h 50m (was: 2h 40m) > Enable support for save_main_session in Python 3 > > > Key: BEAM-6158 > URL: https://issues.apache.org/jira/browse/BEAM-6158 > Project: Beam > Issue Type: Sub-task > Components: sdk-py-harness >Reporter: Mark Liu >Assignee: Valentyn Tymofieiev >Priority: Major > Labels: triaged > Time Spent: 2h 50m > Remaining Estimate: 0h > > This happened when I run wordcount example with portable Dataflow runner in > Python 3.5. The failure shows in worker log (unfortunately unformatted) of > [this > job|https://pantheon.corp.google.com/dataflow/jobsDetail/locations/us-central1/jobs/2018-11-29_11_47_38-6731484595556255542?project=google.com:clouddfe]: > {code:java} > Could not load main session: Traceback (most recent call last): File > "/usr/local/lib/python3.5/site-packages/apache_beam/runners/worker/sdk_worker_main.py", > line 125, in main _load_main_session(semi_persistent_directory) File > "/usr/local/lib/python3.5/site-packages/apache_beam/runners/worker/sdk_worker_main.py", > line 201, in _load_main_session pickler.load_session(session_file) File > "/usr/local/lib/python3.5/site-packages/apache_beam/internal/pickler.py", > line 269, in load_session return dill.load_session(file_path) File > "/usr/local/lib/python3.5/site-packages/dill/_dill.py", line 402, in > load_session module = unpickler.load() File > "/usr/local/lib/python3.5/site-packages/dill/_dill.py", line 465, in > find_class return StockUnpickler.find_class(self, module, name) > AttributeError: Can't get attribute 'WordExtractingDoFn' on 'apache_beam.runners.worker.sdk_worker_main' from > '/usr/local/lib/python3.5/site-packages/apache_beam/runners/worker/sdk_worker_main.py'> > Traceback (most recent call last): File > "/usr/local/lib/python3.5/site-packages/apache_beam/runners/worker/sdk_worker_main.py", > line 125, in main _load_main_session(semi_persistent_directory) File > "/usr/local/lib/python3.5/site-packages/apache_beam/runners/worker/sdk_worker_main.py", > line 201, in _load_main_session pickler.load_session(session_file) File > "/usr/local/lib/python3.5/site-packages/apache_beam/internal/pickler.py", > line 269, in load_session return dill.load_session(file_path) File > "/usr/local/lib/python3.5/site-packages/dill/_dill.py", line 402, in > load_session module = unpickler.load() File > "/usr/local/lib/python3.5/site-packages/dill/_dill.py", line 465, in > find_class return StockUnpickler.find_class(self, module, name) > AttributeError: Can't get attribute 'WordExtractingDoFn' on 'apache_beam.runners.worker.sdk_worker_main' from > '/usr/local/lib/python3.5/site-packages/apache_beam/runners/worker/sdk_worker_main.py'> > {code} > Looks like saved main session didn't work properly in Python 3. > +cc: [~tvalentyn] [~robertwb] [~altay] -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-6718) Logical types PR seems to break SQL postcommits
[ https://issues.apache.org/jira/browse/BEAM-6718?focusedWorklogId=201543&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-201543 ] ASF GitHub Bot logged work on BEAM-6718: Author: ASF GitHub Bot Created on: 20/Feb/19 19:18 Start Date: 20/Feb/19 19:18 Worklog Time Spent: 10m Work Description: reuvenlax commented on issue #7904: [BEAM-6718] Fix BigQuery SQL postcommit URL: https://github.com/apache/beam/pull/7904#issuecomment-465716842 Run SQL PostCommit This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 201543) Time Spent: 0.5h (was: 20m) > Logical types PR seems to break SQL postcommits > --- > > Key: BEAM-6718 > URL: https://issues.apache.org/jira/browse/BEAM-6718 > Project: Beam > Issue Type: Bug > Components: dsl-sql >Reporter: Kenneth Knowles >Assignee: Reuven Lax >Priority: Critical > Time Spent: 0.5h > Remaining Estimate: 0h > > Starting here it has been red on BQ integration test: > https://builds.apache.org/job/beam_PostCommit_SQL/634/ -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-4076) Schema followups
[ https://issues.apache.org/jira/browse/BEAM-4076?focusedWorklogId=201538&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-201538 ] ASF GitHub Bot logged work on BEAM-4076: Author: ASF GitHub Bot Created on: 20/Feb/19 19:12 Start Date: 20/Feb/19 19:12 Worklog Time Spent: 10m Work Description: reuvenlax commented on pull request #7635: [BEAM-4076] Generalize schema inputs to ParDo URL: https://github.com/apache/beam/pull/7635#discussion_r258634834 ## File path: sdks/java/core/src/main/java/org/apache/beam/sdk/schemas/transforms/Cast.java ## @@ -300,7 +300,7 @@ public void verifyCompatibility(Schema inputSchema) { @ProcessElement public void process( - @FieldAccess("filterFields") Row input, OutputReceiver r) { + @FieldAccess("filterFields") @Element Row input, OutputReceiver r) { Review comment: With this PR, elements always need Element annotations. This restriction is removed in the next PR. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 201538) Time Spent: 20.5h (was: 20h 20m) > Schema followups > > > Key: BEAM-4076 > URL: https://issues.apache.org/jira/browse/BEAM-4076 > Project: Beam > Issue Type: Improvement > Components: beam-model, dsl-sql, sdk-java-core >Reporter: Kenneth Knowles >Priority: Major > Time Spent: 20.5h > Remaining Estimate: 0h > > This umbrella bug contains subtasks with followups for Beam schemas, which > were moved from SQL to the core Java SDK and made to be type-name-based > rather than coder based. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-4076) Schema followups
[ https://issues.apache.org/jira/browse/BEAM-4076?focusedWorklogId=201537&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-201537 ] ASF GitHub Bot logged work on BEAM-4076: Author: ASF GitHub Bot Created on: 20/Feb/19 19:12 Start Date: 20/Feb/19 19:12 Worklog Time Spent: 10m Work Description: reuvenlax commented on pull request #7635: [BEAM-4076] Generalize schema inputs to ParDo URL: https://github.com/apache/beam/pull/7635#discussion_r258634692 ## File path: runners/apex/src/main/java/org/apache/beam/runners/apex/translation/ParDoTranslator.java ## @@ -77,6 +80,13 @@ public void translate(ParDo.MultiOutput transform, TranslationC PCollection input = context.getInput(); List> sideInputs = transform.getSideInputs(); +DoFnSchemaInformation doFnSchemaInformation; +try { + doFnSchemaInformation = ParDoTranslation.getSchemaInformation(context.getCurrentTransform()); +} catch (IOException e) { Review comment: Done This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 201537) Time Spent: 20h 20m (was: 20h 10m) > Schema followups > > > Key: BEAM-4076 > URL: https://issues.apache.org/jira/browse/BEAM-4076 > Project: Beam > Issue Type: Improvement > Components: beam-model, dsl-sql, sdk-java-core >Reporter: Kenneth Knowles >Priority: Major > Time Spent: 20h 20m > Remaining Estimate: 0h > > This umbrella bug contains subtasks with followups for Beam schemas, which > were moved from SQL to the core Java SDK and made to be type-name-based > rather than coder based. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-4076) Schema followups
[ https://issues.apache.org/jira/browse/BEAM-4076?focusedWorklogId=201539&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-201539 ] ASF GitHub Bot logged work on BEAM-4076: Author: ASF GitHub Bot Created on: 20/Feb/19 19:12 Start Date: 20/Feb/19 19:12 Worklog Time Spent: 10m Work Description: reuvenlax commented on issue #7635: [BEAM-4076] Generalize schema inputs to ParDo URL: https://github.com/apache/beam/pull/7635#issuecomment-465714735 @kennknowles comments addressed. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 201539) Time Spent: 20h 40m (was: 20.5h) > Schema followups > > > Key: BEAM-4076 > URL: https://issues.apache.org/jira/browse/BEAM-4076 > Project: Beam > Issue Type: Improvement > Components: beam-model, dsl-sql, sdk-java-core >Reporter: Kenneth Knowles >Priority: Major > Time Spent: 20h 40m > Remaining Estimate: 0h > > This umbrella bug contains subtasks with followups for Beam schemas, which > were moved from SQL to the core Java SDK and made to be type-name-based > rather than coder based. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-4076) Schema followups
[ https://issues.apache.org/jira/browse/BEAM-4076?focusedWorklogId=201536&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-201536 ] ASF GitHub Bot logged work on BEAM-4076: Author: ASF GitHub Bot Created on: 20/Feb/19 19:12 Start Date: 20/Feb/19 19:12 Worklog Time Spent: 10m Work Description: reuvenlax commented on pull request #7635: [BEAM-4076] Generalize schema inputs to ParDo URL: https://github.com/apache/beam/pull/7635#discussion_r258634638 ## File path: runners/core-construction-java/src/main/java/org/apache/beam/runners/core/construction/ParDoTranslation.java ## @@ -161,9 +158,40 @@ public boolean canTranslate(PTransform pTransform) { } public static ParDoPayload translateParDo( - final ParDo.MultiOutput parDo, Pipeline pipeline, SdkComponents components) - throws IOException { + AppliedPTransform appliedPTransform, SdkComponents components) throws IOException { Review comment: Done This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 201536) Time Spent: 20h 10m (was: 20h) > Schema followups > > > Key: BEAM-4076 > URL: https://issues.apache.org/jira/browse/BEAM-4076 > Project: Beam > Issue Type: Improvement > Components: beam-model, dsl-sql, sdk-java-core >Reporter: Kenneth Knowles >Priority: Major > Time Spent: 20h 10m > Remaining Estimate: 0h > > This umbrella bug contains subtasks with followups for Beam schemas, which > were moved from SQL to the core Java SDK and made to be type-name-based > rather than coder based. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-6158) Enable support for save_main_session in Python 3
[ https://issues.apache.org/jira/browse/BEAM-6158?focusedWorklogId=201535&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-201535 ] ASF GitHub Bot logged work on BEAM-6158: Author: ASF GitHub Bot Created on: 20/Feb/19 19:11 Start Date: 20/Feb/19 19:11 Worklog Time Spent: 10m Work Description: tvalentyn commented on issue #7888: [BEAM-6158] Remove test_wordcount_without_save_main_session URL: https://github.com/apache/beam/pull/7888#issuecomment-465714169 Run Python Dataflow ValidatesContainer This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 201535) Time Spent: 3h (was: 2h 50m) > Enable support for save_main_session in Python 3 > > > Key: BEAM-6158 > URL: https://issues.apache.org/jira/browse/BEAM-6158 > Project: Beam > Issue Type: Sub-task > Components: sdk-py-harness >Reporter: Mark Liu >Assignee: Valentyn Tymofieiev >Priority: Major > Labels: triaged > Time Spent: 3h > Remaining Estimate: 0h > > This happened when I run wordcount example with portable Dataflow runner in > Python 3.5. The failure shows in worker log (unfortunately unformatted) of > [this > job|https://pantheon.corp.google.com/dataflow/jobsDetail/locations/us-central1/jobs/2018-11-29_11_47_38-6731484595556255542?project=google.com:clouddfe]: > {code:java} > Could not load main session: Traceback (most recent call last): File > "/usr/local/lib/python3.5/site-packages/apache_beam/runners/worker/sdk_worker_main.py", > line 125, in main _load_main_session(semi_persistent_directory) File > "/usr/local/lib/python3.5/site-packages/apache_beam/runners/worker/sdk_worker_main.py", > line 201, in _load_main_session pickler.load_session(session_file) File > "/usr/local/lib/python3.5/site-packages/apache_beam/internal/pickler.py", > line 269, in load_session return dill.load_session(file_path) File > "/usr/local/lib/python3.5/site-packages/dill/_dill.py", line 402, in > load_session module = unpickler.load() File > "/usr/local/lib/python3.5/site-packages/dill/_dill.py", line 465, in > find_class return StockUnpickler.find_class(self, module, name) > AttributeError: Can't get attribute 'WordExtractingDoFn' on 'apache_beam.runners.worker.sdk_worker_main' from > '/usr/local/lib/python3.5/site-packages/apache_beam/runners/worker/sdk_worker_main.py'> > Traceback (most recent call last): File > "/usr/local/lib/python3.5/site-packages/apache_beam/runners/worker/sdk_worker_main.py", > line 125, in main _load_main_session(semi_persistent_directory) File > "/usr/local/lib/python3.5/site-packages/apache_beam/runners/worker/sdk_worker_main.py", > line 201, in _load_main_session pickler.load_session(session_file) File > "/usr/local/lib/python3.5/site-packages/apache_beam/internal/pickler.py", > line 269, in load_session return dill.load_session(file_path) File > "/usr/local/lib/python3.5/site-packages/dill/_dill.py", line 402, in > load_session module = unpickler.load() File > "/usr/local/lib/python3.5/site-packages/dill/_dill.py", line 465, in > find_class return StockUnpickler.find_class(self, module, name) > AttributeError: Can't get attribute 'WordExtractingDoFn' on 'apache_beam.runners.worker.sdk_worker_main' from > '/usr/local/lib/python3.5/site-packages/apache_beam/runners/worker/sdk_worker_main.py'> > {code} > Looks like saved main session didn't work properly in Python 3. > +cc: [~tvalentyn] [~robertwb] [~altay] -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-4775) JobService should support returning metrics
[ https://issues.apache.org/jira/browse/BEAM-4775?focusedWorklogId=201522&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-201522 ] ASF GitHub Bot logged work on BEAM-4775: Author: ASF GitHub Bot Created on: 20/Feb/19 18:54 Start Date: 20/Feb/19 18:54 Worklog Time Spent: 10m Work Description: Ardagan commented on pull request #7868: [BEAM-4775] MonitoringInfo URN tweaks URL: https://github.com/apache/beam/pull/7868#discussion_r258622638 ## File path: model/fn-execution/src/main/proto/beam_fn_api.proto ## @@ -337,24 +337,19 @@ message Annotation { // MonitoringInfo protos. message MonitoringInfoSpecs { enum Enum { -// TODO(ajamato): Add the PTRANSFORM name as a required label after Review comment: My take on this: 1. It is better ot keep this spec present. It defines what fields user counter needs. 2. We can rename this element to something like USER_COUNTER_PREFIX for clarity. 3. It is ok to move it away from this list, but we should still keep it somewhere for generic validation and specification. However in this case we might want to move it away from MonitoringInfoUrns as well. 4. Currently, we have urns defined in this list and in MonitoringInfoUrns. I was thinking of removing MonitoringInfoUrns completely and keep only MonitoringInfoSpecs as explicit source of truth. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 201522) Time Spent: 17h 40m (was: 17.5h) > JobService should support returning metrics > --- > > Key: BEAM-4775 > URL: https://issues.apache.org/jira/browse/BEAM-4775 > Project: Beam > Issue Type: Bug > Components: beam-model >Reporter: Eugene Kirpichov >Assignee: Ryan Williams >Priority: Major > Labels: triaged > Time Spent: 17h 40m > Remaining Estimate: 0h > > [https://github.com/apache/beam/blob/master/model/job-management/src/main/proto/beam_job_api.proto] > currently doesn't appear to have a way for JobService to return metrics to a > user, even though > [https://github.com/apache/beam/blob/master/model/fn-execution/src/main/proto/beam_fn_api.proto] > includes support for reporting SDK metrics to the runner harness. > > Metrics are apparently necessary to run any ValidatesRunner tests because > PAssert needs to validate that the assertions succeeded. However, this > statement should be double-checked: perhaps it's possible to somehow work > with PAssert without metrics support. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-6158) Enable support for save_main_session in Python 3
[ https://issues.apache.org/jira/browse/BEAM-6158?focusedWorklogId=201526&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-201526 ] ASF GitHub Bot logged work on BEAM-6158: Author: ASF GitHub Bot Created on: 20/Feb/19 19:00 Start Date: 20/Feb/19 19:00 Worklog Time Spent: 10m Work Description: markflyhigh commented on issue #7888: [BEAM-6158] Remove test_wordcount_without_save_main_session URL: https://github.com/apache/beam/pull/7888#issuecomment-465710347 update done. PTAL @tvalentyn This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 201526) Time Spent: 2h 40m (was: 2.5h) > Enable support for save_main_session in Python 3 > > > Key: BEAM-6158 > URL: https://issues.apache.org/jira/browse/BEAM-6158 > Project: Beam > Issue Type: Sub-task > Components: sdk-py-harness >Reporter: Mark Liu >Assignee: Valentyn Tymofieiev >Priority: Major > Labels: triaged > Time Spent: 2h 40m > Remaining Estimate: 0h > > This happened when I run wordcount example with portable Dataflow runner in > Python 3.5. The failure shows in worker log (unfortunately unformatted) of > [this > job|https://pantheon.corp.google.com/dataflow/jobsDetail/locations/us-central1/jobs/2018-11-29_11_47_38-6731484595556255542?project=google.com:clouddfe]: > {code:java} > Could not load main session: Traceback (most recent call last): File > "/usr/local/lib/python3.5/site-packages/apache_beam/runners/worker/sdk_worker_main.py", > line 125, in main _load_main_session(semi_persistent_directory) File > "/usr/local/lib/python3.5/site-packages/apache_beam/runners/worker/sdk_worker_main.py", > line 201, in _load_main_session pickler.load_session(session_file) File > "/usr/local/lib/python3.5/site-packages/apache_beam/internal/pickler.py", > line 269, in load_session return dill.load_session(file_path) File > "/usr/local/lib/python3.5/site-packages/dill/_dill.py", line 402, in > load_session module = unpickler.load() File > "/usr/local/lib/python3.5/site-packages/dill/_dill.py", line 465, in > find_class return StockUnpickler.find_class(self, module, name) > AttributeError: Can't get attribute 'WordExtractingDoFn' on 'apache_beam.runners.worker.sdk_worker_main' from > '/usr/local/lib/python3.5/site-packages/apache_beam/runners/worker/sdk_worker_main.py'> > Traceback (most recent call last): File > "/usr/local/lib/python3.5/site-packages/apache_beam/runners/worker/sdk_worker_main.py", > line 125, in main _load_main_session(semi_persistent_directory) File > "/usr/local/lib/python3.5/site-packages/apache_beam/runners/worker/sdk_worker_main.py", > line 201, in _load_main_session pickler.load_session(session_file) File > "/usr/local/lib/python3.5/site-packages/apache_beam/internal/pickler.py", > line 269, in load_session return dill.load_session(file_path) File > "/usr/local/lib/python3.5/site-packages/dill/_dill.py", line 402, in > load_session module = unpickler.load() File > "/usr/local/lib/python3.5/site-packages/dill/_dill.py", line 465, in > find_class return StockUnpickler.find_class(self, module, name) > AttributeError: Can't get attribute 'WordExtractingDoFn' on 'apache_beam.runners.worker.sdk_worker_main' from > '/usr/local/lib/python3.5/site-packages/apache_beam/runners/worker/sdk_worker_main.py'> > {code} > Looks like saved main session didn't work properly in Python 3. > +cc: [~tvalentyn] [~robertwb] [~altay] -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-6158) Enable support for save_main_session in Python 3
[ https://issues.apache.org/jira/browse/BEAM-6158?focusedWorklogId=201525&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-201525 ] ASF GitHub Bot logged work on BEAM-6158: Author: ASF GitHub Bot Created on: 20/Feb/19 19:00 Start Date: 20/Feb/19 19:00 Worklog Time Spent: 10m Work Description: markflyhigh commented on pull request #7888: [BEAM-6158] Remove test_wordcount_without_save_main_session URL: https://github.com/apache/beam/pull/7888#discussion_r258629986 ## File path: sdks/python/apache_beam/examples/wordcount_it_test.py ## @@ -103,13 +50,6 @@ def test_wordcount_it(self): def test_wordcount_fnapi_it(self): self._run_wordcount_it(wordcount.run, experiment='beam_fn_api') - @attr('Py3IT') Review comment: yes. I just updated the code. Changes in `run_validatescontainer_py3.sh` should be clear. Basically `ValidatesContainer` will be run. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 201525) Time Spent: 2.5h (was: 2h 20m) > Enable support for save_main_session in Python 3 > > > Key: BEAM-6158 > URL: https://issues.apache.org/jira/browse/BEAM-6158 > Project: Beam > Issue Type: Sub-task > Components: sdk-py-harness >Reporter: Mark Liu >Assignee: Valentyn Tymofieiev >Priority: Major > Labels: triaged > Time Spent: 2.5h > Remaining Estimate: 0h > > This happened when I run wordcount example with portable Dataflow runner in > Python 3.5. The failure shows in worker log (unfortunately unformatted) of > [this > job|https://pantheon.corp.google.com/dataflow/jobsDetail/locations/us-central1/jobs/2018-11-29_11_47_38-6731484595556255542?project=google.com:clouddfe]: > {code:java} > Could not load main session: Traceback (most recent call last): File > "/usr/local/lib/python3.5/site-packages/apache_beam/runners/worker/sdk_worker_main.py", > line 125, in main _load_main_session(semi_persistent_directory) File > "/usr/local/lib/python3.5/site-packages/apache_beam/runners/worker/sdk_worker_main.py", > line 201, in _load_main_session pickler.load_session(session_file) File > "/usr/local/lib/python3.5/site-packages/apache_beam/internal/pickler.py", > line 269, in load_session return dill.load_session(file_path) File > "/usr/local/lib/python3.5/site-packages/dill/_dill.py", line 402, in > load_session module = unpickler.load() File > "/usr/local/lib/python3.5/site-packages/dill/_dill.py", line 465, in > find_class return StockUnpickler.find_class(self, module, name) > AttributeError: Can't get attribute 'WordExtractingDoFn' on 'apache_beam.runners.worker.sdk_worker_main' from > '/usr/local/lib/python3.5/site-packages/apache_beam/runners/worker/sdk_worker_main.py'> > Traceback (most recent call last): File > "/usr/local/lib/python3.5/site-packages/apache_beam/runners/worker/sdk_worker_main.py", > line 125, in main _load_main_session(semi_persistent_directory) File > "/usr/local/lib/python3.5/site-packages/apache_beam/runners/worker/sdk_worker_main.py", > line 201, in _load_main_session pickler.load_session(session_file) File > "/usr/local/lib/python3.5/site-packages/apache_beam/internal/pickler.py", > line 269, in load_session return dill.load_session(file_path) File > "/usr/local/lib/python3.5/site-packages/dill/_dill.py", line 402, in > load_session module = unpickler.load() File > "/usr/local/lib/python3.5/site-packages/dill/_dill.py", line 465, in > find_class return StockUnpickler.find_class(self, module, name) > AttributeError: Can't get attribute 'WordExtractingDoFn' on 'apache_beam.runners.worker.sdk_worker_main' from > '/usr/local/lib/python3.5/site-packages/apache_beam/runners/worker/sdk_worker_main.py'> > {code} > Looks like saved main session didn't work properly in Python 3. > +cc: [~tvalentyn] [~robertwb] [~altay] -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-4076) Schema followups
[ https://issues.apache.org/jira/browse/BEAM-4076?focusedWorklogId=201517&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-201517 ] ASF GitHub Bot logged work on BEAM-4076: Author: ASF GitHub Bot Created on: 20/Feb/19 18:49 Start Date: 20/Feb/19 18:49 Worklog Time Spent: 10m Work Description: reuvenlax commented on pull request #7635: [BEAM-4076] Generalize schema inputs to ParDo URL: https://github.com/apache/beam/pull/7635#discussion_r258625424 ## File path: sdks/java/core/src/main/java/org/apache/beam/sdk/schemas/transforms/Cast.java ## @@ -300,7 +300,7 @@ public void verifyCompatibility(Schema inputSchema) { @ProcessElement public void process( - @FieldAccess("filterFields") Row input, OutputReceiver r) { + @FieldAccess("filterFields") @Element Row input, OutputReceiver r) { Review comment: After this PR, FieldAccess no longer replaces Element. However the next PR will remove the need for Element (requires more refactoring) This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 201517) Time Spent: 20h (was: 19h 50m) > Schema followups > > > Key: BEAM-4076 > URL: https://issues.apache.org/jira/browse/BEAM-4076 > Project: Beam > Issue Type: Improvement > Components: beam-model, dsl-sql, sdk-java-core >Reporter: Kenneth Knowles >Priority: Major > Time Spent: 20h > Remaining Estimate: 0h > > This umbrella bug contains subtasks with followups for Beam schemas, which > were moved from SQL to the core Java SDK and made to be type-name-based > rather than coder based. -- This message was sent by Atlassian JIRA (v7.6.3#76005)