[jira] [Work logged] (BEAM-6165) Send metrics to Flink in portable Flink runner
[ https://issues.apache.org/jira/browse/BEAM-6165?focusedWorklogId=216583&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-216583 ] ASF GitHub Bot logged work on BEAM-6165: Author: ASF GitHub Bot Created on: 21/Mar/19 04:06 Start Date: 21/Mar/19 04:06 Worklog Time Spent: 10m Work Description: tweise commented on pull request #7971: [BEAM-6165] Flink portable metrics: get ptransform from MonitoringInfo, not stage name URL: https://github.com/apache/beam/pull/7971#discussion_r267620804 ## File path: runners/flink/src/main/java/org/apache/beam/runners/flink/metrics/FlinkMetricContainer.java ## @@ -99,117 +103,145 @@ public MetricsContainer getMetricsContainer(String stepName) { * Update this container with metrics from the passed {@link MonitoringInfo}s, and send updates * along to Flink's internal metrics framework. */ - public void updateMetrics(String stepName, List monitoringInfos) { -MetricsContainer metricsContainer = getMetricsContainer(stepName); + public void updateMetrics(List monitoringInfos) { monitoringInfos.forEach( monitoringInfo -> { - if (monitoringInfo.hasMetric()) { -String urn = monitoringInfo.getUrn(); -MetricName metricName = parseUrn(urn); -Metric metric = monitoringInfo.getMetric(); -if (metric.hasCounterData()) { - CounterData counterData = metric.getCounterData(); - if (counterData.getValueCase() == CounterData.ValueCase.INT64_VALUE) { -org.apache.beam.sdk.metrics.Counter counter = -metricsContainer.getCounter(metricName); -counter.inc(counterData.getInt64Value()); - } else { -LOG.warn("Unsupported CounterData type: {}", counterData); - } -} else if (metric.hasDistributionData()) { - DistributionData distributionData = metric.getDistributionData(); - if (distributionData.hasIntDistributionData()) { -Distribution distribution = metricsContainer.getDistribution(metricName); -IntDistributionData intDistributionData = distributionData.getIntDistributionData(); -distribution.update( -intDistributionData.getSum(), -intDistributionData.getCount(), -intDistributionData.getMin(), -intDistributionData.getMax()); - } else { -LOG.warn("Unsupported DistributionData type: {}", distributionData); - } -} else if (metric.hasExtremaData()) { - ExtremaData extremaData = metric.getExtremaData(); - LOG.warn("Extrema metric unsupported: {}", extremaData); + if (!monitoringInfo.hasMetric()) { +LOG.info("Skipping metric-less MonitoringInfo: {}", monitoringInfo); +return; + } + Metric metric = monitoringInfo.getMetric(); + + String urn = monitoringInfo.getUrn(); + MetricName metricName = parseUserUrn(urn); + if (metricName == null) { +LOG.info("Dropping system metric: {}", monitoringInfo); +return; + } + + Map labels = monitoringInfo.getLabelsMap(); + String ptransform = labels.get(PTRANSFORM_LABEL); + + MetricsContainer metricsContainer = getMetricsContainer(ptransform); + + MetricKey key = MetricKey.create(ptransform, metricName); + + if (metric.hasCounterData()) { +CounterData counterData = metric.getCounterData(); +if (counterData.getValueCase() == CounterData.ValueCase.INT64_VALUE) { + long value = counterData.getInt64Value(); + org.apache.beam.sdk.metrics.Counter counter = metricsContainer.getCounter(metricName); + counter.inc(value); + + // Update flink + updateCounter(key, value); +} else { + LOG.warn("Unsupported CounterData type: {}", counterData); +} + } else if (metric.hasDistributionData()) { +DistributionData distributionData = metric.getDistributionData(); +if (distributionData.hasIntDistributionData()) { + Distribution distribution = metricsContainer.getDistribution(metricName); + IntDistributionData intDistributionData = distributionData.getIntDistributionData(); + distribution.update( + intDistributionData.getSum(), + intDistributionData.getCount(), + intDistributionData.getMin(), + intDistributionData.getMax()); + + // Update flink + updateDistribution( + key, + DistributionResult.create( +
[jira] [Work logged] (BEAM-6527) Parallel tox (unit) tests run on Jenkins
[ https://issues.apache.org/jira/browse/BEAM-6527?focusedWorklogId=216560&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-216560 ] ASF GitHub Bot logged work on BEAM-6527: Author: ASF GitHub Bot Created on: 21/Mar/19 00:36 Start Date: 21/Mar/19 00:36 Worklog Time Spent: 10m Work Description: markflyhigh commented on issue #8067: [BEAM-6527] Use Gradle to parallel Python tox tests URL: https://github.com/apache/beam/pull/8067#issuecomment-475079256 "./gradlew publish" failed since python36 and 37 are missing on Beam12. This is an infra issue and I started [a discussion](https://lists.apache.org/thread.html/529a3626eb9e8f5795b271b9d06ba34242fcaf771fdc8f596eaec229@%3Cdev.beam.apache.org%3E) on dev@. Gradle scan shows that python load test was built successfully. PTAL @aaltay @tvalentyn @adude3141 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 216560) Time Spent: 9h (was: 8h 50m) > Parallel tox (unit) tests run on Jenkins > > > Key: BEAM-6527 > URL: https://issues.apache.org/jira/browse/BEAM-6527 > Project: Beam > Issue Type: Sub-task > Components: testing >Reporter: Mark Liu >Assignee: Mark Liu >Priority: Major > Labels: triaged > Time Spent: 9h > Remaining Estimate: 0h > > Existing tox unit test suite (basic, gcp and cython) will be enabled in > multiple version of Python 3, which will significantly increase runtime of > Pre/PostCommit build. A parallel is wanted in tox or Gradle invocation to > control the time in a reasonable range (<30mins for PreCommit is desired). -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-6527) Parallel tox (unit) tests run on Jenkins
[ https://issues.apache.org/jira/browse/BEAM-6527?focusedWorklogId=216561&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-216561 ] ASF GitHub Bot logged work on BEAM-6527: Author: ASF GitHub Bot Created on: 21/Mar/19 00:37 Start Date: 21/Mar/19 00:37 Worklog Time Spent: 10m Work Description: markflyhigh commented on issue #8067: [BEAM-6527] Use Gradle to parallel Python tox tests URL: https://github.com/apache/beam/pull/8067#issuecomment-475079256 "./gradlew publish" failed since python36 and 37 are missing on Beam12. This is an infra issue and I started [a discussion](https://lists.apache.org/thread.html/529a3626eb9e8f5795b271b9d06ba34242fcaf771fdc8f596eaec229@%3Cdev.beam.apache.org%3E) on dev@. Gradle scan shows that python load test (failed previously and led to rollback) was built successfully. PTAL @aaltay @tvalentyn @adude3141 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 216561) Time Spent: 9h 10m (was: 9h) > Parallel tox (unit) tests run on Jenkins > > > Key: BEAM-6527 > URL: https://issues.apache.org/jira/browse/BEAM-6527 > Project: Beam > Issue Type: Sub-task > Components: testing >Reporter: Mark Liu >Assignee: Mark Liu >Priority: Major > Labels: triaged > Time Spent: 9h 10m > Remaining Estimate: 0h > > Existing tox unit test suite (basic, gcp and cython) will be enabled in > multiple version of Python 3, which will significantly increase runtime of > Pre/PostCommit build. A parallel is wanted in tox or Gradle invocation to > control the time in a reasonable range (<30mins for PreCommit is desired). -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-6527) Parallel tox (unit) tests run on Jenkins
[ https://issues.apache.org/jira/browse/BEAM-6527?focusedWorklogId=216562&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-216562 ] ASF GitHub Bot logged work on BEAM-6527: Author: ASF GitHub Bot Created on: 21/Mar/19 00:37 Start Date: 21/Mar/19 00:37 Worklog Time Spent: 10m Work Description: adude3141 commented on issue #8067: [BEAM-6527] Use Gradle to parallel Python tox tests URL: https://github.com/apache/beam/pull/8067#issuecomment-475079580 Run Gradle Publish This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 216562) Time Spent: 9h 20m (was: 9h 10m) > Parallel tox (unit) tests run on Jenkins > > > Key: BEAM-6527 > URL: https://issues.apache.org/jira/browse/BEAM-6527 > Project: Beam > Issue Type: Sub-task > Components: testing >Reporter: Mark Liu >Assignee: Mark Liu >Priority: Major > Labels: triaged > Time Spent: 9h 20m > Remaining Estimate: 0h > > Existing tox unit test suite (basic, gcp and cython) will be enabled in > multiple version of Python 3, which will significantly increase runtime of > Pre/PostCommit build. A parallel is wanted in tox or Gradle invocation to > control the time in a reasonable range (<30mins for PreCommit is desired). -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-6527) Parallel tox (unit) tests run on Jenkins
[ https://issues.apache.org/jira/browse/BEAM-6527?focusedWorklogId=216558&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-216558 ] ASF GitHub Bot logged work on BEAM-6527: Author: ASF GitHub Bot Created on: 21/Mar/19 00:35 Start Date: 21/Mar/19 00:35 Worklog Time Spent: 10m Work Description: markflyhigh commented on issue #8067: [BEAM-6527] Use Gradle to parallel Python tox tests URL: https://github.com/apache/beam/pull/8067#issuecomment-475079256 "./gradlew publish" failed since python36 and 37 are missing on Beam12. This is a infra issue and I start [a discussion] on dev@(https://lists.apache.org/thread.html/529a3626eb9e8f5795b271b9d06ba34242fcaf771fdc8f596eaec229@%3Cdev.beam.apache.org%3E). Gradle scan shows that python load test was built successfully. PTAL @aaltay @tvalentyn @adude3141 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 216558) Time Spent: 8h 40m (was: 8.5h) > Parallel tox (unit) tests run on Jenkins > > > Key: BEAM-6527 > URL: https://issues.apache.org/jira/browse/BEAM-6527 > Project: Beam > Issue Type: Sub-task > Components: testing >Reporter: Mark Liu >Assignee: Mark Liu >Priority: Major > Labels: triaged > Time Spent: 8h 40m > Remaining Estimate: 0h > > Existing tox unit test suite (basic, gcp and cython) will be enabled in > multiple version of Python 3, which will significantly increase runtime of > Pre/PostCommit build. A parallel is wanted in tox or Gradle invocation to > control the time in a reasonable range (<30mins for PreCommit is desired). -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-6527) Parallel tox (unit) tests run on Jenkins
[ https://issues.apache.org/jira/browse/BEAM-6527?focusedWorklogId=216559&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-216559 ] ASF GitHub Bot logged work on BEAM-6527: Author: ASF GitHub Bot Created on: 21/Mar/19 00:35 Start Date: 21/Mar/19 00:35 Worklog Time Spent: 10m Work Description: markflyhigh commented on issue #8067: [BEAM-6527] Use Gradle to parallel Python tox tests URL: https://github.com/apache/beam/pull/8067#issuecomment-475079256 "./gradlew publish" failed since python36 and 37 are missing on Beam12. This is an infra issue and I start [a discussion] on dev@(https://lists.apache.org/thread.html/529a3626eb9e8f5795b271b9d06ba34242fcaf771fdc8f596eaec229@%3Cdev.beam.apache.org%3E). Gradle scan shows that python load test was built successfully. PTAL @aaltay @tvalentyn @adude3141 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 216559) Time Spent: 8h 50m (was: 8h 40m) > Parallel tox (unit) tests run on Jenkins > > > Key: BEAM-6527 > URL: https://issues.apache.org/jira/browse/BEAM-6527 > Project: Beam > Issue Type: Sub-task > Components: testing >Reporter: Mark Liu >Assignee: Mark Liu >Priority: Major > Labels: triaged > Time Spent: 8h 50m > Remaining Estimate: 0h > > Existing tox unit test suite (basic, gcp and cython) will be enabled in > multiple version of Python 3, which will significantly increase runtime of > Pre/PostCommit build. A parallel is wanted in tox or Gradle invocation to > control the time in a reasonable range (<30mins for PreCommit is desired). -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-6838) Improve error messages for unregistered protos
[ https://issues.apache.org/jira/browse/BEAM-6838?focusedWorklogId=216555&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-216555 ] ASF GitHub Bot logged work on BEAM-6838: Author: ASF GitHub Bot Created on: 21/Mar/19 00:16 Start Date: 21/Mar/19 00:16 Worklog Time Spent: 10m Work Description: youngoli commented on issue #8064: [BEAM-6838] Go SDK: Improving error msgs in pipeline serialization. URL: https://github.com/apache/beam/pull/8064#issuecomment-475076149 There definitely is a lot of repetition with the recursion, even before my changes. Though I'm not sure how to avoid that really. While it's a lot of information for users, it's important for debugging. Only alternative I can think of is adding some debug logs instead of some errors, so users can see the path being taken by examining logs. Here's a real-world example I've been using to test (I replaced the proto name though) > failed to encode custom coder: bad underlying type: bad element: bad field type: bad element: bad element: bad field type: bad element: bad field type: bad element: bad element: bad field type: bad field type: bad element: bad field type: unencodable type: protoapi.extensionMap And the new error: > Failed to encode custom coder for type protoapi.Message. Make sure the type was registered before calling beam.Init. For example: beam.RegisterType(reflect.TypeOf((*TypeName)(nil)).Elem()) > > Full error: failed to encode custom coder *my_package.MyProto[protoapi.Message], bad underlying type: failed to encode pointer *my_package.MyProto, bad base type: failed to encode struct my_package.MyProto, bad field type: failed to encode pointer *my_package.MyProto_MyField, bad base type: failed to encode struct my_package.MyProto_MyField, bad field type: unencodable type map[int32]protoapi.Message This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 216555) Time Spent: 7h 20m (was: 7h 10m) > Improve error messages for unregistered protos > -- > > Key: BEAM-6838 > URL: https://issues.apache.org/jira/browse/BEAM-6838 > Project: Beam > Issue Type: Sub-task > Components: sdk-go >Reporter: Daniel Oliveira >Assignee: Daniel Oliveira >Priority: Minor > Time Spent: 7h 20m > Remaining Estimate: 0h > > When users have a type that can't be serialized (it has a function Field or > similar) or more complex protocol buffers, the error should be improved to > display the type being registered, and better yet, the beam.RegisterType(...) > code they can use to enable use of their type. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-6446) Clean up suppression rules in checkstyle suppressions.xml
[ https://issues.apache.org/jira/browse/BEAM-6446?focusedWorklogId=216554&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-216554 ] ASF GitHub Bot logged work on BEAM-6446: Author: ASF GitHub Bot Created on: 21/Mar/19 00:11 Start Date: 21/Mar/19 00:11 Worklog Time Spent: 10m Work Description: HuangLED commented on issue #8091: [BEAM-6446] Clean up checkstyle suppressions. URL: https://github.com/apache/beam/pull/8091#issuecomment-475075320 R: @kennknowles This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 216554) Time Spent: 1h (was: 50m) > Clean up suppression rules in checkstyle suppressions.xml > - > > Key: BEAM-6446 > URL: https://issues.apache.org/jira/browse/BEAM-6446 > Project: Beam > Issue Type: Improvement > Components: build-system >Reporter: Ruoyun Huang >Assignee: Ruoyun Huang >Priority: Minor > Labels: triaged > Time Spent: 1h > Remaining Estimate: 0h > > When violations are addressed, clean up suppression rules in checkstyle > suppressions.xml -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work started] (BEAM-6446) Clean up suppression rules in checkstyle suppressions.xml
[ https://issues.apache.org/jira/browse/BEAM-6446?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on BEAM-6446 started by Ruoyun Huang. -- > Clean up suppression rules in checkstyle suppressions.xml > - > > Key: BEAM-6446 > URL: https://issues.apache.org/jira/browse/BEAM-6446 > Project: Beam > Issue Type: Improvement > Components: build-system >Reporter: Ruoyun Huang >Assignee: Ruoyun Huang >Priority: Minor > Labels: triaged > Time Spent: 50m > Remaining Estimate: 0h > > When violations are addressed, clean up suppression rules in checkstyle > suppressions.xml -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-6838) Improve error messages for unregistered protos
[ https://issues.apache.org/jira/browse/BEAM-6838?focusedWorklogId=216542&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-216542 ] ASF GitHub Bot logged work on BEAM-6838: Author: ASF GitHub Bot Created on: 20/Mar/19 23:53 Start Date: 20/Mar/19 23:53 Worklog Time Spent: 10m Work Description: youngoli commented on pull request #8064: [BEAM-6838] Go SDK: Improving error msgs in pipeline serialization. URL: https://github.com/apache/beam/pull/8064#discussion_r267590495 ## File path: sdks/go/pkg/beam/core/runtime/graphx/serialize.go ## @@ -257,22 +257,22 @@ func decodeFn(u *v1.Fn) (*graph.Fn, error) { if u.Fn != nil { fn, err := decodeUserFn(u.Fn) if err != nil { - return nil, fmt.Errorf("bad userfn: %v", err) + return nil, fmt.Errorf("failed to decode function proto %v, bad userfn: %v", u, err) Review comment: Done with a little added detail: ("failed to decode DoFn %v, failed to decode userfn: %v", u, err) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 216542) Time Spent: 5h 40m (was: 5.5h) > Improve error messages for unregistered protos > -- > > Key: BEAM-6838 > URL: https://issues.apache.org/jira/browse/BEAM-6838 > Project: Beam > Issue Type: Sub-task > Components: sdk-go >Reporter: Daniel Oliveira >Assignee: Daniel Oliveira >Priority: Minor > Time Spent: 5h 40m > Remaining Estimate: 0h > > When users have a type that can't be serialized (it has a function Field or > similar) or more complex protocol buffers, the error should be improved to > display the type being registered, and better yet, the beam.RegisterType(...) > code they can use to enable use of their type. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-6838) Improve error messages for unregistered protos
[ https://issues.apache.org/jira/browse/BEAM-6838?focusedWorklogId=216550&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-216550 ] ASF GitHub Bot logged work on BEAM-6838: Author: ASF GitHub Bot Created on: 20/Mar/19 23:54 Start Date: 20/Mar/19 23:54 Worklog Time Spent: 10m Work Description: youngoli commented on pull request #8064: [BEAM-6838] Go SDK: Improving error msgs in pipeline serialization. URL: https://github.com/apache/beam/pull/8064#discussion_r267590596 ## File path: sdks/go/pkg/beam/core/runtime/graphx/serialize.go ## @@ -195,7 +195,7 @@ func encodeFn(u *graph.Fn) (*v1.Fn, error) { gen := reflectx.FunctionName(u.DynFn.Gen) t, err := encodeType(u.DynFn.T) if err != nil { - return nil, fmt.Errorf("bad function type: %v", err) + return nil, fmt.Errorf("failed to encode function %v, bad function type: %v", u, err) Review comment: Done. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 216550) Time Spent: 7h (was: 6h 50m) > Improve error messages for unregistered protos > -- > > Key: BEAM-6838 > URL: https://issues.apache.org/jira/browse/BEAM-6838 > Project: Beam > Issue Type: Sub-task > Components: sdk-go >Reporter: Daniel Oliveira >Assignee: Daniel Oliveira >Priority: Minor > Time Spent: 7h > Remaining Estimate: 0h > > When users have a type that can't be serialized (it has a function Field or > similar) or more complex protocol buffers, the error should be improved to > display the type being registered, and better yet, the beam.RegisterType(...) > code they can use to enable use of their type. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-6838) Improve error messages for unregistered protos
[ https://issues.apache.org/jira/browse/BEAM-6838?focusedWorklogId=216545&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-216545 ] ASF GitHub Bot logged work on BEAM-6838: Author: ASF GitHub Bot Created on: 20/Mar/19 23:54 Start Date: 20/Mar/19 23:54 Worklog Time Spent: 10m Work Description: youngoli commented on pull request #8064: [BEAM-6838] Go SDK: Improving error msgs in pipeline serialization. URL: https://github.com/apache/beam/pull/8064#discussion_r267590522 ## File path: sdks/go/pkg/beam/core/runtime/graphx/serialize.go ## @@ -207,45 +207,45 @@ func encodeFn(u *graph.Fn) (*v1.Fn, error) { case u.Fn != nil: fn, err := encodeUserFn(u.Fn) if err != nil { - return nil, fmt.Errorf("bad userfn: %v", err) + return nil, fmt.Errorf("failed to encode function %v, bad userfn: %v", u, err) } return &v1.Fn{Fn: fn}, nil case u.Recv != nil: t := reflect.TypeOf(u.Recv) k, ok := runtime.TypeKey(reflectx.SkipPtr(t)) if !ok { - return nil, fmt.Errorf("bad recv: %v", u.Recv) + return nil, fmt.Errorf("failed to encode function %v, bad receiver %v", u, u.Recv) } if _, ok := runtime.LookupType(k); !ok { - return nil, fmt.Errorf("recv type must be registered: %v", t) + return nil, fmt.Errorf("failed to encode function %v, receiver type %v must be registered", u, t) } typ, err := encodeType(t) if err != nil { - panic(fmt.Sprintf("bad recv type: %v", u.Recv)) + panic(fmt.Sprintf("Failed to encode function %v, bad receiver type: %v", u, err)) } data, err := json.Marshal(u.Recv) if err != nil { - return nil, fmt.Errorf("bad userfn: %v", err) + return nil, fmt.Errorf("failed to encode function %v, bad userfn: %v", u, err) Review comment: Done, with some changes: ("failed to encode structural DoFn %v, failed to marshal receiver %v: %v", u, u.Recv, err) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 216545) Time Spent: 6h 10m (was: 6h) > Improve error messages for unregistered protos > -- > > Key: BEAM-6838 > URL: https://issues.apache.org/jira/browse/BEAM-6838 > Project: Beam > Issue Type: Sub-task > Components: sdk-go >Reporter: Daniel Oliveira >Assignee: Daniel Oliveira >Priority: Minor > Time Spent: 6h 10m > Remaining Estimate: 0h > > When users have a type that can't be serialized (it has a function Field or > similar) or more complex protocol buffers, the error should be improved to > display the type being registered, and better yet, the beam.RegisterType(...) > code they can use to enable use of their type. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-6838) Improve error messages for unregistered protos
[ https://issues.apache.org/jira/browse/BEAM-6838?focusedWorklogId=216539&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-216539 ] ASF GitHub Bot logged work on BEAM-6838: Author: ASF GitHub Bot Created on: 20/Mar/19 23:53 Start Date: 20/Mar/19 23:53 Worklog Time Spent: 10m Work Description: youngoli commented on pull request #8064: [BEAM-6838] Go SDK: Improving error msgs in pipeline serialization. URL: https://github.com/apache/beam/pull/8064#discussion_r267590458 ## File path: sdks/go/pkg/beam/core/runtime/graphx/serialize.go ## @@ -257,22 +257,22 @@ func decodeFn(u *v1.Fn) (*graph.Fn, error) { if u.Fn != nil { fn, err := decodeUserFn(u.Fn) if err != nil { - return nil, fmt.Errorf("bad userfn: %v", err) + return nil, fmt.Errorf("failed to decode function proto %v, bad userfn: %v", u, err) } fx, err := funcx.New(reflectx.MakeFunc(fn)) if err != nil { - return nil, fmt.Errorf("bad userfn: %v", err) + return nil, fmt.Errorf("failed to decode function proto %v, bad userfn: %v", u, err) } return &graph.Fn{Fn: fx}, nil } t, err := decodeType(u.Type) if err != nil { - return nil, fmt.Errorf("bad type: %v", err) + return nil, fmt.Errorf("failed to decode function proto %v, bad type: %v", u, err) } fn, err := reflectx.UnmarshalJSON(t, u.Opt) if err != nil { - return nil, fmt.Errorf("bad struct encoding: %v", err) + return nil, fmt.Errorf("failed to decode function proto %v, bad struct encoding: %v", u, err) Review comment: Is this a copy/paste error, or did you want "bad struct encoding" changed to "bad userfn"? Done for the change from "function proto" -> "structural DoFn". This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 216539) Time Spent: 5h 10m (was: 5h) > Improve error messages for unregistered protos > -- > > Key: BEAM-6838 > URL: https://issues.apache.org/jira/browse/BEAM-6838 > Project: Beam > Issue Type: Sub-task > Components: sdk-go >Reporter: Daniel Oliveira >Assignee: Daniel Oliveira >Priority: Minor > Time Spent: 5h 10m > Remaining Estimate: 0h > > When users have a type that can't be serialized (it has a function Field or > similar) or more complex protocol buffers, the error should be improved to > display the type being registered, and better yet, the beam.RegisterType(...) > code they can use to enable use of their type. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-6838) Improve error messages for unregistered protos
[ https://issues.apache.org/jira/browse/BEAM-6838?focusedWorklogId=216551&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-216551 ] ASF GitHub Bot logged work on BEAM-6838: Author: ASF GitHub Bot Created on: 20/Mar/19 23:54 Start Date: 20/Mar/19 23:54 Worklog Time Spent: 10m Work Description: youngoli commented on pull request #8064: [BEAM-6838] Go SDK: Improving error msgs in pipeline serialization. URL: https://github.com/apache/beam/pull/8064#discussion_r267590604 ## File path: sdks/go/pkg/beam/core/runtime/graphx/serialize.go ## @@ -145,20 +145,20 @@ func encodeCustomCoder(c *coder.CustomCoder) (*v1.CustomCoder, error) { func decodeCustomCoder(c *v1.CustomCoder) (*coder.CustomCoder, error) { t, err := decodeType(c.Type) if err != nil { - return nil, fmt.Errorf("decodeCustomCoder bad type: %v", err) + return nil, fmt.Errorf("failed to decode custom coder proto %v for type %v: %v", c, c.Type, err) Review comment: Done, removed mentions of "proto" in error messages. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 216551) Time Spent: 7h 10m (was: 7h) > Improve error messages for unregistered protos > -- > > Key: BEAM-6838 > URL: https://issues.apache.org/jira/browse/BEAM-6838 > Project: Beam > Issue Type: Sub-task > Components: sdk-go >Reporter: Daniel Oliveira >Assignee: Daniel Oliveira >Priority: Minor > Time Spent: 7h 10m > Remaining Estimate: 0h > > When users have a type that can't be serialized (it has a function Field or > similar) or more complex protocol buffers, the error should be improved to > display the type being registered, and better yet, the beam.RegisterType(...) > code they can use to enable use of their type. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-6838) Improve error messages for unregistered protos
[ https://issues.apache.org/jira/browse/BEAM-6838?focusedWorklogId=216549&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-216549 ] ASF GitHub Bot logged work on BEAM-6838: Author: ASF GitHub Bot Created on: 20/Mar/19 23:54 Start Date: 20/Mar/19 23:54 Worklog Time Spent: 10m Work Description: youngoli commented on pull request #8064: [BEAM-6838] Go SDK: Improving error msgs in pipeline serialization. URL: https://github.com/apache/beam/pull/8064#discussion_r267590587 ## File path: sdks/go/pkg/beam/core/runtime/graphx/serialize.go ## @@ -207,45 +207,45 @@ func encodeFn(u *graph.Fn) (*v1.Fn, error) { case u.Fn != nil: fn, err := encodeUserFn(u.Fn) if err != nil { - return nil, fmt.Errorf("bad userfn: %v", err) + return nil, fmt.Errorf("failed to encode function %v, bad userfn: %v", u, err) Review comment: Done This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 216549) Time Spent: 6h 50m (was: 6h 40m) > Improve error messages for unregistered protos > -- > > Key: BEAM-6838 > URL: https://issues.apache.org/jira/browse/BEAM-6838 > Project: Beam > Issue Type: Sub-task > Components: sdk-go >Reporter: Daniel Oliveira >Assignee: Daniel Oliveira >Priority: Minor > Time Spent: 6h 50m > Remaining Estimate: 0h > > When users have a type that can't be serialized (it has a function Field or > similar) or more complex protocol buffers, the error should be improved to > display the type being registered, and better yet, the beam.RegisterType(...) > code they can use to enable use of their type. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-6838) Improve error messages for unregistered protos
[ https://issues.apache.org/jira/browse/BEAM-6838?focusedWorklogId=216548&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-216548 ] ASF GitHub Bot logged work on BEAM-6838: Author: ASF GitHub Bot Created on: 20/Mar/19 23:54 Start Date: 20/Mar/19 23:54 Worklog Time Spent: 10m Work Description: youngoli commented on pull request #8064: [BEAM-6838] Go SDK: Improving error msgs in pipeline serialization. URL: https://github.com/apache/beam/pull/8064#discussion_r267590579 ## File path: sdks/go/pkg/beam/core/runtime/graphx/serialize.go ## @@ -207,45 +207,45 @@ func encodeFn(u *graph.Fn) (*v1.Fn, error) { case u.Fn != nil: fn, err := encodeUserFn(u.Fn) if err != nil { - return nil, fmt.Errorf("bad userfn: %v", err) + return nil, fmt.Errorf("failed to encode function %v, bad userfn: %v", u, err) } return &v1.Fn{Fn: fn}, nil case u.Recv != nil: t := reflect.TypeOf(u.Recv) k, ok := runtime.TypeKey(reflectx.SkipPtr(t)) if !ok { - return nil, fmt.Errorf("bad recv: %v", u.Recv) + return nil, fmt.Errorf("failed to encode function %v, bad receiver %v", u, u.Recv) Review comment: Done, with a little more detail: ("failed to encode structural DoFn %v, failed to create TypeKey for receiver type %T", u, u.Recv) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 216548) Time Spent: 6h 40m (was: 6.5h) > Improve error messages for unregistered protos > -- > > Key: BEAM-6838 > URL: https://issues.apache.org/jira/browse/BEAM-6838 > Project: Beam > Issue Type: Sub-task > Components: sdk-go >Reporter: Daniel Oliveira >Assignee: Daniel Oliveira >Priority: Minor > Time Spent: 6h 40m > Remaining Estimate: 0h > > When users have a type that can't be serialized (it has a function Field or > similar) or more complex protocol buffers, the error should be improved to > display the type being registered, and better yet, the beam.RegisterType(...) > code they can use to enable use of their type. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-6838) Improve error messages for unregistered protos
[ https://issues.apache.org/jira/browse/BEAM-6838?focusedWorklogId=216547&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-216547 ] ASF GitHub Bot logged work on BEAM-6838: Author: ASF GitHub Bot Created on: 20/Mar/19 23:54 Start Date: 20/Mar/19 23:54 Worklog Time Spent: 10m Work Description: youngoli commented on pull request #8064: [BEAM-6838] Go SDK: Improving error msgs in pipeline serialization. URL: https://github.com/apache/beam/pull/8064#discussion_r267590539 ## File path: sdks/go/pkg/beam/core/runtime/graphx/serialize.go ## @@ -207,45 +207,45 @@ func encodeFn(u *graph.Fn) (*v1.Fn, error) { case u.Fn != nil: fn, err := encodeUserFn(u.Fn) if err != nil { - return nil, fmt.Errorf("bad userfn: %v", err) + return nil, fmt.Errorf("failed to encode function %v, bad userfn: %v", u, err) } return &v1.Fn{Fn: fn}, nil case u.Recv != nil: t := reflect.TypeOf(u.Recv) k, ok := runtime.TypeKey(reflectx.SkipPtr(t)) if !ok { - return nil, fmt.Errorf("bad recv: %v", u.Recv) + return nil, fmt.Errorf("failed to encode function %v, bad receiver %v", u, u.Recv) } if _, ok := runtime.LookupType(k); !ok { - return nil, fmt.Errorf("recv type must be registered: %v", t) + return nil, fmt.Errorf("failed to encode function %v, receiver type %v must be registered", u, t) Review comment: Done This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 216547) Time Spent: 6.5h (was: 6h 20m) > Improve error messages for unregistered protos > -- > > Key: BEAM-6838 > URL: https://issues.apache.org/jira/browse/BEAM-6838 > Project: Beam > Issue Type: Sub-task > Components: sdk-go >Reporter: Daniel Oliveira >Assignee: Daniel Oliveira >Priority: Minor > Time Spent: 6.5h > Remaining Estimate: 0h > > When users have a type that can't be serialized (it has a function Field or > similar) or more complex protocol buffers, the error should be improved to > display the type being registered, and better yet, the beam.RegisterType(...) > code they can use to enable use of their type. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-6838) Improve error messages for unregistered protos
[ https://issues.apache.org/jira/browse/BEAM-6838?focusedWorklogId=216546&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-216546 ] ASF GitHub Bot logged work on BEAM-6838: Author: ASF GitHub Bot Created on: 20/Mar/19 23:54 Start Date: 20/Mar/19 23:54 Worklog Time Spent: 10m Work Description: youngoli commented on pull request #8064: [BEAM-6838] Go SDK: Improving error msgs in pipeline serialization. URL: https://github.com/apache/beam/pull/8064#discussion_r267590532 ## File path: sdks/go/pkg/beam/core/runtime/graphx/serialize.go ## @@ -207,45 +207,45 @@ func encodeFn(u *graph.Fn) (*v1.Fn, error) { case u.Fn != nil: fn, err := encodeUserFn(u.Fn) if err != nil { - return nil, fmt.Errorf("bad userfn: %v", err) + return nil, fmt.Errorf("failed to encode function %v, bad userfn: %v", u, err) } return &v1.Fn{Fn: fn}, nil case u.Recv != nil: t := reflect.TypeOf(u.Recv) k, ok := runtime.TypeKey(reflectx.SkipPtr(t)) if !ok { - return nil, fmt.Errorf("bad recv: %v", u.Recv) + return nil, fmt.Errorf("failed to encode function %v, bad receiver %v", u, u.Recv) } if _, ok := runtime.LookupType(k); !ok { - return nil, fmt.Errorf("recv type must be registered: %v", t) + return nil, fmt.Errorf("failed to encode function %v, receiver type %v must be registered", u, t) } typ, err := encodeType(t) if err != nil { - panic(fmt.Sprintf("bad recv type: %v", u.Recv)) + panic(fmt.Sprintf("Failed to encode function %v, bad receiver type: %v", u, err)) Review comment: Done, with a little more detail: ("Failed to encode structural DoFn %v, failed to encode receiver type %T: %v", u, u.Recv, err) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 216546) Time Spent: 6h 20m (was: 6h 10m) > Improve error messages for unregistered protos > -- > > Key: BEAM-6838 > URL: https://issues.apache.org/jira/browse/BEAM-6838 > Project: Beam > Issue Type: Sub-task > Components: sdk-go >Reporter: Daniel Oliveira >Assignee: Daniel Oliveira >Priority: Minor > Time Spent: 6h 20m > Remaining Estimate: 0h > > When users have a type that can't be serialized (it has a function Field or > similar) or more complex protocol buffers, the error should be improved to > display the type being registered, and better yet, the beam.RegisterType(...) > code they can use to enable use of their type. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-6838) Improve error messages for unregistered protos
[ https://issues.apache.org/jira/browse/BEAM-6838?focusedWorklogId=216543&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-216543 ] ASF GitHub Bot logged work on BEAM-6838: Author: ASF GitHub Bot Created on: 20/Mar/19 23:53 Start Date: 20/Mar/19 23:53 Worklog Time Spent: 10m Work Description: youngoli commented on pull request #8064: [BEAM-6838] Go SDK: Improving error msgs in pipeline serialization. URL: https://github.com/apache/beam/pull/8064#discussion_r267590502 ## File path: sdks/go/pkg/beam/core/runtime/graphx/serialize.go ## @@ -207,45 +207,45 @@ func encodeFn(u *graph.Fn) (*v1.Fn, error) { case u.Fn != nil: fn, err := encodeUserFn(u.Fn) if err != nil { - return nil, fmt.Errorf("bad userfn: %v", err) + return nil, fmt.Errorf("failed to encode function %v, bad userfn: %v", u, err) } return &v1.Fn{Fn: fn}, nil case u.Recv != nil: t := reflect.TypeOf(u.Recv) k, ok := runtime.TypeKey(reflectx.SkipPtr(t)) if !ok { - return nil, fmt.Errorf("bad recv: %v", u.Recv) + return nil, fmt.Errorf("failed to encode function %v, bad receiver %v", u, u.Recv) } if _, ok := runtime.LookupType(k); !ok { - return nil, fmt.Errorf("recv type must be registered: %v", t) + return nil, fmt.Errorf("failed to encode function %v, receiver type %v must be registered", u, t) } typ, err := encodeType(t) if err != nil { - panic(fmt.Sprintf("bad recv type: %v", u.Recv)) + panic(fmt.Sprintf("Failed to encode function %v, bad receiver type: %v", u, err)) } data, err := json.Marshal(u.Recv) if err != nil { - return nil, fmt.Errorf("bad userfn: %v", err) + return nil, fmt.Errorf("failed to encode function %v, bad userfn: %v", u, err) } return &v1.Fn{Type: typ, Opt: string(data)}, nil default: - panic("empty Fn") + panic(fmt.Sprintf("Failed to encode function %v, empty fn", u)) } } func decodeFn(u *v1.Fn) (*graph.Fn, error) { if u.Dynfn != nil { gen, err := runtime.ResolveFunction(u.Dynfn.Gen, genFnType) if err != nil { - return nil, fmt.Errorf("bad symbol %v: %v", u.Dynfn.Gen, err) + return nil, fmt.Errorf("failed to decode function proto %v, bad symbol %v: %v", u, u.Dynfn.Gen, err) } t, err := decodeType(u.Dynfn.Type) if err != nil { - return nil, fmt.Errorf("bad type: %v", err) + return nil, fmt.Errorf("failed to decode function proto %v, bad type: %v", u, err) Review comment: Done. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 216543) Time Spent: 5h 50m (was: 5h 40m) > Improve error messages for unregistered protos > -- > > Key: BEAM-6838 > URL: https://issues.apache.org/jira/browse/BEAM-6838 > Project: Beam > Issue Type: Sub-task > Components: sdk-go >Reporter: Daniel Oliveira >Assignee: Daniel Oliveira >Priority: Minor > Time Spent: 5h 50m > Remaining Estimate: 0h > > When users have a type that can't be serialized (it has a function Field or > similar) or more complex protocol buffers, the error should be improved to > display the type being registered, and better yet, the beam.RegisterType(...) > code they can use to enable use of their type. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-6838) Improve error messages for unregistered protos
[ https://issues.apache.org/jira/browse/BEAM-6838?focusedWorklogId=216544&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-216544 ] ASF GitHub Bot logged work on BEAM-6838: Author: ASF GitHub Bot Created on: 20/Mar/19 23:53 Start Date: 20/Mar/19 23:53 Worklog Time Spent: 10m Work Description: youngoli commented on pull request #8064: [BEAM-6838] Go SDK: Improving error msgs in pipeline serialization. URL: https://github.com/apache/beam/pull/8064#discussion_r267590515 ## File path: sdks/go/pkg/beam/core/runtime/graphx/serialize.go ## @@ -207,45 +207,45 @@ func encodeFn(u *graph.Fn) (*v1.Fn, error) { case u.Fn != nil: fn, err := encodeUserFn(u.Fn) if err != nil { - return nil, fmt.Errorf("bad userfn: %v", err) + return nil, fmt.Errorf("failed to encode function %v, bad userfn: %v", u, err) } return &v1.Fn{Fn: fn}, nil case u.Recv != nil: t := reflect.TypeOf(u.Recv) k, ok := runtime.TypeKey(reflectx.SkipPtr(t)) if !ok { - return nil, fmt.Errorf("bad recv: %v", u.Recv) + return nil, fmt.Errorf("failed to encode function %v, bad receiver %v", u, u.Recv) } if _, ok := runtime.LookupType(k); !ok { - return nil, fmt.Errorf("recv type must be registered: %v", t) + return nil, fmt.Errorf("failed to encode function %v, receiver type %v must be registered", u, t) } typ, err := encodeType(t) if err != nil { - panic(fmt.Sprintf("bad recv type: %v", u.Recv)) + panic(fmt.Sprintf("Failed to encode function %v, bad receiver type: %v", u, err)) } data, err := json.Marshal(u.Recv) if err != nil { - return nil, fmt.Errorf("bad userfn: %v", err) + return nil, fmt.Errorf("failed to encode function %v, bad userfn: %v", u, err) } return &v1.Fn{Type: typ, Opt: string(data)}, nil default: - panic("empty Fn") + panic(fmt.Sprintf("Failed to encode function %v, empty fn", u)) } } func decodeFn(u *v1.Fn) (*graph.Fn, error) { if u.Dynfn != nil { gen, err := runtime.ResolveFunction(u.Dynfn.Gen, genFnType) if err != nil { - return nil, fmt.Errorf("bad symbol %v: %v", u.Dynfn.Gen, err) + return nil, fmt.Errorf("failed to decode function proto %v, bad symbol %v: %v", u, u.Dynfn.Gen, err) Review comment: Done. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 216544) Time Spent: 6h (was: 5h 50m) > Improve error messages for unregistered protos > -- > > Key: BEAM-6838 > URL: https://issues.apache.org/jira/browse/BEAM-6838 > Project: Beam > Issue Type: Sub-task > Components: sdk-go >Reporter: Daniel Oliveira >Assignee: Daniel Oliveira >Priority: Minor > Time Spent: 6h > Remaining Estimate: 0h > > When users have a type that can't be serialized (it has a function Field or > similar) or more complex protocol buffers, the error should be improved to > display the type being registered, and better yet, the beam.RegisterType(...) > code they can use to enable use of their type. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-6838) Improve error messages for unregistered protos
[ https://issues.apache.org/jira/browse/BEAM-6838?focusedWorklogId=216540&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-216540 ] ASF GitHub Bot logged work on BEAM-6838: Author: ASF GitHub Bot Created on: 20/Mar/19 23:53 Start Date: 20/Mar/19 23:53 Worklog Time Spent: 10m Work Description: youngoli commented on pull request #8064: [BEAM-6838] Go SDK: Improving error msgs in pipeline serialization. URL: https://github.com/apache/beam/pull/8064#discussion_r267590472 ## File path: sdks/go/pkg/beam/core/runtime/graphx/serialize.go ## @@ -257,22 +257,22 @@ func decodeFn(u *v1.Fn) (*graph.Fn, error) { if u.Fn != nil { fn, err := decodeUserFn(u.Fn) if err != nil { - return nil, fmt.Errorf("bad userfn: %v", err) + return nil, fmt.Errorf("failed to decode function proto %v, bad userfn: %v", u, err) } fx, err := funcx.New(reflectx.MakeFunc(fn)) if err != nil { - return nil, fmt.Errorf("bad userfn: %v", err) + return nil, fmt.Errorf("failed to decode function proto %v, bad userfn: %v", u, err) } return &graph.Fn{Fn: fx}, nil } t, err := decodeType(u.Type) if err != nil { - return nil, fmt.Errorf("bad type: %v", err) + return nil, fmt.Errorf("failed to decode function proto %v, bad type: %v", u, err) Review comment: Done. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 216540) Time Spent: 5h 20m (was: 5h 10m) > Improve error messages for unregistered protos > -- > > Key: BEAM-6838 > URL: https://issues.apache.org/jira/browse/BEAM-6838 > Project: Beam > Issue Type: Sub-task > Components: sdk-go >Reporter: Daniel Oliveira >Assignee: Daniel Oliveira >Priority: Minor > Time Spent: 5h 20m > Remaining Estimate: 0h > > When users have a type that can't be serialized (it has a function Field or > similar) or more complex protocol buffers, the error should be improved to > display the type being registered, and better yet, the beam.RegisterType(...) > code they can use to enable use of their type. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-6838) Improve error messages for unregistered protos
[ https://issues.apache.org/jira/browse/BEAM-6838?focusedWorklogId=216541&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-216541 ] ASF GitHub Bot logged work on BEAM-6838: Author: ASF GitHub Bot Created on: 20/Mar/19 23:53 Start Date: 20/Mar/19 23:53 Worklog Time Spent: 10m Work Description: youngoli commented on pull request #8064: [BEAM-6838] Go SDK: Improving error msgs in pipeline serialization. URL: https://github.com/apache/beam/pull/8064#discussion_r267590477 ## File path: sdks/go/pkg/beam/core/runtime/graphx/serialize.go ## @@ -257,22 +257,22 @@ func decodeFn(u *v1.Fn) (*graph.Fn, error) { if u.Fn != nil { fn, err := decodeUserFn(u.Fn) if err != nil { - return nil, fmt.Errorf("bad userfn: %v", err) + return nil, fmt.Errorf("failed to decode function proto %v, bad userfn: %v", u, err) } fx, err := funcx.New(reflectx.MakeFunc(fn)) if err != nil { - return nil, fmt.Errorf("bad userfn: %v", err) + return nil, fmt.Errorf("failed to decode function proto %v, bad userfn: %v", u, err) Review comment: Done, with a little added detail: ("failed to decode DoFn %v, failed to construct userfn: %v", u, err) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 216541) Time Spent: 5.5h (was: 5h 20m) > Improve error messages for unregistered protos > -- > > Key: BEAM-6838 > URL: https://issues.apache.org/jira/browse/BEAM-6838 > Project: Beam > Issue Type: Sub-task > Components: sdk-go >Reporter: Daniel Oliveira >Assignee: Daniel Oliveira >Priority: Minor > Time Spent: 5.5h > Remaining Estimate: 0h > > When users have a type that can't be serialized (it has a function Field or > similar) or more complex protocol buffers, the error should be improved to > display the type being registered, and better yet, the beam.RegisterType(...) > code they can use to enable use of their type. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-6872) Add hook for user-defined JVM initialization in workers
[ https://issues.apache.org/jira/browse/BEAM-6872?focusedWorklogId=216538&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-216538 ] ASF GitHub Bot logged work on BEAM-6872: Author: ASF GitHub Bot Created on: 20/Mar/19 23:52 Start Date: 20/Mar/19 23:52 Worklog Time Spent: 10m Work Description: TheNeuralBit commented on issue #8104: [BEAM-6872] Add hook for user-defined JVM initialization in workers URL: https://github.com/apache/beam/pull/8104#issuecomment-475071702 @lukecwik are you a good person to review this PR? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 216538) Time Spent: 20m (was: 10m) > Add hook for user-defined JVM initialization in workers > --- > > Key: BEAM-6872 > URL: https://issues.apache.org/jira/browse/BEAM-6872 > Project: Beam > Issue Type: New Feature > Components: runner-dataflow >Reporter: Brian Hulette >Assignee: Brian Hulette >Priority: Minor > Time Spent: 20m > Remaining Estimate: 0h > > Expose an interface for users to run some one-time initialization code when a > worker starts up. > This can be useful for things like overriding the Default ZoneRulesProvider, > or setting up custom SSL providers. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-6872) Add hook for user-defined JVM initialization in workers
[ https://issues.apache.org/jira/browse/BEAM-6872?focusedWorklogId=216521&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-216521 ] ASF GitHub Bot logged work on BEAM-6872: Author: ASF GitHub Bot Created on: 20/Mar/19 23:10 Start Date: 20/Mar/19 23:10 Worklog Time Spent: 10m Work Description: TheNeuralBit commented on pull request #8104: [BEAM-6872] Add hook for user-defined JVM initialization in workers URL: https://github.com/apache/beam/pull/8104 Adds DataflowWorkerInitializer interface. Workers execute implementations of the interface in user code when they start up via ServiceLoader. Currently implementations are run immediately after logging is configured. Post-Commit Tests Status (on master branch) Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark --- | --- | --- | --- | --- | --- | --- | --- Go | [](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/) | --- | --- | --- | --- | --- | --- Java | [](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/) | [](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/) | [](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/) | [](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/) | [](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/) | [](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/) | [](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/) Python | [](https://builds.apache.org/job/beam_PostCommit_Python_Verify/lastCompletedBuild/)[](https://builds.apache.org/job/beam_PostCommit_Python3_Verify/lastCompletedBuild/) | --- | [](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/) [](https://builds.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/) | [](https://builds.apache.org/job/beam_PreCommit_Python_PVR_Flink_Cron/lastCompletedBuild/) | --- | --- | --- See [.test-infra/jenkins/README](https://github.com/apache/beam/blob/master/.test-infra/jenkins/README.md) for trigger phrase, status and link of all Jenkins jobs. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 216521) Time Spent: 10m Remaining Estimate: 0h > Add hook for user-defined JVM initialization in workers > --- > >
[jira] [Created] (BEAM-6872) Add hook for user-defined JVM initialization
Brian Hulette created BEAM-6872: --- Summary: Add hook for user-defined JVM initialization Key: BEAM-6872 URL: https://issues.apache.org/jira/browse/BEAM-6872 Project: Beam Issue Type: New Feature Components: runner-dataflow Reporter: Brian Hulette Assignee: Brian Hulette Expose an interface for users to run some one-time initialization code when a worker starts up. This can be useful for things like overriding the Default ZoneRulesProvider, or setting up custom SSL providers. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (BEAM-6872) Add hook for user-defined JVM initialization in workers
[ https://issues.apache.org/jira/browse/BEAM-6872?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brian Hulette updated BEAM-6872: Summary: Add hook for user-defined JVM initialization in workers (was: Add hook for user-defined JVM initialization) > Add hook for user-defined JVM initialization in workers > --- > > Key: BEAM-6872 > URL: https://issues.apache.org/jira/browse/BEAM-6872 > Project: Beam > Issue Type: New Feature > Components: runner-dataflow >Reporter: Brian Hulette >Assignee: Brian Hulette >Priority: Minor > > Expose an interface for users to run some one-time initialization code when a > worker starts up. > This can be useful for things like overriding the Default ZoneRulesProvider, > or setting up custom SSL providers. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (BEAM-6711) Bigquery Tornadoes IT is broken in Python3 PostCommit test suite.
[ https://issues.apache.org/jira/browse/BEAM-6711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16797599#comment-16797599 ] Pablo Estrada commented on BEAM-6711: - The issue occurs when proto-typed elements go into a combine, because they are strictly not-hashable in Python 3. There was a combine in the transform: https://github.com/apache/beam/pull/8093/files#diff-c397a2a1f47619267d7c57494de7d835R593 This pr should fix that: https://github.com/apache/beam/pull/8093 - I'll also apply changes discussed with [~tvalentyn] regarding default behavior with DirectRunner. > Bigquery Tornadoes IT is broken in Python3 PostCommit test suite. > -- > > Key: BEAM-6711 > URL: https://issues.apache.org/jira/browse/BEAM-6711 > Project: Beam > Issue Type: Sub-task > Components: sdk-py-core >Reporter: Valentyn Tymofieiev >Assignee: Pablo Estrada >Priority: Major > Fix For: 2.12.0 > > Time Spent: 6h 50m > Remaining Estimate: 0h > > First failure was observed in > https://builds.apache.org/job/beam_PostCommit_Python3_Verify/54 , after > https://github.com/apache/beam/commit/cdea885872b3be7de9ba22f22700be89f7d53766 > was merged. > [~pabloem], could you please take a look? I suggest we do a rollback + > rollforward with a fix. > {noformat} > root: ERROR: Exception at bundle > , > due to an exception. > Traceback (most recent call last): > File > "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify/src/sdks/python/apache_beam/runners/common.py", > line 727, in process > return self.do_fn_invoker.invoke_process(windowed_value) > File > "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify/src/sdks/python/apache_beam/runners/common.py", > line 556, in invoke_process > windowed_value, additional_args, additional_kwargs, output_processor) > File > "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify/src/sdks/python/apache_beam/runners/common.py", > line 622, in _invoke_per_window > self.process_method(*args_for_process, **kwargs_for_process)) > File > "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify/src/sdks/python/apache_beam/runners/common.py", > line 823, in process_outputs > for result in results: > File "/home/jenkins/jenkins-slave/works > pace/beam_PostCommit_Python3_Verify/src/sdks/python/apache_beam/io/gcp/bigquery_file_loads.py", > line 191, in process > if destination in self._destination_to_file_writer: > TypeError: unhashable type: 'TableReference' > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-6812) Convert keys to ByteArray in Combine.perKey for Spark
[ https://issues.apache.org/jira/browse/BEAM-6812?focusedWorklogId=216486&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-216486 ] ASF GitHub Bot logged work on BEAM-6812: Author: ASF GitHub Bot Created on: 20/Mar/19 21:26 Start Date: 20/Mar/19 21:26 Worklog Time Spent: 10m Work Description: dmvk commented on pull request #8042: [BEAM-6812]: Convert keys to ByteArray in Combine.perKey to make sure hashCode is consistent URL: https://github.com/apache/beam/pull/8042 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 216486) Time Spent: 2h 50m (was: 2h 40m) > Convert keys to ByteArray in Combine.perKey for Spark > - > > Key: BEAM-6812 > URL: https://issues.apache.org/jira/browse/BEAM-6812 > Project: Beam > Issue Type: Bug > Components: runner-spark >Reporter: Ankit Jhalaria >Assignee: Ankit Jhalaria >Priority: Critical > Time Spent: 2h 50m > Remaining Estimate: 0h > > * During calls to Combine.perKey, we want they keys used to have consistent > hashCode when invoked from different JVM's. > * However, while testing this in our company we found out that when using > protobuf as keys during combine, the hashCodes can be different for the same > key when invoked from different JVMs. This results in duplicates. > * `ByteArray` class in Spark has a stable has code when dealing with arrays > as well. > * GroupByKey correctly converts keys to `ByteArray` and uses coders for > serialization. > * The fix does something similar when dealing with combines. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (BEAM-6812) Convert keys to ByteArray in Combine.perKey for Spark
[ https://issues.apache.org/jira/browse/BEAM-6812?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Moravek resolved BEAM-6812. - Resolution: Fixed Fix Version/s: 2.12.0 > Convert keys to ByteArray in Combine.perKey for Spark > - > > Key: BEAM-6812 > URL: https://issues.apache.org/jira/browse/BEAM-6812 > Project: Beam > Issue Type: Bug > Components: runner-spark >Reporter: Ankit Jhalaria >Assignee: Ankit Jhalaria >Priority: Critical > Fix For: 2.12.0 > > Time Spent: 2h 50m > Remaining Estimate: 0h > > * During calls to Combine.perKey, we want they keys used to have consistent > hashCode when invoked from different JVM's. > * However, while testing this in our company we found out that when using > protobuf as keys during combine, the hashCodes can be different for the same > key when invoked from different JVMs. This results in duplicates. > * `ByteArray` class in Spark has a stable has code when dealing with arrays > as well. > * GroupByKey correctly converts keys to `ByteArray` and uses coders for > serialization. > * The fix does something similar when dealing with combines. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-6527) Parallel tox (unit) tests run on Jenkins
[ https://issues.apache.org/jira/browse/BEAM-6527?focusedWorklogId=216476&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-216476 ] ASF GitHub Bot logged work on BEAM-6527: Author: ASF GitHub Bot Created on: 20/Mar/19 21:02 Start Date: 20/Mar/19 21:02 Worklog Time Spent: 10m Work Description: markflyhigh commented on issue #8067: [BEAM-6527] Use Gradle to parallel Python tox tests URL: https://github.com/apache/beam/pull/8067#issuecomment-475026993 Run Java PreCommit This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 216476) Time Spent: 8.5h (was: 8h 20m) > Parallel tox (unit) tests run on Jenkins > > > Key: BEAM-6527 > URL: https://issues.apache.org/jira/browse/BEAM-6527 > Project: Beam > Issue Type: Sub-task > Components: testing >Reporter: Mark Liu >Assignee: Mark Liu >Priority: Major > Labels: triaged > Time Spent: 8.5h > Remaining Estimate: 0h > > Existing tox unit test suite (basic, gcp and cython) will be enabled in > multiple version of Python 3, which will significantly increase runtime of > Pre/PostCommit build. A parallel is wanted in tox or Gradle invocation to > control the time in a reasonable range (<30mins for PreCommit is desired). -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-6844) Metrics tests for Portable metrics
[ https://issues.apache.org/jira/browse/BEAM-6844?focusedWorklogId=216475&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-216475 ] ASF GitHub Bot logged work on BEAM-6844: Author: ASF GitHub Bot Created on: 20/Mar/19 21:01 Start Date: 20/Mar/19 21:01 Worklog Time Spent: 10m Work Description: Ardagan commented on issue #8038: [BEAM-6844] Add MetricResultMatcher verification for user metrics, GBK element count, and process time to an integration test. URL: https://github.com/apache/beam/pull/8038#issuecomment-475026699 run java postcommit This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 216475) Time Spent: 0.5h (was: 20m) > Metrics tests for Portable metrics > -- > > Key: BEAM-6844 > URL: https://issues.apache.org/jira/browse/BEAM-6844 > Project: Beam > Issue Type: Improvement > Components: sdk-java-harness, sdk-py-harness >Reporter: Pablo Estrada >Assignee: Alex Amato >Priority: Major > Fix For: Not applicable > > Time Spent: 0.5h > Remaining Estimate: 0h > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-6866) SamzaRunner: support timers in ParDo
[ https://issues.apache.org/jira/browse/BEAM-6866?focusedWorklogId=216473&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-216473 ] ASF GitHub Bot logged work on BEAM-6866: Author: ASF GitHub Bot Created on: 20/Mar/19 20:58 Start Date: 20/Mar/19 20:58 Worklog Time Spent: 10m Work Description: xinyuiscool commented on pull request #8092: [BEAM-6866]: SamzaRunner: support timers in ParDo URL: https://github.com/apache/beam/pull/8092 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 216473) Time Spent: 20m (was: 10m) > SamzaRunner: support timers in ParDo > > > Key: BEAM-6866 > URL: https://issues.apache.org/jira/browse/BEAM-6866 > Project: Beam > Issue Type: Improvement > Components: runner-samza >Reporter: Xinyu Liu >Assignee: Xinyu Liu >Priority: Major > Time Spent: 20m > Remaining Estimate: 0h > > This patch adds support of timers in ParDo. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-6838) Improve error messages for unregistered protos
[ https://issues.apache.org/jira/browse/BEAM-6838?focusedWorklogId=216461&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-216461 ] ASF GitHub Bot logged work on BEAM-6838: Author: ASF GitHub Bot Created on: 20/Mar/19 20:47 Start Date: 20/Mar/19 20:47 Worklog Time Spent: 10m Work Description: lostluck commented on pull request #8064: [BEAM-6838] Go SDK: Improving error msgs in pipeline serialization. URL: https://github.com/apache/beam/pull/8064#discussion_r267530489 ## File path: sdks/go/pkg/beam/core/runtime/graphx/serialize.go ## @@ -207,45 +207,45 @@ func encodeFn(u *graph.Fn) (*v1.Fn, error) { case u.Fn != nil: fn, err := encodeUserFn(u.Fn) if err != nil { - return nil, fmt.Errorf("bad userfn: %v", err) + return nil, fmt.Errorf("failed to encode function %v, bad userfn: %v", u, err) } return &v1.Fn{Fn: fn}, nil case u.Recv != nil: t := reflect.TypeOf(u.Recv) k, ok := runtime.TypeKey(reflectx.SkipPtr(t)) if !ok { - return nil, fmt.Errorf("bad recv: %v", u.Recv) + return nil, fmt.Errorf("failed to encode function %v, bad receiver %v", u, u.Recv) Review comment: This one is trickier: ("failed to encode structural DoFn %v, bad receiver type: %T", u, u.Recv) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 216461) Time Spent: 3h 50m (was: 3h 40m) > Improve error messages for unregistered protos > -- > > Key: BEAM-6838 > URL: https://issues.apache.org/jira/browse/BEAM-6838 > Project: Beam > Issue Type: Sub-task > Components: sdk-go >Reporter: Daniel Oliveira >Assignee: Daniel Oliveira >Priority: Minor > Time Spent: 3h 50m > Remaining Estimate: 0h > > When users have a type that can't be serialized (it has a function Field or > similar) or more complex protocol buffers, the error should be improved to > display the type being registered, and better yet, the beam.RegisterType(...) > code they can use to enable use of their type. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (BEAM-6862) [beam_PostCommit_Py_ValCont] Test fail with import error
[ https://issues.apache.org/jira/browse/BEAM-6862?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mikhail Gryzykhin reassigned BEAM-6862: --- Assignee: Mikhail Gryzykhin (was: Alex Amato) > [beam_PostCommit_Py_ValCont] Test fail with import error > > > Key: BEAM-6862 > URL: https://issues.apache.org/jira/browse/BEAM-6862 > Project: Beam > Issue Type: Bug > Components: test-failures >Reporter: Ahmet Altay >Assignee: Mikhail Gryzykhin >Priority: Major > Labels: currently-failing > > PR likely root cause: https://github.com/apache/beam/pull/8038 > Started at: https://builds.apache.org/job/beam_PostCommit_Py_ValCont/2672/ > 14:30:12 ImportError: No module named > hamcrest.library.number.ordering_comparison > 14:30:12 > 14:30:12 at > java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:357) > 14:30:12 at > java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1895) > 14:30:12 at org.apache.beam.sdk.util.MoreFutures.get(MoreFutures.java:57) > 14:30:12 at > org.apache.beam.runners.dataflow.worker.fn.control.RegisterAndProcessBundleOperation.finish(RegisterAndProcessBundleOperation.java:281) > 14:30:12 at > org.apache.beam.runners.dataflow.worker.util.common.worker.MapTaskExecutor.execute(MapTaskExecutor.java:85) > 14:30:12 at > org.apache.beam.runners.dataflow.worker.fn.control.BeamFnMapTaskExecutor.execute(BeamFnMapTaskExecutor.java:125) > 14:30:12 at > org.apache.beam.runners.dataflow.worker.BatchDataflowWorker.executeWork(BatchDataflowWorker.java:411) > 14:30:12 at > org.apache.beam.runners.dataflow.worker.BatchDataflowWorker.doWork(BatchDataflowWorker.java:380) > 14:30:12 at > org.apache.beam.runners.dataflow.worker.BatchDataflowWorker.getAndPerformWork(BatchDataflowWorker.java:306) > 14:30:12 at > org.apache.beam.runners.dataflow.worker.DataflowRunnerHarness.start(DataflowRunnerHarness.java:195) > 14:30:12 at > org.apache.beam.runners.dataflow.worker.DataflowRunnerHarness.main(DataflowRunnerHarness.java:123) > 14:30:12 Suppressed: java.lang.IllegalStateException: Already closed. > 14:30:12 at > org.apache.beam.sdk.fn.data.BeamFnDataBufferingOutboundObserver.close(BeamFnDataBufferingOutboundObserver.java:95) > 14:30:12 at > org.apache.beam.runners.dataflow.worker.fn.data.RemoteGrpcPortWriteOperation.abort(RemoteGrpcPortWriteOperation.java:215) > 14:30:12 at > org.apache.beam.runners.dataflow.worker.util.common.worker.MapTaskExecutor.execute(MapTaskExecutor.java:91) > 14:30:12 ... 6 more -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-6838) Improve error messages for unregistered protos
[ https://issues.apache.org/jira/browse/BEAM-6838?focusedWorklogId=216458&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-216458 ] ASF GitHub Bot logged work on BEAM-6838: Author: ASF GitHub Bot Created on: 20/Mar/19 20:47 Start Date: 20/Mar/19 20:47 Worklog Time Spent: 10m Work Description: lostluck commented on pull request #8064: [BEAM-6838] Go SDK: Improving error msgs in pipeline serialization. URL: https://github.com/apache/beam/pull/8064#discussion_r267529785 ## File path: sdks/go/pkg/beam/core/runtime/graphx/serialize.go ## @@ -195,7 +195,7 @@ func encodeFn(u *graph.Fn) (*v1.Fn, error) { gen := reflectx.FunctionName(u.DynFn.Gen) t, err := encodeType(u.DynFn.T) if err != nil { - return nil, fmt.Errorf("bad function type: %v", err) + return nil, fmt.Errorf("failed to encode function %v, bad function type: %v", u, err) Review comment: ("failed to encode dynamic DoFn %v, bad function type: %v", u, err) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 216458) Time Spent: 3.5h (was: 3h 20m) > Improve error messages for unregistered protos > -- > > Key: BEAM-6838 > URL: https://issues.apache.org/jira/browse/BEAM-6838 > Project: Beam > Issue Type: Sub-task > Components: sdk-go >Reporter: Daniel Oliveira >Assignee: Daniel Oliveira >Priority: Minor > Time Spent: 3.5h > Remaining Estimate: 0h > > When users have a type that can't be serialized (it has a function Field or > similar) or more complex protocol buffers, the error should be improved to > display the type being registered, and better yet, the beam.RegisterType(...) > code they can use to enable use of their type. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-6838) Improve error messages for unregistered protos
[ https://issues.apache.org/jira/browse/BEAM-6838?focusedWorklogId=216466&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-216466 ] ASF GitHub Bot logged work on BEAM-6838: Author: ASF GitHub Bot Created on: 20/Mar/19 20:47 Start Date: 20/Mar/19 20:47 Worklog Time Spent: 10m Work Description: lostluck commented on pull request #8064: [BEAM-6838] Go SDK: Improving error msgs in pipeline serialization. URL: https://github.com/apache/beam/pull/8064#discussion_r267535027 ## File path: sdks/go/pkg/beam/core/runtime/graphx/serialize.go ## @@ -257,22 +257,22 @@ func decodeFn(u *v1.Fn) (*graph.Fn, error) { if u.Fn != nil { fn, err := decodeUserFn(u.Fn) if err != nil { - return nil, fmt.Errorf("bad userfn: %v", err) + return nil, fmt.Errorf("failed to decode function proto %v, bad userfn: %v", u, err) } fx, err := funcx.New(reflectx.MakeFunc(fn)) if err != nil { - return nil, fmt.Errorf("bad userfn: %v", err) + return nil, fmt.Errorf("failed to decode function proto %v, bad userfn: %v", u, err) } return &graph.Fn{Fn: fx}, nil } t, err := decodeType(u.Type) if err != nil { - return nil, fmt.Errorf("bad type: %v", err) + return nil, fmt.Errorf("failed to decode function proto %v, bad type: %v", u, err) Review comment: ("failed to decode structural DoFn %v, bad type: %v", u, err) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 216466) Time Spent: 4h 40m (was: 4.5h) > Improve error messages for unregistered protos > -- > > Key: BEAM-6838 > URL: https://issues.apache.org/jira/browse/BEAM-6838 > Project: Beam > Issue Type: Sub-task > Components: sdk-go >Reporter: Daniel Oliveira >Assignee: Daniel Oliveira >Priority: Minor > Time Spent: 4h 40m > Remaining Estimate: 0h > > When users have a type that can't be serialized (it has a function Field or > similar) or more complex protocol buffers, the error should be improved to > display the type being registered, and better yet, the beam.RegisterType(...) > code they can use to enable use of their type. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-6838) Improve error messages for unregistered protos
[ https://issues.apache.org/jira/browse/BEAM-6838?focusedWorklogId=216468&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-216468 ] ASF GitHub Bot logged work on BEAM-6838: Author: ASF GitHub Bot Created on: 20/Mar/19 20:47 Start Date: 20/Mar/19 20:47 Worklog Time Spent: 10m Work Description: lostluck commented on pull request #8064: [BEAM-6838] Go SDK: Improving error msgs in pipeline serialization. URL: https://github.com/apache/beam/pull/8064#discussion_r267534573 ## File path: sdks/go/pkg/beam/core/runtime/graphx/serialize.go ## @@ -257,22 +257,22 @@ func decodeFn(u *v1.Fn) (*graph.Fn, error) { if u.Fn != nil { fn, err := decodeUserFn(u.Fn) if err != nil { - return nil, fmt.Errorf("bad userfn: %v", err) + return nil, fmt.Errorf("failed to decode function proto %v, bad userfn: %v", u, err) Review comment: ("failed to decode DoFn %v, bad userfn: %v", u, err) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 216468) Time Spent: 5h (was: 4h 50m) > Improve error messages for unregistered protos > -- > > Key: BEAM-6838 > URL: https://issues.apache.org/jira/browse/BEAM-6838 > Project: Beam > Issue Type: Sub-task > Components: sdk-go >Reporter: Daniel Oliveira >Assignee: Daniel Oliveira >Priority: Minor > Time Spent: 5h > Remaining Estimate: 0h > > When users have a type that can't be serialized (it has a function Field or > similar) or more complex protocol buffers, the error should be improved to > display the type being registered, and better yet, the beam.RegisterType(...) > code they can use to enable use of their type. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-6838) Improve error messages for unregistered protos
[ https://issues.apache.org/jira/browse/BEAM-6838?focusedWorklogId=216464&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-216464 ] ASF GitHub Bot logged work on BEAM-6838: Author: ASF GitHub Bot Created on: 20/Mar/19 20:47 Start Date: 20/Mar/19 20:47 Worklog Time Spent: 10m Work Description: lostluck commented on pull request #8064: [BEAM-6838] Go SDK: Improving error msgs in pipeline serialization. URL: https://github.com/apache/beam/pull/8064#discussion_r267534727 ## File path: sdks/go/pkg/beam/core/runtime/graphx/serialize.go ## @@ -257,22 +257,22 @@ func decodeFn(u *v1.Fn) (*graph.Fn, error) { if u.Fn != nil { fn, err := decodeUserFn(u.Fn) if err != nil { - return nil, fmt.Errorf("bad userfn: %v", err) + return nil, fmt.Errorf("failed to decode function proto %v, bad userfn: %v", u, err) } fx, err := funcx.New(reflectx.MakeFunc(fn)) if err != nil { - return nil, fmt.Errorf("bad userfn: %v", err) + return nil, fmt.Errorf("failed to decode function proto %v, bad userfn: %v", u, err) Review comment: ("failed to decode DoFn %v, bad userfn: %v", u, err) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 216464) Time Spent: 4h 20m (was: 4h 10m) > Improve error messages for unregistered protos > -- > > Key: BEAM-6838 > URL: https://issues.apache.org/jira/browse/BEAM-6838 > Project: Beam > Issue Type: Sub-task > Components: sdk-go >Reporter: Daniel Oliveira >Assignee: Daniel Oliveira >Priority: Minor > Time Spent: 4h 20m > Remaining Estimate: 0h > > When users have a type that can't be serialized (it has a function Field or > similar) or more complex protocol buffers, the error should be improved to > display the type being registered, and better yet, the beam.RegisterType(...) > code they can use to enable use of their type. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-6838) Improve error messages for unregistered protos
[ https://issues.apache.org/jira/browse/BEAM-6838?focusedWorklogId=216467&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-216467 ] ASF GitHub Bot logged work on BEAM-6838: Author: ASF GitHub Bot Created on: 20/Mar/19 20:47 Start Date: 20/Mar/19 20:47 Worklog Time Spent: 10m Work Description: lostluck commented on pull request #8064: [BEAM-6838] Go SDK: Improving error msgs in pipeline serialization. URL: https://github.com/apache/beam/pull/8064#discussion_r267535219 ## File path: sdks/go/pkg/beam/core/runtime/graphx/serialize.go ## @@ -257,22 +257,22 @@ func decodeFn(u *v1.Fn) (*graph.Fn, error) { if u.Fn != nil { fn, err := decodeUserFn(u.Fn) if err != nil { - return nil, fmt.Errorf("bad userfn: %v", err) + return nil, fmt.Errorf("failed to decode function proto %v, bad userfn: %v", u, err) } fx, err := funcx.New(reflectx.MakeFunc(fn)) if err != nil { - return nil, fmt.Errorf("bad userfn: %v", err) + return nil, fmt.Errorf("failed to decode function proto %v, bad userfn: %v", u, err) } return &graph.Fn{Fn: fx}, nil } t, err := decodeType(u.Type) if err != nil { - return nil, fmt.Errorf("bad type: %v", err) + return nil, fmt.Errorf("failed to decode function proto %v, bad type: %v", u, err) } fn, err := reflectx.UnmarshalJSON(t, u.Opt) if err != nil { - return nil, fmt.Errorf("bad struct encoding: %v", err) + return nil, fmt.Errorf("failed to decode function proto %v, bad struct encoding: %v", u, err) Review comment: ("failed to decode structural DoFn %v, bad userfn: %v", u, err) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 216467) Time Spent: 4h 50m (was: 4h 40m) > Improve error messages for unregistered protos > -- > > Key: BEAM-6838 > URL: https://issues.apache.org/jira/browse/BEAM-6838 > Project: Beam > Issue Type: Sub-task > Components: sdk-go >Reporter: Daniel Oliveira >Assignee: Daniel Oliveira >Priority: Minor > Time Spent: 4h 50m > Remaining Estimate: 0h > > When users have a type that can't be serialized (it has a function Field or > similar) or more complex protocol buffers, the error should be improved to > display the type being registered, and better yet, the beam.RegisterType(...) > code they can use to enable use of their type. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-6838) Improve error messages for unregistered protos
[ https://issues.apache.org/jira/browse/BEAM-6838?focusedWorklogId=216456&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-216456 ] ASF GitHub Bot logged work on BEAM-6838: Author: ASF GitHub Bot Created on: 20/Mar/19 20:47 Start Date: 20/Mar/19 20:47 Worklog Time Spent: 10m Work Description: lostluck commented on pull request #8064: [BEAM-6838] Go SDK: Improving error msgs in pipeline serialization. URL: https://github.com/apache/beam/pull/8064#discussion_r267530172 ## File path: sdks/go/pkg/beam/core/runtime/graphx/serialize.go ## @@ -207,45 +207,45 @@ func encodeFn(u *graph.Fn) (*v1.Fn, error) { case u.Fn != nil: fn, err := encodeUserFn(u.Fn) if err != nil { - return nil, fmt.Errorf("bad userfn: %v", err) + return nil, fmt.Errorf("failed to encode function %v, bad userfn: %v", u, err) Review comment: ("failed to encode DoFn %v, bad userfn: %v", u, err) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 216456) Time Spent: 3h 20m (was: 3h 10m) > Improve error messages for unregistered protos > -- > > Key: BEAM-6838 > URL: https://issues.apache.org/jira/browse/BEAM-6838 > Project: Beam > Issue Type: Sub-task > Components: sdk-go >Reporter: Daniel Oliveira >Assignee: Daniel Oliveira >Priority: Minor > Time Spent: 3h 20m > Remaining Estimate: 0h > > When users have a type that can't be serialized (it has a function Field or > similar) or more complex protocol buffers, the error should be improved to > display the type being registered, and better yet, the beam.RegisterType(...) > code they can use to enable use of their type. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-6838) Improve error messages for unregistered protos
[ https://issues.apache.org/jira/browse/BEAM-6838?focusedWorklogId=216462&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-216462 ] ASF GitHub Bot logged work on BEAM-6838: Author: ASF GitHub Bot Created on: 20/Mar/19 20:47 Start Date: 20/Mar/19 20:47 Worklog Time Spent: 10m Work Description: lostluck commented on pull request #8064: [BEAM-6838] Go SDK: Improving error msgs in pipeline serialization. URL: https://github.com/apache/beam/pull/8064#discussion_r267532889 ## File path: sdks/go/pkg/beam/core/runtime/graphx/serialize.go ## @@ -207,45 +207,45 @@ func encodeFn(u *graph.Fn) (*v1.Fn, error) { case u.Fn != nil: fn, err := encodeUserFn(u.Fn) if err != nil { - return nil, fmt.Errorf("bad userfn: %v", err) + return nil, fmt.Errorf("failed to encode function %v, bad userfn: %v", u, err) } return &v1.Fn{Fn: fn}, nil case u.Recv != nil: t := reflect.TypeOf(u.Recv) k, ok := runtime.TypeKey(reflectx.SkipPtr(t)) if !ok { - return nil, fmt.Errorf("bad recv: %v", u.Recv) + return nil, fmt.Errorf("failed to encode function %v, bad receiver %v", u, u.Recv) } if _, ok := runtime.LookupType(k); !ok { - return nil, fmt.Errorf("recv type must be registered: %v", t) + return nil, fmt.Errorf("failed to encode function %v, receiver type %v must be registered", u, t) } typ, err := encodeType(t) if err != nil { - panic(fmt.Sprintf("bad recv type: %v", u.Recv)) + panic(fmt.Sprintf("Failed to encode function %v, bad receiver type: %v", u, err)) Review comment: ("Failed to encode structural DoFn %v, bad receiver type %T: %v", u, u.Recv, err) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 216462) Time Spent: 4h (was: 3h 50m) > Improve error messages for unregistered protos > -- > > Key: BEAM-6838 > URL: https://issues.apache.org/jira/browse/BEAM-6838 > Project: Beam > Issue Type: Sub-task > Components: sdk-go >Reporter: Daniel Oliveira >Assignee: Daniel Oliveira >Priority: Minor > Time Spent: 4h > Remaining Estimate: 0h > > When users have a type that can't be serialized (it has a function Field or > similar) or more complex protocol buffers, the error should be improved to > display the type being registered, and better yet, the beam.RegisterType(...) > code they can use to enable use of their type. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-6838) Improve error messages for unregistered protos
[ https://issues.apache.org/jira/browse/BEAM-6838?focusedWorklogId=216459&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-216459 ] ASF GitHub Bot logged work on BEAM-6838: Author: ASF GitHub Bot Created on: 20/Mar/19 20:47 Start Date: 20/Mar/19 20:47 Worklog Time Spent: 10m Work Description: lostluck commented on pull request #8064: [BEAM-6838] Go SDK: Improving error msgs in pipeline serialization. URL: https://github.com/apache/beam/pull/8064#discussion_r267531124 ## File path: sdks/go/pkg/beam/core/runtime/graphx/serialize.go ## @@ -207,45 +207,45 @@ func encodeFn(u *graph.Fn) (*v1.Fn, error) { case u.Fn != nil: fn, err := encodeUserFn(u.Fn) if err != nil { - return nil, fmt.Errorf("bad userfn: %v", err) + return nil, fmt.Errorf("failed to encode function %v, bad userfn: %v", u, err) } return &v1.Fn{Fn: fn}, nil case u.Recv != nil: t := reflect.TypeOf(u.Recv) k, ok := runtime.TypeKey(reflectx.SkipPtr(t)) if !ok { - return nil, fmt.Errorf("bad recv: %v", u.Recv) + return nil, fmt.Errorf("failed to encode function %v, bad receiver %v", u, u.Recv) } if _, ok := runtime.LookupType(k); !ok { - return nil, fmt.Errorf("recv type must be registered: %v", t) + return nil, fmt.Errorf("failed to encode function %v, receiver type %v must be registered", u, t) Review comment: ("failed to encode structural DoFn %v, receiver type %v must be registered", u, t) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 216459) Time Spent: 3h 40m (was: 3.5h) > Improve error messages for unregistered protos > -- > > Key: BEAM-6838 > URL: https://issues.apache.org/jira/browse/BEAM-6838 > Project: Beam > Issue Type: Sub-task > Components: sdk-go >Reporter: Daniel Oliveira >Assignee: Daniel Oliveira >Priority: Minor > Time Spent: 3h 40m > Remaining Estimate: 0h > > When users have a type that can't be serialized (it has a function Field or > similar) or more complex protocol buffers, the error should be improved to > display the type being registered, and better yet, the beam.RegisterType(...) > code they can use to enable use of their type. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-6838) Improve error messages for unregistered protos
[ https://issues.apache.org/jira/browse/BEAM-6838?focusedWorklogId=216465&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-216465 ] ASF GitHub Bot logged work on BEAM-6838: Author: ASF GitHub Bot Created on: 20/Mar/19 20:47 Start Date: 20/Mar/19 20:47 Worklog Time Spent: 10m Work Description: lostluck commented on pull request #8064: [BEAM-6838] Go SDK: Improving error msgs in pipeline serialization. URL: https://github.com/apache/beam/pull/8064#discussion_r267534401 ## File path: sdks/go/pkg/beam/core/runtime/graphx/serialize.go ## @@ -207,45 +207,45 @@ func encodeFn(u *graph.Fn) (*v1.Fn, error) { case u.Fn != nil: fn, err := encodeUserFn(u.Fn) if err != nil { - return nil, fmt.Errorf("bad userfn: %v", err) + return nil, fmt.Errorf("failed to encode function %v, bad userfn: %v", u, err) } return &v1.Fn{Fn: fn}, nil case u.Recv != nil: t := reflect.TypeOf(u.Recv) k, ok := runtime.TypeKey(reflectx.SkipPtr(t)) if !ok { - return nil, fmt.Errorf("bad recv: %v", u.Recv) + return nil, fmt.Errorf("failed to encode function %v, bad receiver %v", u, u.Recv) } if _, ok := runtime.LookupType(k); !ok { - return nil, fmt.Errorf("recv type must be registered: %v", t) + return nil, fmt.Errorf("failed to encode function %v, receiver type %v must be registered", u, t) } typ, err := encodeType(t) if err != nil { - panic(fmt.Sprintf("bad recv type: %v", u.Recv)) + panic(fmt.Sprintf("Failed to encode function %v, bad receiver type: %v", u, err)) } data, err := json.Marshal(u.Recv) if err != nil { - return nil, fmt.Errorf("bad userfn: %v", err) + return nil, fmt.Errorf("failed to encode function %v, bad userfn: %v", u, err) } return &v1.Fn{Type: typ, Opt: string(data)}, nil default: - panic("empty Fn") + panic(fmt.Sprintf("Failed to encode function %v, empty fn", u)) } } func decodeFn(u *v1.Fn) (*graph.Fn, error) { if u.Dynfn != nil { gen, err := runtime.ResolveFunction(u.Dynfn.Gen, genFnType) if err != nil { - return nil, fmt.Errorf("bad symbol %v: %v", u.Dynfn.Gen, err) + return nil, fmt.Errorf("failed to decode function proto %v, bad symbol %v: %v", u, u.Dynfn.Gen, err) } t, err := decodeType(u.Dynfn.Type) if err != nil { - return nil, fmt.Errorf("bad type: %v", err) + return nil, fmt.Errorf("failed to decode function proto %v, bad type: %v", u, err) Review comment: ("failed to decode dynamic DoFn %v, bad type: %v", u, err) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 216465) Time Spent: 4.5h (was: 4h 20m) > Improve error messages for unregistered protos > -- > > Key: BEAM-6838 > URL: https://issues.apache.org/jira/browse/BEAM-6838 > Project: Beam > Issue Type: Sub-task > Components: sdk-go >Reporter: Daniel Oliveira >Assignee: Daniel Oliveira >Priority: Minor > Time Spent: 4.5h > Remaining Estimate: 0h > > When users have a type that can't be serialized (it has a function Field or > similar) or more complex protocol buffers, the error should be improved to > display the type being registered, and better yet, the beam.RegisterType(...) > code they can use to enable use of their type. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-6838) Improve error messages for unregistered protos
[ https://issues.apache.org/jira/browse/BEAM-6838?focusedWorklogId=216463&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-216463 ] ASF GitHub Bot logged work on BEAM-6838: Author: ASF GitHub Bot Created on: 20/Mar/19 20:47 Start Date: 20/Mar/19 20:47 Worklog Time Spent: 10m Work Description: lostluck commented on pull request #8064: [BEAM-6838] Go SDK: Improving error msgs in pipeline serialization. URL: https://github.com/apache/beam/pull/8064#discussion_r267534230 ## File path: sdks/go/pkg/beam/core/runtime/graphx/serialize.go ## @@ -207,45 +207,45 @@ func encodeFn(u *graph.Fn) (*v1.Fn, error) { case u.Fn != nil: fn, err := encodeUserFn(u.Fn) if err != nil { - return nil, fmt.Errorf("bad userfn: %v", err) + return nil, fmt.Errorf("failed to encode function %v, bad userfn: %v", u, err) } return &v1.Fn{Fn: fn}, nil case u.Recv != nil: t := reflect.TypeOf(u.Recv) k, ok := runtime.TypeKey(reflectx.SkipPtr(t)) if !ok { - return nil, fmt.Errorf("bad recv: %v", u.Recv) + return nil, fmt.Errorf("failed to encode function %v, bad receiver %v", u, u.Recv) } if _, ok := runtime.LookupType(k); !ok { - return nil, fmt.Errorf("recv type must be registered: %v", t) + return nil, fmt.Errorf("failed to encode function %v, receiver type %v must be registered", u, t) } typ, err := encodeType(t) if err != nil { - panic(fmt.Sprintf("bad recv type: %v", u.Recv)) + panic(fmt.Sprintf("Failed to encode function %v, bad receiver type: %v", u, err)) } data, err := json.Marshal(u.Recv) if err != nil { - return nil, fmt.Errorf("bad userfn: %v", err) + return nil, fmt.Errorf("failed to encode function %v, bad userfn: %v", u, err) } return &v1.Fn{Type: typ, Opt: string(data)}, nil default: - panic("empty Fn") + panic(fmt.Sprintf("Failed to encode function %v, empty fn", u)) } } func decodeFn(u *v1.Fn) (*graph.Fn, error) { if u.Dynfn != nil { gen, err := runtime.ResolveFunction(u.Dynfn.Gen, genFnType) if err != nil { - return nil, fmt.Errorf("bad symbol %v: %v", u.Dynfn.Gen, err) + return nil, fmt.Errorf("failed to decode function proto %v, bad symbol %v: %v", u, u.Dynfn.Gen, err) Review comment: ("failed to decode dynamic DoFn %v, bad symbol %v: %v", u, u.Dynfn.Gen, err) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 216463) Time Spent: 4h 10m (was: 4h) > Improve error messages for unregistered protos > -- > > Key: BEAM-6838 > URL: https://issues.apache.org/jira/browse/BEAM-6838 > Project: Beam > Issue Type: Sub-task > Components: sdk-go >Reporter: Daniel Oliveira >Assignee: Daniel Oliveira >Priority: Minor > Time Spent: 4h 10m > Remaining Estimate: 0h > > When users have a type that can't be serialized (it has a function Field or > similar) or more complex protocol buffers, the error should be improved to > display the type being registered, and better yet, the beam.RegisterType(...) > code they can use to enable use of their type. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-6838) Improve error messages for unregistered protos
[ https://issues.apache.org/jira/browse/BEAM-6838?focusedWorklogId=216460&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-216460 ] ASF GitHub Bot logged work on BEAM-6838: Author: ASF GitHub Bot Created on: 20/Mar/19 20:47 Start Date: 20/Mar/19 20:47 Worklog Time Spent: 10m Work Description: lostluck commented on pull request #8064: [BEAM-6838] Go SDK: Improving error msgs in pipeline serialization. URL: https://github.com/apache/beam/pull/8064#discussion_r267533474 ## File path: sdks/go/pkg/beam/core/runtime/graphx/serialize.go ## @@ -207,45 +207,45 @@ func encodeFn(u *graph.Fn) (*v1.Fn, error) { case u.Fn != nil: fn, err := encodeUserFn(u.Fn) if err != nil { - return nil, fmt.Errorf("bad userfn: %v", err) + return nil, fmt.Errorf("failed to encode function %v, bad userfn: %v", u, err) } return &v1.Fn{Fn: fn}, nil case u.Recv != nil: t := reflect.TypeOf(u.Recv) k, ok := runtime.TypeKey(reflectx.SkipPtr(t)) if !ok { - return nil, fmt.Errorf("bad recv: %v", u.Recv) + return nil, fmt.Errorf("failed to encode function %v, bad receiver %v", u, u.Recv) } if _, ok := runtime.LookupType(k); !ok { - return nil, fmt.Errorf("recv type must be registered: %v", t) + return nil, fmt.Errorf("failed to encode function %v, receiver type %v must be registered", u, t) } typ, err := encodeType(t) if err != nil { - panic(fmt.Sprintf("bad recv type: %v", u.Recv)) + panic(fmt.Sprintf("Failed to encode function %v, bad receiver type: %v", u, err)) } data, err := json.Marshal(u.Recv) if err != nil { - return nil, fmt.Errorf("bad userfn: %v", err) + return nil, fmt.Errorf("failed to encode function %v, bad userfn: %v", u, err) Review comment: ("Failed to encode structural DoFn %v, %T: %v", u, u.Recv, err) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 216460) > Improve error messages for unregistered protos > -- > > Key: BEAM-6838 > URL: https://issues.apache.org/jira/browse/BEAM-6838 > Project: Beam > Issue Type: Sub-task > Components: sdk-go >Reporter: Daniel Oliveira >Assignee: Daniel Oliveira >Priority: Minor > Time Spent: 3h 40m > Remaining Estimate: 0h > > When users have a type that can't be serialized (it has a function Field or > similar) or more complex protocol buffers, the error should be improved to > display the type being registered, and better yet, the beam.RegisterType(...) > code they can use to enable use of their type. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-6838) Improve error messages for unregistered protos
[ https://issues.apache.org/jira/browse/BEAM-6838?focusedWorklogId=216457&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-216457 ] ASF GitHub Bot logged work on BEAM-6838: Author: ASF GitHub Bot Created on: 20/Mar/19 20:47 Start Date: 20/Mar/19 20:47 Worklog Time Spent: 10m Work Description: lostluck commented on pull request #8064: [BEAM-6838] Go SDK: Improving error msgs in pipeline serialization. URL: https://github.com/apache/beam/pull/8064#discussion_r267529336 ## File path: sdks/go/pkg/beam/core/runtime/graphx/serialize.go ## @@ -145,20 +145,20 @@ func encodeCustomCoder(c *coder.CustomCoder) (*v1.CustomCoder, error) { func decodeCustomCoder(c *v1.CustomCoder) (*coder.CustomCoder, error) { t, err := decodeType(c.Type) if err != nil { - return nil, fmt.Errorf("decodeCustomCoder bad type: %v", err) + return nil, fmt.Errorf("failed to decode custom coder proto %v for type %v: %v", c, c.Type, err) Review comment: Missed removing the "proto " bits throughout this file. I think it's because it's in the second file, understandable oversight. That's why we review. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 216457) > Improve error messages for unregistered protos > -- > > Key: BEAM-6838 > URL: https://issues.apache.org/jira/browse/BEAM-6838 > Project: Beam > Issue Type: Sub-task > Components: sdk-go >Reporter: Daniel Oliveira >Assignee: Daniel Oliveira >Priority: Minor > Time Spent: 3h 20m > Remaining Estimate: 0h > > When users have a type that can't be serialized (it has a function Field or > similar) or more complex protocol buffers, the error should be improved to > display the type being registered, and better yet, the beam.RegisterType(...) > code they can use to enable use of their type. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-6527) Parallel tox (unit) tests run on Jenkins
[ https://issues.apache.org/jira/browse/BEAM-6527?focusedWorklogId=216446&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-216446 ] ASF GitHub Bot logged work on BEAM-6527: Author: ASF GitHub Bot Created on: 20/Mar/19 20:20 Start Date: 20/Mar/19 20:20 Worklog Time Spent: 10m Work Description: markflyhigh commented on issue #8067: [BEAM-6527] Use Gradle to parallel Python tox tests URL: https://github.com/apache/beam/pull/8067#issuecomment-475011873 Run Gradle Publish This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 216446) Time Spent: 8h 20m (was: 8h 10m) > Parallel tox (unit) tests run on Jenkins > > > Key: BEAM-6527 > URL: https://issues.apache.org/jira/browse/BEAM-6527 > Project: Beam > Issue Type: Sub-task > Components: testing >Reporter: Mark Liu >Assignee: Mark Liu >Priority: Major > Labels: triaged > Time Spent: 8h 20m > Remaining Estimate: 0h > > Existing tox unit test suite (basic, gcp and cython) will be enabled in > multiple version of Python 3, which will significantly increase runtime of > Pre/PostCommit build. A parallel is wanted in tox or Gradle invocation to > control the time in a reasonable range (<30mins for PreCommit is desired). -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-6812) Convert keys to ByteArray in Combine.perKey for Spark
[ https://issues.apache.org/jira/browse/BEAM-6812?focusedWorklogId=216443&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-216443 ] ASF GitHub Bot logged work on BEAM-6812: Author: ASF GitHub Bot Created on: 20/Mar/19 20:11 Start Date: 20/Mar/19 20:11 Worklog Time Spent: 10m Work Description: kyle-winkelman commented on pull request #8042: [BEAM-6812]: Convert keys to ByteArray in Combine.perKey to make sure hashCode is consistent URL: https://github.com/apache/beam/pull/8042#discussion_r267525204 ## File path: runners/spark/src/main/java/org/apache/beam/runners/spark/translation/TransformTranslator.java ## @@ -569,8 +569,8 @@ private static Partitioner getPartitioner(EvaluationContext context) { Long bundleSize = context.getSerializableOptions().get().as(SparkPipelineOptions.class).getBundleSize(); return (bundleSize > 0) -? null -: new HashPartitioner(context.getSparkContext().defaultParallelism()); +? new HashPartitioner(context.getSparkContext().defaultParallelism()) +: null; Review comment: I agree that the old functionality seems strange, but I remember (when I had the logic backwards) that the performance tests for the spark runner were impacted. I think the impact was in streaming mode because if you don't use this HashPartitioner then it actually does a double shuffle of the data. I tried to clean this up but my I never finished getting PR #6511 merged. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 216443) Time Spent: 2h 40m (was: 2.5h) > Convert keys to ByteArray in Combine.perKey for Spark > - > > Key: BEAM-6812 > URL: https://issues.apache.org/jira/browse/BEAM-6812 > Project: Beam > Issue Type: Bug > Components: runner-spark >Reporter: Ankit Jhalaria >Assignee: Ankit Jhalaria >Priority: Critical > Time Spent: 2h 40m > Remaining Estimate: 0h > > * During calls to Combine.perKey, we want they keys used to have consistent > hashCode when invoked from different JVM's. > * However, while testing this in our company we found out that when using > protobuf as keys during combine, the hashCodes can be different for the same > key when invoked from different JVMs. This results in duplicates. > * `ByteArray` class in Spark has a stable has code when dealing with arrays > as well. > * GroupByKey correctly converts keys to `ByteArray` and uses coders for > serialization. > * The fix does something similar when dealing with combines. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-6527) Parallel tox (unit) tests run on Jenkins
[ https://issues.apache.org/jira/browse/BEAM-6527?focusedWorklogId=216442&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-216442 ] ASF GitHub Bot logged work on BEAM-6527: Author: ASF GitHub Bot Created on: 20/Mar/19 20:09 Start Date: 20/Mar/19 20:09 Worklog Time Spent: 10m Work Description: markflyhigh commented on issue #8067: [BEAM-6527] Use Gradle to parallel Python tox tests URL: https://github.com/apache/beam/pull/8067#issuecomment-475007822 Run Seed Job This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 216442) Time Spent: 8h 10m (was: 8h) > Parallel tox (unit) tests run on Jenkins > > > Key: BEAM-6527 > URL: https://issues.apache.org/jira/browse/BEAM-6527 > Project: Beam > Issue Type: Sub-task > Components: testing >Reporter: Mark Liu >Assignee: Mark Liu >Priority: Major > Labels: triaged > Time Spent: 8h 10m > Remaining Estimate: 0h > > Existing tox unit test suite (basic, gcp and cython) will be enabled in > multiple version of Python 3, which will significantly increase runtime of > Pre/PostCommit build. A parallel is wanted in tox or Gradle invocation to > control the time in a reasonable range (<30mins for PreCommit is desired). -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-6446) Clean up suppression rules in checkstyle suppressions.xml
[ https://issues.apache.org/jira/browse/BEAM-6446?focusedWorklogId=216441&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-216441 ] ASF GitHub Bot logged work on BEAM-6446: Author: ASF GitHub Bot Created on: 20/Mar/19 19:58 Start Date: 20/Mar/19 19:58 Worklog Time Spent: 10m Work Description: HuangLED commented on issue #8091: [BEAM-6446] Clean up checkstyle suppressions. URL: https://github.com/apache/beam/pull/8091#issuecomment-474613132 Run Java PreCommit This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 216441) Time Spent: 50m (was: 40m) > Clean up suppression rules in checkstyle suppressions.xml > - > > Key: BEAM-6446 > URL: https://issues.apache.org/jira/browse/BEAM-6446 > Project: Beam > Issue Type: Improvement > Components: build-system >Reporter: Ruoyun Huang >Assignee: Ruoyun Huang >Priority: Minor > Labels: triaged > Time Spent: 50m > Remaining Estimate: 0h > > When violations are addressed, clean up suppression rules in checkstyle > suppressions.xml -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-6446) Clean up suppression rules in checkstyle suppressions.xml
[ https://issues.apache.org/jira/browse/BEAM-6446?focusedWorklogId=216440&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-216440 ] ASF GitHub Bot logged work on BEAM-6446: Author: ASF GitHub Bot Created on: 20/Mar/19 19:58 Start Date: 20/Mar/19 19:58 Worklog Time Spent: 10m Work Description: HuangLED commented on issue #8091: [BEAM-6446] Clean up checkstyle suppressions. URL: https://github.com/apache/beam/pull/8091#issuecomment-474932260 Run Java PreCommit This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 216440) Time Spent: 40m (was: 0.5h) > Clean up suppression rules in checkstyle suppressions.xml > - > > Key: BEAM-6446 > URL: https://issues.apache.org/jira/browse/BEAM-6446 > Project: Beam > Issue Type: Improvement > Components: build-system >Reporter: Ruoyun Huang >Assignee: Ruoyun Huang >Priority: Minor > Labels: triaged > Time Spent: 40m > Remaining Estimate: 0h > > When violations are addressed, clean up suppression rules in checkstyle > suppressions.xml -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-6812) Convert keys to ByteArray in Combine.perKey for Spark
[ https://issues.apache.org/jira/browse/BEAM-6812?focusedWorklogId=216436&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-216436 ] ASF GitHub Bot logged work on BEAM-6812: Author: ASF GitHub Bot Created on: 20/Mar/19 19:53 Start Date: 20/Mar/19 19:53 Worklog Time Spent: 10m Work Description: dmvk commented on issue #8042: [BEAM-6812]: Convert keys to ByteArray in Combine.perKey to make sure hashCode is consistent URL: https://github.com/apache/beam/pull/8042#issuecomment-475002477 Run Spark ValidatesRunner This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 216436) Time Spent: 2.5h (was: 2h 20m) > Convert keys to ByteArray in Combine.perKey for Spark > - > > Key: BEAM-6812 > URL: https://issues.apache.org/jira/browse/BEAM-6812 > Project: Beam > Issue Type: Bug > Components: runner-spark >Reporter: Ankit Jhalaria >Assignee: Ankit Jhalaria >Priority: Critical > Time Spent: 2.5h > Remaining Estimate: 0h > > * During calls to Combine.perKey, we want they keys used to have consistent > hashCode when invoked from different JVM's. > * However, while testing this in our company we found out that when using > protobuf as keys during combine, the hashCodes can be different for the same > key when invoked from different JVMs. This results in duplicates. > * `ByteArray` class in Spark has a stable has code when dealing with arrays > as well. > * GroupByKey correctly converts keys to `ByteArray` and uses coders for > serialization. > * The fix does something similar when dealing with combines. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-6812) Convert keys to ByteArray in Combine.perKey for Spark
[ https://issues.apache.org/jira/browse/BEAM-6812?focusedWorklogId=216433&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-216433 ] ASF GitHub Bot logged work on BEAM-6812: Author: ASF GitHub Bot Created on: 20/Mar/19 19:49 Start Date: 20/Mar/19 19:49 Worklog Time Spent: 10m Work Description: jhalaria commented on pull request #8042: [BEAM-6812]: Convert keys to ByteArray in Combine.perKey to make sure hashCode is consistent URL: https://github.com/apache/beam/pull/8042#discussion_r267517341 ## File path: runners/spark/src/main/java/org/apache/beam/runners/spark/translation/TransformTranslator.java ## @@ -569,8 +569,8 @@ private static Partitioner getPartitioner(EvaluationContext context) { Long bundleSize = context.getSerializableOptions().get().as(SparkPipelineOptions.class).getBundleSize(); return (bundleSize > 0) -? null -: new HashPartitioner(context.getSparkContext().defaultParallelism()); +? new HashPartitioner(context.getSparkContext().defaultParallelism()) +: null; Review comment: I see. I reverted back the changes made to `getPartition` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 216433) Time Spent: 2h 20m (was: 2h 10m) > Convert keys to ByteArray in Combine.perKey for Spark > - > > Key: BEAM-6812 > URL: https://issues.apache.org/jira/browse/BEAM-6812 > Project: Beam > Issue Type: Bug > Components: runner-spark >Reporter: Ankit Jhalaria >Assignee: Ankit Jhalaria >Priority: Critical > Time Spent: 2h 20m > Remaining Estimate: 0h > > * During calls to Combine.perKey, we want they keys used to have consistent > hashCode when invoked from different JVM's. > * However, while testing this in our company we found out that when using > protobuf as keys during combine, the hashCodes can be different for the same > key when invoked from different JVMs. This results in duplicates. > * `ByteArray` class in Spark has a stable has code when dealing with arrays > as well. > * GroupByKey correctly converts keys to `ByteArray` and uses coders for > serialization. > * The fix does something similar when dealing with combines. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-6812) Convert keys to ByteArray in Combine.perKey for Spark
[ https://issues.apache.org/jira/browse/BEAM-6812?focusedWorklogId=216429&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-216429 ] ASF GitHub Bot logged work on BEAM-6812: Author: ASF GitHub Bot Created on: 20/Mar/19 19:45 Start Date: 20/Mar/19 19:45 Worklog Time Spent: 10m Work Description: dmvk commented on pull request #8042: [BEAM-6812]: Convert keys to ByteArray in Combine.perKey to make sure hashCode is consistent URL: https://github.com/apache/beam/pull/8042#discussion_r267514847 ## File path: runners/spark/src/main/java/org/apache/beam/runners/spark/translation/TransformTranslator.java ## @@ -569,8 +569,8 @@ private static Partitioner getPartitioner(EvaluationContext context) { Long bundleSize = context.getSerializableOptions().get().as(SparkPipelineOptions.class).getBundleSize(); return (bundleSize > 0) -? null -: new HashPartitioner(context.getSparkContext().defaultParallelism()); +? new HashPartitioner(context.getSparkContext().defaultParallelism()) +: null; Review comment: Second thoughts, it was correct. https://github.com/apache/beam/pull/6884/files#r246077919 This was in order to maintain old functionality (bundleSize == 0, which basically means to use predefined parallelism). I think the old functionality doesn't make much sense as it doesn't scale with input data. I guess someone may use this in order to re-scale "downstream" stage, but there should be a better mechanism to do this. Any thoughts? @timrobertson100 @kyle-winkelman This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 216429) Time Spent: 2h 10m (was: 2h) > Convert keys to ByteArray in Combine.perKey for Spark > - > > Key: BEAM-6812 > URL: https://issues.apache.org/jira/browse/BEAM-6812 > Project: Beam > Issue Type: Bug > Components: runner-spark >Reporter: Ankit Jhalaria >Assignee: Ankit Jhalaria >Priority: Critical > Time Spent: 2h 10m > Remaining Estimate: 0h > > * During calls to Combine.perKey, we want they keys used to have consistent > hashCode when invoked from different JVM's. > * However, while testing this in our company we found out that when using > protobuf as keys during combine, the hashCodes can be different for the same > key when invoked from different JVMs. This results in duplicates. > * `ByteArray` class in Spark has a stable has code when dealing with arrays > as well. > * GroupByKey correctly converts keys to `ByteArray` and uses coders for > serialization. > * The fix does something similar when dealing with combines. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (BEAM-6862) [beam_PostCommit_Py_ValCont] Test fail with import error
[ https://issues.apache.org/jira/browse/BEAM-6862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16797461#comment-16797461 ] Boyuan Zhang commented on BEAM-6862: Error: File "/usr/local/lib/python2.7/site-packages/apache_beam/runners/dataflow/dataflow_exercise_metrics_pipeline.py", line 24, in from hamcrest.library.number.ordering_comparison import greater_than ImportError: No module named hamcrest.library.number.ordering_comparison Seems like import hamcrest.library.number.ordering_comparison failed. Maybe it's because circular dependency. > [beam_PostCommit_Py_ValCont] Test fail with import error > > > Key: BEAM-6862 > URL: https://issues.apache.org/jira/browse/BEAM-6862 > Project: Beam > Issue Type: Bug > Components: test-failures >Reporter: Ahmet Altay >Assignee: Alex Amato >Priority: Major > Labels: currently-failing > > PR likely root cause: https://github.com/apache/beam/pull/8038 > Started at: https://builds.apache.org/job/beam_PostCommit_Py_ValCont/2672/ > 14:30:12 ImportError: No module named > hamcrest.library.number.ordering_comparison > 14:30:12 > 14:30:12 at > java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:357) > 14:30:12 at > java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1895) > 14:30:12 at org.apache.beam.sdk.util.MoreFutures.get(MoreFutures.java:57) > 14:30:12 at > org.apache.beam.runners.dataflow.worker.fn.control.RegisterAndProcessBundleOperation.finish(RegisterAndProcessBundleOperation.java:281) > 14:30:12 at > org.apache.beam.runners.dataflow.worker.util.common.worker.MapTaskExecutor.execute(MapTaskExecutor.java:85) > 14:30:12 at > org.apache.beam.runners.dataflow.worker.fn.control.BeamFnMapTaskExecutor.execute(BeamFnMapTaskExecutor.java:125) > 14:30:12 at > org.apache.beam.runners.dataflow.worker.BatchDataflowWorker.executeWork(BatchDataflowWorker.java:411) > 14:30:12 at > org.apache.beam.runners.dataflow.worker.BatchDataflowWorker.doWork(BatchDataflowWorker.java:380) > 14:30:12 at > org.apache.beam.runners.dataflow.worker.BatchDataflowWorker.getAndPerformWork(BatchDataflowWorker.java:306) > 14:30:12 at > org.apache.beam.runners.dataflow.worker.DataflowRunnerHarness.start(DataflowRunnerHarness.java:195) > 14:30:12 at > org.apache.beam.runners.dataflow.worker.DataflowRunnerHarness.main(DataflowRunnerHarness.java:123) > 14:30:12 Suppressed: java.lang.IllegalStateException: Already closed. > 14:30:12 at > org.apache.beam.sdk.fn.data.BeamFnDataBufferingOutboundObserver.close(BeamFnDataBufferingOutboundObserver.java:95) > 14:30:12 at > org.apache.beam.runners.dataflow.worker.fn.data.RemoteGrpcPortWriteOperation.abort(RemoteGrpcPortWriteOperation.java:215) > 14:30:12 at > org.apache.beam.runners.dataflow.worker.util.common.worker.MapTaskExecutor.execute(MapTaskExecutor.java:91) > 14:30:12 ... 6 more -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (BEAM-6812) Convert keys to ByteArray in Combine.perKey for Spark
[ https://issues.apache.org/jira/browse/BEAM-6812?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ankit Jhalaria updated BEAM-6812: - Description: * During calls to Combine.perKey, we want they keys used to have consistent hashCode when invoked from different JVM's. * However, while testing this in our company we found out that when using protobuf as keys during combine, the hashCodes can be different for the same key when invoked from different JVMs. This results in duplicates. * `ByteArray` class in Spark has a stable has code when dealing with arrays as well. * GroupByKey correctly converts keys to `ByteArray` and uses coders for serialization. * The fix does something similar when dealing with combines. > Convert keys to ByteArray in Combine.perKey for Spark > - > > Key: BEAM-6812 > URL: https://issues.apache.org/jira/browse/BEAM-6812 > Project: Beam > Issue Type: Bug > Components: runner-spark >Reporter: Ankit Jhalaria >Assignee: Ankit Jhalaria >Priority: Critical > Time Spent: 2h > Remaining Estimate: 0h > > * During calls to Combine.perKey, we want they keys used to have consistent > hashCode when invoked from different JVM's. > * However, while testing this in our company we found out that when using > protobuf as keys during combine, the hashCodes can be different for the same > key when invoked from different JVMs. This results in duplicates. > * `ByteArray` class in Spark has a stable has code when dealing with arrays > as well. > * GroupByKey correctly converts keys to `ByteArray` and uses coders for > serialization. > * The fix does something similar when dealing with combines. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Issue Comment Deleted] (BEAM-6812) Convert keys to ByteArray in Combine.perKey for Spark
[ https://issues.apache.org/jira/browse/BEAM-6812?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ankit Jhalaria updated BEAM-6812: - Comment: was deleted (was: * When running a Combine.perKey transform on Spark, we noticed duplicate results in the output for the same key. * On further investigation, it turns out that `StreamingTransformTranslator` uses `Spark's` HashPartitioner which does not work correctly when attempting to partition on a key which is an array. * Note from the code: * /** * A [[org.apache.spark.Partitioner]] that implements hash-based partitioning using * Java's `Object.hashCode`. * * Java arrays have hashCodes that are based on the arrays' identities rather than their contents, * so attempting to partition an RDD[Array[_]] or RDD[(Array[_], _)] using a HashPartitioner will * produce an unexpected or incorrect result. */) > Convert keys to ByteArray in Combine.perKey for Spark > - > > Key: BEAM-6812 > URL: https://issues.apache.org/jira/browse/BEAM-6812 > Project: Beam > Issue Type: Bug > Components: runner-spark >Reporter: Ankit Jhalaria >Assignee: Ankit Jhalaria >Priority: Critical > Time Spent: 2h > Remaining Estimate: 0h > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (BEAM-6812) Convert keys to ByteArray in Combine.perKey for Spark
[ https://issues.apache.org/jira/browse/BEAM-6812?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ankit Jhalaria updated BEAM-6812: - Summary: Convert keys to ByteArray in Combine.perKey for Spark (was: HashPartitioning on Spark with arrays as keys produces unpredictable results) > Convert keys to ByteArray in Combine.perKey for Spark > - > > Key: BEAM-6812 > URL: https://issues.apache.org/jira/browse/BEAM-6812 > Project: Beam > Issue Type: Bug > Components: runner-spark >Reporter: Ankit Jhalaria >Assignee: Ankit Jhalaria >Priority: Critical > Time Spent: 2h > Remaining Estimate: 0h > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-6812) HashPartitioning on Spark with arrays as keys produces unpredictable results
[ https://issues.apache.org/jira/browse/BEAM-6812?focusedWorklogId=216409&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-216409 ] ASF GitHub Bot logged work on BEAM-6812: Author: ASF GitHub Bot Created on: 20/Mar/19 19:02 Start Date: 20/Mar/19 19:02 Worklog Time Spent: 10m Work Description: jhalaria commented on issue #8042: [BEAM-6812]: Create a custom hash partitioner that deals with arrays during spark combines URL: https://github.com/apache/beam/pull/8042#issuecomment-474985096 @dmvk - Thank you for looking at the PR. I removed the custom partitioner logic for now. I made the CPK similar to GBK where we convert the key to a `ByteArray`. Will create another PR that provides the ability to add a custom partitioner. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 216409) Time Spent: 2h (was: 1h 50m) > HashPartitioning on Spark with arrays as keys produces unpredictable results > > > Key: BEAM-6812 > URL: https://issues.apache.org/jira/browse/BEAM-6812 > Project: Beam > Issue Type: Bug > Components: runner-spark >Reporter: Ankit Jhalaria >Assignee: Ankit Jhalaria >Priority: Critical > Time Spent: 2h > Remaining Estimate: 0h > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-6838) Improve error messages for unregistered protos
[ https://issues.apache.org/jira/browse/BEAM-6838?focusedWorklogId=216393&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-216393 ] ASF GitHub Bot logged work on BEAM-6838: Author: ASF GitHub Bot Created on: 20/Mar/19 18:42 Start Date: 20/Mar/19 18:42 Worklog Time Spent: 10m Work Description: youngoli commented on pull request #8064: [BEAM-6838] Go SDK: Improving error msgs in pipeline serialization. URL: https://github.com/apache/beam/pull/8064#discussion_r267492152 ## File path: sdks/go/pkg/beam/core/runtime/graphx/coder.go ## @@ -217,7 +217,7 @@ func (b *CoderUnmarshaller) makeCoder(c *pb.Coder) (*coder.Coder, error) { // the portable pipeline model directly (BEAM-2885) if elm.GetSpec().GetSpec().GetUrn() != "" && elm.GetSpec().GetSpec().GetUrn() != urnCustomCoder { // TODO(herohde) 11/17/2017: revisit this restriction - return nil, fmt.Errorf("expected length prefix of custom coder only: %v", elm) + return nil, fmt.Errorf("could not make coder from proto %v, expected length prefix of custom coder only: %v", *c, elm) Review comment: Done. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 216393) Time Spent: 2h 10m (was: 2h) > Improve error messages for unregistered protos > -- > > Key: BEAM-6838 > URL: https://issues.apache.org/jira/browse/BEAM-6838 > Project: Beam > Issue Type: Sub-task > Components: sdk-go >Reporter: Daniel Oliveira >Assignee: Daniel Oliveira >Priority: Minor > Time Spent: 2h 10m > Remaining Estimate: 0h > > When users have a type that can't be serialized (it has a function Field or > similar) or more complex protocol buffers, the error should be improved to > display the type being registered, and better yet, the beam.RegisterType(...) > code they can use to enable use of their type. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-6838) Improve error messages for unregistered protos
[ https://issues.apache.org/jira/browse/BEAM-6838?focusedWorklogId=216406&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-216406 ] ASF GitHub Bot logged work on BEAM-6838: Author: ASF GitHub Bot Created on: 20/Mar/19 18:48 Start Date: 20/Mar/19 18:48 Worklog Time Spent: 10m Work Description: youngoli commented on pull request #8064: [BEAM-6838] Go SDK: Improving error msgs in pipeline serialization. URL: https://github.com/apache/beam/pull/8064#discussion_r267494692 ## File path: sdks/go/pkg/beam/core/runtime/graphx/coder.go ## @@ -157,7 +157,7 @@ func (b *CoderUnmarshaller) makeCoder(c *pb.Coder) (*coder.Coder, error) { case urnKVCoder: if len(components) != 2 { - return nil, fmt.Errorf("bad pair: %v", c) + return nil, fmt.Errorf("could not make coder from proto %v, incorrect number of components for pair", *c) Review comment: That's really useful to know actually. I checked the existing String functions and removed all the dereferences I could (which was all of them). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 216406) Time Spent: 3h 10m (was: 3h) > Improve error messages for unregistered protos > -- > > Key: BEAM-6838 > URL: https://issues.apache.org/jira/browse/BEAM-6838 > Project: Beam > Issue Type: Sub-task > Components: sdk-go >Reporter: Daniel Oliveira >Assignee: Daniel Oliveira >Priority: Minor > Time Spent: 3h 10m > Remaining Estimate: 0h > > When users have a type that can't be serialized (it has a function Field or > similar) or more complex protocol buffers, the error should be improved to > display the type being registered, and better yet, the beam.RegisterType(...) > code they can use to enable use of their type. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-6838) Improve error messages for unregistered protos
[ https://issues.apache.org/jira/browse/BEAM-6838?focusedWorklogId=216405&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-216405 ] ASF GitHub Bot logged work on BEAM-6838: Author: ASF GitHub Bot Created on: 20/Mar/19 18:47 Start Date: 20/Mar/19 18:47 Worklog Time Spent: 10m Work Description: youngoli commented on pull request #8064: [BEAM-6838] Go SDK: Improving error msgs in pipeline serialization. URL: https://github.com/apache/beam/pull/8064#discussion_r267494071 ## File path: sdks/go/pkg/beam/core/runtime/graphx/serialize.go ## @@ -122,15 +122,15 @@ func DecodeMultiEdge(edge *v1.MultiEdge) (graph.Opcode, *graph.Fn, *window.Fn, [ func encodeCustomCoder(c *coder.CustomCoder) (*v1.CustomCoder, error) { t, err := encodeType(c.Type) if err != nil { - return nil, fmt.Errorf("bad underlying type: %v", err) + return nil, fmt.Errorf("failed to encode custom coder %v, bad underlying type: %v", *c, err) Review comment: Yeah, it's a bit tricky. Your suggestion is better than anything I thought of, so I went with it. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 216405) Time Spent: 3h (was: 2h 50m) > Improve error messages for unregistered protos > -- > > Key: BEAM-6838 > URL: https://issues.apache.org/jira/browse/BEAM-6838 > Project: Beam > Issue Type: Sub-task > Components: sdk-go >Reporter: Daniel Oliveira >Assignee: Daniel Oliveira >Priority: Minor > Time Spent: 3h > Remaining Estimate: 0h > > When users have a type that can't be serialized (it has a function Field or > similar) or more complex protocol buffers, the error should be improved to > display the type being registered, and better yet, the beam.RegisterType(...) > code they can use to enable use of their type. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-6838) Improve error messages for unregistered protos
[ https://issues.apache.org/jira/browse/BEAM-6838?focusedWorklogId=216402&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-216402 ] ASF GitHub Bot logged work on BEAM-6838: Author: ASF GitHub Bot Created on: 20/Mar/19 18:46 Start Date: 20/Mar/19 18:46 Worklog Time Spent: 10m Work Description: youngoli commented on pull request #8064: [BEAM-6838] Go SDK: Improving error msgs in pipeline serialization. URL: https://github.com/apache/beam/pull/8064#discussion_r267493684 ## File path: sdks/go/pkg/beam/core/runtime/graphx/serialize.go ## @@ -43,14 +43,14 @@ func EncodeMultiEdge(edge *graph.MultiEdge) (*v1.MultiEdge, error) { if edge.DoFn != nil { ref, err := encodeFn((*graph.Fn)(edge.DoFn)) if err != nil { - return nil, fmt.Errorf("encode: bad userfn: %v", err) + return nil, fmt.Errorf("failed to encode multi edge %v, bad userfn: %v", *edge, err) Review comment: Done. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 216402) Time Spent: 2h 50m (was: 2h 40m) > Improve error messages for unregistered protos > -- > > Key: BEAM-6838 > URL: https://issues.apache.org/jira/browse/BEAM-6838 > Project: Beam > Issue Type: Sub-task > Components: sdk-go >Reporter: Daniel Oliveira >Assignee: Daniel Oliveira >Priority: Minor > Time Spent: 2h 50m > Remaining Estimate: 0h > > When users have a type that can't be serialized (it has a function Field or > similar) or more complex protocol buffers, the error should be improved to > display the type being registered, and better yet, the beam.RegisterType(...) > code they can use to enable use of their type. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-6838) Improve error messages for unregistered protos
[ https://issues.apache.org/jira/browse/BEAM-6838?focusedWorklogId=216401&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-216401 ] ASF GitHub Bot logged work on BEAM-6838: Author: ASF GitHub Bot Created on: 20/Mar/19 18:44 Start Date: 20/Mar/19 18:44 Worklog Time Spent: 10m Work Description: youngoli commented on pull request #8064: [BEAM-6838] Go SDK: Improving error msgs in pipeline serialization. URL: https://github.com/apache/beam/pull/8064#discussion_r267492965 ## File path: sdks/go/pkg/beam/core/runtime/graphx/coder.go ## @@ -366,7 +370,7 @@ func (b *CoderMarshaller) Add(c *coder.Coder) string { return b.internBuiltInCoder(urnVarIntCoder) default: - panic(fmt.Sprintf("Unexpected coder kind: %v", c.Kind)) + panic(fmt.Sprintf("failed to marshal custom coder %v, unexpected coder kind: %v", *c, c.Kind)) Review comment: Capitalized them all for consistency. I was aware of errors being lowercased, but wasn't sure about panic messages. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 216401) Time Spent: 2h 40m (was: 2.5h) > Improve error messages for unregistered protos > -- > > Key: BEAM-6838 > URL: https://issues.apache.org/jira/browse/BEAM-6838 > Project: Beam > Issue Type: Sub-task > Components: sdk-go >Reporter: Daniel Oliveira >Assignee: Daniel Oliveira >Priority: Minor > Time Spent: 2h 40m > Remaining Estimate: 0h > > When users have a type that can't be serialized (it has a function Field or > similar) or more complex protocol buffers, the error should be improved to > display the type being registered, and better yet, the beam.RegisterType(...) > code they can use to enable use of their type. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-6838) Improve error messages for unregistered protos
[ https://issues.apache.org/jira/browse/BEAM-6838?focusedWorklogId=216396&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-216396 ] ASF GitHub Bot logged work on BEAM-6838: Author: ASF GitHub Bot Created on: 20/Mar/19 18:43 Start Date: 20/Mar/19 18:43 Worklog Time Spent: 10m Work Description: youngoli commented on pull request #8064: [BEAM-6838] Go SDK: Improving error msgs in pipeline serialization. URL: https://github.com/apache/beam/pull/8064#discussion_r267492258 ## File path: sdks/go/pkg/beam/core/runtime/graphx/coder.go ## @@ -248,7 +248,7 @@ func (b *CoderUnmarshaller) makeCoder(c *pb.Coder) (*coder.Coder, error) { return &coder.Coder{Kind: coder.WindowedValue, T: t, Components: []*coder.Coder{elm}, Window: w}, nil case streamType: - return nil, fmt.Errorf("stream must be pair value: %v", c) + return nil, fmt.Errorf("could not make coder from proto %v, stream must be pair value", *c) Review comment: Done. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 216396) Time Spent: 2.5h (was: 2h 20m) > Improve error messages for unregistered protos > -- > > Key: BEAM-6838 > URL: https://issues.apache.org/jira/browse/BEAM-6838 > Project: Beam > Issue Type: Sub-task > Components: sdk-go >Reporter: Daniel Oliveira >Assignee: Daniel Oliveira >Priority: Minor > Time Spent: 2.5h > Remaining Estimate: 0h > > When users have a type that can't be serialized (it has a function Field or > similar) or more complex protocol buffers, the error should be improved to > display the type being registered, and better yet, the beam.RegisterType(...) > code they can use to enable use of their type. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-6838) Improve error messages for unregistered protos
[ https://issues.apache.org/jira/browse/BEAM-6838?focusedWorklogId=216391&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-216391 ] ASF GitHub Bot logged work on BEAM-6838: Author: ASF GitHub Bot Created on: 20/Mar/19 18:42 Start Date: 20/Mar/19 18:42 Worklog Time Spent: 10m Work Description: youngoli commented on pull request #8064: [BEAM-6838] Go SDK: Improving error msgs in pipeline serialization. URL: https://github.com/apache/beam/pull/8064#discussion_r267492046 ## File path: sdks/go/pkg/beam/core/runtime/graphx/coder.go ## @@ -157,7 +157,7 @@ func (b *CoderUnmarshaller) makeCoder(c *pb.Coder) (*coder.Coder, error) { case urnKVCoder: if len(components) != 2 { - return nil, fmt.Errorf("bad pair: %v", c) + return nil, fmt.Errorf("could not make coder from proto %v, incorrect number of components for pair", *c) Review comment: Useful advice, modified all the error msgs in this function to match. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 216391) Time Spent: 1h 50m (was: 1h 40m) > Improve error messages for unregistered protos > -- > > Key: BEAM-6838 > URL: https://issues.apache.org/jira/browse/BEAM-6838 > Project: Beam > Issue Type: Sub-task > Components: sdk-go >Reporter: Daniel Oliveira >Assignee: Daniel Oliveira >Priority: Minor > Time Spent: 1h 50m > Remaining Estimate: 0h > > When users have a type that can't be serialized (it has a function Field or > similar) or more complex protocol buffers, the error should be improved to > display the type being registered, and better yet, the beam.RegisterType(...) > code they can use to enable use of their type. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-6838) Improve error messages for unregistered protos
[ https://issues.apache.org/jira/browse/BEAM-6838?focusedWorklogId=216392&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-216392 ] ASF GitHub Bot logged work on BEAM-6838: Author: ASF GitHub Bot Created on: 20/Mar/19 18:42 Start Date: 20/Mar/19 18:42 Worklog Time Spent: 10m Work Description: youngoli commented on pull request #8064: [BEAM-6838] Go SDK: Improving error msgs in pipeline serialization. URL: https://github.com/apache/beam/pull/8064#discussion_r267492086 ## File path: sdks/go/pkg/beam/core/runtime/graphx/coder.go ## @@ -206,7 +206,7 @@ func (b *CoderUnmarshaller) makeCoder(c *pb.Coder) (*coder.Coder, error) { case urnLengthPrefixCoder: if len(components) != 1 { - return nil, fmt.Errorf("bad length prefix: %v", c) + return nil, fmt.Errorf("could not make coder from proto %v, bad length prefix", *c) Review comment: Done This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 216392) Time Spent: 2h (was: 1h 50m) > Improve error messages for unregistered protos > -- > > Key: BEAM-6838 > URL: https://issues.apache.org/jira/browse/BEAM-6838 > Project: Beam > Issue Type: Sub-task > Components: sdk-go >Reporter: Daniel Oliveira >Assignee: Daniel Oliveira >Priority: Minor > Time Spent: 2h > Remaining Estimate: 0h > > When users have a type that can't be serialized (it has a function Field or > similar) or more complex protocol buffers, the error should be improved to > display the type being registered, and better yet, the beam.RegisterType(...) > code they can use to enable use of their type. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-6838) Improve error messages for unregistered protos
[ https://issues.apache.org/jira/browse/BEAM-6838?focusedWorklogId=216394&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-216394 ] ASF GitHub Bot logged work on BEAM-6838: Author: ASF GitHub Bot Created on: 20/Mar/19 18:42 Start Date: 20/Mar/19 18:42 Worklog Time Spent: 10m Work Description: youngoli commented on pull request #8064: [BEAM-6838] Go SDK: Improving error msgs in pipeline serialization. URL: https://github.com/apache/beam/pull/8064#discussion_r267492213 ## File path: sdks/go/pkg/beam/core/runtime/graphx/coder.go ## @@ -233,7 +233,7 @@ func (b *CoderUnmarshaller) makeCoder(c *pb.Coder) (*coder.Coder, error) { case urnWindowedValueCoder: if len(components) != 2 { - return nil, fmt.Errorf("bad windowed value: %v", c) + return nil, fmt.Errorf("could not make coder from proto %v, bad windowed value", *c) Review comment: Done. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 216394) Time Spent: 2h 20m (was: 2h 10m) > Improve error messages for unregistered protos > -- > > Key: BEAM-6838 > URL: https://issues.apache.org/jira/browse/BEAM-6838 > Project: Beam > Issue Type: Sub-task > Components: sdk-go >Reporter: Daniel Oliveira >Assignee: Daniel Oliveira >Priority: Minor > Time Spent: 2h 20m > Remaining Estimate: 0h > > When users have a type that can't be serialized (it has a function Field or > similar) or more complex protocol buffers, the error should be improved to > display the type being registered, and better yet, the beam.RegisterType(...) > code they can use to enable use of their type. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-6778) Enable Bundle Finalization in Python SDK
[ https://issues.apache.org/jira/browse/BEAM-6778?focusedWorklogId=216379&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-216379 ] ASF GitHub Bot logged work on BEAM-6778: Author: ASF GitHub Bot Created on: 20/Mar/19 18:15 Start Date: 20/Mar/19 18:15 Worklog Time Spent: 10m Work Description: boyuanzz commented on issue #7937: [BEAM-6778] Enable Bundle Finalization in Python SDK harness over FnApi URL: https://github.com/apache/beam/pull/7937#issuecomment-474967111 All tests have passed. Thanks for your time and help! @robertwb This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 216379) Time Spent: 9h 20m (was: 9h 10m) > Enable Bundle Finalization in Python SDK > > > Key: BEAM-6778 > URL: https://issues.apache.org/jira/browse/BEAM-6778 > Project: Beam > Issue Type: New Feature > Components: sdk-py-harness >Reporter: Boyuan Zhang >Assignee: Boyuan Zhang >Priority: Major > Time Spent: 9h 20m > Remaining Estimate: 0h > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-6619) Add PostCommit suite for integration tests on DataflowRunner
[ https://issues.apache.org/jira/browse/BEAM-6619?focusedWorklogId=216378&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-216378 ] ASF GitHub Bot logged work on BEAM-6619: Author: ASF GitHub Bot Created on: 20/Mar/19 18:14 Start Date: 20/Mar/19 18:14 Worklog Time Spent: 10m Work Description: tvalentyn commented on pull request #8076: [BEAM-6619] [BEAM-6593] Add bigquery integration tests to postcommit URL: https://github.com/apache/beam/pull/8076#discussion_r267479836 ## File path: sdks/python/apache_beam/examples/complete/game/leader_board_it_test.py ## @@ -53,7 +53,7 @@ class LeaderBoardIT(unittest.TestCase): # Input event containing user, team, score, processing time, window start. - INPUT_EVENT = 'user1,teamA,10,%d,2015-11-02 09:09:28.224' + INPUT_EVENT = b'user1,teamA,10,%d,2015-11-02 09:09:28.224' Review comment: So the question is then, to the humans who read and edit this code, should INPUT_EVENT be a textual date or encoded data? I think it's easier to consider it text, up until it's time to feed it to pubsub, then when we can encode it to bytes. I'd keep INPUT_EVENT as is change publishing to: ``` event = self.INPUT_EVENT % self._test_timestamp self.pub_client.publish(event.encode('utf-8')) ``` I would expect users to follow a similar pattern in their pipelines, and they might refer to beam examples for guidance, so I suggest to change this. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 216378) Time Spent: 20h 50m (was: 20h 40m) > Add PostCommit suite for integration tests on DataflowRunner > > > Key: BEAM-6619 > URL: https://issues.apache.org/jira/browse/BEAM-6619 > Project: Beam > Issue Type: Sub-task > Components: testing >Reporter: Mark Liu >Assignee: Mark Liu >Priority: Major > Labels: triaged > Fix For: Not applicable > > Time Spent: 20h 50m > Remaining Estimate: 0h > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-6772) Select transform has non-intuitive semantics
[ https://issues.apache.org/jira/browse/BEAM-6772?focusedWorklogId=216376&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-216376 ] ASF GitHub Bot logged work on BEAM-6772: Author: ASF GitHub Bot Created on: 20/Mar/19 18:13 Start Date: 20/Mar/19 18:13 Worklog Time Spent: 10m Work Description: kanterov commented on issue #8006: [BEAM-6772] Change Select semantics to match what a user expects URL: https://github.com/apache/beam/pull/8006#issuecomment-474966459 Sounds good. I will take a look tomorrow This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 216376) Time Spent: 5.5h (was: 5h 20m) > Select transform has non-intuitive semantics > > > Key: BEAM-6772 > URL: https://issues.apache.org/jira/browse/BEAM-6772 > Project: Beam > Issue Type: Sub-task > Components: sdk-java-core >Reporter: Reuven Lax >Assignee: Reuven Lax >Priority: Major > Time Spent: 5.5h > Remaining Estimate: 0h > > Consider the following schema: > User: > name: STRING > location: Location > > Location: > latitude: DOUBLE > longitude: DOUBLE > > If you apply Select.fieldNames("location"), most users expect to get back a > row matching the Location schema. Instead you get back an outer schema with a > single location field in it. Select should instead unnest the output up to > the point where multiple fields are selected. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-6861) Select fields and computed fields in CassandraIO.Read
[ https://issues.apache.org/jira/browse/BEAM-6861?focusedWorklogId=216369&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-216369 ] ASF GitHub Bot logged work on BEAM-6861: Author: ASF GitHub Bot Created on: 20/Mar/19 18:04 Start Date: 20/Mar/19 18:04 Worklog Time Spent: 10m Work Description: stankiewicz commented on issue #8090: [BEAM-6861] Select fields and computed fields in CassandraIO.Read URL: https://github.com/apache/beam/pull/8090#issuecomment-474962082 Hi @srfrnk! I don't like the idea of forking driver as we will end up maintaining it. I haven't thought about it but I like the idea of programmatic generation of query instead of doing String joins and formats. Challenge here is what we want to expose instead of withWhere() or withQuery() - withQueryBuilder()? If yes it would be nice alternative because we can then just add key ranges dynamically. But then the challenge is how to support value provider for it. If we figure out serialisation then we can clean this up a little bit. In the meantime I would try to merge this. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 216369) Time Spent: 4h (was: 3h 50m) > Select fields and computed fields in CassandraIO.Read > -- > > Key: BEAM-6861 > URL: https://issues.apache.org/jira/browse/BEAM-6861 > Project: Beam > Issue Type: Improvement > Components: io-java-cassandra >Reporter: Radosław Stankiewicz >Assignee: Radosław Stankiewicz >Priority: Minor > Labels: features > Time Spent: 4h > Remaining Estimate: 0h > > CassandraIO.Read currently selects all fields and maps them to POJO. > To make this component more flexible, it should be possible to select only > subset of fields or computed fields to allow reading things like write > Timestamp or using other functions. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-6619) Add PostCommit suite for integration tests on DataflowRunner
[ https://issues.apache.org/jira/browse/BEAM-6619?focusedWorklogId=216367&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-216367 ] ASF GitHub Bot logged work on BEAM-6619: Author: ASF GitHub Bot Created on: 20/Mar/19 17:59 Start Date: 20/Mar/19 17:59 Worklog Time Spent: 10m Work Description: Juta commented on pull request #8076: [BEAM-6619] [BEAM-6593] Add bigquery integration tests to postcommit URL: https://github.com/apache/beam/pull/8076#discussion_r267473739 ## File path: sdks/python/apache_beam/examples/complete/game/leader_board_it_test.py ## @@ -53,7 +53,7 @@ class LeaderBoardIT(unittest.TestCase): # Input event containing user, team, score, processing time, window start. - INPUT_EVENT = 'user1,teamA,10,%d,2015-11-02 09:09:28.224' + INPUT_EVENT = b'user1,teamA,10,%d,2015-11-02 09:09:28.224' Review comment: yes `self.pub_client.publish` requires bytes. In this case I think specifying the input event as bytes is what is expected because it is directly passed to the client. What do you think? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 216367) Time Spent: 20h 40m (was: 20.5h) > Add PostCommit suite for integration tests on DataflowRunner > > > Key: BEAM-6619 > URL: https://issues.apache.org/jira/browse/BEAM-6619 > Project: Beam > Issue Type: Sub-task > Components: testing >Reporter: Mark Liu >Assignee: Mark Liu >Priority: Major > Labels: triaged > Fix For: Not applicable > > Time Spent: 20h 40m > Remaining Estimate: 0h > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-6619) Add PostCommit suite for integration tests on DataflowRunner
[ https://issues.apache.org/jira/browse/BEAM-6619?focusedWorklogId=216366&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-216366 ] ASF GitHub Bot logged work on BEAM-6619: Author: ASF GitHub Bot Created on: 20/Mar/19 17:59 Start Date: 20/Mar/19 17:59 Worklog Time Spent: 10m Work Description: tvalentyn commented on pull request #8076: [BEAM-6619] [BEAM-6593] Add bigquery integration tests to postcommit URL: https://github.com/apache/beam/pull/8076#discussion_r267473468 ## File path: sdks/python/apache_beam/examples/complete/game/leader_board.py ## @@ -120,6 +120,8 @@ def __init__(self): def process(self, elem): try: + if isinstance(elem, bytes): Review comment: I am not sure I follow - could you please elaborate - what exactly in either pubsub or unit test makes it so that the input type is not consistent here? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 216366) Time Spent: 20.5h (was: 20h 20m) > Add PostCommit suite for integration tests on DataflowRunner > > > Key: BEAM-6619 > URL: https://issues.apache.org/jira/browse/BEAM-6619 > Project: Beam > Issue Type: Sub-task > Components: testing >Reporter: Mark Liu >Assignee: Mark Liu >Priority: Major > Labels: triaged > Fix For: Not applicable > > Time Spent: 20.5h > Remaining Estimate: 0h > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-6619) Add PostCommit suite for integration tests on DataflowRunner
[ https://issues.apache.org/jira/browse/BEAM-6619?focusedWorklogId=216363&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-216363 ] ASF GitHub Bot logged work on BEAM-6619: Author: ASF GitHub Bot Created on: 20/Mar/19 17:54 Start Date: 20/Mar/19 17:54 Worklog Time Spent: 10m Work Description: Juta commented on pull request #8076: [BEAM-6619] [BEAM-6593] Add bigquery integration tests to postcommit URL: https://github.com/apache/beam/pull/8076#discussion_r267471449 ## File path: sdks/python/apache_beam/examples/complete/game/leader_board.py ## @@ -120,6 +120,8 @@ def __init__(self): def process(self, elem): try: + if isinstance(elem, bytes): Review comment: The IT test uses pubsub as input source while the unit test currently passes strings. That's why we cannot always decode This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 216363) Time Spent: 20h 20m (was: 20h 10m) > Add PostCommit suite for integration tests on DataflowRunner > > > Key: BEAM-6619 > URL: https://issues.apache.org/jira/browse/BEAM-6619 > Project: Beam > Issue Type: Sub-task > Components: testing >Reporter: Mark Liu >Assignee: Mark Liu >Priority: Major > Labels: triaged > Fix For: Not applicable > > Time Spent: 20h 20m > Remaining Estimate: 0h > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (BEAM-6868) Flink runner supports Bundle Finalization
[ https://issues.apache.org/jira/browse/BEAM-6868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16797386#comment-16797386 ] Maximilian Michels commented on BEAM-6868: -- Thanks for clarifying. Indeed, the Flink Runner does not send {{FinalizeBundleRequest}}. I thought you were referring to the bundle processing of the legacy Runner. > Flink runner supports Bundle Finalization > - > > Key: BEAM-6868 > URL: https://issues.apache.org/jira/browse/BEAM-6868 > Project: Beam > Issue Type: New Feature > Components: runner-flink >Reporter: Boyuan Zhang >Priority: Major > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (BEAM-6870) python 3 test_hourly_team_score_it fails with bigquery job id already exists
[ https://issues.apache.org/jira/browse/BEAM-6870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16797387#comment-16797387 ] Juta Staes commented on BEAM-6870: -- The test also fails in isolation because the retry is triggered somehow. I did not yet look into why this happens. > python 3 test_hourly_team_score_it fails with bigquery job id already exists > > > Key: BEAM-6870 > URL: https://issues.apache.org/jira/browse/BEAM-6870 > Project: Beam > Issue Type: Sub-task > Components: sdk-py-core >Reporter: Juta Staes >Priority: Major > > This IT test on python 3 fails with "Already exists job ..." > {code:java} > 14:42:03 > == > 14:42:03 ERROR: test_hourly_team_score_it > (apache_beam.examples.complete.game.hourly_team_score_it_test.HourlyTeamScoreIT) > 14:42:03 > -- > 14:42:03 Traceback (most recent call last): > 14:42:03 File > "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify_PR/src/sdks/python/apache_beam/examples/complete/game/hourly_team_score_it_test.py", > line 89, in test_hourly_team_score_it > 14:42:03 self.test_pipeline.get_full_options_as_args(**extra_opts)) > 14:42:03 File > "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify_PR/src/sdks/python/apache_beam/examples/complete/game/hourly_team_score.py", > line 295, in run > 14:42:03 }, options.view_as(GoogleCloudOptions).project)) > 14:42:03 File > "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify_PR/src/sdks/python/apache_beam/pipeline.py", > line 426, in __exit__ > 14:42:03 self.run().wait_until_finish() > 14:42:03 File > "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify_PR/src/sdks/python/apache_beam/pipeline.py", > line 406, in run > 14:42:03 self._options).run(False) > 14:42:03 File > "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify_PR/src/sdks/python/apache_beam/pipeline.py", > line 419, in run > 14:42:03 return self.runner.run_pipeline(self, self._options) > 14:42:03 File > "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify_PR/src/sdks/python/apache_beam/runners/dataflow/test_dataflow_runner.py", > line 64, in run_pipeline > 14:42:03 self.result.wait_until_finish(duration=wait_duration) > 14:42:03 File > "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify_PR/src/sdks/python/apache_beam/runners/dataflow/dataflow_runner.py", > line 1217, in wait_until_finish > 14:42:03 (self.state, getattr(self._runner, 'last_error_msg', None)), > self) > 14:42:03 > apache_beam.runners.dataflow.dataflow_runner.DataflowRuntimeException: > Dataflow pipeline failed. State: FAILED, Error: > 14:42:03 Traceback (most recent call last): > 14:42:03 File "apache_beam/runners/common.py", line 732, in > apache_beam.runners.common.DoFnRunner.process > 14:42:03 File "apache_beam/runners/common.py", line 555, in > apache_beam.runners.common.PerWindowInvoker.invoke_process > 14:42:03 File "apache_beam/runners/common.py", line 624, in > apache_beam.runners.common.PerWindowInvoker._invoke_per_window > 14:42:03 File "apache_beam/runners/common.py", line 816, in > apache_beam.runners.common._OutputProcessor.process_outputs > 14:42:03 File "apache_beam/runners/common.py", line 831, in > apache_beam.runners.common._OutputProcessor.process_outputs > 14:42:03 File > "/usr/local/lib/python3.5/site-packages/apache_beam/io/gcp/bigquery_file_loads.py", > line 385, in process > 14:42:03 create_disposition=self.create_disposition) > 14:42:03 File > "/usr/local/lib/python3.5/site-packages/apache_beam/io/gcp/bigquery_tools.py", > line 560, in perform_load_job > 14:42:03 write_disposition=write_disposition) > 14:42:03 File > "/usr/local/lib/python3.5/site-packages/apache_beam/utils/retry.py", line > 195, in wrapper > 14:42:03 return fun(*args, **kwargs) > 14:42:03 File > "/usr/local/lib/python3.5/site-packages/apache_beam/io/gcp/bigquery_tools.py", > line 327, in _insert_load_job > 14:42:03 response = self.client.jobs.Insert(request) > 14:42:03 File > "/usr/local/lib/python3.5/site-packages/apache_beam/io/gcp/internal/clients/bigquery/bigquery_v2_client.py", > line 342, in Insert > 14:42:03 upload=upload, upload_config=upload_config) > 14:42:03 File > "/usr/local/lib/python3.5/site-packages/apitools/base/py/base_api.py", line > 731, in _RunMethod > 14:42:03 return self.ProcessHttpResponse(method_config, http_response, > request) > 14:42:03 File > "/usr/local/lib/python3.5/site-packages/apitools/base/py/base_api.py", line > 737, in ProcessHttpResponse > 1
[jira] [Commented] (BEAM-6711) Bigquery Tornadoes IT is broken in Python3 PostCommit test suite.
[ https://issues.apache.org/jira/browse/BEAM-6711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16797385#comment-16797385 ] Pablo Estrada commented on BEAM-6711: - Yes. I'll work on this. > Bigquery Tornadoes IT is broken in Python3 PostCommit test suite. > -- > > Key: BEAM-6711 > URL: https://issues.apache.org/jira/browse/BEAM-6711 > Project: Beam > Issue Type: Sub-task > Components: sdk-py-core >Reporter: Valentyn Tymofieiev >Assignee: Pablo Estrada >Priority: Major > Fix For: 2.12.0 > > Time Spent: 6h 50m > Remaining Estimate: 0h > > First failure was observed in > https://builds.apache.org/job/beam_PostCommit_Python3_Verify/54 , after > https://github.com/apache/beam/commit/cdea885872b3be7de9ba22f22700be89f7d53766 > was merged. > [~pabloem], could you please take a look? I suggest we do a rollback + > rollforward with a fix. > {noformat} > root: ERROR: Exception at bundle > , > due to an exception. > Traceback (most recent call last): > File > "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify/src/sdks/python/apache_beam/runners/common.py", > line 727, in process > return self.do_fn_invoker.invoke_process(windowed_value) > File > "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify/src/sdks/python/apache_beam/runners/common.py", > line 556, in invoke_process > windowed_value, additional_args, additional_kwargs, output_processor) > File > "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify/src/sdks/python/apache_beam/runners/common.py", > line 622, in _invoke_per_window > self.process_method(*args_for_process, **kwargs_for_process)) > File > "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify/src/sdks/python/apache_beam/runners/common.py", > line 823, in process_outputs > for result in results: > File "/home/jenkins/jenkins-slave/works > pace/beam_PostCommit_Python3_Verify/src/sdks/python/apache_beam/io/gcp/bigquery_file_loads.py", > line 191, in process > if destination in self._destination_to_file_writer: > TypeError: unhashable type: 'TableReference' > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-6865) Refactor common portable runner infrastructure for better reuse
[ https://issues.apache.org/jira/browse/BEAM-6865?focusedWorklogId=216341&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-216341 ] ASF GitHub Bot logged work on BEAM-6865: Author: ASF GitHub Bot Created on: 20/Mar/19 17:22 Start Date: 20/Mar/19 17:22 Worklog Time Spent: 10m Work Description: ibzib commented on issue #8085: [BEAM-6865] share non-Flink-specific pipeline helper utils URL: https://github.com/apache/beam/pull/8085#issuecomment-474172661 Run Java PreCommit This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 216341) Time Spent: 1h (was: 50m) > Refactor common portable runner infrastructure for better reuse > --- > > Key: BEAM-6865 > URL: https://issues.apache.org/jira/browse/BEAM-6865 > Project: Beam > Issue Type: Improvement > Components: runner-flink, runner-spark >Reporter: Kyle Weaver >Assignee: Kyle Weaver >Priority: Major > Time Spent: 1h > Remaining Estimate: 0h > > The Flink runner is currently Beam's most mature portable OSS runner. Much of > the Flink portable runner's implementation details are not unique to Flink, > and yet are confined to the Flink runner code. In order to ease development > on other portable runners such as the Spark runner, this reusable code should > be moved into some common location. > I've set this up to track my progress on these ongoing improvements. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (BEAM-6711) Bigquery Tornadoes IT is broken in Python3 PostCommit test suite.
[ https://issues.apache.org/jira/browse/BEAM-6711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16797376#comment-16797376 ] Valentyn Tymofieiev commented on BEAM-6711: --- Thanks Juta. Pablo, will you have cycles to look into this, since this is likely related to your recent changes? Also, I think these changes happened after Beam 2.11, and I'd like us to fix them before Beam 2.12, which is to be cut a week from now. > Bigquery Tornadoes IT is broken in Python3 PostCommit test suite. > -- > > Key: BEAM-6711 > URL: https://issues.apache.org/jira/browse/BEAM-6711 > Project: Beam > Issue Type: Sub-task > Components: sdk-py-core >Reporter: Valentyn Tymofieiev >Assignee: Pablo Estrada >Priority: Major > Fix For: 2.12.0 > > Time Spent: 6h 50m > Remaining Estimate: 0h > > First failure was observed in > https://builds.apache.org/job/beam_PostCommit_Python3_Verify/54 , after > https://github.com/apache/beam/commit/cdea885872b3be7de9ba22f22700be89f7d53766 > was merged. > [~pabloem], could you please take a look? I suggest we do a rollback + > rollforward with a fix. > {noformat} > root: ERROR: Exception at bundle > , > due to an exception. > Traceback (most recent call last): > File > "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify/src/sdks/python/apache_beam/runners/common.py", > line 727, in process > return self.do_fn_invoker.invoke_process(windowed_value) > File > "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify/src/sdks/python/apache_beam/runners/common.py", > line 556, in invoke_process > windowed_value, additional_args, additional_kwargs, output_processor) > File > "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify/src/sdks/python/apache_beam/runners/common.py", > line 622, in _invoke_per_window > self.process_method(*args_for_process, **kwargs_for_process)) > File > "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify/src/sdks/python/apache_beam/runners/common.py", > line 823, in process_outputs > for result in results: > File "/home/jenkins/jenkins-slave/works > pace/beam_PostCommit_Python3_Verify/src/sdks/python/apache_beam/io/gcp/bigquery_file_loads.py", > line 191, in process > if destination in self._destination_to_file_writer: > TypeError: unhashable type: 'TableReference' > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (BEAM-6870) python 3 test_hourly_team_score_it fails with bigquery job id already exists
[ https://issues.apache.org/jira/browse/BEAM-6870?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Valentyn Tymofieiev updated BEAM-6870: -- Summary: python 3 test_hourly_team_score_it fails with bigquery job id already exists (was: python 3 test_hourly_team_score_it fails with dataflow job id already exists) > python 3 test_hourly_team_score_it fails with bigquery job id already exists > > > Key: BEAM-6870 > URL: https://issues.apache.org/jira/browse/BEAM-6870 > Project: Beam > Issue Type: Sub-task > Components: sdk-py-core >Reporter: Juta Staes >Priority: Major > > This IT test on python 3 fails with "Already exists job ..." > {code:java} > 14:42:03 > == > 14:42:03 ERROR: test_hourly_team_score_it > (apache_beam.examples.complete.game.hourly_team_score_it_test.HourlyTeamScoreIT) > 14:42:03 > -- > 14:42:03 Traceback (most recent call last): > 14:42:03 File > "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify_PR/src/sdks/python/apache_beam/examples/complete/game/hourly_team_score_it_test.py", > line 89, in test_hourly_team_score_it > 14:42:03 self.test_pipeline.get_full_options_as_args(**extra_opts)) > 14:42:03 File > "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify_PR/src/sdks/python/apache_beam/examples/complete/game/hourly_team_score.py", > line 295, in run > 14:42:03 }, options.view_as(GoogleCloudOptions).project)) > 14:42:03 File > "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify_PR/src/sdks/python/apache_beam/pipeline.py", > line 426, in __exit__ > 14:42:03 self.run().wait_until_finish() > 14:42:03 File > "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify_PR/src/sdks/python/apache_beam/pipeline.py", > line 406, in run > 14:42:03 self._options).run(False) > 14:42:03 File > "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify_PR/src/sdks/python/apache_beam/pipeline.py", > line 419, in run > 14:42:03 return self.runner.run_pipeline(self, self._options) > 14:42:03 File > "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify_PR/src/sdks/python/apache_beam/runners/dataflow/test_dataflow_runner.py", > line 64, in run_pipeline > 14:42:03 self.result.wait_until_finish(duration=wait_duration) > 14:42:03 File > "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify_PR/src/sdks/python/apache_beam/runners/dataflow/dataflow_runner.py", > line 1217, in wait_until_finish > 14:42:03 (self.state, getattr(self._runner, 'last_error_msg', None)), > self) > 14:42:03 > apache_beam.runners.dataflow.dataflow_runner.DataflowRuntimeException: > Dataflow pipeline failed. State: FAILED, Error: > 14:42:03 Traceback (most recent call last): > 14:42:03 File "apache_beam/runners/common.py", line 732, in > apache_beam.runners.common.DoFnRunner.process > 14:42:03 File "apache_beam/runners/common.py", line 555, in > apache_beam.runners.common.PerWindowInvoker.invoke_process > 14:42:03 File "apache_beam/runners/common.py", line 624, in > apache_beam.runners.common.PerWindowInvoker._invoke_per_window > 14:42:03 File "apache_beam/runners/common.py", line 816, in > apache_beam.runners.common._OutputProcessor.process_outputs > 14:42:03 File "apache_beam/runners/common.py", line 831, in > apache_beam.runners.common._OutputProcessor.process_outputs > 14:42:03 File > "/usr/local/lib/python3.5/site-packages/apache_beam/io/gcp/bigquery_file_loads.py", > line 385, in process > 14:42:03 create_disposition=self.create_disposition) > 14:42:03 File > "/usr/local/lib/python3.5/site-packages/apache_beam/io/gcp/bigquery_tools.py", > line 560, in perform_load_job > 14:42:03 write_disposition=write_disposition) > 14:42:03 File > "/usr/local/lib/python3.5/site-packages/apache_beam/utils/retry.py", line > 195, in wrapper > 14:42:03 return fun(*args, **kwargs) > 14:42:03 File > "/usr/local/lib/python3.5/site-packages/apache_beam/io/gcp/bigquery_tools.py", > line 327, in _insert_load_job > 14:42:03 response = self.client.jobs.Insert(request) > 14:42:03 File > "/usr/local/lib/python3.5/site-packages/apache_beam/io/gcp/internal/clients/bigquery/bigquery_v2_client.py", > line 342, in Insert > 14:42:03 upload=upload, upload_config=upload_config) > 14:42:03 File > "/usr/local/lib/python3.5/site-packages/apitools/base/py/base_api.py", line > 731, in _RunMethod > 14:42:03 return self.ProcessHttpResponse(method_config, http_response, > request) > 14:42:03 File > "/usr/local/lib/python3.5/site-packages/apitools/base/py/base_api.py", line > 737, in Pr
[jira] [Commented] (BEAM-6870) python 3 test_hourly_team_score_it fails with dataflow job id already exists
[ https://issues.apache.org/jira/browse/BEAM-6870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16797372#comment-16797372 ] Valentyn Tymofieiev commented on BEAM-6870: --- Interesting, looks like the error is coming from BQ, and is referring to some BQ export job. We need to understand whether the test fails because the same test running in parallel in a different suite, and there is some naming conflict in BQ. Does this test pass when we run it in isolation? If the test passes in isolation, we need to understand how BQ job name is generated. > python 3 test_hourly_team_score_it fails with dataflow job id already exists > > > Key: BEAM-6870 > URL: https://issues.apache.org/jira/browse/BEAM-6870 > Project: Beam > Issue Type: Sub-task > Components: sdk-py-core >Reporter: Juta Staes >Priority: Major > > This IT test on python 3 fails with "Already exists job ..." > {code:java} > 14:42:03 > == > 14:42:03 ERROR: test_hourly_team_score_it > (apache_beam.examples.complete.game.hourly_team_score_it_test.HourlyTeamScoreIT) > 14:42:03 > -- > 14:42:03 Traceback (most recent call last): > 14:42:03 File > "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify_PR/src/sdks/python/apache_beam/examples/complete/game/hourly_team_score_it_test.py", > line 89, in test_hourly_team_score_it > 14:42:03 self.test_pipeline.get_full_options_as_args(**extra_opts)) > 14:42:03 File > "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify_PR/src/sdks/python/apache_beam/examples/complete/game/hourly_team_score.py", > line 295, in run > 14:42:03 }, options.view_as(GoogleCloudOptions).project)) > 14:42:03 File > "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify_PR/src/sdks/python/apache_beam/pipeline.py", > line 426, in __exit__ > 14:42:03 self.run().wait_until_finish() > 14:42:03 File > "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify_PR/src/sdks/python/apache_beam/pipeline.py", > line 406, in run > 14:42:03 self._options).run(False) > 14:42:03 File > "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify_PR/src/sdks/python/apache_beam/pipeline.py", > line 419, in run > 14:42:03 return self.runner.run_pipeline(self, self._options) > 14:42:03 File > "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify_PR/src/sdks/python/apache_beam/runners/dataflow/test_dataflow_runner.py", > line 64, in run_pipeline > 14:42:03 self.result.wait_until_finish(duration=wait_duration) > 14:42:03 File > "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify_PR/src/sdks/python/apache_beam/runners/dataflow/dataflow_runner.py", > line 1217, in wait_until_finish > 14:42:03 (self.state, getattr(self._runner, 'last_error_msg', None)), > self) > 14:42:03 > apache_beam.runners.dataflow.dataflow_runner.DataflowRuntimeException: > Dataflow pipeline failed. State: FAILED, Error: > 14:42:03 Traceback (most recent call last): > 14:42:03 File "apache_beam/runners/common.py", line 732, in > apache_beam.runners.common.DoFnRunner.process > 14:42:03 File "apache_beam/runners/common.py", line 555, in > apache_beam.runners.common.PerWindowInvoker.invoke_process > 14:42:03 File "apache_beam/runners/common.py", line 624, in > apache_beam.runners.common.PerWindowInvoker._invoke_per_window > 14:42:03 File "apache_beam/runners/common.py", line 816, in > apache_beam.runners.common._OutputProcessor.process_outputs > 14:42:03 File "apache_beam/runners/common.py", line 831, in > apache_beam.runners.common._OutputProcessor.process_outputs > 14:42:03 File > "/usr/local/lib/python3.5/site-packages/apache_beam/io/gcp/bigquery_file_loads.py", > line 385, in process > 14:42:03 create_disposition=self.create_disposition) > 14:42:03 File > "/usr/local/lib/python3.5/site-packages/apache_beam/io/gcp/bigquery_tools.py", > line 560, in perform_load_job > 14:42:03 write_disposition=write_disposition) > 14:42:03 File > "/usr/local/lib/python3.5/site-packages/apache_beam/utils/retry.py", line > 195, in wrapper > 14:42:03 return fun(*args, **kwargs) > 14:42:03 File > "/usr/local/lib/python3.5/site-packages/apache_beam/io/gcp/bigquery_tools.py", > line 327, in _insert_load_job > 14:42:03 response = self.client.jobs.Insert(request) > 14:42:03 File > "/usr/local/lib/python3.5/site-packages/apache_beam/io/gcp/internal/clients/bigquery/bigquery_v2_client.py", > line 342, in Insert > 14:42:03 upload=upload, upload_config=upload_config) > 14:42:03 File > "/usr/local/lib/python3.5/site-pa
[jira] [Work logged] (BEAM-6861) Select fields and computed fields in CassandraIO.Read
[ https://issues.apache.org/jira/browse/BEAM-6861?focusedWorklogId=216342&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-216342 ] ASF GitHub Bot logged work on BEAM-6861: Author: ASF GitHub Bot Created on: 20/Mar/19 17:23 Start Date: 20/Mar/19 17:23 Worklog Time Spent: 10m Work Description: srfrnk commented on issue #8090: [BEAM-6861] Select fields and computed fields in CassandraIO.Read URL: https://github.com/apache/beam/pull/8090#issuecomment-474942509 Sorry for barging in on this. If there is a need for a more robust query building mechanism we might consider using [Datastux QueryBuilder](https://docs.datastax.com/en/drivers/java/2.0/com/datastax/driver/core/querybuilder/QueryBuilder.html). I tryed to go for that approach when I added `withWhere` but since their classes are not `Serializable` I could not. I attempted to fix this problem [here](https://github.com/datastax/java-driver/pull/1163) but it was rejected. Maybe with enough influence we can manage to make the change? If not maybe this repo can be forked into Beam and so a working QueryBuilder can be used without having to rebuild it? WDYT? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 216342) Time Spent: 3h 50m (was: 3h 40m) > Select fields and computed fields in CassandraIO.Read > -- > > Key: BEAM-6861 > URL: https://issues.apache.org/jira/browse/BEAM-6861 > Project: Beam > Issue Type: Improvement > Components: io-java-cassandra >Reporter: Radosław Stankiewicz >Assignee: Radosław Stankiewicz >Priority: Minor > Labels: features > Time Spent: 3h 50m > Remaining Estimate: 0h > > CassandraIO.Read currently selects all fields and maps them to POJO. > To make this component more flexible, it should be possible to select only > subset of fields or computed fields to allow reading things like write > Timestamp or using other functions. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-6865) Refactor common portable runner infrastructure for better reuse
[ https://issues.apache.org/jira/browse/BEAM-6865?focusedWorklogId=216340&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-216340 ] ASF GitHub Bot logged work on BEAM-6865: Author: ASF GitHub Bot Created on: 20/Mar/19 17:22 Start Date: 20/Mar/19 17:22 Worklog Time Spent: 10m Work Description: ibzib commented on issue #8085: [BEAM-6865] share non-Flink-specific pipeline helper utils URL: https://github.com/apache/beam/pull/8085#issuecomment-474496559 Run Python PreCommit This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 216340) Time Spent: 50m (was: 40m) > Refactor common portable runner infrastructure for better reuse > --- > > Key: BEAM-6865 > URL: https://issues.apache.org/jira/browse/BEAM-6865 > Project: Beam > Issue Type: Improvement > Components: runner-flink, runner-spark >Reporter: Kyle Weaver >Assignee: Kyle Weaver >Priority: Major > Time Spent: 50m > Remaining Estimate: 0h > > The Flink runner is currently Beam's most mature portable OSS runner. Much of > the Flink portable runner's implementation details are not unique to Flink, > and yet are confined to the Flink runner code. In order to ease development > on other portable runners such as the Spark runner, this reusable code should > be moved into some common location. > I've set this up to track my progress on these ongoing improvements. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (BEAM-6695) Latest transform for Python SDK
[ https://issues.apache.org/jira/browse/BEAM-6695?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16797368#comment-16797368 ] Tanay Tummalapalli commented on BEAM-6695: -- Thank you [~altay]! I've been sick for the past week. So, progress on this has been slow. Will get back to you with any more questions I have. > Latest transform for Python SDK > --- > > Key: BEAM-6695 > URL: https://issues.apache.org/jira/browse/BEAM-6695 > Project: Beam > Issue Type: New Feature > Components: sdk-py-core >Reporter: Ahmet Altay >Assignee: Tanay Tummalapalli >Priority: Minor > > Add a PTransform} and Combine.CombineFn for computing the latest element in a > PCollection. > It should offer the same API as its Java counterpart: > https://github.com/apache/beam/blob/11a977b8b26eff2274d706541127c19dc93131a2/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/Latest.java -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (BEAM-6871) apache_beam.io.parquetio_test.TestParquet is very flaky
Valentyn Tymofieiev created BEAM-6871: - Summary: apache_beam.io.parquetio_test.TestParquet is very flaky Key: BEAM-6871 URL: https://issues.apache.org/jira/browse/BEAM-6871 Project: Beam Issue Type: Bug Components: test-failures Reporter: Valentyn Tymofieiev Assignee: Heejong Lee Hi [~heejong], I think you added these tests - could you please take a look? Symptoms: {noformat} 07:47:19 == 07:47:19 ERROR: test_sink_transform_compressed_0 (apache_beam.io.parquetio_test.TestParquet) 07:47:19 -- 07:47:19 Traceback (most recent call last): 07:47:19 File "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/apache_beam/io/parquetio_test.py", line 105, in tearDown 07:47:19 shutil.rmtree(self.temp_dir) 07:47:19 File "/usr/lib/python2.7/shutil.py", line 239, in rmtree 07:47:19 onerror(os.listdir, path, sys.exc_info()) 07:47:19 File "/usr/lib/python2.7/shutil.py", line 237, in rmtree 07:47:19 names = os.listdir(path) 07:47:19 OSError: [Errno 2] No such file or directory: '/tmp/tmpEXNCgA' {noformat} Examples: [https://builds.apache.org/job/beam_PreCommit_Python_Commit/5186/consoleFull] [https://builds.apache.org/job/beam_PreCommit_Python_Commit/5185/consoleFull] CC: [~Juta], [~boyuanz], [~udim] -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-6446) Clean up suppression rules in checkstyle suppressions.xml
[ https://issues.apache.org/jira/browse/BEAM-6446?focusedWorklogId=216326&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-216326 ] ASF GitHub Bot logged work on BEAM-6446: Author: ASF GitHub Bot Created on: 20/Mar/19 17:01 Start Date: 20/Mar/19 17:01 Worklog Time Spent: 10m Work Description: HuangLED commented on issue #8091: [BEAM-6446] Clean up checkstyle suppressions. URL: https://github.com/apache/beam/pull/8091#issuecomment-474932260 Run Java PreCommit This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 216326) Time Spent: 0.5h (was: 20m) > Clean up suppression rules in checkstyle suppressions.xml > - > > Key: BEAM-6446 > URL: https://issues.apache.org/jira/browse/BEAM-6446 > Project: Beam > Issue Type: Improvement > Components: build-system >Reporter: Ruoyun Huang >Assignee: Ruoyun Huang >Priority: Minor > Labels: triaged > Time Spent: 0.5h > Remaining Estimate: 0h > > When violations are addressed, clean up suppression rules in checkstyle > suppressions.xml -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-6824) Plumb FnApi ElementCount metrics in Dataflow Runner.
[ https://issues.apache.org/jira/browse/BEAM-6824?focusedWorklogId=216329&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-216329 ] ASF GitHub Bot logged work on BEAM-6824: Author: ASF GitHub Bot Created on: 20/Mar/19 17:07 Start Date: 20/Mar/19 17:07 Worklog Time Spent: 10m Work Description: Ardagan commented on issue #8095: [BEAM-6824] Add Element count transformer for Java worker FnApi URL: https://github.com/apache/beam/pull/8095#issuecomment-474934867 @pabloem This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 216329) Time Spent: 1.5h (was: 1h 20m) > Plumb FnApi ElementCount metrics in Dataflow Runner. > > > Key: BEAM-6824 > URL: https://issues.apache.org/jira/browse/BEAM-6824 > Project: Beam > Issue Type: New Feature > Components: runner-dataflow >Reporter: Mikhail Gryzykhin >Assignee: Mikhail Gryzykhin >Priority: Major > Time Spent: 1.5h > Remaining Estimate: 0h > > We want to provide FnApi metrics to Dataflow Users. > This bug covers ElementCount metric. > Current approach utilizes mapping of PCollectionID based on WorkItem and > ProcessBundle graphs. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-6824) Plumb FnApi ElementCount metrics in Dataflow Runner.
[ https://issues.apache.org/jira/browse/BEAM-6824?focusedWorklogId=216331&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-216331 ] ASF GitHub Bot logged work on BEAM-6824: Author: ASF GitHub Bot Created on: 20/Mar/19 17:07 Start Date: 20/Mar/19 17:07 Worklog Time Spent: 10m Work Description: Ardagan commented on issue #8095: [BEAM-6824] Add Element count transformer for Java worker FnApi URL: https://github.com/apache/beam/pull/8095#issuecomment-474934929 @lukecwik This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 216331) Time Spent: 1h 40m (was: 1.5h) > Plumb FnApi ElementCount metrics in Dataflow Runner. > > > Key: BEAM-6824 > URL: https://issues.apache.org/jira/browse/BEAM-6824 > Project: Beam > Issue Type: New Feature > Components: runner-dataflow >Reporter: Mikhail Gryzykhin >Assignee: Mikhail Gryzykhin >Priority: Major > Time Spent: 1h 40m > Remaining Estimate: 0h > > We want to provide FnApi metrics to Dataflow Users. > This bug covers ElementCount metric. > Current approach utilizes mapping of PCollectionID based on WorkItem and > ProcessBundle graphs. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (BEAM-6868) Flink runner supports Bundle Finalization
[ https://issues.apache.org/jira/browse/BEAM-6868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16797356#comment-16797356 ] Boyuan Zhang commented on BEAM-6868: Here is the design doc: https://s.apache.org/beam-finalizing-bundles and Python SDK implementation: https://issues.apache.org/jira/browse/BEAM-6778. I checked the code base, flink runner doesn't send the [FinalizeBundleRequest](https://github.com/apache/beam/search?q=FinalizeBundleRequest&unscoped_q=FinalizeBundleRequest) to SDK. But it's not a required feature. > Flink runner supports Bundle Finalization > - > > Key: BEAM-6868 > URL: https://issues.apache.org/jira/browse/BEAM-6868 > Project: Beam > Issue Type: New Feature > Components: runner-flink >Reporter: Boyuan Zhang >Priority: Major > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Reopened] (BEAM-6324) CassandraIO.Read - Add the ability to provide a filter to the query
[ https://issues.apache.org/jira/browse/BEAM-6324?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ismaël Mejía reopened BEAM-6324: I am reopening this one following the API discussions on BEAM-6861 > CassandraIO.Read - Add the ability to provide a filter to the query > --- > > Key: BEAM-6324 > URL: https://issues.apache.org/jira/browse/BEAM-6324 > Project: Beam > Issue Type: Improvement > Components: io-java-cassandra >Affects Versions: 2.9.0 >Reporter: Shahar Frank >Assignee: Shahar Frank >Priority: Major > Labels: performance, pull-request-available, triaged > Fix For: 2.12.0 > > Time Spent: 11.5h > Remaining Estimate: 0h > > CassandraIO.Read doesn't support using WHERE to filter the input at the > source (In Cassandra) which might provide great performance boost. > Already implemented by: > https://github.com/apache/beam/pull/7340 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-6619) Add PostCommit suite for integration tests on DataflowRunner
[ https://issues.apache.org/jira/browse/BEAM-6619?focusedWorklogId=216310&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-216310 ] ASF GitHub Bot logged work on BEAM-6619: Author: ASF GitHub Bot Created on: 20/Mar/19 16:34 Start Date: 20/Mar/19 16:34 Worklog Time Spent: 10m Work Description: tvalentyn commented on pull request #8076: [BEAM-6619] [BEAM-6593] Add bigquery integration tests to postcommit URL: https://github.com/apache/beam/pull/8076#discussion_r267431019 ## File path: sdks/python/apache_beam/examples/complete/game/leader_board_it_test.py ## @@ -53,7 +53,7 @@ class LeaderBoardIT(unittest.TestCase): # Input event containing user, team, score, processing time, window start. - INPUT_EVENT = 'user1,teamA,10,%d,2015-11-02 09:09:28.224' + INPUT_EVENT = b'user1,teamA,10,%d,2015-11-02 09:09:28.224' Review comment: I am not sure if this should be bytes. Does `self.pub_client.publish` require bytes? If so, we should encode before passing data to that method. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 216310) Time Spent: 20h (was: 19h 50m) > Add PostCommit suite for integration tests on DataflowRunner > > > Key: BEAM-6619 > URL: https://issues.apache.org/jira/browse/BEAM-6619 > Project: Beam > Issue Type: Sub-task > Components: testing >Reporter: Mark Liu >Assignee: Mark Liu >Priority: Major > Labels: triaged > Fix For: Not applicable > > Time Spent: 20h > Remaining Estimate: 0h > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (BEAM-6869) Beam dependency check report failed to generate
[ https://issues.apache.org/jira/browse/BEAM-6869?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16797338#comment-16797338 ] yifan zou commented on BEAM-6869: - An unit test failed and interrupted the report. > Beam dependency check report failed to generate > --- > > Key: BEAM-6869 > URL: https://issues.apache.org/jira/browse/BEAM-6869 > Project: Beam > Issue Type: Bug > Components: dependencies, test-failures >Reporter: Lukasz Gajowy >Assignee: yifan zou >Priority: Major > > I think the tests in dependency_check_report_generator_test.py have failed > preventing the script to generate the report. > Link for the logs: > https://builds.apache.org/view/A-D/view/Beam/view/All/job/beam_Dependency_Check/182/console -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-6854) Update amazon-web-services aws-sdk dependency to version 1.11.519
[ https://issues.apache.org/jira/browse/BEAM-6854?focusedWorklogId=216297&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-216297 ] ASF GitHub Bot logged work on BEAM-6854: Author: ASF GitHub Bot Created on: 20/Mar/19 16:13 Start Date: 20/Mar/19 16:13 Worklog Time Spent: 10m Work Description: iemejia commented on pull request #8083: [BEAM-6854] Update amazon-web-services aws-sdk dependency to version 1.11.519 URL: https://github.com/apache/beam/pull/8083#discussion_r267423370 ## File path: sdks/java/io/amazon-web-services/src/test/java/org/apache/beam/sdk/io/aws/sqs/SqsIOTest.java ## @@ -117,7 +117,6 @@ protected void before() { new AWSStaticCredentialsProvider(new BasicAWSCredentials(accessKey, secretKey))) .withEndpointConfiguration( new AwsClientBuilder.EndpointConfiguration(endpoint, region)) - .withRegion(region) Review comment: Yes and now the amazon sdk validates this so the test failed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 216297) Time Spent: 40m (was: 0.5h) > Update amazon-web-services aws-sdk dependency to version 1.11.519 > - > > Key: BEAM-6854 > URL: https://issues.apache.org/jira/browse/BEAM-6854 > Project: Beam > Issue Type: Improvement > Components: io-java-aws >Reporter: Ismaël Mejía >Assignee: Ismaël Mejía >Priority: Minor > Time Spent: 40m > Remaining Estimate: 0h > > An update to the AWS dependency to include new regions. > Additionally this unifies the AWS SDK use among modules as a follow up of > BEAM-6330 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-6619) Add PostCommit suite for integration tests on DataflowRunner
[ https://issues.apache.org/jira/browse/BEAM-6619?focusedWorklogId=216311&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-216311 ] ASF GitHub Bot logged work on BEAM-6619: Author: ASF GitHub Bot Created on: 20/Mar/19 16:34 Start Date: 20/Mar/19 16:34 Worklog Time Spent: 10m Work Description: tvalentyn commented on pull request #8076: [BEAM-6619] [BEAM-6593] Add bigquery integration tests to postcommit URL: https://github.com/apache/beam/pull/8076#discussion_r267431454 ## File path: sdks/python/apache_beam/examples/complete/game/game_stats_it_test.py ## @@ -52,7 +52,7 @@ class GameStatsIT(unittest.TestCase): # Input events containing user, team, score, processing time, window start. - INPUT_EVENT = 'user1,teamA,10,%d,2015-11-02 09:09:28.224' + INPUT_EVENT = b'user1,teamA,10,%d,2015-11-02 09:09:28.224' Review comment: I am not sure if this should be bytes. Does self.pub_client.publish require bytes? If so, we should encode before passing data to that method, To quote https://docs.python.org/3/howto/pyporting.html , `Decode binary data to text as soon as possible, encode text as binary data as late as possible` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 216311) Time Spent: 20h 10m (was: 20h) > Add PostCommit suite for integration tests on DataflowRunner > > > Key: BEAM-6619 > URL: https://issues.apache.org/jira/browse/BEAM-6619 > Project: Beam > Issue Type: Sub-task > Components: testing >Reporter: Mark Liu >Assignee: Mark Liu >Priority: Major > Labels: triaged > Fix For: Not applicable > > Time Spent: 20h 10m > Remaining Estimate: 0h > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-6619) Add PostCommit suite for integration tests on DataflowRunner
[ https://issues.apache.org/jira/browse/BEAM-6619?focusedWorklogId=216309&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-216309 ] ASF GitHub Bot logged work on BEAM-6619: Author: ASF GitHub Bot Created on: 20/Mar/19 16:34 Start Date: 20/Mar/19 16:34 Worklog Time Spent: 10m Work Description: tvalentyn commented on pull request #8076: [BEAM-6619] [BEAM-6593] Add bigquery integration tests to postcommit URL: https://github.com/apache/beam/pull/8076#discussion_r267428713 ## File path: sdks/python/apache_beam/examples/complete/game/leader_board.py ## @@ -120,6 +120,8 @@ def __init__(self): def process(self, elem): try: + if isinstance(elem, bytes): Review comment: Can we always decode here? It is better to have a clear expectation of input arguments as much as possible on Python 3: either always encoded bytes, or always strings, but not mixing the two. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 216309) > Add PostCommit suite for integration tests on DataflowRunner > > > Key: BEAM-6619 > URL: https://issues.apache.org/jira/browse/BEAM-6619 > Project: Beam > Issue Type: Sub-task > Components: testing >Reporter: Mark Liu >Assignee: Mark Liu >Priority: Major > Labels: triaged > Fix For: Not applicable > > Time Spent: 19h 50m > Remaining Estimate: 0h > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-6619) Add PostCommit suite for integration tests on DataflowRunner
[ https://issues.apache.org/jira/browse/BEAM-6619?focusedWorklogId=216308&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-216308 ] ASF GitHub Bot logged work on BEAM-6619: Author: ASF GitHub Bot Created on: 20/Mar/19 16:34 Start Date: 20/Mar/19 16:34 Worklog Time Spent: 10m Work Description: tvalentyn commented on pull request #8076: [BEAM-6619] [BEAM-6593] Add bigquery integration tests to postcommit URL: https://github.com/apache/beam/pull/8076#discussion_r267428124 ## File path: sdks/python/apache_beam/examples/complete/game/game_stats.py ## @@ -111,6 +111,8 @@ def __init__(self): def process(self, elem): try: + if isinstance(elem, bytes): Review comment: Can we always decode here? It is better to have a clear expectation of input arguments as much as possible on Python 3: either always encoded bytes, or always strings, but not mixing the two. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 216308) Time Spent: 19h 50m (was: 19h 40m) > Add PostCommit suite for integration tests on DataflowRunner > > > Key: BEAM-6619 > URL: https://issues.apache.org/jira/browse/BEAM-6619 > Project: Beam > Issue Type: Sub-task > Components: testing >Reporter: Mark Liu >Assignee: Mark Liu >Priority: Major > Labels: triaged > Fix For: Not applicable > > Time Spent: 19h 50m > Remaining Estimate: 0h > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-6854) Update amazon-web-services aws-sdk dependency to version 1.11.519
[ https://issues.apache.org/jira/browse/BEAM-6854?focusedWorklogId=216304&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-216304 ] ASF GitHub Bot logged work on BEAM-6854: Author: ASF GitHub Bot Created on: 20/Mar/19 16:27 Start Date: 20/Mar/19 16:27 Worklog Time Spent: 10m Work Description: aromanenko-dev commented on pull request #8083: [BEAM-6854] Update amazon-web-services aws-sdk dependency to version 1.11.519 URL: https://github.com/apache/beam/pull/8083 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 216304) Time Spent: 1h (was: 50m) > Update amazon-web-services aws-sdk dependency to version 1.11.519 > - > > Key: BEAM-6854 > URL: https://issues.apache.org/jira/browse/BEAM-6854 > Project: Beam > Issue Type: Improvement > Components: io-java-aws >Reporter: Ismaël Mejía >Assignee: Ismaël Mejía >Priority: Minor > Fix For: 2.12.0 > > Time Spent: 1h > Remaining Estimate: 0h > > An update to the AWS dependency to include new regions. > Additionally this unifies the AWS SDK use among modules as a follow up of > BEAM-6330 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (BEAM-6854) Update amazon-web-services aws-sdk dependency to version 1.11.519
[ https://issues.apache.org/jira/browse/BEAM-6854?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexey Romanenko resolved BEAM-6854. Resolution: Fixed Fix Version/s: 2.12.0 > Update amazon-web-services aws-sdk dependency to version 1.11.519 > - > > Key: BEAM-6854 > URL: https://issues.apache.org/jira/browse/BEAM-6854 > Project: Beam > Issue Type: Improvement > Components: io-java-aws >Reporter: Ismaël Mejía >Assignee: Ismaël Mejía >Priority: Minor > Fix For: 2.12.0 > > Time Spent: 1h > Remaining Estimate: 0h > > An update to the AWS dependency to include new regions. > Additionally this unifies the AWS SDK use among modules as a follow up of > BEAM-6330 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-6854) Update amazon-web-services aws-sdk dependency to version 1.11.519
[ https://issues.apache.org/jira/browse/BEAM-6854?focusedWorklogId=216303&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-216303 ] ASF GitHub Bot logged work on BEAM-6854: Author: ASF GitHub Bot Created on: 20/Mar/19 16:26 Start Date: 20/Mar/19 16:26 Worklog Time Spent: 10m Work Description: aromanenko-dev commented on issue #8083: [BEAM-6854] Update amazon-web-services aws-sdk dependency to version 1.11.519 URL: https://github.com/apache/beam/pull/8083#issuecomment-474913945 LGTM This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 216303) Time Spent: 50m (was: 40m) > Update amazon-web-services aws-sdk dependency to version 1.11.519 > - > > Key: BEAM-6854 > URL: https://issues.apache.org/jira/browse/BEAM-6854 > Project: Beam > Issue Type: Improvement > Components: io-java-aws >Reporter: Ismaël Mejía >Assignee: Ismaël Mejía >Priority: Minor > Time Spent: 50m > Remaining Estimate: 0h > > An update to the AWS dependency to include new regions. > Additionally this unifies the AWS SDK use among modules as a follow up of > BEAM-6330 -- This message was sent by Atlassian JIRA (v7.6.3#76005)