[jira] [Commented] (BEAM-10027) Support for Kotlin-based Beam Katas
[ https://issues.apache.org/jira/browse/BEAM-10027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17125203#comment-17125203 ] Pablo Estrada commented on BEAM-10027: -- It's technically not a part of the release, so it's not that necessary to specify, but changes merged now will be in Beam 2.23.0 - so I've added it. Thanks! > Support for Kotlin-based Beam Katas > --- > > Key: BEAM-10027 > URL: https://issues.apache.org/jira/browse/BEAM-10027 > Project: Beam > Issue Type: Improvement > Components: katas >Reporter: Rion Williams >Assignee: Rion Williams >Priority: P2 > Fix For: 2.23.0 > > Original Estimate: 8h > Time Spent: 15h 40m > Remaining Estimate: 0h > > Currently, there are a series of examples available demonstrating the use of > Apache Beam with Kotlin. It would be nice to have support for the same Beam > Katas that exist for Python, Go, and Java to also support Kotlin. > The port itself shouldn't be that involved since it can still target the JVM, > so it would likely just require the inclusion for Kotlin dependencies and a > conversion for all of the existing Java examples. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (BEAM-10027) Support for Kotlin-based Beam Katas
[ https://issues.apache.org/jira/browse/BEAM-10027?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pablo Estrada updated BEAM-10027: - Fix Version/s: 2.23.0 > Support for Kotlin-based Beam Katas > --- > > Key: BEAM-10027 > URL: https://issues.apache.org/jira/browse/BEAM-10027 > Project: Beam > Issue Type: Improvement > Components: katas >Reporter: Rion Williams >Assignee: Rion Williams >Priority: P2 > Fix For: 2.23.0 > > Original Estimate: 8h > Time Spent: 15h 40m > Remaining Estimate: 0h > > Currently, there are a series of examples available demonstrating the use of > Apache Beam with Kotlin. It would be nice to have support for the same Beam > Katas that exist for Python, Go, and Java to also support Kotlin. > The port itself shouldn't be that involved since it can still target the JVM, > so it would likely just require the inclusion for Kotlin dependencies and a > conversion for all of the existing Java examples. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (BEAM-10068) Modify behavior of Dynamic Destinations
[ https://issues.apache.org/jira/browse/BEAM-10068?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17121408#comment-17121408 ] Pablo Estrada commented on BEAM-10068: -- Made this into a new feature -type issue. [~mborkar] can you tell if this corresponds to a feature allowing `specifying per-destination numShards`? > Modify behavior of Dynamic Destinations > --- > > Key: BEAM-10068 > URL: https://issues.apache.org/jira/browse/BEAM-10068 > Project: Beam > Issue Type: New Feature > Components: sdk-java-core >Reporter: Mihir Borkar >Assignee: Reuven Lax >Priority: P2 > > The writeDynamic() method, implementing Dynamic Destinations writes files per > destination per window per pane. > This leads to an increase in the number of files generated. > The request is as follows: > A way to make it possible for the user to modify the behavior of Dynamic > Destinations to control the number of output files being produced. > a.) We can consider adding user-configurable parameters like writers per > bundle, increasing number of records processed per bundle > and/or > b.) Introduce a method implementing Dynamic Destinations but more dependent > on the data passing through the pipeline, instead of windows/panes. > So instead of splitting every output file into roughly the number of > destinations being written to, we let the user configure how output files > should be divided across destinations. > Links: > [1] > [https://beam.apache.org/releases/javadoc/2.19.0/index.html?org/apache/beam/sdk/io/FileIO.html] > [2] > [https://github.com/apache/beam/blob/da9e17288e8473925674a4691d9e86252e67d7d7/sdks/java/core/src/main/java/org/apache/beam/sdk/io/FileIO.java] > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (BEAM-10068) Modify behavior of Dynamic Destinations
[ https://issues.apache.org/jira/browse/BEAM-10068?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pablo Estrada reassigned BEAM-10068: Assignee: Reuven Lax (was: Pablo Estrada) > Modify behavior of Dynamic Destinations > --- > > Key: BEAM-10068 > URL: https://issues.apache.org/jira/browse/BEAM-10068 > Project: Beam > Issue Type: New Feature > Components: sdk-java-core >Reporter: Mihir Borkar >Assignee: Reuven Lax >Priority: P2 > > The writeDynamic() method, implementing Dynamic Destinations writes files per > destination per window per pane. > This leads to an increase in the number of files generated. > The request is as follows: > A way to make it possible for the user to modify the behavior of Dynamic > Destinations to control the number of output files being produced. > a.) We can consider adding user-configurable parameters like writers per > bundle, increasing number of records processed per bundle > and/or > b.) Introduce a method implementing Dynamic Destinations but more dependent > on the data passing through the pipeline, instead of windows/panes. > So instead of splitting every output file into roughly the number of > destinations being written to, we let the user configure how output files > should be divided across destinations. > Links: > [1] > [https://beam.apache.org/releases/javadoc/2.19.0/index.html?org/apache/beam/sdk/io/FileIO.html] > [2] > [https://github.com/apache/beam/blob/da9e17288e8473925674a4691d9e86252e67d7d7/sdks/java/core/src/main/java/org/apache/beam/sdk/io/FileIO.java] > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (BEAM-10068) Modify behavior of Dynamic Destinations
[ https://issues.apache.org/jira/browse/BEAM-10068?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pablo Estrada updated BEAM-10068: - Issue Type: New Feature (was: Improvement) > Modify behavior of Dynamic Destinations > --- > > Key: BEAM-10068 > URL: https://issues.apache.org/jira/browse/BEAM-10068 > Project: Beam > Issue Type: New Feature > Components: sdk-java-core >Reporter: Mihir Borkar >Priority: P2 > > The writeDynamic() method, implementing Dynamic Destinations writes files per > destination per window per pane. > This leads to an increase in the number of files generated. > The request is as follows: > A way to make it possible for the user to modify the behavior of Dynamic > Destinations to control the number of output files being produced. > a.) We can consider adding user-configurable parameters like writers per > bundle, increasing number of records processed per bundle > and/or > b.) Introduce a method implementing Dynamic Destinations but more dependent > on the data passing through the pipeline, instead of windows/panes. > So instead of splitting every output file into roughly the number of > destinations being written to, we let the user configure how output files > should be divided across destinations. > Links: > [1] > [https://beam.apache.org/releases/javadoc/2.19.0/index.html?org/apache/beam/sdk/io/FileIO.html] > [2] > [https://github.com/apache/beam/blob/da9e17288e8473925674a4691d9e86252e67d7d7/sdks/java/core/src/main/java/org/apache/beam/sdk/io/FileIO.java] > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (BEAM-10068) Modify behavior of Dynamic Destinations
[ https://issues.apache.org/jira/browse/BEAM-10068?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pablo Estrada reassigned BEAM-10068: Assignee: Pablo Estrada > Modify behavior of Dynamic Destinations > --- > > Key: BEAM-10068 > URL: https://issues.apache.org/jira/browse/BEAM-10068 > Project: Beam > Issue Type: New Feature > Components: sdk-java-core >Reporter: Mihir Borkar >Assignee: Pablo Estrada >Priority: P2 > > The writeDynamic() method, implementing Dynamic Destinations writes files per > destination per window per pane. > This leads to an increase in the number of files generated. > The request is as follows: > A way to make it possible for the user to modify the behavior of Dynamic > Destinations to control the number of output files being produced. > a.) We can consider adding user-configurable parameters like writers per > bundle, increasing number of records processed per bundle > and/or > b.) Introduce a method implementing Dynamic Destinations but more dependent > on the data passing through the pipeline, instead of windows/panes. > So instead of splitting every output file into roughly the number of > destinations being written to, we let the user configure how output files > should be divided across destinations. > Links: > [1] > [https://beam.apache.org/releases/javadoc/2.19.0/index.html?org/apache/beam/sdk/io/FileIO.html] > [2] > [https://github.com/apache/beam/blob/da9e17288e8473925674a4691d9e86252e67d7d7/sdks/java/core/src/main/java/org/apache/beam/sdk/io/FileIO.java] > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (BEAM-1438) The default behavior for the Write transform doesn't work well with the Dataflow streaming runner
[ https://issues.apache.org/jira/browse/BEAM-1438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17118985#comment-17118985 ] Pablo Estrada commented on BEAM-1438: - Ah it looks like it's just a matter of removing the check. [https://github.com/apache/beam/pull/11850] is out to fix this. > The default behavior for the Write transform doesn't work well with the > Dataflow streaming runner > - > > Key: BEAM-1438 > URL: https://issues.apache.org/jira/browse/BEAM-1438 > Project: Beam > Issue Type: Bug > Components: runner-dataflow >Reporter: Reuven Lax >Assignee: Reuven Lax >Priority: P2 > Fix For: 2.5.0 > > Time Spent: 10m > Remaining Estimate: 0h > > If a Write specifies 0 output shards, that implies the runner should pick an > appropriate sharding. The default behavior is to write one shard per input > bundle. This works well with the Dataflow batch runner, but not with the > streaming runner which produces large numbers of small bundles. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (BEAM-1438) The default behavior for the Write transform doesn't work well with the Dataflow streaming runner
[ https://issues.apache.org/jira/browse/BEAM-1438?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pablo Estrada updated BEAM-1438: Status: Open (was: Triage Needed) > The default behavior for the Write transform doesn't work well with the > Dataflow streaming runner > - > > Key: BEAM-1438 > URL: https://issues.apache.org/jira/browse/BEAM-1438 > Project: Beam > Issue Type: Bug > Components: runner-dataflow >Reporter: Reuven Lax >Assignee: Reuven Lax >Priority: P2 > Fix For: 2.5.0 > > > If a Write specifies 0 output shards, that implies the runner should pick an > appropriate sharding. The default behavior is to write one shard per input > bundle. This works well with the Dataflow batch runner, but not with the > streaming runner which produces large numbers of small bundles. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (BEAM-1438) The default behavior for the Write transform doesn't work well with the Dataflow streaming runner
[ https://issues.apache.org/jira/browse/BEAM-1438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17118980#comment-17118980 ] Pablo Estrada commented on BEAM-1438: - [~reuvenlax] are you able to take a look at this? > The default behavior for the Write transform doesn't work well with the > Dataflow streaming runner > - > > Key: BEAM-1438 > URL: https://issues.apache.org/jira/browse/BEAM-1438 > Project: Beam > Issue Type: Bug > Components: runner-dataflow >Reporter: Reuven Lax >Assignee: Reuven Lax >Priority: P2 > Fix For: 2.5.0 > > > If a Write specifies 0 output shards, that implies the runner should pick an > appropriate sharding. The default behavior is to write one shard per input > bundle. This works well with the Dataflow batch runner, but not with the > streaming runner which produces large numbers of small bundles. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (BEAM-1438) The default behavior for the Write transform doesn't work well with the Dataflow streaming runner
[ https://issues.apache.org/jira/browse/BEAM-1438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17118979#comment-17118979 ] Pablo Estrada commented on BEAM-1438: - Reopening this issue, as this will not work on Dataflow, as appropriately pointed out by others. > The default behavior for the Write transform doesn't work well with the > Dataflow streaming runner > - > > Key: BEAM-1438 > URL: https://issues.apache.org/jira/browse/BEAM-1438 > Project: Beam > Issue Type: Bug > Components: runner-dataflow >Reporter: Reuven Lax >Assignee: Reuven Lax >Priority: P2 > Fix For: 2.5.0 > > > If a Write specifies 0 output shards, that implies the runner should pick an > appropriate sharding. The default behavior is to write one shard per input > bundle. This works well with the Dataflow batch runner, but not with the > streaming runner which produces large numbers of small bundles. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Reopened] (BEAM-1438) The default behavior for the Write transform doesn't work well with the Dataflow streaming runner
[ https://issues.apache.org/jira/browse/BEAM-1438?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pablo Estrada reopened BEAM-1438: - > The default behavior for the Write transform doesn't work well with the > Dataflow streaming runner > - > > Key: BEAM-1438 > URL: https://issues.apache.org/jira/browse/BEAM-1438 > Project: Beam > Issue Type: Bug > Components: runner-dataflow >Reporter: Reuven Lax >Assignee: Reuven Lax >Priority: P2 > Fix For: 2.5.0 > > > If a Write specifies 0 output shards, that implies the runner should pick an > appropriate sharding. The default behavior is to write one shard per input > bundle. This works well with the Dataflow batch runner, but not with the > streaming runner which produces large numbers of small bundles. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (BEAM-10098) Javadoc export deactivated for RabbitMqIO and KuduIO
[ https://issues.apache.org/jira/browse/BEAM-10098?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pablo Estrada reassigned BEAM-10098: Assignee: Pablo Estrada (was: Alexey Romanenko) > Javadoc export deactivated for RabbitMqIO and KuduIO > > > Key: BEAM-10098 > URL: https://issues.apache.org/jira/browse/BEAM-10098 > Project: Beam > Issue Type: Bug > Components: website >Reporter: Ashwin Ramaswami >Assignee: Pablo Estrada >Priority: P2 > > Javadoc export is deactivated for RabbitMqIO and KuduIO. We should enable > this. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (BEAM-10098) Javadoc export deactivated for RabbitMqIO and KuduIO
[ https://issues.apache.org/jira/browse/BEAM-10098?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pablo Estrada updated BEAM-10098: - Status: Open (was: Triage Needed) > Javadoc export deactivated for RabbitMqIO and KuduIO > > > Key: BEAM-10098 > URL: https://issues.apache.org/jira/browse/BEAM-10098 > Project: Beam > Issue Type: Bug > Components: website >Reporter: Ashwin Ramaswami >Assignee: Pablo Estrada >Priority: P2 > > Javadoc export is deactivated for RabbitMqIO and KuduIO. We should enable > this. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (BEAM-10099) Add FhirIO and HL7v2IO to I/O matrix
[ https://issues.apache.org/jira/browse/BEAM-10099?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pablo Estrada updated BEAM-10099: - Status: Open (was: Triage Needed) > Add FhirIO and HL7v2IO to I/O matrix > > > Key: BEAM-10099 > URL: https://issues.apache.org/jira/browse/BEAM-10099 > Project: Beam > Issue Type: Bug > Components: website >Reporter: Ashwin Ramaswami >Assignee: Jacob Ferriero >Priority: P2 > > We should do this after the next release is out. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (BEAM-3759) Add support for PaneInfo descriptor in Python SDK
[ https://issues.apache.org/jira/browse/BEAM-3759?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17116969#comment-17116969 ] Pablo Estrada commented on BEAM-3759: - Unfortunately this is not fixed. PR 4763 only adds encoding support, but as Charles pointed out, it does not add support for adding the PaneInfo after GBK firings. > Add support for PaneInfo descriptor in Python SDK > - > > Key: BEAM-3759 > URL: https://issues.apache.org/jira/browse/BEAM-3759 > Project: Beam > Issue Type: Improvement > Components: sdk-py-core >Affects Versions: 2.3.0 >Reporter: Charles Chen >Priority: P2 > Fix For: 2.19.0 > > Time Spent: 3h 40m > Remaining Estimate: 0h > > The PaneInfo descriptor allows a user to determine which particular > triggering emitted a value. This allows the user to differentiate between > speculative (early), on-time (at end of window) and late value emissions > coming out of a GroupByKey. We should add support for this feature in the > Python SDK. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Reopened] (BEAM-3759) Add support for PaneInfo descriptor in Python SDK
[ https://issues.apache.org/jira/browse/BEAM-3759?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pablo Estrada reopened BEAM-3759: - Assignee: (was: Tanay Tummalapalli) > Add support for PaneInfo descriptor in Python SDK > - > > Key: BEAM-3759 > URL: https://issues.apache.org/jira/browse/BEAM-3759 > Project: Beam > Issue Type: Improvement > Components: sdk-py-core >Affects Versions: 2.3.0 >Reporter: Charles Chen >Priority: P2 > Fix For: 2.19.0 > > Time Spent: 3h 40m > Remaining Estimate: 0h > > The PaneInfo descriptor allows a user to determine which particular > triggering emitted a value. This allows the user to differentiate between > speculative (early), on-time (at end of window) and late value emissions > coming out of a GroupByKey. We should add support for this feature in the > Python SDK. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (BEAM-3767) A Complex Event Processing (CEP) library/extension for Apache Beam
[ https://issues.apache.org/jira/browse/BEAM-3767?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pablo Estrada updated BEAM-3767: Labels: gsoc gsoc2021 (was: ) > A Complex Event Processing (CEP) library/extension for Apache Beam > -- > > Key: BEAM-3767 > URL: https://issues.apache.org/jira/browse/BEAM-3767 > Project: Beam > Issue Type: New Feature > Components: sdk-ideas >Reporter: Ismaël Mejía >Priority: P3 > Labels: gsoc, gsoc2021 > > Apache Beam [1] is a unified and portable programming model for data > processing jobs. The Beam model [2, 3, 4] has rich mechanisms to process > endless streams of events. > Complex Event Processing [5] lets you match patterns of events in streams to > detect important patterns in data and react to them. > Some examples of uses of CEP are fraud detection for example by detecting > unusual behavior (patterns of activity), e.g. network intrusion, suspicious > banking transactions, etc. Also trend detection is another interesting use > case in the context of sensors and IoT. > The goal of this issue is to implement an efficient pattern matching library > inspired by [6] and existing libraries like Apache Flink CEP [7] using the > Apache Beam Java SDK and the Beam style guides [8]. Because of the time > constraints of GSoC we will probably try to cover first simple patterns of > the ‘a followed by b followed by c’ kind, and then if there is still time try > to cover more advanced ones e.g. optional, atLeastOne, oneOrMore, etc. > [1] [https://beam.apache.org/] > [2] [https://www.oreilly.com/ideas/the-world-beyond-batch-streaming-101] > [3] [https://www.oreilly.com/ideas/the-world-beyond-batch-streaming-102] > [4] > [https://static.googleusercontent.com/media/research.google.com/en//pubs/archive/43864.pdf] > [5] [https://en.wikipedia.org/wiki/Complex_event_processing] > [6] [https://people.cs.umass.edu/~yanlei/publications/sase-sigmod08.pdf] > [7] > [https://ci.apache.org/projects/flink/flink-docs-stable/dev/libs/cep.html] > [8] [https://beam.apache.org/contribute/ptransform-style-guide/] > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (BEAM-9148) test flakiness: BigQueryQueryToTableIT.test_big_query_standard_sql
[ https://issues.apache.org/jira/browse/BEAM-9148?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pablo Estrada resolved BEAM-9148. - Fix Version/s: Not applicable Resolution: Cannot Reproduce Closing as obsolete. > test flakiness: BigQueryQueryToTableIT.test_big_query_standard_sql > -- > > Key: BEAM-9148 > URL: https://issues.apache.org/jira/browse/BEAM-9148 > Project: Beam > Issue Type: Bug > Components: io-py-gcp, sdk-py-core, test-failures >Reporter: Udi Meiri >Assignee: Pablo Estrada >Priority: P2 > Fix For: Not applicable > > > There might be other flaky test cases from the same class, but I'm focusing > on test_big_query_standard_sql here. > {code} > 19:39:12 > == > 19:39:12 FAIL: test_big_query_standard_sql > (apache_beam.io.gcp.big_query_query_to_table_it_test.BigQueryQueryToTableIT) > 19:39:12 > -- > 19:39:12 Traceback (most recent call last): > 19:39:12File > "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python37/src/sdks/python/apache_beam/io/gcp/big_query_query_to_table_it_test.py", > line 172, in test_big_query_standard_sql > 19:39:12 big_query_query_to_table_pipeline.run_bq_pipeline(options) > 19:39:12File > "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python37/src/sdks/python/apache_beam/io/gcp/big_query_query_to_table_pipeline.py", > line 84, in run_bq_pipeline > 19:39:12 result = p.run() > 19:39:12File > "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python37/src/sdks/python/apache_beam/testing/test_pipeline.py", > line 112, in run > 19:39:12 else test_runner_api)) > 19:39:12File > "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python37/src/sdks/python/apache_beam/pipeline.py", > line 461, in run > 19:39:12 self._options).run(False) > 19:39:12File > "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python37/src/sdks/python/apache_beam/pipeline.py", > line 474, in run > 19:39:12 return self.runner.run_pipeline(self, self._options) > 19:39:12File > "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python37/src/sdks/python/apache_beam/runners/direct/test_direct_runner.py", > line 53, in run_pipeline > 19:39:12 hc_assert_that(self.result, pickler.loads(on_success_matcher)) > 19:39:12 AssertionError: > 19:39:12 Expected: (Test pipeline expected terminated in state: DONE and > Expected checksum is 158a8ea1c254fcf40d4ed3e7c0242c3ea0a29e72) > 19:39:12 but: Expected checksum is > 158a8ea1c254fcf40d4ed3e7c0242c3ea0a29e72 Actual checksum is > da39a3ee5e6b4b0d3255bfef95601890afd80709 > 19:39:12 > 19:39:12 >> begin captured logging << > > 19:39:12 root: DEBUG: Unhandled type_constraint: Union[] > 19:39:12 root: DEBUG: Unhandled type_constraint: Union[] > 19:39:12 apache_beam.runners.direct.direct_runner: INFO: Running pipeline > with DirectRunner. > 19:39:12 apache_beam.io.gcp.bigquery_tools: DEBUG: Query SELECT * FROM > (SELECT "apple" as fruit) UNION ALL (SELECT "orange" as fruit) does not > reference any tables. > 19:39:12 apache_beam.io.gcp.bigquery_tools: WARNING: Dataset > apache-beam-testing:temp_dataset_90f5797bdb5f4137af750399f91a8e66 does not > exist so we will create it as temporary with location=None > 19:39:12 apache_beam.io.gcp.bigquery: DEBUG: Creating or getting table > 19:39:12 datasetId: 'python_query_to_table_15792323245106' > 19:39:12 projectId: 'apache-beam-testing' > 19:39:12 tableId: 'output_table'> with schema {'fields': [{'name': 'fruit', > 'type': 'STRING', 'mode': 'NULLABLE'}]}. > 19:39:12 apache_beam.io.gcp.bigquery_tools: DEBUG: Created the table with id > output_table > 19:39:12 apache_beam.io.gcp.bigquery_tools: INFO: Created table > apache-beam-testing.python_query_to_table_15792323245106.output_table with > schema 19:39:12 fields: [ 19:39:12 fields: [] > 19:39:12 mode: 'NULLABLE' > 19:39:12 name: 'fruit' > 19:39:12 type: 'STRING'>]>. Result: 19:39:12 creationTime: 1579232328576 > 19:39:12 etag: 'WYysl6UIvc8IWMmTiiKhbg==' > 19:39:12 id: > 'apache-beam-testing:python_query_to_table_15792323245106.output_table' > 19:39:12 kind: 'bigquery#table' > 19:39:12 lastModifiedTime: 1579232328629 > 19:39:12 location: 'US' > 19:39:12 numBytes: 0 > 19:39:12 numLongTermBytes: 0 > 19:39:12 numRows: 0 > 19:39:12 schema: 19:39:12 fields: [ 19:39:12 fields: [] > 19:39:12 mode: 'NULLABLE' > 19:39:12 name: 'fruit' > 19:39:12 type: 'STRING'>]> > 19:39:12 selfLink: >
[jira] [Resolved] (BEAM-8197) test_big_query_write_without_schema might be flaky?
[ https://issues.apache.org/jira/browse/BEAM-8197?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pablo Estrada resolved BEAM-8197. - Fix Version/s: Not applicable Resolution: Cannot Reproduce Closing as obsolete. > test_big_query_write_without_schema might be flaky? > --- > > Key: BEAM-8197 > URL: https://issues.apache.org/jira/browse/BEAM-8197 > Project: Beam > Issue Type: Sub-task > Components: sdk-py-core, testing >Reporter: Ahmet Altay >Assignee: Pablo Estrada >Priority: P1 > Fix For: Not applicable > > > This failed in py 3.6 post commit test: > https://builds.apache.org/job/beam_PostCommit_Python36/434/console > 18:14:58 > == > 18:14:58 FAIL: test_big_query_write_without_schema > (apache_beam.io.gcp.bigquery_write_it_test.BigQueryWriteIntegrationTests) > 18:14:58 > -- > 18:14:58 Traceback (most recent call last): > 18:14:58 File > "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python36/src/sdks/python/apache_beam/io/gcp/bigquery_write_it_test.py", > line 269, in test_big_query_write_without_schema > 18:14:58 write_disposition=beam.io.BigQueryDisposition.WRITE_APPEND)) > 18:14:58 File > "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python36/src/sdks/python/apache_beam/pipeline.py", > line 427, in __exit__ > 18:14:58 self.run().wait_until_finish() > 18:14:58 File > "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python36/src/sdks/python/apache_beam/pipeline.py", > line 407, in run > 18:14:58 self._options).run(False) > 18:14:58 File > "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python36/src/sdks/python/apache_beam/pipeline.py", > line 420, in run > 18:14:58 return self.runner.run_pipeline(self, self._options) > 18:14:58 File > "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python36/src/sdks/python/apache_beam/runners/direct/test_direct_runner.py", > line 51, in run_pipeline > 18:14:58 hc_assert_that(self.result, pickler.loads(on_success_matcher)) > 18:14:58 AssertionError: > 18:14:58 Expected: (Expected data is [(b'xyw', datetime.date(2011, 1, 1), > datetime.time(23, 59, 59, 99)), (b'abc', datetime.date(2000, 1, 1), > datetime.time(0, 0)), (b'\xe4\xbd\xa0\xe5\xa5\xbd', datetime.date(3000, 12, > 31), datetime.time(23, 59, 59)), (b'\xab\xac\xad', datetime.date(2000, 1, 1), > datetime.time(0, 0))]) > 18:14:58 but: Expected data is [(b'xyw', datetime.date(2011, 1, 1), > datetime.time(23, 59, 59, 99)), (b'abc', datetime.date(2000, 1, 1), > datetime.time(0, 0)), (b'\xe4\xbd\xa0\xe5\xa5\xbd', -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (BEAM-7514) Support streaming on the Python fn_api_runner
[ https://issues.apache.org/jira/browse/BEAM-7514?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17108614#comment-17108614 ] Pablo Estrada commented on BEAM-7514: - Hi Clint! I've been working - albeit slowly- on this for a while. I'm making slow progress. The BundleBasedDirectRunner, which is automatically selected when using TestStream+DirectRunner should support this fine. What issue are you running into? > Support streaming on the Python fn_api_runner > - > > Key: BEAM-7514 > URL: https://issues.apache.org/jira/browse/BEAM-7514 > Project: Beam > Issue Type: Bug > Components: sdk-py-core >Reporter: Pablo Estrada >Assignee: Pablo Estrada >Priority: Major > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (BEAM-9983) bigquery_read_it_test.ReadNewTypesTests.test_iobase_source failing
[ https://issues.apache.org/jira/browse/BEAM-9983?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pablo Estrada resolved BEAM-9983. - Fix Version/s: Not applicable Resolution: Fixed this pr was reverted and rolled-forward later on with fixes. > bigquery_read_it_test.ReadNewTypesTests.test_iobase_source failing > -- > > Key: BEAM-9983 > URL: https://issues.apache.org/jira/browse/BEAM-9983 > Project: Beam > Issue Type: Bug > Components: io-py-gcp, test-failures >Reporter: Kyle Weaver >Assignee: Pablo Estrada >Priority: Major > Fix For: Not applicable > > Time Spent: 10m > Remaining Estimate: 0h > > This failure seems to afflict all Python postcommits. > apache_beam.io.gcp.bigquery_read_it_test.ReadNewTypesTests.test_iobase_source > (from nosetests) > Failing for the past 1 build (Since Failed#2429 ) > Took 9 min 57 sec. > Error Message > Dataflow pipeline failed. State: FAILED, Error: > Traceback (most recent call last): > File "/usr/local/lib/python3.6/site-packages/apache_beam/utils/retry.py", > line 246, in wrapper > sleep_interval = next(retry_intervals) > StopIteration > During handling of the above exception, another exception occurred: > Traceback (most recent call last): > File > "/usr/local/lib/python3.6/site-packages/dataflow_worker/batchworker.py", line > 647, in do_work > work_executor.execute() > File "/usr/local/lib/python3.6/site-packages/dataflow_worker/executor.py", > line 226, in execute > self._split_task) > File "/usr/local/lib/python3.6/site-packages/dataflow_worker/executor.py", > line 234, in _perform_source_split_considering_api_limits > desired_bundle_size) > File "/usr/local/lib/python3.6/site-packages/dataflow_worker/executor.py", > line 271, in _perform_source_split > for split in source.split(desired_bundle_size): > File > "/usr/local/lib/python3.6/site-packages/apache_beam/io/gcp/bigquery.py", line > 698, in split > self.table_reference = self._execute_query(bq) > File > "/usr/local/lib/python3.6/site-packages/apache_beam/options/value_provider.py", > line 135, in _f > return fnc(self, *args, **kwargs) > File > "/usr/local/lib/python3.6/site-packages/apache_beam/io/gcp/bigquery.py", line > 744, in _execute_query > job_labels=self.bigquery_job_labels) > File "/usr/local/lib/python3.6/site-packages/apache_beam/utils/retry.py", > line 249, in wrapper > raise_with_traceback(exn, exn_traceback) > File "/usr/local/lib/python3.6/site-packages/future/utils/__init__.py", > line 446, in raise_with_traceback > raise exc.with_traceback(traceback) > File "/usr/local/lib/python3.6/site-packages/apache_beam/utils/retry.py", > line 236, in wrapper > return fun(*args, **kwargs) > File > "/usr/local/lib/python3.6/site-packages/apache_beam/io/gcp/bigquery_tools.py", > line 415, in _start_query_job > labels=job_labels or {}, > File > "/usr/local/lib/python3.6/site-packages/apitools/base/protorpclite/messages.py", > line 791, in __init__ > setattr(self, name, value) > File > "/usr/local/lib/python3.6/site-packages/apitools/base/protorpclite/messages.py", > line 973, in __setattr__ > object.__setattr__(self, name, value) > File > "/usr/local/lib/python3.6/site-packages/apitools/base/protorpclite/messages.py", > line 1651, in __set__ > value = t(**value) > File > "/usr/local/lib/python3.6/site-packages/apitools/base/protorpclite/messages.py", > line 791, in __init__ > setattr(self, name, value) > File > "/usr/local/lib/python3.6/site-packages/apitools/base/protorpclite/messages.py", > line 976, in __setattr__ > "to message %s" % (name, type(self).__name__)) > AttributeError: May not assign arbitrary value owner to message LabelsValue -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (BEAM-9967) Add support for BigQuery job labels on ReadFrom/WriteTo(BigQuery) transforms
Pablo Estrada created BEAM-9967: --- Summary: Add support for BigQuery job labels on ReadFrom/WriteTo(BigQuery) transforms Key: BEAM-9967 URL: https://issues.apache.org/jira/browse/BEAM-9967 Project: Beam Issue Type: Bug Components: io-py-gcp Reporter: Pablo Estrada Assignee: Pablo Estrada -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (BEAM-6514) Dataflow Batch Job Failure is leaving Datasets/Tables behind in BigQuery
[ https://issues.apache.org/jira/browse/BEAM-6514?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17104994#comment-17104994 ] Pablo Estrada commented on BEAM-6514: - This seems to havbe been noticed by others here: [https://stackoverflow.com/questions/61658242/dataprep-is-leaving-datasets-tables-behind-in-bigquery] > Dataflow Batch Job Failure is leaving Datasets/Tables behind in BigQuery > > > Key: BEAM-6514 > URL: https://issues.apache.org/jira/browse/BEAM-6514 > Project: Beam > Issue Type: Bug > Components: io-java-gcp >Reporter: Rumeshkrishnan Mohan >Assignee: Chamikara Madhusanka Jayalath >Priority: Major > > Dataflow is leaving Datasets/Tables behind in BigQuery when the pipeline is > cancelled or when it fails. I cancelled a job or it failed at run time, and > it left behind a dataset and table in BigQuery. > # `cleanupTempResource` method involves cleaning tables and dataset after > batch job succeed. > # If job failed in the middle or cancelled explicitly, the temporary dataset > and tables remain exist. I do see the table expire period 1 day as per code > in `getTableToExtract` function written in BigQueryQuerySource.java. > # I can understand that, keep temp tables and dataset when failure for > debugging. > # Can we have pipeline or job optional parameters which get clean temporary > dataset and tables when cancel or fail ? -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (BEAM-9957) BigQueryIO does not clean up temporary tables / dataset when a pipeline fails
[ https://issues.apache.org/jira/browse/BEAM-9957?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pablo Estrada resolved BEAM-9957. - Fix Version/s: Not applicable Resolution: Duplicate BEAM-6514 > BigQueryIO does not clean up temporary tables / dataset when a pipeline fails > - > > Key: BEAM-9957 > URL: https://issues.apache.org/jira/browse/BEAM-9957 > Project: Beam > Issue Type: Bug > Components: io-java-gcp >Reporter: Pablo Estrada >Priority: Major > Fix For: Not applicable > > > Some examples of this happenidng: > [https://github.com/GoogleCloudPlatform/DataflowJavaSDK/issues/609] - old > SDK, but with a bit more info > [https://stackoverflow.com/questions/61658242/dataprep-is-leaving-datasets-tables-behind-in-bigquery] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (BEAM-9957) BigQueryIO does not clean up temporary tables / dataset when a pipeline fails
[ https://issues.apache.org/jira/browse/BEAM-9957?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pablo Estrada updated BEAM-9957: Status: Open (was: Triage Needed) > BigQueryIO does not clean up temporary tables / dataset when a pipeline fails > - > > Key: BEAM-9957 > URL: https://issues.apache.org/jira/browse/BEAM-9957 > Project: Beam > Issue Type: Bug > Components: io-java-gcp >Reporter: Pablo Estrada >Priority: Major > > Some examples of this happenidng: > [https://github.com/GoogleCloudPlatform/DataflowJavaSDK/issues/609] - old > SDK, but with a bit more info > [https://stackoverflow.com/questions/61658242/dataprep-is-leaving-datasets-tables-behind-in-bigquery] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (BEAM-9957) BigQueryIO does not clean up temporary tables / dataset when a pipeline fails
Pablo Estrada created BEAM-9957: --- Summary: BigQueryIO does not clean up temporary tables / dataset when a pipeline fails Key: BEAM-9957 URL: https://issues.apache.org/jira/browse/BEAM-9957 Project: Beam Issue Type: Bug Components: io-java-gcp Reporter: Pablo Estrada Some examples of this happenidng: [https://github.com/GoogleCloudPlatform/DataflowJavaSDK/issues/609] - old SDK, but with a bit more info [https://stackoverflow.com/questions/61658242/dataprep-is-leaving-datasets-tables-behind-in-bigquery] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (BEAM-8219) crossLanguagePortableWordCount seems to be flaky for beam_PostCommit_Python2
[ https://issues.apache.org/jira/browse/BEAM-8219?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17104774#comment-17104774 ] Pablo Estrada commented on BEAM-8219: - Thanks Kyle! I'll close this then - it is either solved or obsolete. > crossLanguagePortableWordCount seems to be flaky for beam_PostCommit_Python2 > - > > Key: BEAM-8219 > URL: https://issues.apache.org/jira/browse/BEAM-8219 > Project: Beam > Issue Type: Bug > Components: test-failures >Reporter: Chamikara Madhusanka Jayalath >Assignee: Chamikara Madhusanka Jayalath >Priority: Major > > For example, > [https://builds.apache.org/job/beam_PostCommit_Python2/451/console] > [https://builds.apache.org/job/beam_PostCommit_Python2/454/console] > *10:37:22* * What went wrong:*10:37:22* Execution failed for task > ':sdks:python:test-suites:portable:py2:crossLanguagePortableWordCount'.*10:37:22* > > Process 'command 'sh'' finished with non-zero exit value 1*10:37:22* > > cc: [~heejong] [~mxm] > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (BEAM-8219) crossLanguagePortableWordCount seems to be flaky for beam_PostCommit_Python2
[ https://issues.apache.org/jira/browse/BEAM-8219?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pablo Estrada resolved BEAM-8219. - Fix Version/s: Not applicable Resolution: Fixed > crossLanguagePortableWordCount seems to be flaky for beam_PostCommit_Python2 > - > > Key: BEAM-8219 > URL: https://issues.apache.org/jira/browse/BEAM-8219 > Project: Beam > Issue Type: Bug > Components: test-failures >Reporter: Chamikara Madhusanka Jayalath >Assignee: Chamikara Madhusanka Jayalath >Priority: Major > Fix For: Not applicable > > > For example, > [https://builds.apache.org/job/beam_PostCommit_Python2/451/console] > [https://builds.apache.org/job/beam_PostCommit_Python2/454/console] > *10:37:22* * What went wrong:*10:37:22* Execution failed for task > ':sdks:python:test-suites:portable:py2:crossLanguagePortableWordCount'.*10:37:22* > > Process 'command 'sh'' finished with non-zero exit value 1*10:37:22* > > cc: [~heejong] [~mxm] > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (BEAM-8219) crossLanguagePortableWordCount seems to be flaky for beam_PostCommit_Python2
[ https://issues.apache.org/jira/browse/BEAM-8219?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17104753#comment-17104753 ] Pablo Estrada commented on BEAM-8219: - I see this is a bit of an old issue, but I see it making Py2 postcommit permared: [https://scans.gradle.com/s/dm24jlr4plnbk] > crossLanguagePortableWordCount seems to be flaky for beam_PostCommit_Python2 > - > > Key: BEAM-8219 > URL: https://issues.apache.org/jira/browse/BEAM-8219 > Project: Beam > Issue Type: Bug > Components: test-failures >Reporter: Chamikara Madhusanka Jayalath >Assignee: Chamikara Madhusanka Jayalath >Priority: Major > > For example, > [https://builds.apache.org/job/beam_PostCommit_Python2/451/console] > [https://builds.apache.org/job/beam_PostCommit_Python2/454/console] > *10:37:22* * What went wrong:*10:37:22* Execution failed for task > ':sdks:python:test-suites:portable:py2:crossLanguagePortableWordCount'.*10:37:22* > > Process 'command 'sh'' finished with non-zero exit value 1*10:37:22* > > cc: [~heejong] [~mxm] > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (BEAM-9886) ReadFromBigQuery should auto-infer which project to bill for BQ exports
Pablo Estrada created BEAM-9886: --- Summary: ReadFromBigQuery should auto-infer which project to bill for BQ exports Key: BEAM-9886 URL: https://issues.apache.org/jira/browse/BEAM-9886 Project: Beam Issue Type: Bug Components: io-py-gcp Reporter: Pablo Estrada Assignee: Pablo Estrada -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (BEAM-7885) DoFn.setup() don't run for streaming jobs on DirectRunner.
[ https://issues.apache.org/jira/browse/BEAM-7885?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pablo Estrada resolved BEAM-7885. - Fix Version/s: 2.22.0 Resolution: Fixed > DoFn.setup() don't run for streaming jobs on DirectRunner. > --- > > Key: BEAM-7885 > URL: https://issues.apache.org/jira/browse/BEAM-7885 > Project: Beam > Issue Type: Bug > Components: sdk-py-core >Affects Versions: 2.14.0 > Environment: Python >Reporter: niklas Hansson >Assignee: Pablo Estrada >Priority: Minor > Fix For: 2.22.0 > > > From version 2.14.0 Python have introduced setup and teardown for DoFn in > order to "Called to prepare an instance for processing bundles of > elements.This is a good place to initialize transient in-memory resources, > such as network connections." > However when trying to use it for a unbounded job (pubsub source) it seams > like the DoFn.setup() is never called and the resources are never initialize. > [UPDATE] it is working for Dataflow runner but not for DirectRunner. For the > Dataflow runner the DoFn.Setup seams to be called multiple times but then > never again when the pipeline is processing elements [UPDATE] . For the > direct runner I get: > > AttributeError: 'NoneType' object has no attribute 'predict' [while running > 'transform the data'] > """ > My source code: [https://github.com/NikeNano/DataflowSklearnStreaming] > > I am happy to contribute with example code for how to use setup as soon as I > get it running :) > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (BEAM-7885) DoFn.setup() don't run for streaming jobs on DirectRunner.
[ https://issues.apache.org/jira/browse/BEAM-7885?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pablo Estrada reassigned BEAM-7885: --- Assignee: Pablo Estrada > DoFn.setup() don't run for streaming jobs on DirectRunner. > --- > > Key: BEAM-7885 > URL: https://issues.apache.org/jira/browse/BEAM-7885 > Project: Beam > Issue Type: Bug > Components: sdk-py-core >Affects Versions: 2.14.0 > Environment: Python >Reporter: niklas Hansson >Assignee: Pablo Estrada >Priority: Minor > > From version 2.14.0 Python have introduced setup and teardown for DoFn in > order to "Called to prepare an instance for processing bundles of > elements.This is a good place to initialize transient in-memory resources, > such as network connections." > However when trying to use it for a unbounded job (pubsub source) it seams > like the DoFn.setup() is never called and the resources are never initialize. > [UPDATE] it is working for Dataflow runner but not for DirectRunner. For the > Dataflow runner the DoFn.Setup seams to be called multiple times but then > never again when the pipeline is processing elements [UPDATE] . For the > direct runner I get: > > AttributeError: 'NoneType' object has no attribute 'predict' [while running > 'transform the data'] > """ > My source code: [https://github.com/NikeNano/DataflowSklearnStreaming] > > I am happy to contribute with example code for how to use setup as soon as I > get it running :) > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (BEAM-7885) DoFn.setup() don't run for streaming jobs on DirectRunner.
[ https://issues.apache.org/jira/browse/BEAM-7885?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17095009#comment-17095009 ] Pablo Estrada commented on BEAM-7885: - This is fixed in PR 11547 > DoFn.setup() don't run for streaming jobs on DirectRunner. > --- > > Key: BEAM-7885 > URL: https://issues.apache.org/jira/browse/BEAM-7885 > Project: Beam > Issue Type: Bug > Components: sdk-py-core >Affects Versions: 2.14.0 > Environment: Python >Reporter: niklas Hansson >Priority: Minor > > From version 2.14.0 Python have introduced setup and teardown for DoFn in > order to "Called to prepare an instance for processing bundles of > elements.This is a good place to initialize transient in-memory resources, > such as network connections." > However when trying to use it for a unbounded job (pubsub source) it seams > like the DoFn.setup() is never called and the resources are never initialize. > [UPDATE] it is working for Dataflow runner but not for DirectRunner. For the > Dataflow runner the DoFn.Setup seams to be called multiple times but then > never again when the pipeline is processing elements [UPDATE] . For the > direct runner I get: > > AttributeError: 'NoneType' object has no attribute 'predict' [while running > 'transform the data'] > """ > My source code: [https://github.com/NikeNano/DataflowSklearnStreaming] > > I am happy to contribute with example code for how to use setup as soon as I > get it running :) > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (BEAM-9745) [beam_PostCommit_Java_PortabilityApi] Various GCP IO tests failing, unable to deserialize Custom DoFns and Custom Coders.
[ https://issues.apache.org/jira/browse/BEAM-9745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17094957#comment-17094957 ] Pablo Estrada commented on BEAM-9745: - I am trying to figure out whether this is a regression or not. I'll post an update by tomorrow morning. So far the tests aren't passing on 2.20.0, but they throw a different error, so I guess it is hard to just dismiss as a regression: {code:java} java.lang.RuntimeException: java.util.concurrent.ExecutionException: java.lang.RuntimeException: Error received from SDK harness for instruction -177: java.lang.UnsupportedOperationException: BigQuery source must be split before being read at org.apache.beam.sdk.io.gcp.bigquery.BigQuerySourceBase.createReader(BigQuerySourceBase.java:173) at org.apache.beam.fn.harness.BoundedSourceRunner.runReadLoop(BoundedSourceRunner.java:159) at org.apache.beam.fn.harness.BoundedSourceRunner.start(BoundedSourceRunner.java:146) at org.apache.beam.fn.harness.data.PTransformFunctionRegistry.lambda$register$0(PTransformFunctionRegistry.java:108) at org.apache.beam.fn.harness.control.ProcessBundleHandler.processBundle(ProcessBundleHandler.java:282) at org.apache.beam.fn.harness.control.BeamFnControlClient.delegateOnInstructionRequestType(BeamFnControlClient.java:160) at org.apache.beam.fn.harness.control.BeamFnControlClient.lambda$processInstructionRequests$0(BeamFnControlClient.java:144) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) at java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:357) at java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1895) at org.apache.beam.sdk.util.MoreFutures.get(MoreFutures.java:57) at org.apache.beam.runners.dataflow.worker.fn.control.RegisterAndProcessBundleOperation.finish(RegisterAndProcessBundleOperation.java:332) at org.apache.beam.runners.dataflow.worker.util.common.worker.MapTaskExecutor.execute(MapTaskExecutor.java:85) at org.apache.beam.runners.dataflow.worker.fn.control.BeamFnMapTaskExecutor.execute(BeamFnMapTaskExecutor.java:125) at org.apache.beam.runners.dataflow.worker.BatchDataflowWorker.executeWork(BatchDataflowWorker.java:412) at org.apache.beam.runners.dataflow.worker.BatchDataflowWorker.doWork(BatchDataflowWorker.java:381) at org.apache.beam.runners.dataflow.worker.BatchDataflowWorker.getAndPerformWork(BatchDataflowWorker.java:306) at org.apache.beam.runners.dataflow.worker.DataflowRunnerHarness.start(DataflowRunnerHarness.java:230) at org.apache.beam.runners.dataflow.worker.DataflowRunnerHarness.main(DataflowRunnerHarness.java:138) {code} > [beam_PostCommit_Java_PortabilityApi] Various GCP IO tests failing, unable to > deserialize Custom DoFns and Custom Coders. > - > > Key: BEAM-9745 > URL: https://issues.apache.org/jira/browse/BEAM-9745 > Project: Beam > Issue Type: Bug > Components: io-java-gcp, java-fn-execution, sdk-java-harness, > test-failures >Reporter: Daniel Oliveira >Assignee: Pablo Estrada >Priority: Blocker > Labels: currently-failing > Fix For: 2.21.0 > > > _Use this form to file an issue for test failure:_ > * [Jenkins > Job|https://builds.apache.org/job/beam_PostCommit_Java_PortabilityApi/4657/] > * [Gradle Build > Scan|https://scans.gradle.com/s/c3izncsa4u24k/tests/by-project] > Initial investigation: > The bug appears to be popping up on BigQuery tests mostly, but also a > BigTable and a Datastore test. > Here's an example stacktrace of the two errors, showing _only_ the error > messages themselves. Source: > [https://scans.gradle.com/s/c3izncsa4u24k/tests/efn4wciuamvqq-ccxt3jvofvqbe] > {noformat} > java.util.concurrent.ExecutionException: java.lang.RuntimeException: Error > received from SDK harness for instruction -191: > java.lang.IllegalArgumentException: unable to deserialize Custom DoFn With > Execution Info > ... > Caused by: java.lang.ClassNotFoundException: > org.apache.beam.sdk.io.gcp.bigquery.BatchLoads$3 > ... > Caused by: java.lang.RuntimeException: Error received from SDK harness for > instruction -191: java.lang.IllegalArgumentException: unable to deserialize > Custom DoFn With Execution Info > ... > Caused by: java.lang.ClassNotFoundException: > org.apache.beam.sdk.io.gcp.bigquery.BatchLoads$3 > java.util.concurrent.ExecutionException: java.lang.RuntimeException: Error > received from SDK harness for instruction -206: >
[jira] [Resolved] (BEAM-9832) KeyError: 'No such coder: ' in fn_runner_test
[ https://issues.apache.org/jira/browse/BEAM-9832?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pablo Estrada resolved BEAM-9832. - Fix Version/s: 2.22.0 Resolution: Fixed > KeyError: 'No such coder: ' in fn_runner_test > - > > Key: BEAM-9832 > URL: https://issues.apache.org/jira/browse/BEAM-9832 > Project: Beam > Issue Type: Test > Components: sdk-py-core, test-failures >Reporter: Ning Kang >Assignee: Pablo Estrada >Priority: Critical > Fix For: 2.22.0 > > Time Spent: 50m > Remaining Estimate: 0h > > Failed test results can be found > [here|[https://builds.apache.org/job/beam_PreCommit_Python_Commit/12525/]] > > A stack trace: > {code:java} > self = > testMethod=test_read> > def test_read(self): > # Can't use NamedTemporaryFile as a context > # due to https://bugs.python.org/issue14243 > temp_file = tempfile.NamedTemporaryFile(delete=False) > try: > temp_file.write(b'a\nb\nc') > temp_file.close() > with self.create_pipeline() as p: > assert_that( > > p | beam.io.ReadFromText(temp_file.name), equal_to(['a', 'b', > > 'c'])) > apache_beam/runners/portability/fn_api_runner/fn_runner_test.py:625: > _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ > _ > apache_beam/pipeline.py:529: in __exit__ > self.run().wait_until_finish() > apache_beam/pipeline.py:502: in run > self._options).run(False) > apache_beam/pipeline.py:515: in run > return self.runner.run_pipeline(self, self._options) > apache_beam/runners/portability/fn_api_runner/fn_runner.py:173: in > run_pipeline > pipeline.to_runner_api(default_environment=self._default_environment)) > apache_beam/runners/portability/fn_api_runner/fn_runner.py:183: in > run_via_runner_api > return self.run_stages(stage_context, stages) > apache_beam/runners/portability/fn_api_runner/fn_runner.py:331: in run_stages > bundle_context_manager, > apache_beam/runners/portability/fn_api_runner/fn_runner.py:508: in _run_stage > bundle_manager) > apache_beam/runners/portability/fn_api_runner/fn_runner.py:546: in _run_bundle > data_input, data_output, input_timers, expected_timer_output) > apache_beam/runners/portability/fn_api_runner/fn_runner.py:927: in > process_bundle > timer_inputs)): > /usr/lib/python3.5/concurrent/futures/_base.py:556: in result_iterator > yield future.result() > /usr/lib/python3.5/concurrent/futures/_base.py:405: in result > return self.__get_result() > /usr/lib/python3.5/concurrent/futures/_base.py:357: in __get_result > raise self._exception > apache_beam/utils/thread_pool_executor.py:44: in run > self._future.set_result(self._fn(*self._fn_args, **self._fn_kwargs)) > apache_beam/runners/portability/fn_api_runner/fn_runner.py:923: in execute > dry_run) > apache_beam/runners/portability/fn_api_runner/fn_runner.py:826: in > process_bundle > result_future = self._worker_handler.control_conn.push(process_bundle_req) > apache_beam/runners/portability/fn_api_runner/worker_handlers.py:353: in push > response = self.worker.do_instruction(request) > apache_beam/runners/worker/sdk_worker.py:471: in do_instruction > getattr(request, request_type), request.instruction_id) > apache_beam/runners/worker/sdk_worker.py:500: in process_bundle > instruction_id, request.process_bundle_descriptor_id) > apache_beam/runners/worker/sdk_worker.py:374: in get > self.data_channel_factory) > apache_beam/runners/worker/bundle_processor.py:782: in __init__ > self.ops = self.create_execution_tree(self.process_bundle_descriptor) > apache_beam/runners/worker/bundle_processor.py:837: in create_execution_tree > descriptor.transforms, key=topological_height, reverse=True) > apache_beam/runners/worker/bundle_processor.py:836: in > (transform_id, get_operation(transform_id)) for transform_id in sorted( > apache_beam/runners/worker/bundle_processor.py:726: in wrapper > result = cache[args] = func(*args) > apache_beam/runners/worker/bundle_processor.py:820: in get_operation > pcoll_id in descriptor.transforms[transform_id].outputs.items() > apache_beam/runners/worker/bundle_processor.py:820: in > pcoll_id in descriptor.transforms[transform_id].outputs.items() > apache_beam/runners/worker/bundle_processor.py:818: in > tag: [get_operation(op) for op in pcoll_consumers[pcoll_id]] > apache_beam/runners/worker/bundle_processor.py:726: in wrapper > result = cache[args] = func(*args) > apache_beam/runners/worker/bundle_processor.py:823: in get_operation > transform_id, transform_consumers) > apache_beam/runners/worker/bundle_processor.py:1108: in create_operation > return creator(self, transform_id, transform_proto, payload, consumers) >
[jira] [Commented] (BEAM-9832) KeyError: 'No such coder: ' in fn_runner_test
[ https://issues.apache.org/jira/browse/BEAM-9832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17094700#comment-17094700 ] Pablo Estrada commented on BEAM-9832: - Okay I'll mark this as resolved then. Thanks! > KeyError: 'No such coder: ' in fn_runner_test > - > > Key: BEAM-9832 > URL: https://issues.apache.org/jira/browse/BEAM-9832 > Project: Beam > Issue Type: Test > Components: sdk-py-core, test-failures >Reporter: Ning Kang >Assignee: Pablo Estrada >Priority: Critical > Time Spent: 50m > Remaining Estimate: 0h > > Failed test results can be found > [here|[https://builds.apache.org/job/beam_PreCommit_Python_Commit/12525/]] > > A stack trace: > {code:java} > self = > testMethod=test_read> > def test_read(self): > # Can't use NamedTemporaryFile as a context > # due to https://bugs.python.org/issue14243 > temp_file = tempfile.NamedTemporaryFile(delete=False) > try: > temp_file.write(b'a\nb\nc') > temp_file.close() > with self.create_pipeline() as p: > assert_that( > > p | beam.io.ReadFromText(temp_file.name), equal_to(['a', 'b', > > 'c'])) > apache_beam/runners/portability/fn_api_runner/fn_runner_test.py:625: > _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ > _ > apache_beam/pipeline.py:529: in __exit__ > self.run().wait_until_finish() > apache_beam/pipeline.py:502: in run > self._options).run(False) > apache_beam/pipeline.py:515: in run > return self.runner.run_pipeline(self, self._options) > apache_beam/runners/portability/fn_api_runner/fn_runner.py:173: in > run_pipeline > pipeline.to_runner_api(default_environment=self._default_environment)) > apache_beam/runners/portability/fn_api_runner/fn_runner.py:183: in > run_via_runner_api > return self.run_stages(stage_context, stages) > apache_beam/runners/portability/fn_api_runner/fn_runner.py:331: in run_stages > bundle_context_manager, > apache_beam/runners/portability/fn_api_runner/fn_runner.py:508: in _run_stage > bundle_manager) > apache_beam/runners/portability/fn_api_runner/fn_runner.py:546: in _run_bundle > data_input, data_output, input_timers, expected_timer_output) > apache_beam/runners/portability/fn_api_runner/fn_runner.py:927: in > process_bundle > timer_inputs)): > /usr/lib/python3.5/concurrent/futures/_base.py:556: in result_iterator > yield future.result() > /usr/lib/python3.5/concurrent/futures/_base.py:405: in result > return self.__get_result() > /usr/lib/python3.5/concurrent/futures/_base.py:357: in __get_result > raise self._exception > apache_beam/utils/thread_pool_executor.py:44: in run > self._future.set_result(self._fn(*self._fn_args, **self._fn_kwargs)) > apache_beam/runners/portability/fn_api_runner/fn_runner.py:923: in execute > dry_run) > apache_beam/runners/portability/fn_api_runner/fn_runner.py:826: in > process_bundle > result_future = self._worker_handler.control_conn.push(process_bundle_req) > apache_beam/runners/portability/fn_api_runner/worker_handlers.py:353: in push > response = self.worker.do_instruction(request) > apache_beam/runners/worker/sdk_worker.py:471: in do_instruction > getattr(request, request_type), request.instruction_id) > apache_beam/runners/worker/sdk_worker.py:500: in process_bundle > instruction_id, request.process_bundle_descriptor_id) > apache_beam/runners/worker/sdk_worker.py:374: in get > self.data_channel_factory) > apache_beam/runners/worker/bundle_processor.py:782: in __init__ > self.ops = self.create_execution_tree(self.process_bundle_descriptor) > apache_beam/runners/worker/bundle_processor.py:837: in create_execution_tree > descriptor.transforms, key=topological_height, reverse=True) > apache_beam/runners/worker/bundle_processor.py:836: in > (transform_id, get_operation(transform_id)) for transform_id in sorted( > apache_beam/runners/worker/bundle_processor.py:726: in wrapper > result = cache[args] = func(*args) > apache_beam/runners/worker/bundle_processor.py:820: in get_operation > pcoll_id in descriptor.transforms[transform_id].outputs.items() > apache_beam/runners/worker/bundle_processor.py:820: in > pcoll_id in descriptor.transforms[transform_id].outputs.items() > apache_beam/runners/worker/bundle_processor.py:818: in > tag: [get_operation(op) for op in pcoll_consumers[pcoll_id]] > apache_beam/runners/worker/bundle_processor.py:726: in wrapper > result = cache[args] = func(*args) > apache_beam/runners/worker/bundle_processor.py:823: in get_operation > transform_id, transform_consumers) > apache_beam/runners/worker/bundle_processor.py:1108: in create_operation > return creator(self, transform_id, transform_proto, payload, consumers)
[jira] [Commented] (BEAM-9832) KeyError: 'No such coder: ' in fn_runner_test
[ https://issues.apache.org/jira/browse/BEAM-9832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17094101#comment-17094101 ] Pablo Estrada commented on BEAM-9832: - [https://builds.apache.org/job/beam_PreCommit_Python_Cron/] seems to indicate even more recent breakage > KeyError: 'No such coder: ' in fn_runner_test > - > > Key: BEAM-9832 > URL: https://issues.apache.org/jira/browse/BEAM-9832 > Project: Beam > Issue Type: Test > Components: sdk-py-core, test-failures >Reporter: Ning Kang >Assignee: Pablo Estrada >Priority: Critical > Time Spent: 0.5h > Remaining Estimate: 0h > > Failed test results can be found > [here|[https://builds.apache.org/job/beam_PreCommit_Python_Commit/12525/]] > > A stack trace: > {code:java} > self = > testMethod=test_read> > def test_read(self): > # Can't use NamedTemporaryFile as a context > # due to https://bugs.python.org/issue14243 > temp_file = tempfile.NamedTemporaryFile(delete=False) > try: > temp_file.write(b'a\nb\nc') > temp_file.close() > with self.create_pipeline() as p: > assert_that( > > p | beam.io.ReadFromText(temp_file.name), equal_to(['a', 'b', > > 'c'])) > apache_beam/runners/portability/fn_api_runner/fn_runner_test.py:625: > _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ > _ > apache_beam/pipeline.py:529: in __exit__ > self.run().wait_until_finish() > apache_beam/pipeline.py:502: in run > self._options).run(False) > apache_beam/pipeline.py:515: in run > return self.runner.run_pipeline(self, self._options) > apache_beam/runners/portability/fn_api_runner/fn_runner.py:173: in > run_pipeline > pipeline.to_runner_api(default_environment=self._default_environment)) > apache_beam/runners/portability/fn_api_runner/fn_runner.py:183: in > run_via_runner_api > return self.run_stages(stage_context, stages) > apache_beam/runners/portability/fn_api_runner/fn_runner.py:331: in run_stages > bundle_context_manager, > apache_beam/runners/portability/fn_api_runner/fn_runner.py:508: in _run_stage > bundle_manager) > apache_beam/runners/portability/fn_api_runner/fn_runner.py:546: in _run_bundle > data_input, data_output, input_timers, expected_timer_output) > apache_beam/runners/portability/fn_api_runner/fn_runner.py:927: in > process_bundle > timer_inputs)): > /usr/lib/python3.5/concurrent/futures/_base.py:556: in result_iterator > yield future.result() > /usr/lib/python3.5/concurrent/futures/_base.py:405: in result > return self.__get_result() > /usr/lib/python3.5/concurrent/futures/_base.py:357: in __get_result > raise self._exception > apache_beam/utils/thread_pool_executor.py:44: in run > self._future.set_result(self._fn(*self._fn_args, **self._fn_kwargs)) > apache_beam/runners/portability/fn_api_runner/fn_runner.py:923: in execute > dry_run) > apache_beam/runners/portability/fn_api_runner/fn_runner.py:826: in > process_bundle > result_future = self._worker_handler.control_conn.push(process_bundle_req) > apache_beam/runners/portability/fn_api_runner/worker_handlers.py:353: in push > response = self.worker.do_instruction(request) > apache_beam/runners/worker/sdk_worker.py:471: in do_instruction > getattr(request, request_type), request.instruction_id) > apache_beam/runners/worker/sdk_worker.py:500: in process_bundle > instruction_id, request.process_bundle_descriptor_id) > apache_beam/runners/worker/sdk_worker.py:374: in get > self.data_channel_factory) > apache_beam/runners/worker/bundle_processor.py:782: in __init__ > self.ops = self.create_execution_tree(self.process_bundle_descriptor) > apache_beam/runners/worker/bundle_processor.py:837: in create_execution_tree > descriptor.transforms, key=topological_height, reverse=True) > apache_beam/runners/worker/bundle_processor.py:836: in > (transform_id, get_operation(transform_id)) for transform_id in sorted( > apache_beam/runners/worker/bundle_processor.py:726: in wrapper > result = cache[args] = func(*args) > apache_beam/runners/worker/bundle_processor.py:820: in get_operation > pcoll_id in descriptor.transforms[transform_id].outputs.items() > apache_beam/runners/worker/bundle_processor.py:820: in > pcoll_id in descriptor.transforms[transform_id].outputs.items() > apache_beam/runners/worker/bundle_processor.py:818: in > tag: [get_operation(op) for op in pcoll_consumers[pcoll_id]] > apache_beam/runners/worker/bundle_processor.py:726: in wrapper > result = cache[args] = func(*args) > apache_beam/runners/worker/bundle_processor.py:823: in get_operation > transform_id, transform_consumers) > apache_beam/runners/worker/bundle_processor.py:1108: in create_operation > return
[jira] [Commented] (BEAM-9832) KeyError: 'No such coder: ' in fn_runner_test
[ https://issues.apache.org/jira/browse/BEAM-9832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17094091#comment-17094091 ] Pablo Estrada commented on BEAM-9832: - Nevermind. that wasn't it : ) > KeyError: 'No such coder: ' in fn_runner_test > - > > Key: BEAM-9832 > URL: https://issues.apache.org/jira/browse/BEAM-9832 > Project: Beam > Issue Type: Test > Components: sdk-py-core, test-failures >Reporter: Ning Kang >Assignee: Pablo Estrada >Priority: Critical > Time Spent: 10m > Remaining Estimate: 0h > > Failed test results can be found > [here|[https://builds.apache.org/job/beam_PreCommit_Python_Commit/12525/]] > > A stack trace: > {code:java} > self = > testMethod=test_read> > def test_read(self): > # Can't use NamedTemporaryFile as a context > # due to https://bugs.python.org/issue14243 > temp_file = tempfile.NamedTemporaryFile(delete=False) > try: > temp_file.write(b'a\nb\nc') > temp_file.close() > with self.create_pipeline() as p: > assert_that( > > p | beam.io.ReadFromText(temp_file.name), equal_to(['a', 'b', > > 'c'])) > apache_beam/runners/portability/fn_api_runner/fn_runner_test.py:625: > _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ > _ > apache_beam/pipeline.py:529: in __exit__ > self.run().wait_until_finish() > apache_beam/pipeline.py:502: in run > self._options).run(False) > apache_beam/pipeline.py:515: in run > return self.runner.run_pipeline(self, self._options) > apache_beam/runners/portability/fn_api_runner/fn_runner.py:173: in > run_pipeline > pipeline.to_runner_api(default_environment=self._default_environment)) > apache_beam/runners/portability/fn_api_runner/fn_runner.py:183: in > run_via_runner_api > return self.run_stages(stage_context, stages) > apache_beam/runners/portability/fn_api_runner/fn_runner.py:331: in run_stages > bundle_context_manager, > apache_beam/runners/portability/fn_api_runner/fn_runner.py:508: in _run_stage > bundle_manager) > apache_beam/runners/portability/fn_api_runner/fn_runner.py:546: in _run_bundle > data_input, data_output, input_timers, expected_timer_output) > apache_beam/runners/portability/fn_api_runner/fn_runner.py:927: in > process_bundle > timer_inputs)): > /usr/lib/python3.5/concurrent/futures/_base.py:556: in result_iterator > yield future.result() > /usr/lib/python3.5/concurrent/futures/_base.py:405: in result > return self.__get_result() > /usr/lib/python3.5/concurrent/futures/_base.py:357: in __get_result > raise self._exception > apache_beam/utils/thread_pool_executor.py:44: in run > self._future.set_result(self._fn(*self._fn_args, **self._fn_kwargs)) > apache_beam/runners/portability/fn_api_runner/fn_runner.py:923: in execute > dry_run) > apache_beam/runners/portability/fn_api_runner/fn_runner.py:826: in > process_bundle > result_future = self._worker_handler.control_conn.push(process_bundle_req) > apache_beam/runners/portability/fn_api_runner/worker_handlers.py:353: in push > response = self.worker.do_instruction(request) > apache_beam/runners/worker/sdk_worker.py:471: in do_instruction > getattr(request, request_type), request.instruction_id) > apache_beam/runners/worker/sdk_worker.py:500: in process_bundle > instruction_id, request.process_bundle_descriptor_id) > apache_beam/runners/worker/sdk_worker.py:374: in get > self.data_channel_factory) > apache_beam/runners/worker/bundle_processor.py:782: in __init__ > self.ops = self.create_execution_tree(self.process_bundle_descriptor) > apache_beam/runners/worker/bundle_processor.py:837: in create_execution_tree > descriptor.transforms, key=topological_height, reverse=True) > apache_beam/runners/worker/bundle_processor.py:836: in > (transform_id, get_operation(transform_id)) for transform_id in sorted( > apache_beam/runners/worker/bundle_processor.py:726: in wrapper > result = cache[args] = func(*args) > apache_beam/runners/worker/bundle_processor.py:820: in get_operation > pcoll_id in descriptor.transforms[transform_id].outputs.items() > apache_beam/runners/worker/bundle_processor.py:820: in > pcoll_id in descriptor.transforms[transform_id].outputs.items() > apache_beam/runners/worker/bundle_processor.py:818: in > tag: [get_operation(op) for op in pcoll_consumers[pcoll_id]] > apache_beam/runners/worker/bundle_processor.py:726: in wrapper > result = cache[args] = func(*args) > apache_beam/runners/worker/bundle_processor.py:823: in get_operation > transform_id, transform_consumers) > apache_beam/runners/worker/bundle_processor.py:1108: in create_operation > return creator(self, transform_id, transform_proto, payload, consumers) >
[jira] [Commented] (BEAM-9832) KeyError: 'No such coder: ' in fn_runner_test
[ https://issues.apache.org/jira/browse/BEAM-9832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17094084#comment-17094084 ] Pablo Estrada commented on BEAM-9832: - I am not 100% sure, but this may be a fix: [https://github.com/apache/beam/pull/11546] : P > KeyError: 'No such coder: ' in fn_runner_test > - > > Key: BEAM-9832 > URL: https://issues.apache.org/jira/browse/BEAM-9832 > Project: Beam > Issue Type: Test > Components: sdk-py-core, test-failures >Reporter: Ning Kang >Assignee: Pablo Estrada >Priority: Critical > Time Spent: 10m > Remaining Estimate: 0h > > Failed test results can be found > [here|[https://builds.apache.org/job/beam_PreCommit_Python_Commit/12525/]] > > A stack trace: > {code:java} > self = > testMethod=test_read> > def test_read(self): > # Can't use NamedTemporaryFile as a context > # due to https://bugs.python.org/issue14243 > temp_file = tempfile.NamedTemporaryFile(delete=False) > try: > temp_file.write(b'a\nb\nc') > temp_file.close() > with self.create_pipeline() as p: > assert_that( > > p | beam.io.ReadFromText(temp_file.name), equal_to(['a', 'b', > > 'c'])) > apache_beam/runners/portability/fn_api_runner/fn_runner_test.py:625: > _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ > _ > apache_beam/pipeline.py:529: in __exit__ > self.run().wait_until_finish() > apache_beam/pipeline.py:502: in run > self._options).run(False) > apache_beam/pipeline.py:515: in run > return self.runner.run_pipeline(self, self._options) > apache_beam/runners/portability/fn_api_runner/fn_runner.py:173: in > run_pipeline > pipeline.to_runner_api(default_environment=self._default_environment)) > apache_beam/runners/portability/fn_api_runner/fn_runner.py:183: in > run_via_runner_api > return self.run_stages(stage_context, stages) > apache_beam/runners/portability/fn_api_runner/fn_runner.py:331: in run_stages > bundle_context_manager, > apache_beam/runners/portability/fn_api_runner/fn_runner.py:508: in _run_stage > bundle_manager) > apache_beam/runners/portability/fn_api_runner/fn_runner.py:546: in _run_bundle > data_input, data_output, input_timers, expected_timer_output) > apache_beam/runners/portability/fn_api_runner/fn_runner.py:927: in > process_bundle > timer_inputs)): > /usr/lib/python3.5/concurrent/futures/_base.py:556: in result_iterator > yield future.result() > /usr/lib/python3.5/concurrent/futures/_base.py:405: in result > return self.__get_result() > /usr/lib/python3.5/concurrent/futures/_base.py:357: in __get_result > raise self._exception > apache_beam/utils/thread_pool_executor.py:44: in run > self._future.set_result(self._fn(*self._fn_args, **self._fn_kwargs)) > apache_beam/runners/portability/fn_api_runner/fn_runner.py:923: in execute > dry_run) > apache_beam/runners/portability/fn_api_runner/fn_runner.py:826: in > process_bundle > result_future = self._worker_handler.control_conn.push(process_bundle_req) > apache_beam/runners/portability/fn_api_runner/worker_handlers.py:353: in push > response = self.worker.do_instruction(request) > apache_beam/runners/worker/sdk_worker.py:471: in do_instruction > getattr(request, request_type), request.instruction_id) > apache_beam/runners/worker/sdk_worker.py:500: in process_bundle > instruction_id, request.process_bundle_descriptor_id) > apache_beam/runners/worker/sdk_worker.py:374: in get > self.data_channel_factory) > apache_beam/runners/worker/bundle_processor.py:782: in __init__ > self.ops = self.create_execution_tree(self.process_bundle_descriptor) > apache_beam/runners/worker/bundle_processor.py:837: in create_execution_tree > descriptor.transforms, key=topological_height, reverse=True) > apache_beam/runners/worker/bundle_processor.py:836: in > (transform_id, get_operation(transform_id)) for transform_id in sorted( > apache_beam/runners/worker/bundle_processor.py:726: in wrapper > result = cache[args] = func(*args) > apache_beam/runners/worker/bundle_processor.py:820: in get_operation > pcoll_id in descriptor.transforms[transform_id].outputs.items() > apache_beam/runners/worker/bundle_processor.py:820: in > pcoll_id in descriptor.transforms[transform_id].outputs.items() > apache_beam/runners/worker/bundle_processor.py:818: in > tag: [get_operation(op) for op in pcoll_consumers[pcoll_id]] > apache_beam/runners/worker/bundle_processor.py:726: in wrapper > result = cache[args] = func(*args) > apache_beam/runners/worker/bundle_processor.py:823: in get_operation > transform_id, transform_consumers) > apache_beam/runners/worker/bundle_processor.py:1108: in create_operation > return creator(self,
[jira] [Commented] (BEAM-9832) KeyError: 'No such coder: ' in fn_runner_test
[ https://issues.apache.org/jira/browse/BEAM-9832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17094078#comment-17094078 ] Pablo Estrada commented on BEAM-9832: - hmm PR 11514 seems to only deal with monitoring infos... so I'm not sure > KeyError: 'No such coder: ' in fn_runner_test > - > > Key: BEAM-9832 > URL: https://issues.apache.org/jira/browse/BEAM-9832 > Project: Beam > Issue Type: Test > Components: sdk-py-core, test-failures >Reporter: Ning Kang >Assignee: Pablo Estrada >Priority: Critical > > Failed test results can be found > [here|[https://builds.apache.org/job/beam_PreCommit_Python_Commit/12525/]] > > A stack trace: > {code:java} > self = > testMethod=test_read> > def test_read(self): > # Can't use NamedTemporaryFile as a context > # due to https://bugs.python.org/issue14243 > temp_file = tempfile.NamedTemporaryFile(delete=False) > try: > temp_file.write(b'a\nb\nc') > temp_file.close() > with self.create_pipeline() as p: > assert_that( > > p | beam.io.ReadFromText(temp_file.name), equal_to(['a', 'b', > > 'c'])) > apache_beam/runners/portability/fn_api_runner/fn_runner_test.py:625: > _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ > _ > apache_beam/pipeline.py:529: in __exit__ > self.run().wait_until_finish() > apache_beam/pipeline.py:502: in run > self._options).run(False) > apache_beam/pipeline.py:515: in run > return self.runner.run_pipeline(self, self._options) > apache_beam/runners/portability/fn_api_runner/fn_runner.py:173: in > run_pipeline > pipeline.to_runner_api(default_environment=self._default_environment)) > apache_beam/runners/portability/fn_api_runner/fn_runner.py:183: in > run_via_runner_api > return self.run_stages(stage_context, stages) > apache_beam/runners/portability/fn_api_runner/fn_runner.py:331: in run_stages > bundle_context_manager, > apache_beam/runners/portability/fn_api_runner/fn_runner.py:508: in _run_stage > bundle_manager) > apache_beam/runners/portability/fn_api_runner/fn_runner.py:546: in _run_bundle > data_input, data_output, input_timers, expected_timer_output) > apache_beam/runners/portability/fn_api_runner/fn_runner.py:927: in > process_bundle > timer_inputs)): > /usr/lib/python3.5/concurrent/futures/_base.py:556: in result_iterator > yield future.result() > /usr/lib/python3.5/concurrent/futures/_base.py:405: in result > return self.__get_result() > /usr/lib/python3.5/concurrent/futures/_base.py:357: in __get_result > raise self._exception > apache_beam/utils/thread_pool_executor.py:44: in run > self._future.set_result(self._fn(*self._fn_args, **self._fn_kwargs)) > apache_beam/runners/portability/fn_api_runner/fn_runner.py:923: in execute > dry_run) > apache_beam/runners/portability/fn_api_runner/fn_runner.py:826: in > process_bundle > result_future = self._worker_handler.control_conn.push(process_bundle_req) > apache_beam/runners/portability/fn_api_runner/worker_handlers.py:353: in push > response = self.worker.do_instruction(request) > apache_beam/runners/worker/sdk_worker.py:471: in do_instruction > getattr(request, request_type), request.instruction_id) > apache_beam/runners/worker/sdk_worker.py:500: in process_bundle > instruction_id, request.process_bundle_descriptor_id) > apache_beam/runners/worker/sdk_worker.py:374: in get > self.data_channel_factory) > apache_beam/runners/worker/bundle_processor.py:782: in __init__ > self.ops = self.create_execution_tree(self.process_bundle_descriptor) > apache_beam/runners/worker/bundle_processor.py:837: in create_execution_tree > descriptor.transforms, key=topological_height, reverse=True) > apache_beam/runners/worker/bundle_processor.py:836: in > (transform_id, get_operation(transform_id)) for transform_id in sorted( > apache_beam/runners/worker/bundle_processor.py:726: in wrapper > result = cache[args] = func(*args) > apache_beam/runners/worker/bundle_processor.py:820: in get_operation > pcoll_id in descriptor.transforms[transform_id].outputs.items() > apache_beam/runners/worker/bundle_processor.py:820: in > pcoll_id in descriptor.transforms[transform_id].outputs.items() > apache_beam/runners/worker/bundle_processor.py:818: in > tag: [get_operation(op) for op in pcoll_consumers[pcoll_id]] > apache_beam/runners/worker/bundle_processor.py:726: in wrapper > result = cache[args] = func(*args) > apache_beam/runners/worker/bundle_processor.py:823: in get_operation > transform_id, transform_consumers) > apache_beam/runners/worker/bundle_processor.py:1108: in create_operation > return creator(self, transform_id, transform_proto, payload, consumers) >
[jira] [Commented] (BEAM-9832) KeyError: 'No such coder: ' in fn_runner_test
[ https://issues.apache.org/jira/browse/BEAM-9832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17094074#comment-17094074 ] Pablo Estrada commented on BEAM-9832: - I'm suspecting this one: [https://github.com/apache/beam/pull/11514] > KeyError: 'No such coder: ' in fn_runner_test > - > > Key: BEAM-9832 > URL: https://issues.apache.org/jira/browse/BEAM-9832 > Project: Beam > Issue Type: Test > Components: sdk-py-core, test-failures >Reporter: Ning Kang >Assignee: Pablo Estrada >Priority: Critical > > Failed test results can be found > [here|[https://builds.apache.org/job/beam_PreCommit_Python_Commit/12525/]] > > A stack trace: > {code:java} > self = > testMethod=test_read> > def test_read(self): > # Can't use NamedTemporaryFile as a context > # due to https://bugs.python.org/issue14243 > temp_file = tempfile.NamedTemporaryFile(delete=False) > try: > temp_file.write(b'a\nb\nc') > temp_file.close() > with self.create_pipeline() as p: > assert_that( > > p | beam.io.ReadFromText(temp_file.name), equal_to(['a', 'b', > > 'c'])) > apache_beam/runners/portability/fn_api_runner/fn_runner_test.py:625: > _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ > _ > apache_beam/pipeline.py:529: in __exit__ > self.run().wait_until_finish() > apache_beam/pipeline.py:502: in run > self._options).run(False) > apache_beam/pipeline.py:515: in run > return self.runner.run_pipeline(self, self._options) > apache_beam/runners/portability/fn_api_runner/fn_runner.py:173: in > run_pipeline > pipeline.to_runner_api(default_environment=self._default_environment)) > apache_beam/runners/portability/fn_api_runner/fn_runner.py:183: in > run_via_runner_api > return self.run_stages(stage_context, stages) > apache_beam/runners/portability/fn_api_runner/fn_runner.py:331: in run_stages > bundle_context_manager, > apache_beam/runners/portability/fn_api_runner/fn_runner.py:508: in _run_stage > bundle_manager) > apache_beam/runners/portability/fn_api_runner/fn_runner.py:546: in _run_bundle > data_input, data_output, input_timers, expected_timer_output) > apache_beam/runners/portability/fn_api_runner/fn_runner.py:927: in > process_bundle > timer_inputs)): > /usr/lib/python3.5/concurrent/futures/_base.py:556: in result_iterator > yield future.result() > /usr/lib/python3.5/concurrent/futures/_base.py:405: in result > return self.__get_result() > /usr/lib/python3.5/concurrent/futures/_base.py:357: in __get_result > raise self._exception > apache_beam/utils/thread_pool_executor.py:44: in run > self._future.set_result(self._fn(*self._fn_args, **self._fn_kwargs)) > apache_beam/runners/portability/fn_api_runner/fn_runner.py:923: in execute > dry_run) > apache_beam/runners/portability/fn_api_runner/fn_runner.py:826: in > process_bundle > result_future = self._worker_handler.control_conn.push(process_bundle_req) > apache_beam/runners/portability/fn_api_runner/worker_handlers.py:353: in push > response = self.worker.do_instruction(request) > apache_beam/runners/worker/sdk_worker.py:471: in do_instruction > getattr(request, request_type), request.instruction_id) > apache_beam/runners/worker/sdk_worker.py:500: in process_bundle > instruction_id, request.process_bundle_descriptor_id) > apache_beam/runners/worker/sdk_worker.py:374: in get > self.data_channel_factory) > apache_beam/runners/worker/bundle_processor.py:782: in __init__ > self.ops = self.create_execution_tree(self.process_bundle_descriptor) > apache_beam/runners/worker/bundle_processor.py:837: in create_execution_tree > descriptor.transforms, key=topological_height, reverse=True) > apache_beam/runners/worker/bundle_processor.py:836: in > (transform_id, get_operation(transform_id)) for transform_id in sorted( > apache_beam/runners/worker/bundle_processor.py:726: in wrapper > result = cache[args] = func(*args) > apache_beam/runners/worker/bundle_processor.py:820: in get_operation > pcoll_id in descriptor.transforms[transform_id].outputs.items() > apache_beam/runners/worker/bundle_processor.py:820: in > pcoll_id in descriptor.transforms[transform_id].outputs.items() > apache_beam/runners/worker/bundle_processor.py:818: in > tag: [get_operation(op) for op in pcoll_consumers[pcoll_id]] > apache_beam/runners/worker/bundle_processor.py:726: in wrapper > result = cache[args] = func(*args) > apache_beam/runners/worker/bundle_processor.py:823: in get_operation > transform_id, transform_consumers) > apache_beam/runners/worker/bundle_processor.py:1108: in create_operation > return creator(self, transform_id, transform_proto, payload, consumers) >
[jira] [Commented] (BEAM-9832) KeyError: 'No such coder: ' in fn_runner_test
[ https://issues.apache.org/jira/browse/BEAM-9832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17094075#comment-17094075 ] Pablo Estrada commented on BEAM-9832: - (Same test failure is visible from revert [https://github.com/apache/beam/pull/11544] of [https://github.com/apache/beam/pull/11270] ) > KeyError: 'No such coder: ' in fn_runner_test > - > > Key: BEAM-9832 > URL: https://issues.apache.org/jira/browse/BEAM-9832 > Project: Beam > Issue Type: Test > Components: sdk-py-core, test-failures >Reporter: Ning Kang >Assignee: Pablo Estrada >Priority: Critical > > Failed test results can be found > [here|[https://builds.apache.org/job/beam_PreCommit_Python_Commit/12525/]] > > A stack trace: > {code:java} > self = > testMethod=test_read> > def test_read(self): > # Can't use NamedTemporaryFile as a context > # due to https://bugs.python.org/issue14243 > temp_file = tempfile.NamedTemporaryFile(delete=False) > try: > temp_file.write(b'a\nb\nc') > temp_file.close() > with self.create_pipeline() as p: > assert_that( > > p | beam.io.ReadFromText(temp_file.name), equal_to(['a', 'b', > > 'c'])) > apache_beam/runners/portability/fn_api_runner/fn_runner_test.py:625: > _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ > _ > apache_beam/pipeline.py:529: in __exit__ > self.run().wait_until_finish() > apache_beam/pipeline.py:502: in run > self._options).run(False) > apache_beam/pipeline.py:515: in run > return self.runner.run_pipeline(self, self._options) > apache_beam/runners/portability/fn_api_runner/fn_runner.py:173: in > run_pipeline > pipeline.to_runner_api(default_environment=self._default_environment)) > apache_beam/runners/portability/fn_api_runner/fn_runner.py:183: in > run_via_runner_api > return self.run_stages(stage_context, stages) > apache_beam/runners/portability/fn_api_runner/fn_runner.py:331: in run_stages > bundle_context_manager, > apache_beam/runners/portability/fn_api_runner/fn_runner.py:508: in _run_stage > bundle_manager) > apache_beam/runners/portability/fn_api_runner/fn_runner.py:546: in _run_bundle > data_input, data_output, input_timers, expected_timer_output) > apache_beam/runners/portability/fn_api_runner/fn_runner.py:927: in > process_bundle > timer_inputs)): > /usr/lib/python3.5/concurrent/futures/_base.py:556: in result_iterator > yield future.result() > /usr/lib/python3.5/concurrent/futures/_base.py:405: in result > return self.__get_result() > /usr/lib/python3.5/concurrent/futures/_base.py:357: in __get_result > raise self._exception > apache_beam/utils/thread_pool_executor.py:44: in run > self._future.set_result(self._fn(*self._fn_args, **self._fn_kwargs)) > apache_beam/runners/portability/fn_api_runner/fn_runner.py:923: in execute > dry_run) > apache_beam/runners/portability/fn_api_runner/fn_runner.py:826: in > process_bundle > result_future = self._worker_handler.control_conn.push(process_bundle_req) > apache_beam/runners/portability/fn_api_runner/worker_handlers.py:353: in push > response = self.worker.do_instruction(request) > apache_beam/runners/worker/sdk_worker.py:471: in do_instruction > getattr(request, request_type), request.instruction_id) > apache_beam/runners/worker/sdk_worker.py:500: in process_bundle > instruction_id, request.process_bundle_descriptor_id) > apache_beam/runners/worker/sdk_worker.py:374: in get > self.data_channel_factory) > apache_beam/runners/worker/bundle_processor.py:782: in __init__ > self.ops = self.create_execution_tree(self.process_bundle_descriptor) > apache_beam/runners/worker/bundle_processor.py:837: in create_execution_tree > descriptor.transforms, key=topological_height, reverse=True) > apache_beam/runners/worker/bundle_processor.py:836: in > (transform_id, get_operation(transform_id)) for transform_id in sorted( > apache_beam/runners/worker/bundle_processor.py:726: in wrapper > result = cache[args] = func(*args) > apache_beam/runners/worker/bundle_processor.py:820: in get_operation > pcoll_id in descriptor.transforms[transform_id].outputs.items() > apache_beam/runners/worker/bundle_processor.py:820: in > pcoll_id in descriptor.transforms[transform_id].outputs.items() > apache_beam/runners/worker/bundle_processor.py:818: in > tag: [get_operation(op) for op in pcoll_consumers[pcoll_id]] > apache_beam/runners/worker/bundle_processor.py:726: in wrapper > result = cache[args] = func(*args) > apache_beam/runners/worker/bundle_processor.py:823: in get_operation > transform_id, transform_consumers) > apache_beam/runners/worker/bundle_processor.py:1108: in create_operation > return creator(self,
[jira] [Commented] (BEAM-9832) KeyError: 'No such coder: ' in fn_runner_test
[ https://issues.apache.org/jira/browse/BEAM-9832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17094073#comment-17094073 ] Pablo Estrada commented on BEAM-9832: - Breakage seems to have started about 3 days ago: [https://builds.apache.org/job/beam_PreCommit_Python_Commit/] > KeyError: 'No such coder: ' in fn_runner_test > - > > Key: BEAM-9832 > URL: https://issues.apache.org/jira/browse/BEAM-9832 > Project: Beam > Issue Type: Test > Components: sdk-py-core, test-failures >Reporter: Ning Kang >Assignee: Pablo Estrada >Priority: Critical > > Failed test results can be found > [here|[https://builds.apache.org/job/beam_PreCommit_Python_Commit/12525/]] > > A stack trace: > {code:java} > self = > testMethod=test_read> > def test_read(self): > # Can't use NamedTemporaryFile as a context > # due to https://bugs.python.org/issue14243 > temp_file = tempfile.NamedTemporaryFile(delete=False) > try: > temp_file.write(b'a\nb\nc') > temp_file.close() > with self.create_pipeline() as p: > assert_that( > > p | beam.io.ReadFromText(temp_file.name), equal_to(['a', 'b', > > 'c'])) > apache_beam/runners/portability/fn_api_runner/fn_runner_test.py:625: > _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ > _ > apache_beam/pipeline.py:529: in __exit__ > self.run().wait_until_finish() > apache_beam/pipeline.py:502: in run > self._options).run(False) > apache_beam/pipeline.py:515: in run > return self.runner.run_pipeline(self, self._options) > apache_beam/runners/portability/fn_api_runner/fn_runner.py:173: in > run_pipeline > pipeline.to_runner_api(default_environment=self._default_environment)) > apache_beam/runners/portability/fn_api_runner/fn_runner.py:183: in > run_via_runner_api > return self.run_stages(stage_context, stages) > apache_beam/runners/portability/fn_api_runner/fn_runner.py:331: in run_stages > bundle_context_manager, > apache_beam/runners/portability/fn_api_runner/fn_runner.py:508: in _run_stage > bundle_manager) > apache_beam/runners/portability/fn_api_runner/fn_runner.py:546: in _run_bundle > data_input, data_output, input_timers, expected_timer_output) > apache_beam/runners/portability/fn_api_runner/fn_runner.py:927: in > process_bundle > timer_inputs)): > /usr/lib/python3.5/concurrent/futures/_base.py:556: in result_iterator > yield future.result() > /usr/lib/python3.5/concurrent/futures/_base.py:405: in result > return self.__get_result() > /usr/lib/python3.5/concurrent/futures/_base.py:357: in __get_result > raise self._exception > apache_beam/utils/thread_pool_executor.py:44: in run > self._future.set_result(self._fn(*self._fn_args, **self._fn_kwargs)) > apache_beam/runners/portability/fn_api_runner/fn_runner.py:923: in execute > dry_run) > apache_beam/runners/portability/fn_api_runner/fn_runner.py:826: in > process_bundle > result_future = self._worker_handler.control_conn.push(process_bundle_req) > apache_beam/runners/portability/fn_api_runner/worker_handlers.py:353: in push > response = self.worker.do_instruction(request) > apache_beam/runners/worker/sdk_worker.py:471: in do_instruction > getattr(request, request_type), request.instruction_id) > apache_beam/runners/worker/sdk_worker.py:500: in process_bundle > instruction_id, request.process_bundle_descriptor_id) > apache_beam/runners/worker/sdk_worker.py:374: in get > self.data_channel_factory) > apache_beam/runners/worker/bundle_processor.py:782: in __init__ > self.ops = self.create_execution_tree(self.process_bundle_descriptor) > apache_beam/runners/worker/bundle_processor.py:837: in create_execution_tree > descriptor.transforms, key=topological_height, reverse=True) > apache_beam/runners/worker/bundle_processor.py:836: in > (transform_id, get_operation(transform_id)) for transform_id in sorted( > apache_beam/runners/worker/bundle_processor.py:726: in wrapper > result = cache[args] = func(*args) > apache_beam/runners/worker/bundle_processor.py:820: in get_operation > pcoll_id in descriptor.transforms[transform_id].outputs.items() > apache_beam/runners/worker/bundle_processor.py:820: in > pcoll_id in descriptor.transforms[transform_id].outputs.items() > apache_beam/runners/worker/bundle_processor.py:818: in > tag: [get_operation(op) for op in pcoll_consumers[pcoll_id]] > apache_beam/runners/worker/bundle_processor.py:726: in wrapper > result = cache[args] = func(*args) > apache_beam/runners/worker/bundle_processor.py:823: in get_operation > transform_id, transform_consumers) > apache_beam/runners/worker/bundle_processor.py:1108: in create_operation > return creator(self, transform_id, transform_proto,
[jira] [Commented] (BEAM-9832) KeyError: 'No such coder: ' in fn_runner_test
[ https://issues.apache.org/jira/browse/BEAM-9832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17094043#comment-17094043 ] Pablo Estrada commented on BEAM-9832: - Reverting PR: [https://github.com/apache/beam/pull/11544] > KeyError: 'No such coder: ' in fn_runner_test > - > > Key: BEAM-9832 > URL: https://issues.apache.org/jira/browse/BEAM-9832 > Project: Beam > Issue Type: Test > Components: sdk-py-core, test-failures >Reporter: Ning Kang >Assignee: Pablo Estrada >Priority: Critical > > Failed test results can be found > [here|[https://builds.apache.org/job/beam_PreCommit_Python_Commit/12525/]] > > A stack trace: > {code:java} > self = > testMethod=test_read> > def test_read(self): > # Can't use NamedTemporaryFile as a context > # due to https://bugs.python.org/issue14243 > temp_file = tempfile.NamedTemporaryFile(delete=False) > try: > temp_file.write(b'a\nb\nc') > temp_file.close() > with self.create_pipeline() as p: > assert_that( > > p | beam.io.ReadFromText(temp_file.name), equal_to(['a', 'b', > > 'c'])) > apache_beam/runners/portability/fn_api_runner/fn_runner_test.py:625: > _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ > _ > apache_beam/pipeline.py:529: in __exit__ > self.run().wait_until_finish() > apache_beam/pipeline.py:502: in run > self._options).run(False) > apache_beam/pipeline.py:515: in run > return self.runner.run_pipeline(self, self._options) > apache_beam/runners/portability/fn_api_runner/fn_runner.py:173: in > run_pipeline > pipeline.to_runner_api(default_environment=self._default_environment)) > apache_beam/runners/portability/fn_api_runner/fn_runner.py:183: in > run_via_runner_api > return self.run_stages(stage_context, stages) > apache_beam/runners/portability/fn_api_runner/fn_runner.py:331: in run_stages > bundle_context_manager, > apache_beam/runners/portability/fn_api_runner/fn_runner.py:508: in _run_stage > bundle_manager) > apache_beam/runners/portability/fn_api_runner/fn_runner.py:546: in _run_bundle > data_input, data_output, input_timers, expected_timer_output) > apache_beam/runners/portability/fn_api_runner/fn_runner.py:927: in > process_bundle > timer_inputs)): > /usr/lib/python3.5/concurrent/futures/_base.py:556: in result_iterator > yield future.result() > /usr/lib/python3.5/concurrent/futures/_base.py:405: in result > return self.__get_result() > /usr/lib/python3.5/concurrent/futures/_base.py:357: in __get_result > raise self._exception > apache_beam/utils/thread_pool_executor.py:44: in run > self._future.set_result(self._fn(*self._fn_args, **self._fn_kwargs)) > apache_beam/runners/portability/fn_api_runner/fn_runner.py:923: in execute > dry_run) > apache_beam/runners/portability/fn_api_runner/fn_runner.py:826: in > process_bundle > result_future = self._worker_handler.control_conn.push(process_bundle_req) > apache_beam/runners/portability/fn_api_runner/worker_handlers.py:353: in push > response = self.worker.do_instruction(request) > apache_beam/runners/worker/sdk_worker.py:471: in do_instruction > getattr(request, request_type), request.instruction_id) > apache_beam/runners/worker/sdk_worker.py:500: in process_bundle > instruction_id, request.process_bundle_descriptor_id) > apache_beam/runners/worker/sdk_worker.py:374: in get > self.data_channel_factory) > apache_beam/runners/worker/bundle_processor.py:782: in __init__ > self.ops = self.create_execution_tree(self.process_bundle_descriptor) > apache_beam/runners/worker/bundle_processor.py:837: in create_execution_tree > descriptor.transforms, key=topological_height, reverse=True) > apache_beam/runners/worker/bundle_processor.py:836: in > (transform_id, get_operation(transform_id)) for transform_id in sorted( > apache_beam/runners/worker/bundle_processor.py:726: in wrapper > result = cache[args] = func(*args) > apache_beam/runners/worker/bundle_processor.py:820: in get_operation > pcoll_id in descriptor.transforms[transform_id].outputs.items() > apache_beam/runners/worker/bundle_processor.py:820: in > pcoll_id in descriptor.transforms[transform_id].outputs.items() > apache_beam/runners/worker/bundle_processor.py:818: in > tag: [get_operation(op) for op in pcoll_consumers[pcoll_id]] > apache_beam/runners/worker/bundle_processor.py:726: in wrapper > result = cache[args] = func(*args) > apache_beam/runners/worker/bundle_processor.py:823: in get_operation > transform_id, transform_consumers) > apache_beam/runners/worker/bundle_processor.py:1108: in create_operation > return creator(self, transform_id, transform_proto, payload, consumers) >
[jira] [Commented] (BEAM-9832) KeyError: 'No such coder: ' in fn_runner_test
[ https://issues.apache.org/jira/browse/BEAM-9832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17094037#comment-17094037 ] Pablo Estrada commented on BEAM-9832: - Okay, I guess it makes sense to roll back. Working on that.. > KeyError: 'No such coder: ' in fn_runner_test > - > > Key: BEAM-9832 > URL: https://issues.apache.org/jira/browse/BEAM-9832 > Project: Beam > Issue Type: Test > Components: sdk-py-core, test-failures >Reporter: Ning Kang >Assignee: Pablo Estrada >Priority: Critical > > Failed test results can be found > [here|[https://builds.apache.org/job/beam_PreCommit_Python_Commit/12525/]] > > A stack trace: > {code:java} > self = > testMethod=test_read> > def test_read(self): > # Can't use NamedTemporaryFile as a context > # due to https://bugs.python.org/issue14243 > temp_file = tempfile.NamedTemporaryFile(delete=False) > try: > temp_file.write(b'a\nb\nc') > temp_file.close() > with self.create_pipeline() as p: > assert_that( > > p | beam.io.ReadFromText(temp_file.name), equal_to(['a', 'b', > > 'c'])) > apache_beam/runners/portability/fn_api_runner/fn_runner_test.py:625: > _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ > _ > apache_beam/pipeline.py:529: in __exit__ > self.run().wait_until_finish() > apache_beam/pipeline.py:502: in run > self._options).run(False) > apache_beam/pipeline.py:515: in run > return self.runner.run_pipeline(self, self._options) > apache_beam/runners/portability/fn_api_runner/fn_runner.py:173: in > run_pipeline > pipeline.to_runner_api(default_environment=self._default_environment)) > apache_beam/runners/portability/fn_api_runner/fn_runner.py:183: in > run_via_runner_api > return self.run_stages(stage_context, stages) > apache_beam/runners/portability/fn_api_runner/fn_runner.py:331: in run_stages > bundle_context_manager, > apache_beam/runners/portability/fn_api_runner/fn_runner.py:508: in _run_stage > bundle_manager) > apache_beam/runners/portability/fn_api_runner/fn_runner.py:546: in _run_bundle > data_input, data_output, input_timers, expected_timer_output) > apache_beam/runners/portability/fn_api_runner/fn_runner.py:927: in > process_bundle > timer_inputs)): > /usr/lib/python3.5/concurrent/futures/_base.py:556: in result_iterator > yield future.result() > /usr/lib/python3.5/concurrent/futures/_base.py:405: in result > return self.__get_result() > /usr/lib/python3.5/concurrent/futures/_base.py:357: in __get_result > raise self._exception > apache_beam/utils/thread_pool_executor.py:44: in run > self._future.set_result(self._fn(*self._fn_args, **self._fn_kwargs)) > apache_beam/runners/portability/fn_api_runner/fn_runner.py:923: in execute > dry_run) > apache_beam/runners/portability/fn_api_runner/fn_runner.py:826: in > process_bundle > result_future = self._worker_handler.control_conn.push(process_bundle_req) > apache_beam/runners/portability/fn_api_runner/worker_handlers.py:353: in push > response = self.worker.do_instruction(request) > apache_beam/runners/worker/sdk_worker.py:471: in do_instruction > getattr(request, request_type), request.instruction_id) > apache_beam/runners/worker/sdk_worker.py:500: in process_bundle > instruction_id, request.process_bundle_descriptor_id) > apache_beam/runners/worker/sdk_worker.py:374: in get > self.data_channel_factory) > apache_beam/runners/worker/bundle_processor.py:782: in __init__ > self.ops = self.create_execution_tree(self.process_bundle_descriptor) > apache_beam/runners/worker/bundle_processor.py:837: in create_execution_tree > descriptor.transforms, key=topological_height, reverse=True) > apache_beam/runners/worker/bundle_processor.py:836: in > (transform_id, get_operation(transform_id)) for transform_id in sorted( > apache_beam/runners/worker/bundle_processor.py:726: in wrapper > result = cache[args] = func(*args) > apache_beam/runners/worker/bundle_processor.py:820: in get_operation > pcoll_id in descriptor.transforms[transform_id].outputs.items() > apache_beam/runners/worker/bundle_processor.py:820: in > pcoll_id in descriptor.transforms[transform_id].outputs.items() > apache_beam/runners/worker/bundle_processor.py:818: in > tag: [get_operation(op) for op in pcoll_consumers[pcoll_id]] > apache_beam/runners/worker/bundle_processor.py:726: in wrapper > result = cache[args] = func(*args) > apache_beam/runners/worker/bundle_processor.py:823: in get_operation > transform_id, transform_consumers) > apache_beam/runners/worker/bundle_processor.py:1108: in create_operation > return creator(self, transform_id, transform_proto, payload, consumers) >
[jira] [Commented] (BEAM-9832) KeyError: 'No such coder: ' in fn_runner_test
[ https://issues.apache.org/jira/browse/BEAM-9832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17094030#comment-17094030 ] Pablo Estrada commented on BEAM-9832: - Hm I haven't been able to figure out how an erroneous output is appearing here... > KeyError: 'No such coder: ' in fn_runner_test > - > > Key: BEAM-9832 > URL: https://issues.apache.org/jira/browse/BEAM-9832 > Project: Beam > Issue Type: Test > Components: sdk-py-core, test-failures >Reporter: Ning Kang >Assignee: Pablo Estrada >Priority: Critical > > Failed test results can be found > [here|[https://builds.apache.org/job/beam_PreCommit_Python_Commit/12525/]] > > A stack trace: > {code:java} > self = > testMethod=test_read> > def test_read(self): > # Can't use NamedTemporaryFile as a context > # due to https://bugs.python.org/issue14243 > temp_file = tempfile.NamedTemporaryFile(delete=False) > try: > temp_file.write(b'a\nb\nc') > temp_file.close() > with self.create_pipeline() as p: > assert_that( > > p | beam.io.ReadFromText(temp_file.name), equal_to(['a', 'b', > > 'c'])) > apache_beam/runners/portability/fn_api_runner/fn_runner_test.py:625: > _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ > _ > apache_beam/pipeline.py:529: in __exit__ > self.run().wait_until_finish() > apache_beam/pipeline.py:502: in run > self._options).run(False) > apache_beam/pipeline.py:515: in run > return self.runner.run_pipeline(self, self._options) > apache_beam/runners/portability/fn_api_runner/fn_runner.py:173: in > run_pipeline > pipeline.to_runner_api(default_environment=self._default_environment)) > apache_beam/runners/portability/fn_api_runner/fn_runner.py:183: in > run_via_runner_api > return self.run_stages(stage_context, stages) > apache_beam/runners/portability/fn_api_runner/fn_runner.py:331: in run_stages > bundle_context_manager, > apache_beam/runners/portability/fn_api_runner/fn_runner.py:508: in _run_stage > bundle_manager) > apache_beam/runners/portability/fn_api_runner/fn_runner.py:546: in _run_bundle > data_input, data_output, input_timers, expected_timer_output) > apache_beam/runners/portability/fn_api_runner/fn_runner.py:927: in > process_bundle > timer_inputs)): > /usr/lib/python3.5/concurrent/futures/_base.py:556: in result_iterator > yield future.result() > /usr/lib/python3.5/concurrent/futures/_base.py:405: in result > return self.__get_result() > /usr/lib/python3.5/concurrent/futures/_base.py:357: in __get_result > raise self._exception > apache_beam/utils/thread_pool_executor.py:44: in run > self._future.set_result(self._fn(*self._fn_args, **self._fn_kwargs)) > apache_beam/runners/portability/fn_api_runner/fn_runner.py:923: in execute > dry_run) > apache_beam/runners/portability/fn_api_runner/fn_runner.py:826: in > process_bundle > result_future = self._worker_handler.control_conn.push(process_bundle_req) > apache_beam/runners/portability/fn_api_runner/worker_handlers.py:353: in push > response = self.worker.do_instruction(request) > apache_beam/runners/worker/sdk_worker.py:471: in do_instruction > getattr(request, request_type), request.instruction_id) > apache_beam/runners/worker/sdk_worker.py:500: in process_bundle > instruction_id, request.process_bundle_descriptor_id) > apache_beam/runners/worker/sdk_worker.py:374: in get > self.data_channel_factory) > apache_beam/runners/worker/bundle_processor.py:782: in __init__ > self.ops = self.create_execution_tree(self.process_bundle_descriptor) > apache_beam/runners/worker/bundle_processor.py:837: in create_execution_tree > descriptor.transforms, key=topological_height, reverse=True) > apache_beam/runners/worker/bundle_processor.py:836: in > (transform_id, get_operation(transform_id)) for transform_id in sorted( > apache_beam/runners/worker/bundle_processor.py:726: in wrapper > result = cache[args] = func(*args) > apache_beam/runners/worker/bundle_processor.py:820: in get_operation > pcoll_id in descriptor.transforms[transform_id].outputs.items() > apache_beam/runners/worker/bundle_processor.py:820: in > pcoll_id in descriptor.transforms[transform_id].outputs.items() > apache_beam/runners/worker/bundle_processor.py:818: in > tag: [get_operation(op) for op in pcoll_consumers[pcoll_id]] > apache_beam/runners/worker/bundle_processor.py:726: in wrapper > result = cache[args] = func(*args) > apache_beam/runners/worker/bundle_processor.py:823: in get_operation > transform_id, transform_consumers) > apache_beam/runners/worker/bundle_processor.py:1108: in create_operation > return creator(self, transform_id, transform_proto, payload, consumers) >
[jira] [Commented] (BEAM-9832) KeyError: 'No such coder: ' in fn_runner_test
[ https://issues.apache.org/jira/browse/BEAM-9832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17093973#comment-17093973 ] Pablo Estrada commented on BEAM-9832: - Working on this. Interestingly enough, it's very hard to reproduce this on my machine. It takes about 15 runs per failure. > KeyError: 'No such coder: ' in fn_runner_test > - > > Key: BEAM-9832 > URL: https://issues.apache.org/jira/browse/BEAM-9832 > Project: Beam > Issue Type: Test > Components: sdk-py-core, test-failures >Reporter: Ning Kang >Assignee: Pablo Estrada >Priority: Critical > > Failed test results can be found > [here|[https://builds.apache.org/job/beam_PreCommit_Python_Commit/12525/]] > > A stack trace: > {code:java} > self = > testMethod=test_read> > def test_read(self): > # Can't use NamedTemporaryFile as a context > # due to https://bugs.python.org/issue14243 > temp_file = tempfile.NamedTemporaryFile(delete=False) > try: > temp_file.write(b'a\nb\nc') > temp_file.close() > with self.create_pipeline() as p: > assert_that( > > p | beam.io.ReadFromText(temp_file.name), equal_to(['a', 'b', > > 'c'])) > apache_beam/runners/portability/fn_api_runner/fn_runner_test.py:625: > _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ > _ > apache_beam/pipeline.py:529: in __exit__ > self.run().wait_until_finish() > apache_beam/pipeline.py:502: in run > self._options).run(False) > apache_beam/pipeline.py:515: in run > return self.runner.run_pipeline(self, self._options) > apache_beam/runners/portability/fn_api_runner/fn_runner.py:173: in > run_pipeline > pipeline.to_runner_api(default_environment=self._default_environment)) > apache_beam/runners/portability/fn_api_runner/fn_runner.py:183: in > run_via_runner_api > return self.run_stages(stage_context, stages) > apache_beam/runners/portability/fn_api_runner/fn_runner.py:331: in run_stages > bundle_context_manager, > apache_beam/runners/portability/fn_api_runner/fn_runner.py:508: in _run_stage > bundle_manager) > apache_beam/runners/portability/fn_api_runner/fn_runner.py:546: in _run_bundle > data_input, data_output, input_timers, expected_timer_output) > apache_beam/runners/portability/fn_api_runner/fn_runner.py:927: in > process_bundle > timer_inputs)): > /usr/lib/python3.5/concurrent/futures/_base.py:556: in result_iterator > yield future.result() > /usr/lib/python3.5/concurrent/futures/_base.py:405: in result > return self.__get_result() > /usr/lib/python3.5/concurrent/futures/_base.py:357: in __get_result > raise self._exception > apache_beam/utils/thread_pool_executor.py:44: in run > self._future.set_result(self._fn(*self._fn_args, **self._fn_kwargs)) > apache_beam/runners/portability/fn_api_runner/fn_runner.py:923: in execute > dry_run) > apache_beam/runners/portability/fn_api_runner/fn_runner.py:826: in > process_bundle > result_future = self._worker_handler.control_conn.push(process_bundle_req) > apache_beam/runners/portability/fn_api_runner/worker_handlers.py:353: in push > response = self.worker.do_instruction(request) > apache_beam/runners/worker/sdk_worker.py:471: in do_instruction > getattr(request, request_type), request.instruction_id) > apache_beam/runners/worker/sdk_worker.py:500: in process_bundle > instruction_id, request.process_bundle_descriptor_id) > apache_beam/runners/worker/sdk_worker.py:374: in get > self.data_channel_factory) > apache_beam/runners/worker/bundle_processor.py:782: in __init__ > self.ops = self.create_execution_tree(self.process_bundle_descriptor) > apache_beam/runners/worker/bundle_processor.py:837: in create_execution_tree > descriptor.transforms, key=topological_height, reverse=True) > apache_beam/runners/worker/bundle_processor.py:836: in > (transform_id, get_operation(transform_id)) for transform_id in sorted( > apache_beam/runners/worker/bundle_processor.py:726: in wrapper > result = cache[args] = func(*args) > apache_beam/runners/worker/bundle_processor.py:820: in get_operation > pcoll_id in descriptor.transforms[transform_id].outputs.items() > apache_beam/runners/worker/bundle_processor.py:820: in > pcoll_id in descriptor.transforms[transform_id].outputs.items() > apache_beam/runners/worker/bundle_processor.py:818: in > tag: [get_operation(op) for op in pcoll_consumers[pcoll_id]] > apache_beam/runners/worker/bundle_processor.py:726: in wrapper > result = cache[args] = func(*args) > apache_beam/runners/worker/bundle_processor.py:823: in get_operation > transform_id, transform_consumers) > apache_beam/runners/worker/bundle_processor.py:1108: in create_operation > return creator(self, transform_id,
[jira] [Commented] (BEAM-9832) KeyError: 'No such coder: ' in fn_runner_test
[ https://issues.apache.org/jira/browse/BEAM-9832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17093947#comment-17093947 ] Pablo Estrada commented on BEAM-9832: - Hm sorry about the trouble. Starting to take a look... > KeyError: 'No such coder: ' in fn_runner_test > - > > Key: BEAM-9832 > URL: https://issues.apache.org/jira/browse/BEAM-9832 > Project: Beam > Issue Type: Test > Components: sdk-py-core, test-failures >Reporter: Ning Kang >Assignee: Pablo Estrada >Priority: Critical > > Failed test results can be found > [here|[https://builds.apache.org/job/beam_PreCommit_Python_Commit/12525/]] > > A stack trace: > {code:java} > self = > testMethod=test_read> > def test_read(self): > # Can't use NamedTemporaryFile as a context > # due to https://bugs.python.org/issue14243 > temp_file = tempfile.NamedTemporaryFile(delete=False) > try: > temp_file.write(b'a\nb\nc') > temp_file.close() > with self.create_pipeline() as p: > assert_that( > > p | beam.io.ReadFromText(temp_file.name), equal_to(['a', 'b', > > 'c'])) > apache_beam/runners/portability/fn_api_runner/fn_runner_test.py:625: > _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ > _ > apache_beam/pipeline.py:529: in __exit__ > self.run().wait_until_finish() > apache_beam/pipeline.py:502: in run > self._options).run(False) > apache_beam/pipeline.py:515: in run > return self.runner.run_pipeline(self, self._options) > apache_beam/runners/portability/fn_api_runner/fn_runner.py:173: in > run_pipeline > pipeline.to_runner_api(default_environment=self._default_environment)) > apache_beam/runners/portability/fn_api_runner/fn_runner.py:183: in > run_via_runner_api > return self.run_stages(stage_context, stages) > apache_beam/runners/portability/fn_api_runner/fn_runner.py:331: in run_stages > bundle_context_manager, > apache_beam/runners/portability/fn_api_runner/fn_runner.py:508: in _run_stage > bundle_manager) > apache_beam/runners/portability/fn_api_runner/fn_runner.py:546: in _run_bundle > data_input, data_output, input_timers, expected_timer_output) > apache_beam/runners/portability/fn_api_runner/fn_runner.py:927: in > process_bundle > timer_inputs)): > /usr/lib/python3.5/concurrent/futures/_base.py:556: in result_iterator > yield future.result() > /usr/lib/python3.5/concurrent/futures/_base.py:405: in result > return self.__get_result() > /usr/lib/python3.5/concurrent/futures/_base.py:357: in __get_result > raise self._exception > apache_beam/utils/thread_pool_executor.py:44: in run > self._future.set_result(self._fn(*self._fn_args, **self._fn_kwargs)) > apache_beam/runners/portability/fn_api_runner/fn_runner.py:923: in execute > dry_run) > apache_beam/runners/portability/fn_api_runner/fn_runner.py:826: in > process_bundle > result_future = self._worker_handler.control_conn.push(process_bundle_req) > apache_beam/runners/portability/fn_api_runner/worker_handlers.py:353: in push > response = self.worker.do_instruction(request) > apache_beam/runners/worker/sdk_worker.py:471: in do_instruction > getattr(request, request_type), request.instruction_id) > apache_beam/runners/worker/sdk_worker.py:500: in process_bundle > instruction_id, request.process_bundle_descriptor_id) > apache_beam/runners/worker/sdk_worker.py:374: in get > self.data_channel_factory) > apache_beam/runners/worker/bundle_processor.py:782: in __init__ > self.ops = self.create_execution_tree(self.process_bundle_descriptor) > apache_beam/runners/worker/bundle_processor.py:837: in create_execution_tree > descriptor.transforms, key=topological_height, reverse=True) > apache_beam/runners/worker/bundle_processor.py:836: in > (transform_id, get_operation(transform_id)) for transform_id in sorted( > apache_beam/runners/worker/bundle_processor.py:726: in wrapper > result = cache[args] = func(*args) > apache_beam/runners/worker/bundle_processor.py:820: in get_operation > pcoll_id in descriptor.transforms[transform_id].outputs.items() > apache_beam/runners/worker/bundle_processor.py:820: in > pcoll_id in descriptor.transforms[transform_id].outputs.items() > apache_beam/runners/worker/bundle_processor.py:818: in > tag: [get_operation(op) for op in pcoll_consumers[pcoll_id]] > apache_beam/runners/worker/bundle_processor.py:726: in wrapper > result = cache[args] = func(*args) > apache_beam/runners/worker/bundle_processor.py:823: in get_operation > transform_id, transform_consumers) > apache_beam/runners/worker/bundle_processor.py:1108: in create_operation > return creator(self, transform_id, transform_proto, payload, consumers) >
[jira] [Updated] (BEAM-9812) WriteToBigQuery issue causes pipelines with multiple load jobs to work erroneously
[ https://issues.apache.org/jira/browse/BEAM-9812?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pablo Estrada updated BEAM-9812: Fix Version/s: 2.21.0 > WriteToBigQuery issue causes pipelines with multiple load jobs to work > erroneously > -- > > Key: BEAM-9812 > URL: https://issues.apache.org/jira/browse/BEAM-9812 > Project: Beam > Issue Type: Bug > Components: io-py-gcp >Reporter: Pablo Estrada >Assignee: Pablo Estrada >Priority: Major > Fix For: 2.21.0 > > Time Spent: 20m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (BEAM-9812) WriteToBigQuery issue causes pipelines with multiple load jobs to work erroneously
[ https://issues.apache.org/jira/browse/BEAM-9812?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pablo Estrada updated BEAM-9812: Status: Open (was: Triage Needed) > WriteToBigQuery issue causes pipelines with multiple load jobs to work > erroneously > -- > > Key: BEAM-9812 > URL: https://issues.apache.org/jira/browse/BEAM-9812 > Project: Beam > Issue Type: Bug > Components: io-py-gcp >Reporter: Pablo Estrada >Assignee: Pablo Estrada >Priority: Major > Time Spent: 20m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (BEAM-9812) WriteToBigQuery issue causes pipelines with multiple load jobs to work erroneously
Pablo Estrada created BEAM-9812: --- Summary: WriteToBigQuery issue causes pipelines with multiple load jobs to work erroneously Key: BEAM-9812 URL: https://issues.apache.org/jira/browse/BEAM-9812 Project: Beam Issue Type: Bug Components: io-py-gcp Reporter: Pablo Estrada Assignee: Pablo Estrada -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (BEAM-9745) [beam_PostCommit_Java_PortabilityApi] Various GCP IO tests failing, unable to deserialize Custom DoFns and Custom Coders.
[ https://issues.apache.org/jira/browse/BEAM-9745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17090157#comment-17090157 ] Pablo Estrada commented on BEAM-9745: - Nevermind, on a723148858b421f7df08c74f11432b4aeb2f561c the test still didn't pass, but this time they hung for a long time. > [beam_PostCommit_Java_PortabilityApi] Various GCP IO tests failing, unable to > deserialize Custom DoFns and Custom Coders. > - > > Key: BEAM-9745 > URL: https://issues.apache.org/jira/browse/BEAM-9745 > Project: Beam > Issue Type: Bug > Components: io-java-gcp, java-fn-execution, sdk-java-harness, > test-failures >Reporter: Daniel Oliveira >Assignee: Pablo Estrada >Priority: Blocker > Labels: currently-failing > Fix For: 2.21.0 > > > _Use this form to file an issue for test failure:_ > * [Jenkins > Job|https://builds.apache.org/job/beam_PostCommit_Java_PortabilityApi/4657/] > * [Gradle Build > Scan|https://scans.gradle.com/s/c3izncsa4u24k/tests/by-project] > Initial investigation: > The bug appears to be popping up on BigQuery tests mostly, but also a > BigTable and a Datastore test. > Here's an example stacktrace of the two errors, showing _only_ the error > messages themselves. Source: > [https://scans.gradle.com/s/c3izncsa4u24k/tests/efn4wciuamvqq-ccxt3jvofvqbe] > {noformat} > java.util.concurrent.ExecutionException: java.lang.RuntimeException: Error > received from SDK harness for instruction -191: > java.lang.IllegalArgumentException: unable to deserialize Custom DoFn With > Execution Info > ... > Caused by: java.lang.ClassNotFoundException: > org.apache.beam.sdk.io.gcp.bigquery.BatchLoads$3 > ... > Caused by: java.lang.RuntimeException: Error received from SDK harness for > instruction -191: java.lang.IllegalArgumentException: unable to deserialize > Custom DoFn With Execution Info > ... > Caused by: java.lang.ClassNotFoundException: > org.apache.beam.sdk.io.gcp.bigquery.BatchLoads$3 > java.util.concurrent.ExecutionException: java.lang.RuntimeException: Error > received from SDK harness for instruction -206: > org.apache.beam.vendor.guava.v26_0_jre.com.google.common.util.concurrent.UncheckedExecutionException: > > org.apache.beam.vendor.guava.v26_0_jre.com.google.common.util.concurrent.UncheckedExecutionException: > java.lang.IllegalArgumentException: unable to deserialize Custom Coder Bytes > ... > Caused by: > org.apache.beam.vendor.guava.v26_0_jre.com.google.common.util.concurrent.UncheckedExecutionException: > java.lang.IllegalArgumentException: unable to deserialize Custom Coder Bytes > ... > Caused by: java.lang.IllegalArgumentException: unable to deserialize Custom > Coder Bytes > ... > Caused by: java.lang.ClassNotFoundException: > org.apache.beam.sdk.io.gcp.bigquery.TableRowJsonCoder > ... > Caused by: java.lang.RuntimeException: Error received from SDK harness for > instruction -206: > org.apache.beam.vendor.guava.v26_0_jre.com.google.common.util.concurrent.UncheckedExecutionException: > > org.apache.beam.vendor.guava.v26_0_jre.com.google.common.util.concurrent.UncheckedExecutionException: > java.lang.IllegalArgumentException: unable to deserialize Custom Coder Bytes > ... > Caused by: > org.apache.beam.vendor.guava.v26_0_jre.com.google.common.util.concurrent.UncheckedExecutionException: > java.lang.IllegalArgumentException: unable to deserialize Custom Coder Bytes > ... > Caused by: java.lang.IllegalArgumentException: unable to deserialize Custom > Coder Bytes > ... > Caused by: java.lang.ClassNotFoundException: > org.apache.beam.sdk.io.gcp.bigquery.TableRowJsonCoder > ... > {noformat} > Update: Looks like this has been failing as far back as [Apr > 4|https://builds.apache.org/job/beam_PostCommit_Java_PortabilityApi/4566/] > after a long period where the test was consistently timing out since [Mar > 31|https://builds.apache.org/job/beam_PostCommit_Java_PortabilityApi/4546/]. > So it's hard to narrow down what commit may have caused this. Plus, the test > was failing due to a completely different BigQuery failure before anyway, so > it seems like this test will need to be completely fixed from scratch, > instead of tracking down a specific breaking change. > > _After you've filled out the above details, please [assign the issue to an > individual|https://beam.apache.org/contribute/postcommits-guides/index.html#find_specialist]. > Assignee should [treat test failures as > high-priority|https://beam.apache.org/contribute/postcommits-policies/#assigned-failing-test], > helping to fix the issue or find a more appropriate owner. See [Apache Beam > Post-Commit > Policies|https://beam.apache.org/contribute/postcommits-policies]._ -- This message
[jira] [Commented] (BEAM-9745) [beam_PostCommit_Java_PortabilityApi] Various GCP IO tests failing, unable to deserialize Custom DoFns and Custom Coders.
[ https://issues.apache.org/jira/browse/BEAM-9745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17090105#comment-17090105 ] Pablo Estrada commented on BEAM-9745: - FWIW, this issue only seems to be affecting the Java SDK harness when running in portable mode, which is not a use case that we currently support, so I'm leaning towards "not a release blocker". I'm bisecting the commit which causes this issue. For my records: * Running on a723148858b421f7df08c74f11432b4aeb2f561c , the issue does not reproduce. * Running on 7869455ff38ce4791c5531022ffb75e7f007e06e, the issue reproduces > [beam_PostCommit_Java_PortabilityApi] Various GCP IO tests failing, unable to > deserialize Custom DoFns and Custom Coders. > - > > Key: BEAM-9745 > URL: https://issues.apache.org/jira/browse/BEAM-9745 > Project: Beam > Issue Type: Bug > Components: io-java-gcp, java-fn-execution, sdk-java-harness, > test-failures >Reporter: Daniel Oliveira >Assignee: Chamikara Madhusanka Jayalath >Priority: Blocker > Labels: currently-failing > Fix For: 2.21.0 > > > _Use this form to file an issue for test failure:_ > * [Jenkins > Job|https://builds.apache.org/job/beam_PostCommit_Java_PortabilityApi/4657/] > * [Gradle Build > Scan|https://scans.gradle.com/s/c3izncsa4u24k/tests/by-project] > Initial investigation: > The bug appears to be popping up on BigQuery tests mostly, but also a > BigTable and a Datastore test. > Here's an example stacktrace of the two errors, showing _only_ the error > messages themselves. Source: > [https://scans.gradle.com/s/c3izncsa4u24k/tests/efn4wciuamvqq-ccxt3jvofvqbe] > {noformat} > java.util.concurrent.ExecutionException: java.lang.RuntimeException: Error > received from SDK harness for instruction -191: > java.lang.IllegalArgumentException: unable to deserialize Custom DoFn With > Execution Info > ... > Caused by: java.lang.ClassNotFoundException: > org.apache.beam.sdk.io.gcp.bigquery.BatchLoads$3 > ... > Caused by: java.lang.RuntimeException: Error received from SDK harness for > instruction -191: java.lang.IllegalArgumentException: unable to deserialize > Custom DoFn With Execution Info > ... > Caused by: java.lang.ClassNotFoundException: > org.apache.beam.sdk.io.gcp.bigquery.BatchLoads$3 > java.util.concurrent.ExecutionException: java.lang.RuntimeException: Error > received from SDK harness for instruction -206: > org.apache.beam.vendor.guava.v26_0_jre.com.google.common.util.concurrent.UncheckedExecutionException: > > org.apache.beam.vendor.guava.v26_0_jre.com.google.common.util.concurrent.UncheckedExecutionException: > java.lang.IllegalArgumentException: unable to deserialize Custom Coder Bytes > ... > Caused by: > org.apache.beam.vendor.guava.v26_0_jre.com.google.common.util.concurrent.UncheckedExecutionException: > java.lang.IllegalArgumentException: unable to deserialize Custom Coder Bytes > ... > Caused by: java.lang.IllegalArgumentException: unable to deserialize Custom > Coder Bytes > ... > Caused by: java.lang.ClassNotFoundException: > org.apache.beam.sdk.io.gcp.bigquery.TableRowJsonCoder > ... > Caused by: java.lang.RuntimeException: Error received from SDK harness for > instruction -206: > org.apache.beam.vendor.guava.v26_0_jre.com.google.common.util.concurrent.UncheckedExecutionException: > > org.apache.beam.vendor.guava.v26_0_jre.com.google.common.util.concurrent.UncheckedExecutionException: > java.lang.IllegalArgumentException: unable to deserialize Custom Coder Bytes > ... > Caused by: > org.apache.beam.vendor.guava.v26_0_jre.com.google.common.util.concurrent.UncheckedExecutionException: > java.lang.IllegalArgumentException: unable to deserialize Custom Coder Bytes > ... > Caused by: java.lang.IllegalArgumentException: unable to deserialize Custom > Coder Bytes > ... > Caused by: java.lang.ClassNotFoundException: > org.apache.beam.sdk.io.gcp.bigquery.TableRowJsonCoder > ... > {noformat} > Update: Looks like this has been failing as far back as [Apr > 4|https://builds.apache.org/job/beam_PostCommit_Java_PortabilityApi/4566/] > after a long period where the test was consistently timing out since [Mar > 31|https://builds.apache.org/job/beam_PostCommit_Java_PortabilityApi/4546/]. > So it's hard to narrow down what commit may have caused this. Plus, the test > was failing due to a completely different BigQuery failure before anyway, so > it seems like this test will need to be completely fixed from scratch, > instead of tracking down a specific breaking change. > > _After you've filled out the above details, please [assign the issue to an > individual|https://beam.apache.org/contribute/postcommits-guides/index.html#find_specialist]. >
[jira] [Created] (BEAM-9787) Send clear error to users trying to use BigQuerySource on FnApi pipelines on Python SDK
Pablo Estrada created BEAM-9787: --- Summary: Send clear error to users trying to use BigQuerySource on FnApi pipelines on Python SDK Key: BEAM-9787 URL: https://issues.apache.org/jira/browse/BEAM-9787 Project: Beam Issue Type: Bug Components: io-py-gcp Reporter: Pablo Estrada Assignee: Pablo Estrada -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Comment Edited] (BEAM-9745) [beam_PostCommit_Java_PortabilityApi] Various GCP IO tests failing, unable to deserialize Custom DoFns and Custom Coders.
[ https://issues.apache.org/jira/browse/BEAM-9745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17086076#comment-17086076 ] Pablo Estrada edited comment on BEAM-9745 at 4/17/20, 9:24 PM: --- This suite seems to have been broken since April 3, if not more. See :[https://builds.apache.org/job/beam_PostCommit_Java_PortabilityApi/] was (Author: pabloem): This suite seems to have been broken since April 3, if not more. See :[https://builds.apache.org/job/beam_PostCommit_Java_PortabilityApi/] !https://screenshot.googleplex.com/94iMFxiE3X5.png! > [beam_PostCommit_Java_PortabilityApi] Various GCP IO tests failing, unable to > deserialize Custom DoFns and Custom Coders. > - > > Key: BEAM-9745 > URL: https://issues.apache.org/jira/browse/BEAM-9745 > Project: Beam > Issue Type: Bug > Components: io-java-gcp, java-fn-execution, sdk-java-harness, > test-failures >Reporter: Daniel Oliveira >Assignee: Chamikara Madhusanka Jayalath >Priority: Blocker > Labels: currently-failing > Fix For: 2.21.0 > > > _Use this form to file an issue for test failure:_ > * [Jenkins > Job|https://builds.apache.org/job/beam_PostCommit_Java_PortabilityApi/4657/] > * [Gradle Build > Scan|https://scans.gradle.com/s/c3izncsa4u24k/tests/by-project] > Initial investigation: > The bug appears to be popping up on BigQuery tests mostly, but also a > BigTable and a Datastore test. > Here's an example stacktrace of the two errors, showing _only_ the error > messages themselves. Source: > [https://scans.gradle.com/s/c3izncsa4u24k/tests/efn4wciuamvqq-ccxt3jvofvqbe] > {noformat} > java.util.concurrent.ExecutionException: java.lang.RuntimeException: Error > received from SDK harness for instruction -191: > java.lang.IllegalArgumentException: unable to deserialize Custom DoFn With > Execution Info > ... > Caused by: java.lang.ClassNotFoundException: > org.apache.beam.sdk.io.gcp.bigquery.BatchLoads$3 > ... > Caused by: java.lang.RuntimeException: Error received from SDK harness for > instruction -191: java.lang.IllegalArgumentException: unable to deserialize > Custom DoFn With Execution Info > ... > Caused by: java.lang.ClassNotFoundException: > org.apache.beam.sdk.io.gcp.bigquery.BatchLoads$3 > java.util.concurrent.ExecutionException: java.lang.RuntimeException: Error > received from SDK harness for instruction -206: > org.apache.beam.vendor.guava.v26_0_jre.com.google.common.util.concurrent.UncheckedExecutionException: > > org.apache.beam.vendor.guava.v26_0_jre.com.google.common.util.concurrent.UncheckedExecutionException: > java.lang.IllegalArgumentException: unable to deserialize Custom Coder Bytes > ... > Caused by: > org.apache.beam.vendor.guava.v26_0_jre.com.google.common.util.concurrent.UncheckedExecutionException: > java.lang.IllegalArgumentException: unable to deserialize Custom Coder Bytes > ... > Caused by: java.lang.IllegalArgumentException: unable to deserialize Custom > Coder Bytes > ... > Caused by: java.lang.ClassNotFoundException: > org.apache.beam.sdk.io.gcp.bigquery.TableRowJsonCoder > ... > Caused by: java.lang.RuntimeException: Error received from SDK harness for > instruction -206: > org.apache.beam.vendor.guava.v26_0_jre.com.google.common.util.concurrent.UncheckedExecutionException: > > org.apache.beam.vendor.guava.v26_0_jre.com.google.common.util.concurrent.UncheckedExecutionException: > java.lang.IllegalArgumentException: unable to deserialize Custom Coder Bytes > ... > Caused by: > org.apache.beam.vendor.guava.v26_0_jre.com.google.common.util.concurrent.UncheckedExecutionException: > java.lang.IllegalArgumentException: unable to deserialize Custom Coder Bytes > ... > Caused by: java.lang.IllegalArgumentException: unable to deserialize Custom > Coder Bytes > ... > Caused by: java.lang.ClassNotFoundException: > org.apache.beam.sdk.io.gcp.bigquery.TableRowJsonCoder > ... > {noformat} > Update: Looks like this has been failing as far back as [Apr > 4|https://builds.apache.org/job/beam_PostCommit_Java_PortabilityApi/4566/] > after a long period where the test was consistently timing out since [Mar > 31|https://builds.apache.org/job/beam_PostCommit_Java_PortabilityApi/4546/]. > So it's hard to narrow down what commit may have caused this. Plus, the test > was failing due to a completely different BigQuery failure before anyway, so > it seems like this test will need to be completely fixed from scratch, > instead of tracking down a specific breaking change. > > _After you've filled out the above details, please [assign the issue to an > individual|https://beam.apache.org/contribute/postcommits-guides/index.html#find_specialist]. > Assignee should
[jira] [Updated] (BEAM-9769) Ensure JSON imports are the default behavior for BigQuerySink and WriteToBigQuery in Python
[ https://issues.apache.org/jira/browse/BEAM-9769?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pablo Estrada updated BEAM-9769: Fix Version/s: 2.21.0 > Ensure JSON imports are the default behavior for BigQuerySink and > WriteToBigQuery in Python > --- > > Key: BEAM-9769 > URL: https://issues.apache.org/jira/browse/BEAM-9769 > Project: Beam > Issue Type: Bug > Components: io-py-gcp >Reporter: Pablo Estrada >Assignee: Pablo Estrada >Priority: Major > Fix For: 2.21.0 > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (BEAM-9769) Ensure JSON imports are the default behavior for BigQuerySink and WriteToBigQuery in Python
[ https://issues.apache.org/jira/browse/BEAM-9769?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pablo Estrada updated BEAM-9769: Status: Open (was: Triage Needed) > Ensure JSON imports are the default behavior for BigQuerySink and > WriteToBigQuery in Python > --- > > Key: BEAM-9769 > URL: https://issues.apache.org/jira/browse/BEAM-9769 > Project: Beam > Issue Type: Bug > Components: io-py-gcp >Reporter: Pablo Estrada >Assignee: Pablo Estrada >Priority: Major > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (BEAM-9769) Ensure JSON imports are the default behavior for BigQuerySink and WriteToBigQuery in Python
Pablo Estrada created BEAM-9769: --- Summary: Ensure JSON imports are the default behavior for BigQuerySink and WriteToBigQuery in Python Key: BEAM-9769 URL: https://issues.apache.org/jira/browse/BEAM-9769 Project: Beam Issue Type: Bug Components: io-py-gcp Reporter: Pablo Estrada Assignee: Pablo Estrada -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (BEAM-9763) Make _ReadFromBigQuery public
[ https://issues.apache.org/jira/browse/BEAM-9763?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pablo Estrada updated BEAM-9763: Description: This means removing the underscore from it, but keeping it tagged as experimental. > Make _ReadFromBigQuery public > - > > Key: BEAM-9763 > URL: https://issues.apache.org/jira/browse/BEAM-9763 > Project: Beam > Issue Type: Sub-task > Components: io-py-gcp >Reporter: Pablo Estrada >Assignee: Pablo Estrada >Priority: Major > > This means removing the underscore from it, but keeping it tagged as > experimental. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (BEAM-9763) Make _ReadFromBigQuery public
Pablo Estrada created BEAM-9763: --- Summary: Make _ReadFromBigQuery public Key: BEAM-9763 URL: https://issues.apache.org/jira/browse/BEAM-9763 Project: Beam Issue Type: Sub-task Components: io-py-gcp Reporter: Pablo Estrada Assignee: Pablo Estrada -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (BEAM-9762) Production-ready BigQuery IOs for Python SDK
[ https://issues.apache.org/jira/browse/BEAM-9762?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pablo Estrada updated BEAM-9762: Status: Open (was: Triage Needed) > Production-ready BigQuery IOs for Python SDK > > > Key: BEAM-9762 > URL: https://issues.apache.org/jira/browse/BEAM-9762 > Project: Beam > Issue Type: Bug > Components: io-py-gcp >Reporter: Pablo Estrada >Assignee: Pablo Estrada >Priority: Major > > This issue is to track the elements of work necessary to make the new > BigQuery IOs (WriteToBigQuery/FileLoads, WriteToBigQuery/StreamingInserts, > _ReadFromBigQuery) fully productionized, and the main, encouraged way of > using BQ with Beam Python SDK. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (BEAM-9763) Make _ReadFromBigQuery public
[ https://issues.apache.org/jira/browse/BEAM-9763?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pablo Estrada updated BEAM-9763: Status: Open (was: Triage Needed) > Make _ReadFromBigQuery public > - > > Key: BEAM-9763 > URL: https://issues.apache.org/jira/browse/BEAM-9763 > Project: Beam > Issue Type: Sub-task > Components: io-py-gcp >Reporter: Pablo Estrada >Assignee: Pablo Estrada >Priority: Major > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (BEAM-9762) Production-ready BigQuery IOs for Python SDK
Pablo Estrada created BEAM-9762: --- Summary: Production-ready BigQuery IOs for Python SDK Key: BEAM-9762 URL: https://issues.apache.org/jira/browse/BEAM-9762 Project: Beam Issue Type: Bug Components: io-py-gcp Reporter: Pablo Estrada Assignee: Pablo Estrada This issue is to track the elements of work necessary to make the new BigQuery IOs (WriteToBigQuery/FileLoads, WriteToBigQuery/StreamingInserts, _ReadFromBigQuery) fully productionized, and the main, encouraged way of using BQ with Beam Python SDK. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (BEAM-9741) Split timers to multiple workers in ParallelBundleManager
Pablo Estrada created BEAM-9741: --- Summary: Split timers to multiple workers in ParallelBundleManager Key: BEAM-9741 URL: https://issues.apache.org/jira/browse/BEAM-9741 Project: Beam Issue Type: Sub-task Components: sdk-py-core Reporter: Pablo Estrada -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (BEAM-9494) Remove workaround for BQ transform for Dataflow
[ https://issues.apache.org/jira/browse/BEAM-9494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17078776#comment-17078776 ] Pablo Estrada commented on BEAM-9494: - Not resolved, but we can remove the fix version. > Remove workaround for BQ transform for Dataflow > --- > > Key: BEAM-9494 > URL: https://issues.apache.org/jira/browse/BEAM-9494 > Project: Beam > Issue Type: Bug > Components: io-py-gcp >Reporter: Luke Cwik >Assignee: Pablo Estrada >Priority: Minor > Fix For: 2.21.0 > > Time Spent: 2h 40m > Remaining Estimate: 0h > > Dataflow incorrectly uses the Flatten input PCollection coder when it > performs an optimization instead of the output PCollection coder which can > lead to issues if these coders differ. > > The workaround was introduced in [https://github.com/apache/beam/pull/11103] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (BEAM-9494) Remove workaround for BQ transform for Dataflow
[ https://issues.apache.org/jira/browse/BEAM-9494?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pablo Estrada updated BEAM-9494: Fix Version/s: (was: 2.21.0) > Remove workaround for BQ transform for Dataflow > --- > > Key: BEAM-9494 > URL: https://issues.apache.org/jira/browse/BEAM-9494 > Project: Beam > Issue Type: Bug > Components: io-py-gcp >Reporter: Luke Cwik >Assignee: Pablo Estrada >Priority: Minor > Time Spent: 2h 40m > Remaining Estimate: 0h > > Dataflow incorrectly uses the Flatten input PCollection coder when it > performs an optimization instead of the output PCollection coder which can > lead to issues if these coders differ. > > The workaround was introduced in [https://github.com/apache/beam/pull/11103] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (BEAM-9691) Ensure Dataflow BQ Native sink are not used on FnApi
[ https://issues.apache.org/jira/browse/BEAM-9691?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pablo Estrada resolved BEAM-9691. - Fix Version/s: 2.21.0 Resolution: Fixed > Ensure Dataflow BQ Native sink are not used on FnApi > > > Key: BEAM-9691 > URL: https://issues.apache.org/jira/browse/BEAM-9691 > Project: Beam > Issue Type: Bug > Components: io-py-gcp >Reporter: Pablo Estrada >Assignee: Pablo Estrada >Priority: Major > Fix For: 2.21.0 > > Time Spent: 2h 50m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (BEAM-9715) annotations_test fails in some environmens
[ https://issues.apache.org/jira/browse/BEAM-9715?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pablo Estrada updated BEAM-9715: Priority: Minor (was: Major) > annotations_test fails in some environmens > -- > > Key: BEAM-9715 > URL: https://issues.apache.org/jira/browse/BEAM-9715 > Project: Beam > Issue Type: Bug > Components: sdk-py-core >Reporter: Pablo Estrada >Assignee: Pablo Estrada >Priority: Minor > Time Spent: 20m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (BEAM-9715) annotations_test fails in some environmens
[ https://issues.apache.org/jira/browse/BEAM-9715?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pablo Estrada updated BEAM-9715: Status: Open (was: Triage Needed) > annotations_test fails in some environmens > -- > > Key: BEAM-9715 > URL: https://issues.apache.org/jira/browse/BEAM-9715 > Project: Beam > Issue Type: Bug > Components: sdk-py-core >Reporter: Pablo Estrada >Assignee: Pablo Estrada >Priority: Major > Time Spent: 20m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (BEAM-9715) annotations_test fails in some environmens
Pablo Estrada created BEAM-9715: --- Summary: annotations_test fails in some environmens Key: BEAM-9715 URL: https://issues.apache.org/jira/browse/BEAM-9715 Project: Beam Issue Type: Bug Components: sdk-py-core Reporter: Pablo Estrada Assignee: Pablo Estrada -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (BEAM-9691) Ensure Dataflow BQ Native sink are not used on FnApi
[ https://issues.apache.org/jira/browse/BEAM-9691?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pablo Estrada updated BEAM-9691: Status: Open (was: Triage Needed) > Ensure Dataflow BQ Native sink are not used on FnApi > > > Key: BEAM-9691 > URL: https://issues.apache.org/jira/browse/BEAM-9691 > Project: Beam > Issue Type: Bug > Components: io-py-gcp >Reporter: Pablo Estrada >Assignee: Pablo Estrada >Priority: Major > Time Spent: 50m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (BEAM-9691) Ensure Dataflow BQ Native sink are not used on FnApi
[ https://issues.apache.org/jira/browse/BEAM-9691?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17074895#comment-17074895 ] Pablo Estrada commented on BEAM-9691: - Ah, in fact this is for native sink, not source. I have a PR to have avro export for the custom source, but it changes the transform somewhat, so it wouldn't be backwards compatible: [https://github.com/apache/beam/pull/11086] We're still discussing what should be the approach for that. > Ensure Dataflow BQ Native sink are not used on FnApi > > > Key: BEAM-9691 > URL: https://issues.apache.org/jira/browse/BEAM-9691 > Project: Beam > Issue Type: Bug > Components: io-py-gcp >Reporter: Pablo Estrada >Assignee: Pablo Estrada >Priority: Major > Time Spent: 50m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (BEAM-9691) Ensure Dataflow BQ Native sink are not used on FnApi
[ https://issues.apache.org/jira/browse/BEAM-9691?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pablo Estrada updated BEAM-9691: Summary: Ensure Dataflow BQ Native sink are not used on FnApi (was: Ensure Dataflow BQ Native sources are not used on FnApi) > Ensure Dataflow BQ Native sink are not used on FnApi > > > Key: BEAM-9691 > URL: https://issues.apache.org/jira/browse/BEAM-9691 > Project: Beam > Issue Type: Bug > Components: io-py-gcp >Reporter: Pablo Estrada >Assignee: Pablo Estrada >Priority: Major > Time Spent: 50m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (BEAM-9691) Ensure Dataflow BQ Native sources are not used on FnApi
[ https://issues.apache.org/jira/browse/BEAM-9691?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17074892#comment-17074892 ] Pablo Estrada commented on BEAM-9691: - I think the title of the issue is too wide. Sorry about that. This is only meant for now, and for BQSource. - but for BQ IO we're using a Beam Custom source, as the native source is not supported by UW. I do not know if it's supported by the java runner harness. > Ensure Dataflow BQ Native sources are not used on FnApi > --- > > Key: BEAM-9691 > URL: https://issues.apache.org/jira/browse/BEAM-9691 > Project: Beam > Issue Type: Bug > Components: io-py-gcp >Reporter: Pablo Estrada >Assignee: Pablo Estrada >Priority: Major > Time Spent: 40m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (BEAM-9691) Ensure Dataflow BQ Native sources are not used on FnApi
[ https://issues.apache.org/jira/browse/BEAM-9691?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pablo Estrada updated BEAM-9691: Summary: Ensure Dataflow BQ Native sources are not used on FnApi (was: Ensure Dataflow Native sources are not used on FnApi) > Ensure Dataflow BQ Native sources are not used on FnApi > --- > > Key: BEAM-9691 > URL: https://issues.apache.org/jira/browse/BEAM-9691 > Project: Beam > Issue Type: Bug > Components: io-py-gcp >Reporter: Pablo Estrada >Assignee: Pablo Estrada >Priority: Major > Time Spent: 40m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (BEAM-9691) Ensure Dataflow Native sources are not used on FnApi
[ https://issues.apache.org/jira/browse/BEAM-9691?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pablo Estrada updated BEAM-9691: Summary: Ensure Dataflow Native sources are not used on FnApi (was: Ensure Dataflow Native sources are not used on FnApi ever) > Ensure Dataflow Native sources are not used on FnApi > > > Key: BEAM-9691 > URL: https://issues.apache.org/jira/browse/BEAM-9691 > Project: Beam > Issue Type: Bug > Components: io-py-gcp >Reporter: Pablo Estrada >Assignee: Pablo Estrada >Priority: Major > Time Spent: 40m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (BEAM-9691) Ensure Dataflow Native sources are not used on FnApi ever
Pablo Estrada created BEAM-9691: --- Summary: Ensure Dataflow Native sources are not used on FnApi ever Key: BEAM-9691 URL: https://issues.apache.org/jira/browse/BEAM-9691 Project: Beam Issue Type: Bug Components: io-py-gcp Reporter: Pablo Estrada Assignee: Pablo Estrada -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (BEAM-9640) Track PCollection watermark across bundle executions
Pablo Estrada created BEAM-9640: --- Summary: Track PCollection watermark across bundle executions Key: BEAM-9640 URL: https://issues.apache.org/jira/browse/BEAM-9640 Project: Beam Issue Type: Sub-task Components: sdk-py-core Reporter: Pablo Estrada Assignee: Pablo Estrada This can be done without relying on the watermark manager for execution. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (BEAM-9639) Abstract bundle execution logic from stage execution logic
[ https://issues.apache.org/jira/browse/BEAM-9639?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pablo Estrada updated BEAM-9639: Status: Open (was: Triage Needed) > Abstract bundle execution logic from stage execution logic > -- > > Key: BEAM-9639 > URL: https://issues.apache.org/jira/browse/BEAM-9639 > Project: Beam > Issue Type: Sub-task > Components: sdk-py-core >Reporter: Pablo Estrada >Assignee: Pablo Estrada >Priority: Major > > The FnApiRunner currently works on a per-stage manner, and does not abstract > single-bundle execution much. This work item is to clearly define the code to > execute a single bundle. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (BEAM-9639) Abstract bundle execution logic from stage execution logic
Pablo Estrada created BEAM-9639: --- Summary: Abstract bundle execution logic from stage execution logic Key: BEAM-9639 URL: https://issues.apache.org/jira/browse/BEAM-9639 Project: Beam Issue Type: Sub-task Components: sdk-py-core Reporter: Pablo Estrada Assignee: Pablo Estrada The FnApiRunner currently works on a per-stage manner, and does not abstract single-bundle execution much. This work item is to clearly define the code to execute a single bundle. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (BEAM-9608) Add context managers for FnApiRunner to manage execution of each bundle
[ https://issues.apache.org/jira/browse/BEAM-9608?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pablo Estrada resolved BEAM-9608. - Fix Version/s: 2.21.0 Resolution: Fixed > Add context managers for FnApiRunner to manage execution of each bundle > --- > > Key: BEAM-9608 > URL: https://issues.apache.org/jira/browse/BEAM-9608 > Project: Beam > Issue Type: Sub-task > Components: sdk-py-core >Reporter: Pablo Estrada >Assignee: Pablo Estrada >Priority: Major > Fix For: 2.21.0 > > Time Spent: 2h 20m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (BEAM-9625) StateServicer should be owned by FnApiRunnerContextManager
[ https://issues.apache.org/jira/browse/BEAM-9625?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pablo Estrada updated BEAM-9625: Status: Open (was: Triage Needed) > StateServicer should be owned by FnApiRunnerContextManager > -- > > Key: BEAM-9625 > URL: https://issues.apache.org/jira/browse/BEAM-9625 > Project: Beam > Issue Type: Sub-task > Components: sdk-py-core >Reporter: Pablo Estrada >Assignee: Pablo Estrada >Priority: Minor > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (BEAM-9625) StateServicer should be owned by FnApiRunnerContextManager
Pablo Estrada created BEAM-9625: --- Summary: StateServicer should be owned by FnApiRunnerContextManager Key: BEAM-9625 URL: https://issues.apache.org/jira/browse/BEAM-9625 Project: Beam Issue Type: Sub-task Components: sdk-py-core Reporter: Pablo Estrada Assignee: Pablo Estrada -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (BEAM-9608) Add context managers for FnApiRunner to manage execution of each bundle
[ https://issues.apache.org/jira/browse/BEAM-9608?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pablo Estrada updated BEAM-9608: Status: Open (was: Triage Needed) > Add context managers for FnApiRunner to manage execution of each bundle > --- > > Key: BEAM-9608 > URL: https://issues.apache.org/jira/browse/BEAM-9608 > Project: Beam > Issue Type: Sub-task > Components: sdk-py-core >Reporter: Pablo Estrada >Assignee: Pablo Estrada >Priority: Major > Time Spent: 20m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (BEAM-9537) Refactor FnApiRunner into its own package
[ https://issues.apache.org/jira/browse/BEAM-9537?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pablo Estrada resolved BEAM-9537. - Fix Version/s: 2.21.0 Resolution: Fixed > Refactor FnApiRunner into its own package > - > > Key: BEAM-9537 > URL: https://issues.apache.org/jira/browse/BEAM-9537 > Project: Beam > Issue Type: Sub-task > Components: sdk-py-core >Reporter: Pablo Estrada >Assignee: Pablo Estrada >Priority: Minor > Fix For: 2.21.0 > > Time Spent: 2h 40m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (BEAM-9608) Add context managers for FnApiRunner to manage execution of each bundle
Pablo Estrada created BEAM-9608: --- Summary: Add context managers for FnApiRunner to manage execution of each bundle Key: BEAM-9608 URL: https://issues.apache.org/jira/browse/BEAM-9608 Project: Beam Issue Type: Sub-task Components: sdk-py-core Reporter: Pablo Estrada Assignee: Pablo Estrada -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (BEAM-9601) Interactive test_streaming_wordcount failing
Pablo Estrada created BEAM-9601: --- Summary: Interactive test_streaming_wordcount failing Key: BEAM-9601 URL: https://issues.apache.org/jira/browse/BEAM-9601 Project: Beam Issue Type: Bug Components: runner-py-interactive, test-failures Reporter: Pablo Estrada Assignee: Sam Rohde -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (BEAM-9598) _CustomBigQuerySource checks valueprovider when it's not needed
Pablo Estrada created BEAM-9598: --- Summary: _CustomBigQuerySource checks valueprovider when it's not needed Key: BEAM-9598 URL: https://issues.apache.org/jira/browse/BEAM-9598 Project: Beam Issue Type: Bug Components: io-py-gcp, test-failures Reporter: Pablo Estrada Assignee: Pablo Estrada -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (BEAM-9572) Allow RuntimeValueProviders for WriteToBigQuery transform (dataflow templates)
[ https://issues.apache.org/jira/browse/BEAM-9572?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pablo Estrada updated BEAM-9572: Priority: Minor (was: Critical) > Allow RuntimeValueProviders for WriteToBigQuery transform (dataflow templates) > -- > > Key: BEAM-9572 > URL: https://issues.apache.org/jira/browse/BEAM-9572 > Project: Beam > Issue Type: Bug > Components: io-py-gcp >Affects Versions: 2.19.0 >Reporter: José Miguel Rebelo >Assignee: Pablo Estrada >Priority: Minor > Attachments: error.PNG > > > the *WriteToBigQuery* transform should be able to receive the *table* > parameter as a *ValueProvider.* > > When I try to *create and stage* the pipeline with the table parameter as a > ValueProvider, I get the error attached > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (BEAM-9572) Allow RuntimeValueProviders for WriteToBigQuery transform (dataflow templates)
[ https://issues.apache.org/jira/browse/BEAM-9572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17066016#comment-17066016 ] Pablo Estrada commented on BEAM-9572: - Did you pass the pipeline option `experiments=use_beam_bq_sink`? If so, the pipeline should work. You shouldn't be calling `.get` on the value provider, just passing it to the transform. By passing `WriteToBigQuery(table=extraction_options.tableId)`, it should be enough. > Allow RuntimeValueProviders for WriteToBigQuery transform (dataflow templates) > -- > > Key: BEAM-9572 > URL: https://issues.apache.org/jira/browse/BEAM-9572 > Project: Beam > Issue Type: Bug > Components: io-py-gcp >Affects Versions: 2.19.0 >Reporter: José Miguel Rebelo >Assignee: Pablo Estrada >Priority: Critical > Attachments: error.PNG > > > the *WriteToBigQuery* transform should be able to receive the *table* > parameter as a *ValueProvider.* > > When I try to *create and stage* the pipeline with the table parameter as a > ValueProvider, I get the error attached > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (BEAM-9572) Allow RuntimeValueProviders for WriteToBigQuery transform (dataflow templates)
[ https://issues.apache.org/jira/browse/BEAM-9572?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pablo Estrada resolved BEAM-9572. - Fix Version/s: Not applicable Resolution: Fixed > Allow RuntimeValueProviders for WriteToBigQuery transform (dataflow templates) > -- > > Key: BEAM-9572 > URL: https://issues.apache.org/jira/browse/BEAM-9572 > Project: Beam > Issue Type: Bug > Components: io-py-gcp >Affects Versions: 2.19.0 >Reporter: José Miguel Rebelo >Assignee: Pablo Estrada >Priority: Minor > Fix For: Not applicable > > Attachments: error.PNG > > > the *WriteToBigQuery* transform should be able to receive the *table* > parameter as a *ValueProvider.* > > When I try to *create and stage* the pipeline with the table parameter as a > ValueProvider, I get the error attached > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (BEAM-9572) Allow RuntimeValueProviders for WriteToBigQuery transform (dataflow templates)
[ https://issues.apache.org/jira/browse/BEAM-9572?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pablo Estrada updated BEAM-9572: Component/s: (was: beam-model) (was: beam-community) io-py-gcp > Allow RuntimeValueProviders for WriteToBigQuery transform (dataflow templates) > -- > > Key: BEAM-9572 > URL: https://issues.apache.org/jira/browse/BEAM-9572 > Project: Beam > Issue Type: Bug > Components: io-py-gcp >Affects Versions: 2.19.0 >Reporter: José Miguel Rebelo >Assignee: Pablo Estrada >Priority: Critical > Attachments: error.PNG > > > the *WriteToBigQuery* transform should be able to receive the *table* > parameter as a *ValueProvider.* > > When I try to *create and stage* the pipeline with the table parameter as a > ValueProvider, I get the error attached > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (BEAM-2572) Implement an S3 filesystem for Python SDK
[ https://issues.apache.org/jira/browse/BEAM-2572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17061974#comment-17061974 ] Pablo Estrada commented on BEAM-2572: - Thanks for catching that Badrul! Would you file a JIRA to update them or just update them? On Mon, Mar 16, 2020 at 2:32 PM Badrul Chowdhury (Jira) > Implement an S3 filesystem for Python SDK > - > > Key: BEAM-2572 > URL: https://issues.apache.org/jira/browse/BEAM-2572 > Project: Beam > Issue Type: Task > Components: sdk-py-core >Reporter: Dmitry Demeshchuk >Priority: Minor > Labels: GSoC2019, gsoc, gsoc2019, mentor, outreachy19dec > Fix For: 2.19.0 > > Time Spent: 5h 10m > Remaining Estimate: 0h > > There are two paths worth exploring, to my understanding: > 1. Sticking to the HDFS-based approach (like it's done in Java). > 2. Using boto/boto3 for accessing S3 through its common API endpoints. > I personally prefer the second approach, for a few reasons: > 1. In real life, HDFS and S3 have different consistency guarantees, therefore > their behaviors may contradict each other in some edge cases (say, we write > something to S3, but it's not immediately accessible for reading from another > end). > 2. There are other AWS-based sources and sinks we may want to create in the > future: DynamoDB, Kinesis, SQS, etc. > 3. boto3 already provides somewhat good logic for basic things like > reattempting. > Whatever path we choose, there's another problem related to this: we > currently cannot pass any global settings (say, pipeline options, or just an > arbitrary kwarg) to a filesystem. Because of that, we'd have to setup the > runner nodes to have AWS keys set up in the environment, which is not trivial > to achieve and doesn't look too clean either (I'd rather see one single place > for configuring the runner options). > Also, it's worth mentioning that I already have a janky S3 filesystem > implementation that only supports DirectRunner at the moment (because of the > previous paragraph). I'm perfectly fine finishing it myself, with some > guidance from the maintainers. > Where should I move on from here, and whose input should I be looking for? > Thanks! -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (BEAM-9537) Refactor FnApiRunner into its own package
Pablo Estrada created BEAM-9537: --- Summary: Refactor FnApiRunner into its own package Key: BEAM-9537 URL: https://issues.apache.org/jira/browse/BEAM-9537 Project: Beam Issue Type: Sub-task Components: sdk-py-core Reporter: Pablo Estrada Assignee: Pablo Estrada -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (BEAM-9533) Replace *-gcp/*-aws tox suites with *-cloud suites to run unit tests for both
Pablo Estrada created BEAM-9533: --- Summary: Replace *-gcp/*-aws tox suites with *-cloud suites to run unit tests for both Key: BEAM-9533 URL: https://issues.apache.org/jira/browse/BEAM-9533 Project: Beam Issue Type: Bug Components: testing Reporter: Pablo Estrada Assignee: Pablo Estrada Currently there are `py37-gcp`, py37-aws test suites. Let's consolidate all of them into py37-cloud, along with other py35-gcp, py27-gcp, etc. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Comment Edited] (BEAM-9484) apache_beam.io.gcp.bigquery_write_it_test.BigQueryWriteIntegrationTests is flaky in DirectRunner Postcommits
[ https://issues.apache.org/jira/browse/BEAM-9484?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17061257#comment-17061257 ] Pablo Estrada edited comment on BEAM-9484 at 3/17/20, 11:34 PM: [~chamikara] lmk if you need a hand looking into this, or if you got it. Feel free to assign to me too. was (Author: pabloem): [~chamikara] lmk if you need a hand looking into this, or if you got it. > apache_beam.io.gcp.bigquery_write_it_test.BigQueryWriteIntegrationTests is > flaky in DirectRunner Postcommits > > > Key: BEAM-9484 > URL: https://issues.apache.org/jira/browse/BEAM-9484 > Project: Beam > Issue Type: Bug > Components: io-py-gcp, test-failures >Reporter: Valentyn Tymofieiev >Assignee: Chamikara Madhusanka Jayalath >Priority: Major > > From https://builds.apache.org/job/beam_PostCommit_Python37_PR/99/: > {noformat} > == > 04:40:28 FAIL: test_big_query_write > (apache_beam.io.gcp.bigquery_write_it_test.BigQueryWriteIntegrationTests) > 04:40:28 > -- > 04:40:28 Traceback (most recent call last): > 04:40:28File > "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python37_PR/src/sdks/python/apache_beam/io/gcp/bigquery_write_it_test.py", > line 167, in test_big_query_write > 04:40:28 write_disposition=beam.io.BigQueryDisposition.WRITE_EMPTY)) > 04:40:28File > "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python37_PR/src/sdks/python/apache_beam/pipeline.py", > line 522, in __exit__ > 04:40:28 self.run().wait_until_finish() > 04:40:28File > "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python37_PR/src/sdks/python/apache_beam/pipeline.py", > line 495, in run > 04:40:28 self._options).run(False) > 04:40:28File > "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python37_PR/src/sdks/python/apache_beam/pipeline.py", > line 508, in run > 04:40:28 return self.runner.run_pipeline(self, self._options) > 04:40:28File > "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python37_PR/src/sdks/python/apache_beam/runners/direct/test_direct_runner.py", > line 53, in run_pipeline > 04:40:28 hc_assert_that(self.result, pickler.loads(on_success_matcher)) > 04:40:28 AssertionError: > 04:40:28 Expected: (Expected data is [(1, 'abc'), (2, 'def'), (3, '你好'), (4, > 'привет')]) > 04:40:28 but: Expected data is [(1, 'abc'), (2, 'def'), (3, '你好'), (4, > 'привет')] Actual data is [] > 04:40:28 > {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (BEAM-9484) apache_beam.io.gcp.bigquery_write_it_test.BigQueryWriteIntegrationTests is flaky in DirectRunner Postcommits
[ https://issues.apache.org/jira/browse/BEAM-9484?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17061257#comment-17061257 ] Pablo Estrada commented on BEAM-9484: - [~chamikara] lmk if you need a hand looking into this, or if you got it. > apache_beam.io.gcp.bigquery_write_it_test.BigQueryWriteIntegrationTests is > flaky in DirectRunner Postcommits > > > Key: BEAM-9484 > URL: https://issues.apache.org/jira/browse/BEAM-9484 > Project: Beam > Issue Type: Bug > Components: io-py-gcp, test-failures >Reporter: Valentyn Tymofieiev >Assignee: Chamikara Madhusanka Jayalath >Priority: Major > > From https://builds.apache.org/job/beam_PostCommit_Python37_PR/99/: > {noformat} > == > 04:40:28 FAIL: test_big_query_write > (apache_beam.io.gcp.bigquery_write_it_test.BigQueryWriteIntegrationTests) > 04:40:28 > -- > 04:40:28 Traceback (most recent call last): > 04:40:28File > "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python37_PR/src/sdks/python/apache_beam/io/gcp/bigquery_write_it_test.py", > line 167, in test_big_query_write > 04:40:28 write_disposition=beam.io.BigQueryDisposition.WRITE_EMPTY)) > 04:40:28File > "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python37_PR/src/sdks/python/apache_beam/pipeline.py", > line 522, in __exit__ > 04:40:28 self.run().wait_until_finish() > 04:40:28File > "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python37_PR/src/sdks/python/apache_beam/pipeline.py", > line 495, in run > 04:40:28 self._options).run(False) > 04:40:28File > "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python37_PR/src/sdks/python/apache_beam/pipeline.py", > line 508, in run > 04:40:28 return self.runner.run_pipeline(self, self._options) > 04:40:28File > "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python37_PR/src/sdks/python/apache_beam/runners/direct/test_direct_runner.py", > line 53, in run_pipeline > 04:40:28 hc_assert_that(self.result, pickler.loads(on_success_matcher)) > 04:40:28 AssertionError: > 04:40:28 Expected: (Expected data is [(1, 'abc'), (2, 'def'), (3, '你好'), (4, > 'привет')]) > 04:40:28 but: Expected data is [(1, 'abc'), (2, 'def'), (3, '你好'), (4, > 'привет')] Actual data is [] > 04:40:28 > {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (BEAM-9532) test_last_updated is failing in s3io_test
[ https://issues.apache.org/jira/browse/BEAM-9532?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17061253#comment-17061253 ] Pablo Estrada commented on BEAM-9532: - cc: [~mattmorgis] fyi > test_last_updated is failing in s3io_test > - > > Key: BEAM-9532 > URL: https://issues.apache.org/jira/browse/BEAM-9532 > Project: Beam > Issue Type: Bug > Components: io-py-aws >Reporter: Pablo Estrada >Priority: Major > > The timestamps are not set appropriately. For some reason they are one hour > away from each other: > > == > FAIL: test_last_updated (apache_beam.io.aws.s3io_test.TestS3IO) > -- > Traceback (most recent call last): > File "/home/pabloem/codes/meam/sdks/python/apache_beam/io/aws/s3io_test.py", > line 125, in test_last_updated > self.assertAlmostEqual(result, time.time(), delta=tolerance) > AssertionError: 1584481946.282874 != 1584485546.2829826 within 300 delta > -- > > Note that 1584481946.282874 - 1584485546.2829826 is 3600 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (BEAM-9532) test_last_updated is failing in s3io_test
Pablo Estrada created BEAM-9532: --- Summary: test_last_updated is failing in s3io_test Key: BEAM-9532 URL: https://issues.apache.org/jira/browse/BEAM-9532 Project: Beam Issue Type: Bug Components: io-py-aws Reporter: Pablo Estrada The timestamps are not set appropriately. For some reason they are one hour away from each other: == FAIL: test_last_updated (apache_beam.io.aws.s3io_test.TestS3IO) -- Traceback (most recent call last): File "/home/pabloem/codes/meam/sdks/python/apache_beam/io/aws/s3io_test.py", line 125, in test_last_updated self.assertAlmostEqual(result, time.time(), delta=tolerance) AssertionError: 1584481946.282874 != 1584485546.2829826 within 300 delta -- Note that 1584481946.282874 - 1584485546.2829826 is 3600 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (BEAM-9494) Remove workaround for BQ transform for Dataflow
[ https://issues.apache.org/jira/browse/BEAM-9494?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pablo Estrada updated BEAM-9494: Fix Version/s: (was: 2.20.0) 2.21.0 > Remove workaround for BQ transform for Dataflow > --- > > Key: BEAM-9494 > URL: https://issues.apache.org/jira/browse/BEAM-9494 > Project: Beam > Issue Type: Bug > Components: io-py-gcp >Reporter: Luke Cwik >Assignee: Pablo Estrada >Priority: Minor > Fix For: 2.21.0 > > Time Spent: 2h 40m > Remaining Estimate: 0h > > Dataflow incorrectly uses the Flatten input PCollection coder when it > performs an optimization instead of the output PCollection coder which can > lead to issues if these coders differ. > > The workaround was introduced in [https://github.com/apache/beam/pull/11103] -- This message was sent by Atlassian Jira (v8.3.4#803005)