This is your daily summary of Beam's current high priority issues that may need
attention.
See https://beam.apache.org/contribute/issue-priorities for the meaning and
expectations around issue priorities.
Unassigned P1 Issues:
https://github.com/apache/beam/issues/32873 [Task]: Document Jinja
templatization for yaml
https://github.com/apache/beam/issues/32832 [Failing Test]: PreCommit Yaml
Xlang Direct is broken
https://github.com/apache/beam/issues/32809 The PostCommit Python Xlang IO
Direct job is flaky
https://github.com/apache/beam/issues/32794 The PostCommit Python Examples
Flink job is flaky
https://github.com/apache/beam/issues/32509 [Bug]: Unable to Restart Google
Spanner Change Streams Consumer due to tableExists(table_name) bug
https://github.com/apache/beam/issues/31929 The PerformanceTests Kafka IO job
is flaky
https://github.com/apache/beam/issues/31800 [Bug]: PreparePubsubWriteDoFn does
not guarantee that a message will not exceed hard request limits
https://github.com/apache/beam/issues/31254 [Failing Test]: Onnx inference unit
tests are failing.
https://github.com/apache/beam/issues/31203 [Failing Test]: PostCommit Python
and PostCommit Python Arm perma red
https://github.com/apache/beam/issues/30799 The PostCommit Python Dependency
job is flaky
https://github.com/apache/beam/issues/30606 The PostCommit Java Nexmark
Dataflow job is flaky
https://github.com/apache/beam/issues/30527 The PostCommit Java IO Performance
Tests job is flaky
https://github.com/apache/beam/issues/30526 The PerformanceTests xlang KafkaIO
Python job is flaky
https://github.com/apache/beam/issues/30525 The PostCommit Python
ValidatesContainer Dataflow With RC job is flaky
https://github.com/apache/beam/issues/30521 The LoadTests Go Combine Flink
Batch job is flaky
https://github.com/apache/beam/issues/30520 The LoadTests Python Combine Flink
Streaming job is flaky
https://github.com/apache/beam/issues/30517 The PostCommit XVR Direct job is
flaky
https://github.com/apache/beam/issues/30507 The LoadTests Go GBK Flink Batch
job is flaky
https://github.com/apache/beam/issues/30506 The TypeScript Tests job is flaky
https://github.com/apache/beam/issues/30502 The LoadTests Go CoGBK Flink Batch
job is flaky
https://github.com/apache/beam/issues/29971 [Bug]: FixedWindows not working for
large Kafka topic
https://github.com/apache/beam/issues/29926 [Bug]: FileIO: lack of timeouts may
cause the pipeline to get stuck indefinitely
https://github.com/apache/beam/issues/29099 [Bug]: FnAPI Java SDK Harness
doesn't update user counters in OnTimer callback functions
https://github.com/apache/beam/issues/28760 [Bug]: EFO Kinesis IO reader
provided by apache beam does not pick the event time for watermarking
https://github.com/apache/beam/issues/28383 [Failing Test]:
org.apache.beam.runners.dataflow.worker.StreamingDataflowWorkerTest.testMaxThreadMetric
https://github.com/apache/beam/issues/28326 Bug:
apache_beam.io.gcp.pubsublite.ReadFromPubSubLite not working
https://github.com/apache/beam/issues/27892 [Bug]: ignoreUnknownValues not
working when using CreateDisposition.CREATE_IF_NEEDED
https://github.com/apache/beam/issues/27616 [Bug]: Unable to use
applyRowMutations() in bigquery IO apache beam java
https://github.com/apache/beam/issues/27486 [Bug]: Read from datastore with
inequality filters
https://github.com/apache/beam/issues/27314 [Failing Test]:
bigquery.StorageApiSinkCreateIfNeededIT.testCreateManyTables[1]
https://github.com/apache/beam/issues/27238 [Bug]: Window trigger has lag when
using Kafka and GroupByKey on Dataflow Runner
https://github.com/apache/beam/issues/26911 [Bug]: UNNEST ARRAY with a nested
ROW (described below)
https://github.com/apache/beam/issues/26343 [Bug]:
apache_beam.io.gcp.bigquery_read_it_test.ReadAllBQTests.test_read_queries is
flaky
https://github.com/apache/beam/issues/26329 [Bug]: BigQuerySourceBase does not
propagate a Coder to AvroSource
https://github.com/apache/beam/issues/26041 [Bug]: Unable to create
exactly-once Flink pipeline with stream source and file sink
https://github.com/apache/beam/issues/25946 [Task]: Support more Beam portable
schema types as Python types
https://github.com/apache/beam/issues/24776 [Bug]: Race condition in Python SDK
Harness ProcessBundleProgress
https://github.com/apache/beam/issues/24313 [Flaky]:
apache_beam/runners/portability/portable_runner_test.py::PortableRunnerTestWithSubprocesses::test_pardo_state_with_custom_key_coder
https://github.com/apache/beam/issues/23709 [Flake]: Spark batch flakes in
ParDoLifecycleTest.testTeardownCalledAfterExceptionInProcessElement and
ParDoLifecycleTest.testTeardownCalledAfterExceptionInStartBundle
https://github.com/apache/beam/issues/23525 [Bug]: Default PubsubMessage coder
will drop message id and orderingKey
https://github.com/apache/beam/issues/22913 [Bug]:
beam_PostCommit_Java_ValidatesRunner_Flink is flakes in
org.apache.beam.sdk.transforms.GroupByKeyTest$BasicTests.testAfterProcessingTimeContinuationTriggerUsingState
https://github.com/apache/beam/issues/22605 [Bug]: Beam Python failure for
dataflow_exercise_metrics_pipeline_test.ExerciseMetricsPipelineTest.test_metrics_it
https://github.com/apache/beam/issues/21706 Flaky timeout in github Python unit
test action
StatefulDoFnOnDirectRunnerTest.test_dynamic_timer_clear_then_set_timer
https://github.com/apache/beam/issues/21643 FnRunnerTest with non-trivial
(order 1000 elements) numpy input flakes in non-cython environment
https://github.com/apache/beam/issues/21476 WriteToBigQuery Dynamic table
destinations returns wrong tableId
https://github.com/apache/beam/issues/21260 Python DirectRunner does not emit
data at GC time
https://github.com/apache/beam/issues/21121
apache_beam.examples.streaming_wordcount_it_test.StreamingWordCountIT.test_streaming_wordcount_it
flakey
https://github.com/apache/beam/issues/21104 Flaky:
apache_beam.runners.portability.fn_api_runner.fn_runner_test.FnApiRunnerTestWithGrpcAndMultiWorkers
https://github.com/apache/beam/issues/20976
apache_beam.runners.portability.flink_runner_test.FlinkRunnerTestOptimized.test_flink_metrics
is flaky
https://github.com/apache/beam/issues/20108 Python direct runner doesn't emit
empty pane when it should
https://github.com/apache/beam/issues/19814 Flink streaming flakes in
ParDoLifecycleTest.testTeardownCalledAfterExceptionInStartBundleStateful and
ParDoLifecycleTest.testTeardownCalledAfterExceptionInProcessElementStateful
P1 Issues with no update in the last week:
https://github.com/apache/beam/issues/32627 The Go tests job is flaky
https://github.com/apache/beam/issues/31040 [Bug]: ReadAllFiles does not fully
read gzipped files from GCS
https://github.com/apache/beam/issues/30519 The PostCommit XVR GoUsingJava
Dataflow job is flaky
https://github.com/apache/beam/issues/29515 [Bug]: WriteToFiles in python leave
few records in temp directory when writing to large number (100+) of files
https://github.com/apache/beam/issues/25975 [Bug]: KinesisIO processing-time
watermarking can cause data loss