This is your daily summary of Beam's current high priority issues that may need
attention.
See https://beam.apache.org/contribute/issue-priorities for the meaning and
expectations around issue priorities.
Unassigned P1 Issues:
https://github.com/apache/beam/issues/23527 [Feature Request]: PubSubLiteIO
custom timestamp policy
https://github.com/apache/beam/issues/23525 [Bug]: Default PubsubMessage coder
will drop message id and orderingKey
https://github.com/apache/beam/issues/23489 [Bug]: add DebeziumIO to the
connectors page
https://github.com/apache/beam/issues/23449 [Bug]: Flaky test:
org.apache.beam.sdk.io.sparkreceiver.SparkReceiverIOTest.testReadFromCustomReceiverWithOffsetFailsAndReread
https://github.com/apache/beam/issues/23306 [Bug]: BigQueryBatchFileLoads in
python loses data when using WRITE_TRUNCATE
https://github.com/apache/beam/issues/23286 [Bug]:
beam_PerformanceTests_InfluxDbIO_IT Flaky > 50 % Fail
https://github.com/apache/beam/issues/22913 [Bug]:
beam_PostCommit_Java_ValidatesRunner_Flink is flakes in
org.apache.beam.sdk.transforms.GroupByKeyTest$BasicTests.testAfterProcessingTimeContinuationTriggerUsingState
https://github.com/apache/beam/issues/22891 [Bug]:
beam_PostCommit_XVR_PythonUsingJavaDataflow is flaky
https://github.com/apache/beam/issues/22590 [Bug]: Java Unit Tests fail to find
gradle
https://github.com/apache/beam/issues/22303 [Task]: Add tests to Kafka SDF and
fix known and discovered issues
https://github.com/apache/beam/issues/22299 [Bug]: JDBCIO Write freeze at
getConnection() in WriteFn
https://github.com/apache/beam/issues/22115 [Bug]:
apache_beam.runners.portability.portable_runner_test.PortableRunnerTestWithSubprocesses
is flaky
https://github.com/apache/beam/issues/21713 404s in BigQueryIO don't get output
to Failed Inserts PCollection
https://github.com/apache/beam/issues/21696 Flink Tests failure :
java.lang.NoClassDefFoundError: Could not initialize class
org.apache.beam.runners.core.construction.SerializablePipelineOptions
https://github.com/apache/beam/issues/21645
beam_PostCommit_XVR_GoUsingJava_Dataflow fails on some test transforms
https://github.com/apache/beam/issues/21643 FnRunnerTest with non-trivial
(order 1000 elements) numpy input flakes in non-cython environment
https://github.com/apache/beam/issues/21623 CrossLanguageJdbcIOTest broken with
"Cannot load JDBC driver class 'com.mysql.cj.jdbc.Driver'"
https://github.com/apache/beam/issues/21561
ExternalPythonTransformTest.trivialPythonTransform flaky
https://github.com/apache/beam/issues/21533
SpannerChangeStreamErrorTest.testUnavailableExceptionRetries flaky
https://github.com/apache/beam/issues/21471 Flakes: Failed to load cache entry
https://github.com/apache/beam/issues/21470 Test flake: test_split_half_sdf
https://github.com/apache/beam/issues/21469 beam_PostCommit_XVR_Flink flaky:
Connection refused
https://github.com/apache/beam/issues/21468
beam_PostCommit_Python_Examples_Dataflow failing
https://github.com/apache/beam/issues/21464 GroupIntoBatchesTest is failing
https://github.com/apache/beam/issues/21463 NPE in Flink Portable
ValidatesRunner streaming suite
https://github.com/apache/beam/issues/21462 Flake in
org.apache.beam.sdk.io.mqtt.MqttIOTest.testReadObject: Address already in use
https://github.com/apache/beam/issues/21424 Java VR (Dataflow, V2, Streaming)
failing: ParDoTest$TimestampTests/OnWindowExpirationTests
https://github.com/apache/beam/issues/21364 Flink load tests fail:
NoClassDefFoundError: MessageBodyReader
https://github.com/apache/beam/issues/21333 Flink testParDoRequiresStableInput
flaky
https://github.com/apache/beam/issues/21271 pubsublite.ReadWriteIT flaky in
beam_PostCommit_Java_DataflowV2
https://github.com/apache/beam/issues/21270
org.apache.beam.sdk.transforms.CombineTest$WindowingTests.testWindowedCombineGloballyAsSingletonView
flaky on Dataflow Runner V2
https://github.com/apache/beam/issues/21266
org.apache.beam.sdk.transforms.ParDoLifecycleTest.testTeardownCalledAfterExceptionInProcessElementStateful
is flaky in Java ValidatesRunner Flink suite.
https://github.com/apache/beam/issues/21262 Python AfterAny, AfterAll do not
follow spec
https://github.com/apache/beam/issues/21261
org.apache.beam.runners.dataflow.worker.fn.logging.BeamFnLoggingServiceTest.testMultipleClientsFailingIsHandledGracefullyByServer
is flaky
https://github.com/apache/beam/issues/21260 Python DirectRunner does not emit
data at GC time
https://github.com/apache/beam/issues/21257 Either Create or DirectRunner fails
to produce all elements to the following transform
https://github.com/apache/beam/issues/21242
org.apache.beam.sdk.transforms.ParDoLifecycleTest.testTeardownCalledAfterExceptionInStartBundle
is flaky in Java Spark ValidatesRunner suite
https://github.com/apache/beam/issues/21123 Multiple jobs running on Flink
session cluster reuse the persistent Python environment.
https://github.com/apache/beam/issues/21121
apache_beam.examples.streaming_wordcount_it_test.StreamingWordCountIT.test_streaming_wordcount_it
flakey
https://github.com/apache/beam/issues/21118
PortableRunnerTestWithExternalEnv.test_pardo_timers flaky
https://github.com/apache/beam/issues/21114 Already Exists: Dataset
apache-beam-testing:python_bq_file_loads_NNN
https://github.com/apache/beam/issues/21113
testTwoTimersSettingEachOtherWithCreateAsInputBounded flaky
https://github.com/apache/beam/issues/21111 Java creates an incorrect pipeline
proto when core-construction-java jar is not in the CLASSPATH
https://github.com/apache/beam/issues/21104 Flaky:
apache_beam.runners.portability.fn_api_runner.fn_runner_test.FnApiRunnerTestWithGrpcAndMultiWorkers
https://github.com/apache/beam/issues/20981 Python precommit flaky: Failed to
read inputs in the data plane
https://github.com/apache/beam/issues/20977 SamzaStoreStateInternalsTest is
flaky
https://github.com/apache/beam/issues/20976
apache_beam.runners.portability.flink_runner_test.FlinkRunnerTestOptimized.test_flink_metrics
is flaky
https://github.com/apache/beam/issues/20975
org.apache.beam.runners.flink.ReadSourcePortableTest.testExecution[streaming:
false] is flaky
https://github.com/apache/beam/issues/20974 Python GHA PreCommits flake with
grpc.FutureTimeoutError on SDK harness startup
https://github.com/apache/beam/issues/20930 Python Postcommits timing out on
postCommitIT
https://github.com/apache/beam/issues/20815
testTeardownCalledAfterExceptionInProcessElement flakes on direct runner.
https://github.com/apache/beam/issues/20689 Kafka commitOffsetsInFinalize OOM
on Flink
https://github.com/apache/beam/issues/20655 Flink PortableValidatesRunner test
failure: GroupByKeyTest$BasicTests.testLargeKeys10MB
https://github.com/apache/beam/issues/20331
org.apache.beam.sdk.io.mongodb.MongoDbIOTest.testReadWithAggregate is flaky
https://github.com/apache/beam/issues/20269 Flink postcommits failing
testFlattenWithDifferentInputAndOutputCoders2
https://github.com/apache/beam/issues/20189 Spark failing
testFlattenWithDifferentInputAndOutputCoders2
https://github.com/apache/beam/issues/20109 SortValues should fail if
SecondaryKey coder is not deterministic
https://github.com/apache/beam/issues/20108 Python direct runner doesn't emit
empty pane when it should
https://github.com/apache/beam/issues/19946 Python PreCommit Failures: Could
not copy file '/some/path/file.egg'
https://github.com/apache/beam/issues/19816
MetricsTest$AttemptedMetricTests.testAllAttemptedMetrics is flaky on
DirectRunner
https://github.com/apache/beam/issues/19814 Flakes in
ParDoLifecycleTest.testTeardownCalledAfterExceptionInStartBundleStateful for
Direct, Spark, Flink
P1 Issues with no update in the last week:
https://github.com/apache/beam/issues/23022 [Bug]: PubsubIO does not consider
attributes as part of the limit
https://github.com/apache/beam/issues/22969 Discrepancy in behavior of
`DoFn.process()` when `yield` is combined with `return` statement, or vice versa
https://github.com/apache/beam/issues/22605 [Bug]: Beam Python failure for
dataflow_exercise_metrics_pipeline_test.ExerciseMetricsPipelineTest.test_metrics_it
https://github.com/apache/beam/issues/22288 [Bug]: BigQueryIO throws
IllegalStateException if dynamicDestionations.getSchema returns null
https://github.com/apache/beam/issues/22192 [Bug]: NullPointerException when
copying files using FileSystems.copy and setting
StandardMoveOptions.SKIP_IF_DESTINATION_EXISTS
https://github.com/apache/beam/issues/22011 [Bug]:
org.apache.beam.sdk.io.aws2.kinesis.KinesisIOWriteTest.testWriteFailure flaky
https://github.com/apache/beam/issues/22010 [Bug]:
org.apache.beam.runners.flink.FlinkRunnerTest.testEnsureStdoutStdErrIsRestored
flaky
https://github.com/apache/beam/issues/22009 [Bug]:
org.apache.beam.examples.subprocess.ExampleEchoPipelineTest.testExampleEchoPipeline
flaky
https://github.com/apache/beam/issues/21893 [Bug]: BigQuery Storage Write API
implementation does not support table partitioning
https://github.com/apache/beam/issues/21711 Python Streaming job failing to
drain with BigQueryIO write errors
https://github.com/apache/beam/issues/21707 GroupByKeyTest BasicTests
testLargeKeys100MB flake (on ULR)
https://github.com/apache/beam/issues/21706 Flaky timeout in github Python unit
test action
StatefulDoFnOnDirectRunnerTest.test_dynamic_timer_clear_then_set_timer
https://github.com/apache/beam/issues/21476 WriteToBigQuery Dynamic table
destinations returns wrong tableId
https://github.com/apache/beam/issues/21474 Flaky tests: Gradle build daemon
disappeared unexpectedly
https://github.com/apache/beam/issues/20814 JmsIO is not acknowledging messages
correctly
https://github.com/apache/beam/issues/20812 Cross-language consistency
(RequiresStableInputs) is quietly broken (at least on portable flink runner)