Data Engineering Track at ApacheCon (October 3-6, New Orleans) - CFP ends 23rd of May !
Hello Beam developers ! ApacheCon North America is back in person this year in October. https://apachecon.com/acna2022/ Together with Ismaël Mejía, we are organizing for the first time a Data Engineering Track as part of ApacheCon. You might be wondering why a different track if we already have the Big Data track. Simple, this new track covers the ‘other’ open-source projects we use to clean data, orchestrate workloads, do observability, visualization, governance, data lineage and many other tasks that are part of data engineering and that are usually not covered by the data processing / database tracks. If you are curious you can find more details here: https://s.apache.org/apacheconna-2022-dataeng-track So why are you getting this message? Well it could be that (1) you are already a contributor to a project in the data engineering space and you might be interested in sending your proposal, or (2) you are interested in integrations of these tools with your existing data tools. If you are interested you can submit a proposal using the CfP link below. Don’t forget to choose the Data Engineering Track. https://apachecon.com/acna2022/cfp.html The Call for Presentations (CfP) closes in less than two weeks on May 23th, 2022. We are looking forward to receiving your submissions and hopefully seeing you in New Orleans in October. Thanks, Ismaël and Jarek
Flaky test issue report (56)
This is your daily summary of Beam's current flaky tests (https://issues.apache.org/jira/issues/?jql=project%20%3D%20BEAM%20AND%20statusCategory%20!%3D%20Done%20AND%20labels%20%3D%20flake) These are P1 issues because they have a major negative impact on the community and make it hard to determine the quality of the software. https://issues.apache.org/jira/browse/BEAM-14410: FnRunnerTest with non-trivial (order 1000 elements) numpy input flakes in non-cython environment (created 2022-05-04) https://issues.apache.org/jira/browse/BEAM-14407: Jenkins worker sometimes crashes while running Python Flink pipeline (created 2022-05-04) https://issues.apache.org/jira/browse/BEAM-14367: Flaky timeout in github Python unit test action StatefulDoFnOnDirectRunnerTest.test_dynamic_timer_clear_then_set_timer (created 2022-04-26) https://issues.apache.org/jira/browse/BEAM-14349: GroupByKeyTest BasicTests testLargeKeys100MB flake (on ULR) (created 2022-04-21) https://issues.apache.org/jira/browse/BEAM-14276: beam_PostCommit_Java_DataflowV2 failures parent bug (created 2022-04-07) https://issues.apache.org/jira/browse/BEAM-14269: PulsarIOTest.testReadFromSimpleTopic is very flaky (created 2022-04-06) https://issues.apache.org/jira/browse/BEAM-14263: beam_PostCommit_Java_DataflowV2, testBigQueryStorageWrite30MProto failing consistently (created 2022-04-05) https://issues.apache.org/jira/browse/BEAM-14252: beam_PostCommit_Java_DataflowV1 failing with a variety of flakes and errors (created 2022-04-05) https://issues.apache.org/jira/browse/BEAM-14216: Multiple XVR Suites having similar flakes simultaneously (created 2022-03-31) https://issues.apache.org/jira/browse/BEAM-14174: Flink Tests failure : java.lang.NoClassDefFoundError: Could not initialize class org.apache.beam.runners.core.construction.SerializablePipelineOptions (created 2022-03-24) https://issues.apache.org/jira/browse/BEAM-14172: beam_PreCommit_PythonDocs failing (jinja2) (created 2022-03-24) https://issues.apache.org/jira/browse/BEAM-13952: Dataflow streaming tests failing new AfterSynchronizedProcessingTime test (created 2022-02-15) https://issues.apache.org/jira/browse/BEAM-13859: Test flake: test_split_half_sdf (created 2022-02-09) https://issues.apache.org/jira/browse/BEAM-13850: beam_PostCommit_Python_Examples_Dataflow failing (created 2022-02-08) https://issues.apache.org/jira/browse/BEAM-13822: GBK and CoGBK streaming Java load tests failing (created 2022-02-03) https://issues.apache.org/jira/browse/BEAM-13810: Flaky tests: Gradle build daemon disappeared unexpectedly (created 2022-02-03) https://issues.apache.org/jira/browse/BEAM-13809: beam_PostCommit_XVR_Flink flaky: Connection refused (created 2022-02-03) https://issues.apache.org/jira/browse/BEAM-13797: Flakes: Failed to load cache entry (created 2022-02-01) https://issues.apache.org/jira/browse/BEAM-13708: flake: FlinkRunnerTest.testEnsureStdoutStdErrIsRestored (created 2022-01-20) https://issues.apache.org/jira/browse/BEAM-13575: Flink testParDoRequiresStableInput flaky (created 2021-12-28) https://issues.apache.org/jira/browse/BEAM-13500: NPE in Flink Portable ValidatesRunner streaming suite (created 2021-12-21) https://issues.apache.org/jira/browse/BEAM-13453: Flake in org.apache.beam.sdk.io.mqtt.MqttIOTest.testReadObject: Address already in use (created 2021-12-13) https://issues.apache.org/jira/browse/BEAM-13393: GroupIntoBatchesTest is failing (created 2021-12-07) https://issues.apache.org/jira/browse/BEAM-13367: [beam_PostCommit_Python36] [ apache_beam.io.gcp.experimental.spannerio_read_it_test] Failure summary (created 2021-12-01) https://issues.apache.org/jira/browse/BEAM-13312: org.apache.beam.sdk.transforms.ParDoLifecycleTest.testTeardownCalledAfterExceptionInStartBundle is flaky in Java Spark ValidatesRunner suite (created 2021-11-23) https://issues.apache.org/jira/browse/BEAM-13311: org.apache.beam.sdk.transforms.ParDoLifecycleTest.testTeardownCalledAfterExceptionInProcessElementStateful is flaky in Java ValidatesRunner Flink suite. (created 2021-11-23) https://issues.apache.org/jira/browse/BEAM-13237: org.apache.beam.sdk.transforms.CombineTest$WindowingTests.testWindowedCombineGloballyAsSingletonView flaky on Dataflow Runner V2 (created 2021-11-12) https://issues.apache.org/jira/browse/BEAM-13025: pubsublite.ReadWriteIT flaky in beam_PostCommit_Java_DataflowV2 (created 2021-10-08) https://issues.apache.org/jira/browse/BEAM-12928: beam_PostCommit_Python36 - CrossLanguageSpannerIOTest - flakey failing (created 2021-09-21) https://issues.apache.org/jira/browse/BEAM-12859: org.apache.beam.runners.dataflow.worker.fn.logging.BeamFnLoggingServiceTest.testMultipleClientsFailingIsHandledGracefullyByServer is flaky (created 2021-09-08) https://issues.apache.org/jira/browse/BEAM-12809:
P1 issues report (79)
This is your daily summary of Beam's current P1 issues, not including flaky tests (https://issues.apache.org/jira/issues/?jql=project%20%3D%20BEAM%20AND%20statusCategory%20!%3D%20Done%20AND%20priority%20%3D%20P1%20AND%20(labels%20is%20EMPTY%20OR%20labels%20!%3D%20flake). See https://beam.apache.org/contribute/jira-priorities/#p1-critical for the meaning and expectations around P1 issues. https://issues.apache.org/jira/browse/BEAM-14447: BigQueryWriteIntegrationTests.test_big_query_write_insert_errors_reporting failing in Python PostCommit (created 2022-05-09) https://issues.apache.org/jira/browse/BEAM-14434: beam_LoadTests_Python_GBK_reiterate_Dataflow_Streaming failure (created 2022-05-06) https://issues.apache.org/jira/browse/BEAM-14429: SyntheticUnboundedSource(with SDF) produce duplicate records when split with DEFAULT_DESIRED_NUM_SPLITS (created 2022-05-06) https://issues.apache.org/jira/browse/BEAM-14421: --dataflowServiceOptions=use_runner_v2 is broken (created 2022-05-05) https://issues.apache.org/jira/browse/BEAM-14416: ParDo LoadTest performance regression on java streaming dataflow runner v2 (created 2022-05-04) https://issues.apache.org/jira/browse/BEAM-14412: Block release on impersonation FR (created 2022-05-04) https://issues.apache.org/jira/browse/BEAM-14411: TypeCodersTest is never executed (created 2022-05-04) https://issues.apache.org/jira/browse/BEAM-14390: Java license check is broken (created 2022-05-02) https://issues.apache.org/jira/browse/BEAM-14364: 404s in BigQueryIO don't get output to Failed Inserts PCollection (created 2022-04-25) https://issues.apache.org/jira/browse/BEAM-14356: Java PostCommits: BigQueryIO.Read needs a GCS temp location (created 2022-04-22) https://issues.apache.org/jira/browse/BEAM-14298: Can't resolve org.pentaho:pentaho-aggdesigner-algorithm:5.1.5-jhyde (created 2022-04-12) https://issues.apache.org/jira/browse/BEAM-14291: DataflowPipelineResult does not raise exception for unsuccessful states. (created 2022-04-11) https://issues.apache.org/jira/browse/BEAM-14276: beam_PostCommit_Java_DataflowV2 failures parent bug (created 2022-04-07) https://issues.apache.org/jira/browse/BEAM-14275: SpannerWriteIT failing in beam PostCommit Java V1 (created 2022-04-07) https://issues.apache.org/jira/browse/BEAM-14265: Flink should hold the watermark at the output timestamp for processing time timers (created 2022-04-06) https://issues.apache.org/jira/browse/BEAM-14263: beam_PostCommit_Java_DataflowV2, testBigQueryStorageWrite30MProto failing consistently (created 2022-04-05) https://issues.apache.org/jira/browse/BEAM-14253: pubsublite.ReadWriteIT failing in beam_PostCommit_Java_DataflowV1 and V2 (created 2022-04-05) https://issues.apache.org/jira/browse/BEAM-14239: Changing the output timestamp of a timer does not clear the previously set timer (created 2022-04-04) https://issues.apache.org/jira/browse/BEAM-14174: Flink Tests failure : java.lang.NoClassDefFoundError: Could not initialize class org.apache.beam.runners.core.construction.SerializablePipelineOptions (created 2022-03-24) https://issues.apache.org/jira/browse/BEAM-14135: BigQuery Storage API insert with writeResult retry and write to error table (created 2022-03-20) https://issues.apache.org/jira/browse/BEAM-13952: Dataflow streaming tests failing new AfterSynchronizedProcessingTime test (created 2022-02-15) https://issues.apache.org/jira/browse/BEAM-13950: PVR_Spark2_Streaming perma-red (created 2022-02-15) https://issues.apache.org/jira/browse/BEAM-13920: Beam x-lang Dataflow tests failing due to _InactiveRpcError (created 2022-02-10) https://issues.apache.org/jira/browse/BEAM-13852: KafkaIO.read.withDynamicRead() doesn't pick up new TopicPartitions (created 2022-02-08) https://issues.apache.org/jira/browse/BEAM-13850: beam_PostCommit_Python_Examples_Dataflow failing (created 2022-02-08) https://issues.apache.org/jira/browse/BEAM-13822: GBK and CoGBK streaming Java load tests failing (created 2022-02-03) https://issues.apache.org/jira/browse/BEAM-13805: Simplify version override for Dev versions of the Go SDK. (created 2022-02-02) https://issues.apache.org/jira/browse/BEAM-13747: Add integration testing for BQ Storage API write modes (created 2022-01-26) https://issues.apache.org/jira/browse/BEAM-13715: Kafka commit offset drop data on failure for runners that have non-checkpointing shuffle (created 2022-01-21) https://issues.apache.org/jira/browse/BEAM-13487: WriteToBigQuery Dynamic table destinations returns wrong tableId (created 2021-12-17) https://issues.apache.org/jira/browse/BEAM-13393: GroupIntoBatchesTest is failing (created 2021-12-07) https://issues.apache.org/jira/browse/BEAM-13164: Race between member variable being accessed due to leaking uninitialized state via OutboundObserverFactory (created 2021-11-01)