Data Engineering Track at ApacheCon (October 3-6, New Orleans) - CFP ends 23rd of May !

2022-05-10 Thread Jarek Potiuk
Hello Beam developers !

ApacheCon North America is back in person this year in October.
https://apachecon.com/acna2022/

Together with Ismaël Mejía, we are organizing for the first time a Data
Engineering Track as part of ApacheCon.

You might be wondering why a different track if we already have the Big
Data track. Simple, this new track covers the ‘other’ open-source projects
we use to clean data, orchestrate workloads, do observability,
visualization, governance, data lineage and many other tasks that are part
of data engineering and that are usually not covered by the data processing
/ database tracks.

If you are curious you can find more details here:
https://s.apache.org/apacheconna-2022-dataeng-track

So why are you getting this message? Well it could be that (1) you are
already a contributor to a project in the data engineering space and you
might be interested in sending your proposal, or (2) you are interested in
integrations of these tools with your existing data tools.

If you are interested you can submit a proposal using the CfP link below.
Don’t forget to choose the Data Engineering Track.
https://apachecon.com/acna2022/cfp.html

The Call for Presentations (CfP) closes in less than two weeks on May 23th,
2022.

We are looking forward to receiving your submissions and hopefully seeing
you in

New Orleans in October.

Thanks,

Ismaël and Jarek


Flaky test issue report (56)

2022-05-10 Thread Beam Jira Bot
This is your daily summary of Beam's current flaky tests 
(https://issues.apache.org/jira/issues/?jql=project%20%3D%20BEAM%20AND%20statusCategory%20!%3D%20Done%20AND%20labels%20%3D%20flake)

These are P1 issues because they have a major negative impact on the community 
and make it hard to determine the quality of the software.

https://issues.apache.org/jira/browse/BEAM-14410: FnRunnerTest with 
non-trivial (order 1000 elements) numpy input flakes in non-cython environment 
(created 2022-05-04)
https://issues.apache.org/jira/browse/BEAM-14407: Jenkins worker sometimes 
crashes while running Python Flink pipeline (created 2022-05-04)
https://issues.apache.org/jira/browse/BEAM-14367: Flaky timeout in github 
Python unit test action 
StatefulDoFnOnDirectRunnerTest.test_dynamic_timer_clear_then_set_timer (created 
2022-04-26)
https://issues.apache.org/jira/browse/BEAM-14349: GroupByKeyTest BasicTests 
testLargeKeys100MB flake (on ULR) (created 2022-04-21)
https://issues.apache.org/jira/browse/BEAM-14276: 
beam_PostCommit_Java_DataflowV2 failures parent bug (created 2022-04-07)
https://issues.apache.org/jira/browse/BEAM-14269: 
PulsarIOTest.testReadFromSimpleTopic is very flaky (created 2022-04-06)
https://issues.apache.org/jira/browse/BEAM-14263: 
beam_PostCommit_Java_DataflowV2, testBigQueryStorageWrite30MProto failing 
consistently (created 2022-04-05)
https://issues.apache.org/jira/browse/BEAM-14252: 
beam_PostCommit_Java_DataflowV1 failing with a variety of flakes and errors 
(created 2022-04-05)
https://issues.apache.org/jira/browse/BEAM-14216: Multiple XVR Suites 
having similar flakes simultaneously (created 2022-03-31)
https://issues.apache.org/jira/browse/BEAM-14174: Flink Tests failure :  
java.lang.NoClassDefFoundError: Could not initialize class 
org.apache.beam.runners.core.construction.SerializablePipelineOptions  (created 
2022-03-24)
https://issues.apache.org/jira/browse/BEAM-14172: beam_PreCommit_PythonDocs 
failing (jinja2) (created 2022-03-24)
https://issues.apache.org/jira/browse/BEAM-13952: Dataflow streaming tests 
failing new AfterSynchronizedProcessingTime test (created 2022-02-15)
https://issues.apache.org/jira/browse/BEAM-13859: Test flake: 
test_split_half_sdf (created 2022-02-09)
https://issues.apache.org/jira/browse/BEAM-13850: 
beam_PostCommit_Python_Examples_Dataflow failing (created 2022-02-08)
https://issues.apache.org/jira/browse/BEAM-13822: GBK and CoGBK streaming 
Java load tests failing (created 2022-02-03)
https://issues.apache.org/jira/browse/BEAM-13810: Flaky tests: Gradle build 
daemon disappeared unexpectedly (created 2022-02-03)
https://issues.apache.org/jira/browse/BEAM-13809: beam_PostCommit_XVR_Flink 
flaky: Connection refused (created 2022-02-03)
https://issues.apache.org/jira/browse/BEAM-13797: Flakes: Failed to load 
cache entry (created 2022-02-01)
https://issues.apache.org/jira/browse/BEAM-13708: flake: 
FlinkRunnerTest.testEnsureStdoutStdErrIsRestored (created 2022-01-20)
https://issues.apache.org/jira/browse/BEAM-13575: Flink 
testParDoRequiresStableInput flaky (created 2021-12-28)
https://issues.apache.org/jira/browse/BEAM-13500: NPE in Flink Portable 
ValidatesRunner streaming suite (created 2021-12-21)
https://issues.apache.org/jira/browse/BEAM-13453: Flake in 
org.apache.beam.sdk.io.mqtt.MqttIOTest.testReadObject: Address already in use 
(created 2021-12-13)
https://issues.apache.org/jira/browse/BEAM-13393: GroupIntoBatchesTest is 
failing (created 2021-12-07)
https://issues.apache.org/jira/browse/BEAM-13367: 
[beam_PostCommit_Python36] [ 
apache_beam.io.gcp.experimental.spannerio_read_it_test] Failure summary 
(created 2021-12-01)
https://issues.apache.org/jira/browse/BEAM-13312: 
org.apache.beam.sdk.transforms.ParDoLifecycleTest.testTeardownCalledAfterExceptionInStartBundle
 is flaky in Java Spark ValidatesRunner suite  (created 2021-11-23)
https://issues.apache.org/jira/browse/BEAM-13311: 
org.apache.beam.sdk.transforms.ParDoLifecycleTest.testTeardownCalledAfterExceptionInProcessElementStateful
 is flaky in Java ValidatesRunner Flink suite. (created 2021-11-23)
https://issues.apache.org/jira/browse/BEAM-13237: 
org.apache.beam.sdk.transforms.CombineTest$WindowingTests.testWindowedCombineGloballyAsSingletonView
 flaky on Dataflow Runner V2 (created 2021-11-12)
https://issues.apache.org/jira/browse/BEAM-13025: pubsublite.ReadWriteIT 
flaky in beam_PostCommit_Java_DataflowV2   (created 2021-10-08)
https://issues.apache.org/jira/browse/BEAM-12928: beam_PostCommit_Python36 
- CrossLanguageSpannerIOTest - flakey failing (created 2021-09-21)
https://issues.apache.org/jira/browse/BEAM-12859: 
org.apache.beam.runners.dataflow.worker.fn.logging.BeamFnLoggingServiceTest.testMultipleClientsFailingIsHandledGracefullyByServer
 is flaky (created 2021-09-08)
https://issues.apache.org/jira/browse/BEAM-12809: 

P1 issues report (79)

2022-05-10 Thread Beam Jira Bot
This is your daily summary of Beam's current P1 issues, not including flaky 
tests 
(https://issues.apache.org/jira/issues/?jql=project%20%3D%20BEAM%20AND%20statusCategory%20!%3D%20Done%20AND%20priority%20%3D%20P1%20AND%20(labels%20is%20EMPTY%20OR%20labels%20!%3D%20flake).

See https://beam.apache.org/contribute/jira-priorities/#p1-critical for the 
meaning and expectations around P1 issues.

https://issues.apache.org/jira/browse/BEAM-14447: 
BigQueryWriteIntegrationTests.test_big_query_write_insert_errors_reporting 
failing in Python PostCommit (created 2022-05-09)
https://issues.apache.org/jira/browse/BEAM-14434: 
beam_LoadTests_Python_GBK_reiterate_Dataflow_Streaming failure (created 
2022-05-06)
https://issues.apache.org/jira/browse/BEAM-14429: 
SyntheticUnboundedSource(with SDF) produce duplicate records when split with 
DEFAULT_DESIRED_NUM_SPLITS (created 2022-05-06)
https://issues.apache.org/jira/browse/BEAM-14421: 
--dataflowServiceOptions=use_runner_v2 is broken (created 2022-05-05)
https://issues.apache.org/jira/browse/BEAM-14416: ParDo LoadTest 
performance regression on java streaming dataflow runner v2 (created 2022-05-04)
https://issues.apache.org/jira/browse/BEAM-14412: Block release on 
impersonation FR (created 2022-05-04)
https://issues.apache.org/jira/browse/BEAM-14411: TypeCodersTest is never 
executed (created 2022-05-04)
https://issues.apache.org/jira/browse/BEAM-14390: Java license check is 
broken (created 2022-05-02)
https://issues.apache.org/jira/browse/BEAM-14364: 404s in BigQueryIO don't 
get output to Failed Inserts PCollection (created 2022-04-25)
https://issues.apache.org/jira/browse/BEAM-14356: Java PostCommits: 
BigQueryIO.Read needs a GCS temp location (created 2022-04-22)
https://issues.apache.org/jira/browse/BEAM-14298: Can't resolve 
org.pentaho:pentaho-aggdesigner-algorithm:5.1.5-jhyde (created 2022-04-12)
https://issues.apache.org/jira/browse/BEAM-14291: DataflowPipelineResult 
does not raise exception for unsuccessful states. (created 2022-04-11)
https://issues.apache.org/jira/browse/BEAM-14276: 
beam_PostCommit_Java_DataflowV2 failures parent bug (created 2022-04-07)
https://issues.apache.org/jira/browse/BEAM-14275: SpannerWriteIT failing in 
beam PostCommit Java V1 (created 2022-04-07)
https://issues.apache.org/jira/browse/BEAM-14265: Flink should hold the 
watermark at the output timestamp for processing time timers (created 
2022-04-06)
https://issues.apache.org/jira/browse/BEAM-14263: 
beam_PostCommit_Java_DataflowV2, testBigQueryStorageWrite30MProto failing 
consistently (created 2022-04-05)
https://issues.apache.org/jira/browse/BEAM-14253: pubsublite.ReadWriteIT 
failing in beam_PostCommit_Java_DataflowV1 and V2 (created 2022-04-05)
https://issues.apache.org/jira/browse/BEAM-14239: Changing the output 
timestamp of a timer does not clear the previously set timer (created 
2022-04-04)
https://issues.apache.org/jira/browse/BEAM-14174: Flink Tests failure :  
java.lang.NoClassDefFoundError: Could not initialize class 
org.apache.beam.runners.core.construction.SerializablePipelineOptions  (created 
2022-03-24)
https://issues.apache.org/jira/browse/BEAM-14135: BigQuery Storage API 
insert with writeResult retry and write to error table (created 2022-03-20)
https://issues.apache.org/jira/browse/BEAM-13952: Dataflow streaming tests 
failing new AfterSynchronizedProcessingTime test (created 2022-02-15)
https://issues.apache.org/jira/browse/BEAM-13950: PVR_Spark2_Streaming 
perma-red (created 2022-02-15)
https://issues.apache.org/jira/browse/BEAM-13920: Beam x-lang Dataflow 
tests failing due to _InactiveRpcError (created 2022-02-10)
https://issues.apache.org/jira/browse/BEAM-13852: 
KafkaIO.read.withDynamicRead() doesn't pick up new TopicPartitions (created 
2022-02-08)
https://issues.apache.org/jira/browse/BEAM-13850: 
beam_PostCommit_Python_Examples_Dataflow failing (created 2022-02-08)
https://issues.apache.org/jira/browse/BEAM-13822: GBK and CoGBK streaming 
Java load tests failing (created 2022-02-03)
https://issues.apache.org/jira/browse/BEAM-13805: Simplify version override 
for Dev versions of the Go SDK. (created 2022-02-02)
https://issues.apache.org/jira/browse/BEAM-13747: Add integration testing 
for BQ Storage API  write modes (created 2022-01-26)
https://issues.apache.org/jira/browse/BEAM-13715: Kafka commit offset drop 
data on failure for runners that have non-checkpointing shuffle (created 
2022-01-21)
https://issues.apache.org/jira/browse/BEAM-13487: WriteToBigQuery Dynamic 
table destinations returns wrong tableId (created 2021-12-17)
https://issues.apache.org/jira/browse/BEAM-13393: GroupIntoBatchesTest is 
failing (created 2021-12-07)
https://issues.apache.org/jira/browse/BEAM-13164: Race between member 
variable being accessed due to leaking uninitialized state via 
OutboundObserverFactory (created 2021-11-01)