Beam Dependency Check Report (2023-05-10)
<<< text/html; charset=UTF-8: Unrecognized >>>
Re: [Notice] Jenkins seed job comment trigger no longer working, and possible solutions
+1 to add committers to the list manually. Thanks Yi for doing this. On Thu, May 11, 2023 at 11:48 AM Danny McCormick via dev < dev@beam.apache.org> wrote: > I'm +1 on just adding committers to a list manually. Having the ability to > run seed jobs from a PR is nice, but adding a new committer is a rare > enough event that automating is not worth the time IMO (as opposed to > documenting this as something to do when you're a new committer). Plus this > problem goes away entirely if we move to GitHub Actions :) > > One thing I'll note: there is an automation route that involves querying > the teams from the Apache GitHub org, this would require us to upload a > custom PAT though which incurs secret rotation and is more work than its > worth IMO. > > If we decide to do this, I have https://github.com/apache/beam/pull/26672 > prepared. > > Thanks, > Danny > > On Thu, May 11, 2023 at 11:20 AM Yi Hu via dev > wrote: > >> Dear Beam Developers, >> >> tl;dr For PRs involving Jenkins task changes authored by Beam committers, >> "Run seed job" no longer working due to apache infra change. >> >> It is noted that due to recent Apache Infra change on LDAP server, Beam >> Jenkins CI/CD no longer has access to the GitHub username list, and >> consequently several Jenkins tasks that used to have triggers enabled by >> committers can no longer triggered by commenting phrase against PR (e.g. >> "Run seed job") >> >> A full list of affected jobs are >> >> >>- seed_00_job >>- seed_job_standalone >>- beam_Publish_Docker_Snapshots >>- beam_Dependency_Check >>- beam_Metrics_Report >> >> Other than the seed job are release related workflows and should not >> affect development on code base. >> >> I have created a PR to temporarily remove the step of fetching GitHub >> usernames [2] to get the seed job back green. After that, I would like to >> ask the community if it is fine to either >> >> >>- Leave these jobs have no comment trigger (they can still be >>manually triggered via steps described in [2], besides the scheduled jobs) >>- Maintain a list of committer GitHub usernames manually in >> >> https://github.com/apache/beam/blob/master/.test-infra/jenkins/Committers.groovy >> >> >> Please feel free to share if you have a better idea for fixing this. >> >> See more context on >> [1] https://github.com/apache/beam/issues/26602 >> [2] https://github.com/apache/beam/pull/26652 >> >> >> Regards, >> Yi >> >> -- >> >> Yi Hu, (he/him/his) >> >> Software Engineer >> >> >>
Re: [Notice] Jenkins seed job comment trigger no longer working, and possible solutions
I'm +1 on just adding committers to a list manually. Having the ability to run seed jobs from a PR is nice, but adding a new committer is a rare enough event that automating is not worth the time IMO (as opposed to documenting this as something to do when you're a new committer). Plus this problem goes away entirely if we move to GitHub Actions :) One thing I'll note: there is an automation route that involves querying the teams from the Apache GitHub org, this would require us to upload a custom PAT though which incurs secret rotation and is more work than its worth IMO. If we decide to do this, I have https://github.com/apache/beam/pull/26672 prepared. Thanks, Danny On Thu, May 11, 2023 at 11:20 AM Yi Hu via dev wrote: > Dear Beam Developers, > > tl;dr For PRs involving Jenkins task changes authored by Beam committers, > "Run seed job" no longer working due to apache infra change. > > It is noted that due to recent Apache Infra change on LDAP server, Beam > Jenkins CI/CD no longer has access to the GitHub username list, and > consequently several Jenkins tasks that used to have triggers enabled by > committers can no longer triggered by commenting phrase against PR (e.g. > "Run seed job") > > A full list of affected jobs are > > >- seed_00_job >- seed_job_standalone >- beam_Publish_Docker_Snapshots >- beam_Dependency_Check >- beam_Metrics_Report > > Other than the seed job are release related workflows and should not > affect development on code base. > > I have created a PR to temporarily remove the step of fetching GitHub > usernames [2] to get the seed job back green. After that, I would like to > ask the community if it is fine to either > > >- Leave these jobs have no comment trigger (they can still be manually >triggered via steps described in [2], besides the scheduled jobs) >- Maintain a list of committer GitHub usernames manually in > > https://github.com/apache/beam/blob/master/.test-infra/jenkins/Committers.groovy > > > Please feel free to share if you have a better idea for fixing this. > > See more context on > [1] https://github.com/apache/beam/issues/26602 > [2] https://github.com/apache/beam/pull/26652 > > > Regards, > Yi > > -- > > Yi Hu, (he/him/his) > > Software Engineer > > >
[Notice] Jenkins seed job comment trigger no longer working, and possible solutions
Dear Beam Developers, tl;dr For PRs involving Jenkins task changes authored by Beam committers, "Run seed job" no longer working due to apache infra change. It is noted that due to recent Apache Infra change on LDAP server, Beam Jenkins CI/CD no longer has access to the GitHub username list, and consequently several Jenkins tasks that used to have triggers enabled by committers can no longer triggered by commenting phrase against PR (e.g. "Run seed job") A full list of affected jobs are - seed_00_job - seed_job_standalone - beam_Publish_Docker_Snapshots - beam_Dependency_Check - beam_Metrics_Report Other than the seed job are release related workflows and should not affect development on code base. I have created a PR to temporarily remove the step of fetching GitHub usernames [2] to get the seed job back green. After that, I would like to ask the community if it is fine to either - Leave these jobs have no comment trigger (they can still be manually triggered via steps described in [2], besides the scheduled jobs) - Maintain a list of committer GitHub usernames manually in https://github.com/apache/beam/blob/master/.test-infra/jenkins/Committers.groovy Please feel free to share if you have a better idea for fixing this. See more context on [1] https://github.com/apache/beam/issues/26602 [2] https://github.com/apache/beam/pull/26652 Regards, Yi -- Yi Hu, (he/him/his) Software Engineer
Beam High Priority Issue Report (35)
This is your daily summary of Beam's current high priority issues that may need attention. See https://beam.apache.org/contribute/issue-priorities for the meaning and expectations around issue priorities. Unassigned P0 Issues: https://github.com/apache/beam/issues/26661 [Bug]: JDBCIO Read without Partition occur GC overhead limit exceeded https://github.com/apache/beam/issues/26602 [Failing Test]: Seed job permared due to insufficient_access LDAP Unassigned P1 Issues: https://github.com/apache/beam/issues/26621 [Failing Test]: beam_PerformanceTests_SparkReceiver_IO failing https://github.com/apache/beam/issues/26616 [Failing Test]: beam_PostCommit_Java_DataflowV2 SpannerReadIT multiple test failing https://github.com/apache/beam/issues/26587 [Bug]: BigQuery Copy jobs do not set write disposition to WRITE_APPEND after first copy https://github.com/apache/beam/issues/26550 [Failing Test]: beam_PostCommit_Java_PVR_Spark_Batch https://github.com/apache/beam/issues/26547 [Failing Test]: beam_PostCommit_Java_DataflowV2 https://github.com/apache/beam/issues/26354 [Bug]: BigQueryIO direct read not reading all rows when set --setEnableBundling=true https://github.com/apache/beam/issues/26343 [Bug]: apache_beam.io.gcp.bigquery_read_it_test.ReadAllBQTests.test_read_queries is flaky https://github.com/apache/beam/issues/26329 [Bug]: BigQuerySourceBase does not propagate a Coder to AvroSource https://github.com/apache/beam/issues/26041 [Bug]: Unable to create exactly-once Flink pipeline with stream source and file sink https://github.com/apache/beam/issues/25975 [Bug]: Reducing parallelism in FlinkRunner leads to a data loss https://github.com/apache/beam/issues/24776 [Bug]: Race condition in Python SDK Harness ProcessBundleProgress https://github.com/apache/beam/issues/24389 [Failing Test]: HadoopFormatIOElasticTest.classMethod ExceptionInInitializerError ContainerFetchException https://github.com/apache/beam/issues/24313 [Flaky]: apache_beam/runners/portability/portable_runner_test.py::PortableRunnerTestWithSubprocesses::test_pardo_state_with_custom_key_coder https://github.com/apache/beam/issues/23944 beam_PreCommit_Python_Cron regularily failing - test_pardo_large_input flaky https://github.com/apache/beam/issues/23709 [Flake]: Spark batch flakes in ParDoLifecycleTest.testTeardownCalledAfterExceptionInProcessElement and ParDoLifecycleTest.testTeardownCalledAfterExceptionInStartBundle https://github.com/apache/beam/issues/22913 [Bug]: beam_PostCommit_Java_ValidatesRunner_Flink is flakes in org.apache.beam.sdk.transforms.GroupByKeyTest$BasicTests.testAfterProcessingTimeContinuationTriggerUsingState https://github.com/apache/beam/issues/22605 [Bug]: Beam Python failure for dataflow_exercise_metrics_pipeline_test.ExerciseMetricsPipelineTest.test_metrics_it https://github.com/apache/beam/issues/21714 PulsarIOTest.testReadFromSimpleTopic is very flaky https://github.com/apache/beam/issues/21708 beam_PostCommit_Java_DataflowV2, testBigQueryStorageWrite30MProto failing consistently https://github.com/apache/beam/issues/21706 Flaky timeout in github Python unit test action StatefulDoFnOnDirectRunnerTest.test_dynamic_timer_clear_then_set_timer https://github.com/apache/beam/issues/21643 FnRunnerTest with non-trivial (order 1000 elements) numpy input flakes in non-cython environment https://github.com/apache/beam/issues/21476 WriteToBigQuery Dynamic table destinations returns wrong tableId https://github.com/apache/beam/issues/21469 beam_PostCommit_XVR_Flink flaky: Connection refused https://github.com/apache/beam/issues/21424 Java VR (Dataflow, V2, Streaming) failing: ParDoTest$TimestampTests/OnWindowExpirationTests https://github.com/apache/beam/issues/21262 Python AfterAny, AfterAll do not follow spec https://github.com/apache/beam/issues/21260 Python DirectRunner does not emit data at GC time https://github.com/apache/beam/issues/21121 apache_beam.examples.streaming_wordcount_it_test.StreamingWordCountIT.test_streaming_wordcount_it flakey https://github.com/apache/beam/issues/21104 Flaky: apache_beam.runners.portability.fn_api_runner.fn_runner_test.FnApiRunnerTestWithGrpcAndMultiWorkers https://github.com/apache/beam/issues/20976 apache_beam.runners.portability.flink_runner_test.FlinkRunnerTestOptimized.test_flink_metrics is flaky https://github.com/apache/beam/issues/20108 Python direct runner doesn't emit empty pane when it should https://github.com/apache/beam/issues/19814 Flink streaming flakes in ParDoLifecycleTest.testTeardownCalledAfterExceptionInStartBundleStateful and ParDoLifecycleTest.testTeardownCalledAfterExceptionInProcessElementStateful https://github.com/apache/beam/issues/19465 Explore possibilities to lower in-use IP address quota footprint. P1 Issues with no update in the last week: https://github.com/apache/beam/issues/23525 [Bug]: Default PubsubMessage coder will drop message id and orderingKey