Re: Gradle Task Configuration Avoidance
Nice! I believe at some point in the past we made a pass to try to convert our stuff to this model. I wonder if we can prevent it proactively somehow, like disabling the legacy way of creating tasks or something. Kenn On Mon, Dec 5, 2022 at 6:25 AM Kerry Donny-Clark via dev < dev@beam.apache.org> wrote: > Thanks Damon! I really appreciate how clear your emails are here. Instead > of my usual feeling of "I don't quite understand, and don't have time to > get context" I can read all the context in the mail. > This error message had confused me, so I really appreciate the cleanup and > explanation. > > On Fri, Dec 2, 2022, 7:28 PM Damon Douglas via dev > wrote: > >> Hello Everyone, >> >> *If you are new to Beam and coming from non-Java language conventions, it >> is likely you are new to gradle. At the end of this email is a list of >> definitions and references to help understand this email.* >> >> *Short Version (For those who know gradle)*: >> A pull request [1] may fix the continual error message "Error: Backend >> initialization required, please run "terraform init"". The PR applies Task >> Configuration Avoidance [2] by applying changes to a few tasks from >> tasks(String) to tasks.register(String). >> >> *Long Version (For those who are not as familiar with gradle)*: >> >> I write this not as an expert but as someone still learning. Gradle [3] >> is the software we use in the Beam repository to automate many needed tasks >> associated with building and testing code. It is typically used in Java >> projects but can be extended for other purposes. We store code related to >> our Beam Playground [4] that also uses gradle though it is not mainly a >> Java project. The unit of work for Gradle is what is called a task. To >> run a task you open a terminal and type "./gradlew nameOfMyTask". There >> are two main ways to create a custom task in our build.gradle files. One >> is writing task("doSomething") and the other is >> tasks.register("doSomethingElse"). According to [2], the recommendation is >> to use the tasks.register("doSomething"). This avoids executing other work >> (configuration but don't worry about it for now) until one runs the >> doSomething task or another task we are running depends on it. >> >> So why were we seeing this "Error: Backend initialization required" >> message all the time? The reason is that tasks were configured as >> task("doSomething"). All I had to do was change this to >> tasks.register("doSomething") and it removed the message. >> >> *Definitions/References* >> >> 1. https://github.com/apache/beam/pull/24509 >> 2. >> https://docs.gradle.org/current/userguide/task_configuration_avoidance.html >> 3. https://docs.gradle.org/current/userguide/what_is_gradle.html >> 4. https://play.beam.apache.org/ >> >> *Suggested Learning Path To Understand This Email* >> 1. >> https://docs.gradle.org/current/samples/sample_building_java_libraries.html >> 2. https://docs.gradle.org/current/userguide/build_lifecycle.html >> 3. https://docs.gradle.org/current/userguide/tutorial_using_tasks.html >> 4. >> https://docs.gradle.org/current/userguide/task_configuration_avoidance.html >> >> Best, >> >> Damon >> >>
Re: Achievement unlocked: fully triaged
I definitely think reducing the label zoo could help. We have a lot of labels that are decompositions of what used to be Jira components. Kenn On Mon, Dec 5, 2022 at 12:17 PM Danny McCormick via dev wrote: > > Previously, we had automation that would automatically mark > self-assigned self-reported issues as triaged. That is probably a third of > issues or more. > > I believe that automation exists now[1], but it wasn't retroactively > applied to old issues. > > > One issue is that a lot of triage work is getting the labels right (a > lot of things end up in beam-model or beam-community) > > Do you think it would help to cut down on our label options? > beam-community might be popular because it's the default option, so > reducing options might not help that much unfortunately. > > [1] example - https://github.com/apache/beam/issues/24521 > > On Mon, Dec 5, 2022 at 2:57 PM Kenneth Knowles wrote: > >> Previously, we had automation that would automatically mark self-assigned >> self-reported issues as triaged. That is probably a third of issues or >> more. I'm not sure what else. I appreciate Valentyn keeping an eye on the >> Python label. One issue is that a lot of triage work is getting the labels >> right (a lot of things end up in beam-model or beam-community) >> >> Kenn >> >> On Mon, Dec 5, 2022 at 6:23 AM Kerry Donny-Clark via dev < >> dev@beam.apache.org> wrote: >> >>> This is a glorious achievement Kenn! To keep things clean going forward >>> are there any improvements we can make in our issue creation flow? >>> >>> On Fri, Dec 2, 2022, 6:44 PM Kenneth Knowles wrote: >>> Hi all, I've finally done it! I've emptied the label "awaiting triage". Help me keep it that way! This ensures that we actually at least *look* at each issue once, preferably soon after it is filed. The idea is that you make sure the priority and other labels are right, since users are not expected to know how we use labels. https://github.com/apache/beam/issues?q=is%3Aissue+is%3Aopen+label%3A%22awaiting+triage%22 Kenn >>>
Re: Achievement unlocked: fully triaged
> Previously, we had automation that would automatically mark self-assigned self-reported issues as triaged. That is probably a third of issues or more. I believe that automation exists now[1], but it wasn't retroactively applied to old issues. > One issue is that a lot of triage work is getting the labels right (a lot of things end up in beam-model or beam-community) Do you think it would help to cut down on our label options? beam-community might be popular because it's the default option, so reducing options might not help that much unfortunately. [1] example - https://github.com/apache/beam/issues/24521 On Mon, Dec 5, 2022 at 2:57 PM Kenneth Knowles wrote: > Previously, we had automation that would automatically mark self-assigned > self-reported issues as triaged. That is probably a third of issues or > more. I'm not sure what else. I appreciate Valentyn keeping an eye on the > Python label. One issue is that a lot of triage work is getting the labels > right (a lot of things end up in beam-model or beam-community) > > Kenn > > On Mon, Dec 5, 2022 at 6:23 AM Kerry Donny-Clark via dev < > dev@beam.apache.org> wrote: > >> This is a glorious achievement Kenn! To keep things clean going forward >> are there any improvements we can make in our issue creation flow? >> >> On Fri, Dec 2, 2022, 6:44 PM Kenneth Knowles wrote: >> >>> Hi all, >>> >>> I've finally done it! I've emptied the label "awaiting triage". Help me >>> keep it that way! This ensures that we actually at least *look* at each >>> issue once, preferably soon after it is filed. The idea is that you make >>> sure the priority and other labels are right, since users are not expected >>> to know how we use labels. >>> >>> >>> https://github.com/apache/beam/issues?q=is%3Aissue+is%3Aopen+label%3A%22awaiting+triage%22 >>> >>> Kenn >>> >>
Re: Achievement unlocked: fully triaged
I do a regular look at the go label myself. Partly because that's the best way to learn what to fix next. On Mon, Dec 5, 2022, 11:57 AM Kenneth Knowles wrote: > Previously, we had automation that would automatically mark self-assigned > self-reported issues as triaged. That is probably a third of issues or > more. I'm not sure what else. I appreciate Valentyn keeping an eye on the > Python label. One issue is that a lot of triage work is getting the labels > right (a lot of things end up in beam-model or beam-community) > > Kenn > > On Mon, Dec 5, 2022 at 6:23 AM Kerry Donny-Clark via dev < > dev@beam.apache.org> wrote: > >> This is a glorious achievement Kenn! To keep things clean going forward >> are there any improvements we can make in our issue creation flow? >> >> On Fri, Dec 2, 2022, 6:44 PM Kenneth Knowles wrote: >> >>> Hi all, >>> >>> I've finally done it! I've emptied the label "awaiting triage". Help me >>> keep it that way! This ensures that we actually at least *look* at each >>> issue once, preferably soon after it is filed. The idea is that you make >>> sure the priority and other labels are right, since users are not expected >>> to know how we use labels. >>> >>> >>> https://github.com/apache/beam/issues?q=is%3Aissue+is%3Aopen+label%3A%22awaiting+triage%22 >>> >>> Kenn >>> >>
Re: Achievement unlocked: fully triaged
Previously, we had automation that would automatically mark self-assigned self-reported issues as triaged. That is probably a third of issues or more. I'm not sure what else. I appreciate Valentyn keeping an eye on the Python label. One issue is that a lot of triage work is getting the labels right (a lot of things end up in beam-model or beam-community) Kenn On Mon, Dec 5, 2022 at 6:23 AM Kerry Donny-Clark via dev < dev@beam.apache.org> wrote: > This is a glorious achievement Kenn! To keep things clean going forward > are there any improvements we can make in our issue creation flow? > > On Fri, Dec 2, 2022, 6:44 PM Kenneth Knowles wrote: > >> Hi all, >> >> I've finally done it! I've emptied the label "awaiting triage". Help me >> keep it that way! This ensures that we actually at least *look* at each >> issue once, preferably soon after it is filed. The idea is that you make >> sure the priority and other labels are right, since users are not expected >> to know how we use labels. >> >> >> https://github.com/apache/beam/issues?q=is%3Aissue+is%3Aopen+label%3A%22awaiting+triage%22 >> >> Kenn >> >
Re: Gradle Task Configuration Avoidance
Thanks Damon! I really appreciate how clear your emails are here. Instead of my usual feeling of "I don't quite understand, and don't have time to get context" I can read all the context in the mail. This error message had confused me, so I really appreciate the cleanup and explanation. On Fri, Dec 2, 2022, 7:28 PM Damon Douglas via dev wrote: > Hello Everyone, > > *If you are new to Beam and coming from non-Java language conventions, it > is likely you are new to gradle. At the end of this email is a list of > definitions and references to help understand this email.* > > *Short Version (For those who know gradle)*: > A pull request [1] may fix the continual error message "Error: Backend > initialization required, please run "terraform init"". The PR applies Task > Configuration Avoidance [2] by applying changes to a few tasks from > tasks(String) to tasks.register(String). > > *Long Version (For those who are not as familiar with gradle)*: > > I write this not as an expert but as someone still learning. Gradle [3] > is the software we use in the Beam repository to automate many needed tasks > associated with building and testing code. It is typically used in Java > projects but can be extended for other purposes. We store code related to > our Beam Playground [4] that also uses gradle though it is not mainly a > Java project. The unit of work for Gradle is what is called a task. To > run a task you open a terminal and type "./gradlew nameOfMyTask". There > are two main ways to create a custom task in our build.gradle files. One > is writing task("doSomething") and the other is > tasks.register("doSomethingElse"). According to [2], the recommendation is > to use the tasks.register("doSomething"). This avoids executing other work > (configuration but don't worry about it for now) until one runs the > doSomething task or another task we are running depends on it. > > So why were we seeing this "Error: Backend initialization required" > message all the time? The reason is that tasks were configured as > task("doSomething"). All I had to do was change this to > tasks.register("doSomething") and it removed the message. > > *Definitions/References* > > 1. https://github.com/apache/beam/pull/24509 > 2. > https://docs.gradle.org/current/userguide/task_configuration_avoidance.html > 3. https://docs.gradle.org/current/userguide/what_is_gradle.html > 4. https://play.beam.apache.org/ > > *Suggested Learning Path To Understand This Email* > 1. > https://docs.gradle.org/current/samples/sample_building_java_libraries.html > 2. https://docs.gradle.org/current/userguide/build_lifecycle.html > 3. https://docs.gradle.org/current/userguide/tutorial_using_tasks.html > 4. > https://docs.gradle.org/current/userguide/task_configuration_avoidance.html > > Best, > > Damon > >
Re: Achievement unlocked: fully triaged
This is a glorious achievement Kenn! To keep things clean going forward are there any improvements we can make in our issue creation flow? On Fri, Dec 2, 2022, 6:44 PM Kenneth Knowles wrote: > Hi all, > > I've finally done it! I've emptied the label "awaiting triage". Help me > keep it that way! This ensures that we actually at least *look* at each > issue once, preferably soon after it is filed. The idea is that you make > sure the priority and other labels are right, since users are not expected > to know how we use labels. > > > https://github.com/apache/beam/issues?q=is%3Aissue+is%3Aopen+label%3A%22awaiting+triage%22 > > Kenn >
Beam High Priority Issue Report (57)
This is your daily summary of Beam's current high priority issues that may need attention. See https://beam.apache.org/contribute/issue-priorities for the meaning and expectations around issue priorities. Unassigned P1 Issues: https://github.com/apache/beam/issues/24383 [Bug]: Daemon will be stopped at the end of the build after the daemon was no longer found in the daemon registry https://github.com/apache/beam/issues/24367 [Bug]: workflow.tar.gz cannot be passed to flink runner https://github.com/apache/beam/issues/24313 [Flaky]: apache_beam/runners/portability/portable_runner_test.py::PortableRunnerTestWithSubprocesses::test_pardo_state_with_custom_key_coder https://github.com/apache/beam/issues/24267 [Failing Test]: Timeout waiting to lock gradle https://github.com/apache/beam/issues/24263 [Bug]: Remote call on apache-beam-jenkins-3 failed. The channel is closing down or has closed down https://github.com/apache/beam/issues/23944 beam_PreCommit_Python_Cron regularily failing - test_pardo_large_input flaky https://github.com/apache/beam/issues/23709 [Flake]: Spark batch flakes in ParDoLifecycleTest.testTeardownCalledAfterExceptionInProcessElement and ParDoLifecycleTest.testTeardownCalledAfterExceptionInStartBundle https://github.com/apache/beam/issues/22969 Discrepancy in behavior of `DoFn.process()` when `yield` is combined with `return` statement, or vice versa https://github.com/apache/beam/issues/22961 [Bug]: WriteToBigQuery silently skips most of records without job fail https://github.com/apache/beam/issues/22913 [Bug]: beam_PostCommit_Java_ValidatesRunner_Flink is flakes in org.apache.beam.sdk.transforms.GroupByKeyTest$BasicTests.testAfterProcessingTimeContinuationTriggerUsingState https://github.com/apache/beam/issues/22321 PortableRunnerTestWithExternalEnv.test_pardo_large_input is regularly failing on jenkins https://github.com/apache/beam/issues/21713 404s in BigQueryIO don't get output to Failed Inserts PCollection https://github.com/apache/beam/issues/21561 ExternalPythonTransformTest.trivialPythonTransform flaky https://github.com/apache/beam/issues/21480 flake: FlinkRunnerTest.testEnsureStdoutStdErrIsRestored https://github.com/apache/beam/issues/21469 beam_PostCommit_XVR_Flink flaky: Connection refused https://github.com/apache/beam/issues/21462 Flake in org.apache.beam.sdk.io.mqtt.MqttIOTest.testReadObject: Address already in use https://github.com/apache/beam/issues/21261 org.apache.beam.runners.dataflow.worker.fn.logging.BeamFnLoggingServiceTest.testMultipleClientsFailingIsHandledGracefullyByServer is flaky https://github.com/apache/beam/issues/21260 Python DirectRunner does not emit data at GC time https://github.com/apache/beam/issues/21121 apache_beam.examples.streaming_wordcount_it_test.StreamingWordCountIT.test_streaming_wordcount_it flakey https://github.com/apache/beam/issues/21113 testTwoTimersSettingEachOtherWithCreateAsInputBounded flaky https://github.com/apache/beam/issues/20976 apache_beam.runners.portability.flink_runner_test.FlinkRunnerTestOptimized.test_flink_metrics is flaky https://github.com/apache/beam/issues/20975 org.apache.beam.runners.flink.ReadSourcePortableTest.testExecution[streaming: false] is flaky https://github.com/apache/beam/issues/20974 Python GHA PreCommits flake with grpc.FutureTimeoutError on SDK harness startup https://github.com/apache/beam/issues/20689 Kafka commitOffsetsInFinalize OOM on Flink https://github.com/apache/beam/issues/20108 Python direct runner doesn't emit empty pane when it should https://github.com/apache/beam/issues/19814 Flink streaming flakes in ParDoLifecycleTest.testTeardownCalledAfterExceptionInStartBundleStateful and ParDoLifecycleTest.testTeardownCalledAfterExceptionInProcessElementStateful https://github.com/apache/beam/issues/19734 WatchTest.testMultiplePollsWithManyResults flake: Outputs must be in timestamp order (sickbayed) https://github.com/apache/beam/issues/19465 Explore possibilities to lower in-use IP address quota footprint. https://github.com/apache/beam/issues/19241 Python Dataflow integration tests should export the pipeline Job ID and console output to Jenkins Test Result section P1 Issues with no update in the last week: https://github.com/apache/beam/issues/24100 [Bug]: `Filter.whereFieldName` appears in docs but not available https://github.com/apache/beam/issues/23906 [Bug]: Dataflow jpms tests fail on the 2.43.0 release branch https://github.com/apache/beam/issues/23875 [Bug]: beam.Row.__eq__ returns true for unequal rows https://github.com/apache/beam/issues/23525 [Bug]: Default PubsubMessage coder will drop message id and orderingKey https://github.com/apache/beam/issues/23489 [Bug]: add DebeziumIO to the connectors page https://github.com/apache/beam/issues/23306 [Bug]: BigQueryBatchFileLoads in python loses data when using WRITE_TRUNCATE https://github.com/apache/beam/issues/23286 [Bug]: beam_PerformanceTests_InfluxDbIO_IT Flaky > 50 % Fa