Sounds good to me. Perhaps P0s > 36 hours ago (presumably they are more like ~hours for true outages of CI/website/etc) and P1s > 7 days?
On Thu, Jun 23, 2022 at 1:27 PM Brian Hulette <bhule...@google.com> wrote: > I think that Danny's alternate proposal (a daily email that show only > issues last updated >7 days ago, and those with no assignee) fits well with > the two goals you describe, if we include "triage needed" issues in the > latter category. Maybe we also explicitly separate these two concerns in > the report? > > > On Thu, Jun 23, 2022 at 1:14 PM Kenneth Knowles <k...@apache.org> wrote: > >> Forking thread because lots of people may just ignore this topic, per the >> discussion :-) >> >> (sometimes gmail doesn't fork thread properly, but here's hoping...) >> >> I'll add some other outcomes of these emails: >> >> - people file P0s that are not outages and P1s that are not data loss >> and I downgrade them >> - I randomly open up a few flaky test bugs and see if I can fix them >> really quick >> - people file legit P0s and P1s and I subscribe and follow them >> >> Of these, only the last one seems important (not just that *I* follow >> them, but that new P0s and P1s get immediate attention from many eyes) >> >> So maybe one take on the goal is to: >> >> - have new P0s and P1s evaluated quickly: P0s are an outage or >> outage-like occurrence that needs immediate remedy, and P1s need to be >> evaluated for release blocking, etc. >> - make sure P0s and P1s get attention appropriate to their priority >> >> It can also be helpful to just state the failure modes which would happen >> by default if we don't have a good process or automation: >> >> - Real P0 gets filed and not noticed or fixed in a timely manner, >> blocking users and/or community in real time >> - Real P1 gets filed and not noticed, so release goes out with known >> data loss bug or other total loss of functionality >> - Non-real P0s and P1s accumulate, throwing off our data and making it >> hard to find the real problems >> - Flakes are never fixed >> >> WDYT? >> >> If we have P0s and P1s in the "awaiting triage" state, those are the ones >> we need to notice. Then for a P0 or P1 outside of that state, we just need >> some way of making sure it doesn't stagnate. Or if it does stagnate, that >> empirically demonstrates it isn't really P1 (just like our P2 to P3 >> downgrade automation). If everything is P1, nothing is, as they say. >> >> Kenn >> >> On Thu, Jun 23, 2022 at 10:01 AM Danny McCormick < >> dannymccorm...@google.com> wrote: >> >>> > Maybe it would be helpful to sort these by last update time (and >>> potentially include that information in the email). Then we can at least >>> prioritize them instead of looking at a big wall of issues. >>> >>> I agree that this is a good idea (and pretty trivial to do). I'll update >>> the automation to do that once we get consensus on an approach. >>> >>> > I think the motivation for daily emails is that per the priorities >>> guide [1] P1 issues should be getting "continuous status updates". If these >>> issues aren't actually that important, I think the noise is good as it >>> should motivate us to prioritize them correctly. In practice that hasn't >>> been happening though... >>> >>> I guess the questions here are: >>> >>> 1) What is the goal of this email? >>> 2) Is it effective at accomplishing that goal. >>> >>> I think you're saying that the goal (or a goal) is to highlight issues >>> that aren't getting the attention they need; if that's our goal, then I >>> don't think this is a particularly effective mechanism for it because (a) >>> its very unclear which issues fall into that category and (b) there are too >>> many to manually go through on a daily basis. From the email alone, it's >>> not clear to me that any of the issues above "shouldn't" be P1s (though I'd >>> guess you're right that some/many of them don't belong since most were >>> created before the Jira -> GH migration based on the titles). I'd also >>> argue that a daily email just desensitizes us to them since there almost >>> always will be *some *valid P1s that don't need extra attention. >>> >>> I do still think this could have value as a weekly email, with the goal >>> being "it's probably a good idea for someone to take a look at each of >>> these". Another option would be to only include issues with no action in >>> the last 7 days and/or no assignees and keep it daily. >>> >>> A couple side notes: >>> - No matter what we do, if we keep the current automation in any form we >>> should fix the url from >>> https://api.github.com/repos/apache/beam/issues/# to >>> https://github.com/apache/beam/issues/# - the current links are very >>> annoying. >>> - After I send this, I will do a pass of the current P1s since it does >>> indeed seem like too many are P1s and many should actually be P2s (or >>> lower). >>> >>> Thanks, >>> Danny >>> >>> On Thu, Jun 23, 2022 at 12:21 PM Brian Hulette <bhule...@google.com> >>> wrote: >>> >>>> I think the motivation for daily emails is that per the priorities >>>> guide [1] P1 issues should be getting "continuous status updates". If these >>>> issues aren't actually that important, I think the noise is good as it >>>> should motivate us to prioritize them correctly. In practice that hasn't >>>> been happening though... >>>> >>>> Maybe it would be helpful to sort these by last update time (and >>>> potentially include that information in the email). Then we can at least >>>> prioritize them instead of looking at a big wall of issues. >>>> >>>> Brian >>>> >>>> [1] https://beam.apache.org/contribute/issue-priorities/ >>>> >>>> On Thu, Jun 23, 2022 at 6:07 AM Danny McCormick < >>>> dannymccorm...@google.com> wrote: >>>> >>>>> I think a weekly summary seems like a good idea for the P1 issues and >>>>> flaky tests, though daily still seems appropriate for P0 issues. I put up >>>>> https://github.com/apache/beam/pull/22017 to just send the P1/flaky >>>>> test reports on Wednesdays, if anyone objects please let me know - I'll >>>>> wait on merging til tomorrow to leave time for feedback (and it's always >>>>> reversible 🙂). >>>>> >>>>> Thanks, >>>>> Danny >>>>> >>>>> On Wed, Jun 22, 2022 at 7:05 PM Manu Zhang <owenzhang1...@gmail.com> >>>>> wrote: >>>>> >>>>>> Hi all, >>>>>> >>>>>> what is this daily summary intended for? Not all issues look like P1. >>>>>> And will a weekly summary be less noise? >>>>>> >>>>>> <beamacti...@gmail.com>于2022年6月22日 周三23:45写道: >>>>>> >>>>>>> This is your daily summary of Beam's current P1 issues, not >>>>>>> including flaky tests. >>>>>>> >>>>>>> See >>>>>>> https://beam.apache.org/contribute/issue-priorities/#p1-critical >>>>>>> for the meaning and expectations around P1 issues. >>>>>>> >>>>>>> >>>>>>> >>>>>>> https://api.github.com/repos/apache/beam/issues/21978: [Playground] >>>>>>> Implement Share Any Code feature on the frontend >>>>>>> https://api.github.com/repos/apache/beam/issues/21946: [Bug]: No >>>>>>> way to read or write to file when running Beam in Flink >>>>>>> https://api.github.com/repos/apache/beam/issues/21935: [Bug]: >>>>>>> Reject illformed GBK Coders >>>>>>> https://api.github.com/repos/apache/beam/issues/21897: [Feature >>>>>>> Request]: Flink runner savepoint backward compatibility >>>>>>> https://api.github.com/repos/apache/beam/issues/21893: [Bug]: >>>>>>> BigQuery Storage Write API implementation does not support table >>>>>>> partitioning >>>>>>> https://api.github.com/repos/apache/beam/issues/21794: Dataflow >>>>>>> runner creates a new timer whenever the output timestamp is change >>>>>>> https://api.github.com/repos/apache/beam/issues/21763: [Playground >>>>>>> Task]: Migrate from Google Analytics to Matomo Cloud >>>>>>> https://api.github.com/repos/apache/beam/issues/21715: Data missing >>>>>>> when using CassandraIO.Read >>>>>>> https://api.github.com/repos/apache/beam/issues/21713: 404s in >>>>>>> BigQueryIO don't get output to Failed Inserts PCollection >>>>>>> https://api.github.com/repos/apache/beam/issues/21711: Python >>>>>>> Streaming job failing to drain with BigQueryIO write errors >>>>>>> https://api.github.com/repos/apache/beam/issues/21703: >>>>>>> pubsublite.ReadWriteIT failing in beam_PostCommit_Java_DataflowV1 and V2 >>>>>>> https://api.github.com/repos/apache/beam/issues/21702: >>>>>>> SpannerWriteIT failing in beam PostCommit Java V1 >>>>>>> https://api.github.com/repos/apache/beam/issues/21700: >>>>>>> --dataflowServiceOptions=use_runner_v2 is broken >>>>>>> https://api.github.com/repos/apache/beam/issues/21695: >>>>>>> DataflowPipelineResult does not raise exception for unsuccessful states. >>>>>>> https://api.github.com/repos/apache/beam/issues/21694: BigQuery >>>>>>> Storage API insert with writeResult retry and write to error table >>>>>>> https://api.github.com/repos/apache/beam/issues/21479: Install >>>>>>> Python wheel and dependencies to local venv in SDK harness >>>>>>> https://api.github.com/repos/apache/beam/issues/21478: >>>>>>> KafkaIO.read.withDynamicRead() doesn't pick up new TopicPartitions >>>>>>> https://api.github.com/repos/apache/beam/issues/21477: Add >>>>>>> integration testing for BQ Storage API write modes >>>>>>> https://api.github.com/repos/apache/beam/issues/21476: >>>>>>> WriteToBigQuery Dynamic table destinations returns wrong tableId >>>>>>> https://api.github.com/repos/apache/beam/issues/21475: Beam x-lang >>>>>>> Dataflow tests failing due to _InactiveRpcError >>>>>>> https://api.github.com/repos/apache/beam/issues/21473: >>>>>>> PVR_Spark2_Streaming perma-red >>>>>>> https://api.github.com/repos/apache/beam/issues/21466: Simplify >>>>>>> version override for Dev versions of the Go SDK. >>>>>>> https://api.github.com/repos/apache/beam/issues/21465: Kafka commit >>>>>>> offset drop data on failure for runners that have non-checkpointing >>>>>>> shuffle >>>>>>> https://api.github.com/repos/apache/beam/issues/21269: Delete >>>>>>> orphaned files >>>>>>> https://api.github.com/repos/apache/beam/issues/21268: Race between >>>>>>> member variable being accessed due to leaking uninitialized state via >>>>>>> OutboundObserverFactory >>>>>>> https://api.github.com/repos/apache/beam/issues/21267: >>>>>>> WriteToBigQuery submits a duplicate BQ load job if a 503 error code is >>>>>>> returned from googleapi >>>>>>> https://api.github.com/repos/apache/beam/issues/21265: >>>>>>> apache_beam.runners.portability.fn_api_runner.translations_test.TranslationsTest.test_run_packable_combine_globally >>>>>>> 'apache_beam.coders.coder_impl._AbstractIterable' object is not >>>>>>> reversible >>>>>>> https://api.github.com/repos/apache/beam/issues/21263: (Broken Pipe >>>>>>> induced) Bricked Dataflow Pipeline >>>>>>> https://api.github.com/repos/apache/beam/issues/21262: Python >>>>>>> AfterAny, AfterAll do not follow spec >>>>>>> https://api.github.com/repos/apache/beam/issues/21260: Python >>>>>>> DirectRunner does not emit data at GC time >>>>>>> https://api.github.com/repos/apache/beam/issues/21259: Consumer >>>>>>> group with random prefix >>>>>>> https://api.github.com/repos/apache/beam/issues/21258: Dataflow >>>>>>> error in CombinePerKey operation >>>>>>> https://api.github.com/repos/apache/beam/issues/21257: Either >>>>>>> Create or DirectRunner fails to produce all elements to the following >>>>>>> transform >>>>>>> https://api.github.com/repos/apache/beam/issues/21123: Multiple >>>>>>> jobs running on Flink session cluster reuse the persistent Python >>>>>>> environment. >>>>>>> https://api.github.com/repos/apache/beam/issues/21119: Migrate to >>>>>>> the next version of Python `requests` when released >>>>>>> https://api.github.com/repos/apache/beam/issues/21117: "Java IO IT >>>>>>> Tests" - missing data in grafana >>>>>>> https://api.github.com/repos/apache/beam/issues/21115: JdbcIO date >>>>>>> conversion is sensitive to OS >>>>>>> https://api.github.com/repos/apache/beam/issues/21112: Dataflow >>>>>>> SocketException (SSLException) error while trying to send message from >>>>>>> Cloud Pub/Sub to BigQuery >>>>>>> https://api.github.com/repos/apache/beam/issues/21111: Java creates >>>>>>> an incorrect pipeline proto when core-construction-java jar is not in >>>>>>> the >>>>>>> CLASSPATH >>>>>>> https://api.github.com/repos/apache/beam/issues/21110: >>>>>>> codecov/patch has poor behavior >>>>>>> https://api.github.com/repos/apache/beam/issues/21109: SDF >>>>>>> BoundedSource seems to execute significantly slower than 'normal' >>>>>>> BoundedSource >>>>>>> https://api.github.com/repos/apache/beam/issues/21108: >>>>>>> java.io.InvalidClassException With Flink Kafka >>>>>>> https://api.github.com/repos/apache/beam/issues/20979: Portable >>>>>>> runners should be able to issue checkpoints to Splittable DoFn >>>>>>> https://api.github.com/repos/apache/beam/issues/20978: >>>>>>> PubsubIO.readAvroGenericRecord creates SchemaCoder that fails to decode >>>>>>> some Avro logical types >>>>>>> https://api.github.com/repos/apache/beam/issues/20973: Python Beam >>>>>>> SDK Harness hangs when installing pip packages >>>>>>> https://api.github.com/repos/apache/beam/issues/20818: XmlIO.Read >>>>>>> does not handle XML encoding per spec >>>>>>> https://api.github.com/repos/apache/beam/issues/20814: JmsIO is not >>>>>>> acknowledging messages correctly >>>>>>> https://api.github.com/repos/apache/beam/issues/20813: No trigger >>>>>>> early repeatedly for session windows >>>>>>> https://api.github.com/repos/apache/beam/issues/20812: >>>>>>> Cross-language consistency (RequiresStableInputs) is quietly broken (at >>>>>>> least on portable flink runner) >>>>>>> https://api.github.com/repos/apache/beam/issues/20692: Timer with >>>>>>> dataflow runner can be set multiple times (dataflow runner) >>>>>>> https://api.github.com/repos/apache/beam/issues/20691: Beam metrics >>>>>>> should be displayed in Flink UI "Metrics" tab >>>>>>> https://api.github.com/repos/apache/beam/issues/20689: Kafka >>>>>>> commitOffsetsInFinalize OOM on Flink >>>>>>> https://api.github.com/repos/apache/beam/issues/20532: Support for >>>>>>> coder argument in WriteToBigQuery >>>>>>> https://api.github.com/repos/apache/beam/issues/20531: >>>>>>> FileBasedSink: allow setting temp directory provider per dynamic >>>>>>> destination >>>>>>> https://api.github.com/repos/apache/beam/issues/20530: Make >>>>>>> non-portable Splittable DoFn the only option when executing Java "Read" >>>>>>> transforms >>>>>>> https://api.github.com/repos/apache/beam/issues/20529: SpannerIO >>>>>>> tests don't actually assert anything. >>>>>>> https://api.github.com/repos/apache/beam/issues/20528: python >>>>>>> CombineGlobally().with_fanout() cause duplicate combine results for >>>>>>> sliding >>>>>>> windows >>>>>>> https://api.github.com/repos/apache/beam/issues/20333: >>>>>>> beam_PerformanceTests_Kafka_IO failing due to " provided port is already >>>>>>> allocated" >>>>>>> https://api.github.com/repos/apache/beam/issues/20332: FileIO >>>>>>> writeDynamic with AvroIO.sink not writing all data >>>>>>> https://api.github.com/repos/apache/beam/issues/20330: Remove >>>>>>> insecure ssl options from MongoDBIO >>>>>>> https://api.github.com/repos/apache/beam/issues/20109: SortValues >>>>>>> should fail if SecondaryKey coder is not deterministic >>>>>>> https://api.github.com/repos/apache/beam/issues/20108: Python >>>>>>> direct runner doesn't emit empty pane when it should >>>>>>> https://api.github.com/repos/apache/beam/issues/20009: >>>>>>> Environment-sensitive provisioning for Dataflow >>>>>>> https://api.github.com/repos/apache/beam/issues/19971: [SQL] Some >>>>>>> Hive tests throw NullPointerException, but get marked as passing (Direct >>>>>>> Runner) >>>>>>> https://api.github.com/repos/apache/beam/issues/19817: datetime and >>>>>>> decimal should be logical types >>>>>>> https://api.github.com/repos/apache/beam/issues/19815: Add support >>>>>>> for remaining data types in python RowCoder >>>>>>> https://api.github.com/repos/apache/beam/issues/19813: PubsubIO >>>>>>> returns empty message bodies for all messages read >>>>>>> https://api.github.com/repos/apache/beam/issues/19556: User reports >>>>>>> protobuf ClassChangeError running against 2.6.0 or above >>>>>>> https://api.github.com/repos/apache/beam/issues/19369: KafkaIO >>>>>>> doesn't commit offsets while being used as bounded source >>>>>>> https://api.github.com/repos/apache/beam/issues/17950: [Bug]: Java >>>>>>> Precommit permared >>>>>>> >>>>>>