> > 2. The links in this report start with api.github.* and don’t take us > > directly to the issues. > > > Yeah Danny pointed that out as well. I'm assuming he knows how to fix it? > > This is already fixed - Pablo actually beat me to it! > <https://github.com/apache/beam/pull/22033> It adds also a colon after URL and some mail clients consider it as a part of URL which leads to a broken link. Should we just remove a colon there or add a space between?
— Alexey > > Thanks, > Danny > > On Thu, Jun 23, 2022 at 8:30 PM Brian Hulette <bhule...@google.com > <mailto:bhule...@google.com>> wrote: > +1 for that proposal! > > > 1. P2 and P3 issues should be noticed and resolved as well. Shall we have a > > longer time window for the rest of not triaged or stagnate issues and > > include them? > > I worry these lists would get _very_ long and wouldn't be actionable. But > maybe it's worth reporting something like "There are 376 P2's with no update > in the last 6 months" with a link to a query? > > > 2. The links in this report start with api.github.* and don’t take us > > directly to the issues. > > Yeah Danny pointed that out as well. I'm assuming he knows how to fix it? > > On Thu, Jun 23, 2022 at 2:37 PM Pablo Estrada <pabl...@google.com > <mailto:pabl...@google.com>> wrote: > Thanks. I like the proposal, and I've found the emails useful. > Best > -P. > > On Thu, Jun 23, 2022 at 2:33 PM Manu Zhang <owenzhang1...@gmail.com > <mailto:owenzhang1...@gmail.com>> wrote: > Sounds good! It’s like our internal reports of JIRA tickets exceeding SLA > time and having no response from engineers. We either resolve them or > downgrade the priority to extend time window. > > Besides, > 1. P2 and P3 issues should be noticed and resolved as well. Shall we have a > longer time window for the rest of not triaged or stagnate issues and include > them? > 2. The links in this report start with api.github.* and don’t take us > directly to the issues. > > > Danny McCormick <dannymccorm...@google.com > <mailto:dannymccorm...@google.com>>于2022年6月24日 周五04:48写道: > That generally sounds right to me - I also would vote that we consolidate to > 1 email and stop distinguishing between flaky P1s and normal P1s. > > So the single daily report would be: > > - Unassigned P0s > - P0s with no update in the last 36 hours > - Unassigned P1s > - P1s with no update in the last 7 days > > I think that will generate a pretty good list of issues that require some > kind of action. > > On Thu, Jun 23, 2022 at 4:43 PM Kenneth Knowles <k...@apache.org > <mailto:k...@apache.org>> wrote: > Sounds good to me. Perhaps P0s > 36 hours ago (presumably they are more like > ~hours for true outages of CI/website/etc) and P1s > 7 days? > > On Thu, Jun 23, 2022 at 1:27 PM Brian Hulette <bhule...@google.com > <mailto:bhule...@google.com>> wrote: > I think that Danny's alternate proposal (a daily email that show only issues > last updated >7 days ago, and those with no assignee) fits well with the two > goals you describe, if we include "triage needed" issues in the latter > category. Maybe we also explicitly separate these two concerns in the report? > > > On Thu, Jun 23, 2022 at 1:14 PM Kenneth Knowles <k...@apache.org > <mailto:k...@apache.org>> wrote: > Forking thread because lots of people may just ignore this topic, per the > discussion :-) > > (sometimes gmail doesn't fork thread properly, but here's hoping...) > > I'll add some other outcomes of these emails: > > - people file P0s that are not outages and P1s that are not data loss and I > downgrade them > - I randomly open up a few flaky test bugs and see if I can fix them really > quick > - people file legit P0s and P1s and I subscribe and follow them > > Of these, only the last one seems important (not just that *I* follow them, > but that new P0s and P1s get immediate attention from many eyes) > > So maybe one take on the goal is to: > > - have new P0s and P1s evaluated quickly: P0s are an outage or outage-like > occurrence that needs immediate remedy, and P1s need to be evaluated for > release blocking, etc. > - make sure P0s and P1s get attention appropriate to their priority > > It can also be helpful to just state the failure modes which would happen by > default if we don't have a good process or automation: > > - Real P0 gets filed and not noticed or fixed in a timely manner, blocking > users and/or community in real time > - Real P1 gets filed and not noticed, so release goes out with known data > loss bug or other total loss of functionality > - Non-real P0s and P1s accumulate, throwing off our data and making it hard > to find the real problems > - Flakes are never fixed > > WDYT? > > If we have P0s and P1s in the "awaiting triage" state, those are the ones we > need to notice. Then for a P0 or P1 outside of that state, we just need some > way of making sure it doesn't stagnate. Or if it does stagnate, that > empirically demonstrates it isn't really P1 (just like our P2 to P3 downgrade > automation). If everything is P1, nothing is, as they say. > > Kenn > > On Thu, Jun 23, 2022 at 10:01 AM Danny McCormick <dannymccorm...@google.com > <mailto:dannymccorm...@google.com>> wrote: > > Maybe it would be helpful to sort these by last update time (and > > potentially include that information in the email). Then we can at least > > prioritize them instead of looking at a big wall of issues. > > I agree that this is a good idea (and pretty trivial to do). I'll update the > automation to do that once we get consensus on an approach. > > > I think the motivation for daily emails is that per the priorities guide > > [1] P1 issues should be getting "continuous status updates". If these > > issues aren't actually that important, I think the noise is good as it > > should motivate us to prioritize them correctly. In practice that hasn't > > been happening though... > > I guess the questions here are: > > 1) What is the goal of this email? > 2) Is it effective at accomplishing that goal. > > I think you're saying that the goal (or a goal) is to highlight issues that > aren't getting the attention they need; if that's our goal, then I don't > think this is a particularly effective mechanism for it because (a) its very > unclear which issues fall into that category and (b) there are too many to > manually go through on a daily basis. From the email alone, it's not clear to > me that any of the issues above "shouldn't" be P1s (though I'd guess you're > right that some/many of them don't belong since most were created before the > Jira -> GH migration based on the titles). I'd also argue that a daily email > just desensitizes us to them since there almost always will be some valid P1s > that don't need extra attention. > > I do still think this could have value as a weekly email, with the goal being > "it's probably a good idea for someone to take a look at each of these". > Another option would be to only include issues with no action in the last 7 > days and/or no assignees and keep it daily. > > A couple side notes: > - No matter what we do, if we keep the current automation in any form we > should fix the url from https://api.github.com/repos/apache/beam/issues/# > <https://api.github.com/repos/apache/beam/issues/#> to > https://github.com/apache/beam/issues/# > <https://github.com/apache/beam/issues/#> - the current links are very > annoying. > - After I send this, I will do a pass of the current P1s since it does indeed > seem like too many are P1s and many should actually be P2s (or lower). > > Thanks, > Danny > > On Thu, Jun 23, 2022 at 12:21 PM Brian Hulette <bhule...@google.com > <mailto:bhule...@google.com>> wrote: > I think the motivation for daily emails is that per the priorities guide [1] > P1 issues should be getting "continuous status updates". If these issues > aren't actually that important, I think the noise is good as it should > motivate us to prioritize them correctly. In practice that hasn't been > happening though... > > Maybe it would be helpful to sort these by last update time (and potentially > include that information in the email). Then we can at least prioritize them > instead of looking at a big wall of issues. > > Brian > > [1] https://beam.apache.org/contribute/issue-priorities/ > <https://beam.apache.org/contribute/issue-priorities/> > On Thu, Jun 23, 2022 at 6:07 AM Danny McCormick <dannymccorm...@google.com > <mailto:dannymccorm...@google.com>> wrote: > I think a weekly summary seems like a good idea for the P1 issues and flaky > tests, though daily still seems appropriate for P0 issues. I put up > https://github.com/apache/beam/pull/22017 > <https://github.com/apache/beam/pull/22017> to just send the P1/flaky test > reports on Wednesdays, if anyone objects please let me know - I'll wait on > merging til tomorrow to leave time for feedback (and it's always reversible > 🙂). > > Thanks, > Danny > > On Wed, Jun 22, 2022 at 7:05 PM Manu Zhang <owenzhang1...@gmail.com > <mailto:owenzhang1...@gmail.com>> wrote: > Hi all, > > what is this daily summary intended for? Not all issues look like P1. And > will a weekly summary be less noise? > > <beamacti...@gmail.com <mailto:beamacti...@gmail.com>>于2022年6月22日 周三23:45写道: > This is your daily summary of Beam's current P1 issues, not including flaky > tests. > > See https://beam.apache.org/contribute/issue-priorities/#p1-critical > <https://beam.apache.org/contribute/issue-priorities/#p1-critical> for the > meaning and expectations around P1 issues. > > > > https://api.github.com/repos/apache/beam/issues/21978 > <https://api.github.com/repos/apache/beam/issues/21978>: [Playground] > Implement Share Any Code feature on the frontend > https://api.github.com/repos/apache/beam/issues/21946 > <https://api.github.com/repos/apache/beam/issues/21946>: [Bug]: No way to > read or write to file when running Beam in Flink > https://api.github.com/repos/apache/beam/issues/21935 > <https://api.github.com/repos/apache/beam/issues/21935>: [Bug]: Reject > illformed GBK Coders > https://api.github.com/repos/apache/beam/issues/21897 > <https://api.github.com/repos/apache/beam/issues/21897>: [Feature Request]: > Flink runner savepoint backward compatibility > https://api.github.com/repos/apache/beam/issues/21893 > <https://api.github.com/repos/apache/beam/issues/21893>: [Bug]: BigQuery > Storage Write API implementation does not support table partitioning > https://api.github.com/repos/apache/beam/issues/21794 > <https://api.github.com/repos/apache/beam/issues/21794>: Dataflow runner > creates a new timer whenever the output timestamp is change > https://api.github.com/repos/apache/beam/issues/21763 > <https://api.github.com/repos/apache/beam/issues/21763>: [Playground Task]: > Migrate from Google Analytics to Matomo Cloud > https://api.github.com/repos/apache/beam/issues/21715 > <https://api.github.com/repos/apache/beam/issues/21715>: Data missing when > using CassandraIO.Read > https://api.github.com/repos/apache/beam/issues/21713 > <https://api.github.com/repos/apache/beam/issues/21713>: 404s in BigQueryIO > don't get output to Failed Inserts PCollection > https://api.github.com/repos/apache/beam/issues/21711 > <https://api.github.com/repos/apache/beam/issues/21711>: Python Streaming job > failing to drain with BigQueryIO write errors > https://api.github.com/repos/apache/beam/issues/21703 > <https://api.github.com/repos/apache/beam/issues/21703>: > pubsublite.ReadWriteIT failing in beam_PostCommit_Java_DataflowV1 and V2 > https://api.github.com/repos/apache/beam/issues/21702 > <https://api.github.com/repos/apache/beam/issues/21702>: SpannerWriteIT > failing in beam PostCommit Java V1 > https://api.github.com/repos/apache/beam/issues/21700 > <https://api.github.com/repos/apache/beam/issues/21700>: > --dataflowServiceOptions=use_runner_v2 is broken > https://api.github.com/repos/apache/beam/issues/21695 > <https://api.github.com/repos/apache/beam/issues/21695>: > DataflowPipelineResult does not raise exception for unsuccessful states. > https://api.github.com/repos/apache/beam/issues/21694 > <https://api.github.com/repos/apache/beam/issues/21694>: BigQuery Storage API > insert with writeResult retry and write to error table > https://api.github.com/repos/apache/beam/issues/21479 > <https://api.github.com/repos/apache/beam/issues/21479>: Install Python wheel > and dependencies to local venv in SDK harness > https://api.github.com/repos/apache/beam/issues/21478 > <https://api.github.com/repos/apache/beam/issues/21478>: > KafkaIO.read.withDynamicRead() doesn't pick up new TopicPartitions > https://api.github.com/repos/apache/beam/issues/21477 > <https://api.github.com/repos/apache/beam/issues/21477>: Add integration > testing for BQ Storage API write modes > https://api.github.com/repos/apache/beam/issues/21476 > <https://api.github.com/repos/apache/beam/issues/21476>: WriteToBigQuery > Dynamic table destinations returns wrong tableId > https://api.github.com/repos/apache/beam/issues/21475 > <https://api.github.com/repos/apache/beam/issues/21475>: Beam x-lang Dataflow > tests failing due to _InactiveRpcError > https://api.github.com/repos/apache/beam/issues/21473 > <https://api.github.com/repos/apache/beam/issues/21473>: PVR_Spark2_Streaming > perma-red > https://api.github.com/repos/apache/beam/issues/21466 > <https://api.github.com/repos/apache/beam/issues/21466>: Simplify version > override for Dev versions of the Go SDK. > https://api.github.com/repos/apache/beam/issues/21465 > <https://api.github.com/repos/apache/beam/issues/21465>: Kafka commit offset > drop data on failure for runners that have non-checkpointing shuffle > https://api.github.com/repos/apache/beam/issues/21269 > <https://api.github.com/repos/apache/beam/issues/21269>: Delete orphaned files > https://api.github.com/repos/apache/beam/issues/21268 > <https://api.github.com/repos/apache/beam/issues/21268>: Race between member > variable being accessed due to leaking uninitialized state via > OutboundObserverFactory > https://api.github.com/repos/apache/beam/issues/21267 > <https://api.github.com/repos/apache/beam/issues/21267>: WriteToBigQuery > submits a duplicate BQ load job if a 503 error code is returned from googleapi > https://api.github.com/repos/apache/beam/issues/21265 > <https://api.github.com/repos/apache/beam/issues/21265>: > apache_beam.runners.portability.fn_api_runner.translations_test.TranslationsTest.test_run_packable_combine_globally > 'apache_beam.coders.coder_impl._AbstractIterable' object is not reversible > https://api.github.com/repos/apache/beam/issues/21263 > <https://api.github.com/repos/apache/beam/issues/21263>: (Broken Pipe > induced) Bricked Dataflow Pipeline > https://api.github.com/repos/apache/beam/issues/21262 > <https://api.github.com/repos/apache/beam/issues/21262>: Python AfterAny, > AfterAll do not follow spec > https://api.github.com/repos/apache/beam/issues/21260 > <https://api.github.com/repos/apache/beam/issues/21260>: Python DirectRunner > does not emit data at GC time > https://api.github.com/repos/apache/beam/issues/21259 > <https://api.github.com/repos/apache/beam/issues/21259>: Consumer group with > random prefix > https://api.github.com/repos/apache/beam/issues/21258 > <https://api.github.com/repos/apache/beam/issues/21258>: Dataflow error in > CombinePerKey operation > https://api.github.com/repos/apache/beam/issues/21257 > <https://api.github.com/repos/apache/beam/issues/21257>: Either Create or > DirectRunner fails to produce all elements to the following transform > https://api.github.com/repos/apache/beam/issues/21123 > <https://api.github.com/repos/apache/beam/issues/21123>: Multiple jobs > running on Flink session cluster reuse the persistent Python environment. > https://api.github.com/repos/apache/beam/issues/21119 > <https://api.github.com/repos/apache/beam/issues/21119>: Migrate to the next > version of Python `requests` when released > https://api.github.com/repos/apache/beam/issues/21117 > <https://api.github.com/repos/apache/beam/issues/21117>: "Java IO IT Tests" - > missing data in grafana > https://api.github.com/repos/apache/beam/issues/21115 > <https://api.github.com/repos/apache/beam/issues/21115>: JdbcIO date > conversion is sensitive to OS > https://api.github.com/repos/apache/beam/issues/21112 > <https://api.github.com/repos/apache/beam/issues/21112>: Dataflow > SocketException (SSLException) error while trying to send message from Cloud > Pub/Sub to BigQuery > https://api.github.com/repos/apache/beam/issues/21111 > <https://api.github.com/repos/apache/beam/issues/21111>: Java creates an > incorrect pipeline proto when core-construction-java jar is not in the > CLASSPATH > https://api.github.com/repos/apache/beam/issues/21110 > <https://api.github.com/repos/apache/beam/issues/21110>: codecov/patch has > poor behavior > https://api.github.com/repos/apache/beam/issues/21109 > <https://api.github.com/repos/apache/beam/issues/21109>: SDF BoundedSource > seems to execute significantly slower than 'normal' BoundedSource > https://api.github.com/repos/apache/beam/issues/21108 > <https://api.github.com/repos/apache/beam/issues/21108>: > java.io.InvalidClassException With Flink Kafka > https://api.github.com/repos/apache/beam/issues/20979 > <https://api.github.com/repos/apache/beam/issues/20979>: Portable runners > should be able to issue checkpoints to Splittable DoFn > https://api.github.com/repos/apache/beam/issues/20978 > <https://api.github.com/repos/apache/beam/issues/20978>: > PubsubIO.readAvroGenericRecord creates SchemaCoder that fails to decode some > Avro logical types > https://api.github.com/repos/apache/beam/issues/20973 > <https://api.github.com/repos/apache/beam/issues/20973>: Python Beam SDK > Harness hangs when installing pip packages > https://api.github.com/repos/apache/beam/issues/20818 > <https://api.github.com/repos/apache/beam/issues/20818>: XmlIO.Read does not > handle XML encoding per spec > https://api.github.com/repos/apache/beam/issues/20814 > <https://api.github.com/repos/apache/beam/issues/20814>: JmsIO is not > acknowledging messages correctly > https://api.github.com/repos/apache/beam/issues/20813 > <https://api.github.com/repos/apache/beam/issues/20813>: No trigger early > repeatedly for session windows > https://api.github.com/repos/apache/beam/issues/20812 > <https://api.github.com/repos/apache/beam/issues/20812>: Cross-language > consistency (RequiresStableInputs) is quietly broken (at least on portable > flink runner) > https://api.github.com/repos/apache/beam/issues/20692 > <https://api.github.com/repos/apache/beam/issues/20692>: Timer with dataflow > runner can be set multiple times (dataflow runner) > https://api.github.com/repos/apache/beam/issues/20691 > <https://api.github.com/repos/apache/beam/issues/20691>: Beam metrics should > be displayed in Flink UI "Metrics" tab > https://api.github.com/repos/apache/beam/issues/20689 > <https://api.github.com/repos/apache/beam/issues/20689>: Kafka > commitOffsetsInFinalize OOM on Flink > https://api.github.com/repos/apache/beam/issues/20532 > <https://api.github.com/repos/apache/beam/issues/20532>: Support for coder > argument in WriteToBigQuery > https://api.github.com/repos/apache/beam/issues/20531 > <https://api.github.com/repos/apache/beam/issues/20531>: FileBasedSink: allow > setting temp directory provider per dynamic destination > https://api.github.com/repos/apache/beam/issues/20530 > <https://api.github.com/repos/apache/beam/issues/20530>: Make non-portable > Splittable DoFn the only option when executing Java "Read" transforms > https://api.github.com/repos/apache/beam/issues/20529 > <https://api.github.com/repos/apache/beam/issues/20529>: SpannerIO tests > don't actually assert anything. > https://api.github.com/repos/apache/beam/issues/20528 > <https://api.github.com/repos/apache/beam/issues/20528>: python > CombineGlobally().with_fanout() cause duplicate combine results for sliding > windows > https://api.github.com/repos/apache/beam/issues/20333 > <https://api.github.com/repos/apache/beam/issues/20333>: > beam_PerformanceTests_Kafka_IO failing due to " provided port is already > allocated" > https://api.github.com/repos/apache/beam/issues/20332 > <https://api.github.com/repos/apache/beam/issues/20332>: FileIO writeDynamic > with AvroIO.sink not writing all data > https://api.github.com/repos/apache/beam/issues/20330 > <https://api.github.com/repos/apache/beam/issues/20330>: Remove insecure ssl > options from MongoDBIO > https://api.github.com/repos/apache/beam/issues/20109 > <https://api.github.com/repos/apache/beam/issues/20109>: SortValues should > fail if SecondaryKey coder is not deterministic > https://api.github.com/repos/apache/beam/issues/20108 > <https://api.github.com/repos/apache/beam/issues/20108>: Python direct runner > doesn't emit empty pane when it should > https://api.github.com/repos/apache/beam/issues/20009 > <https://api.github.com/repos/apache/beam/issues/20009>: > Environment-sensitive provisioning for Dataflow > https://api.github.com/repos/apache/beam/issues/19971 > <https://api.github.com/repos/apache/beam/issues/19971>: [SQL] Some Hive > tests throw NullPointerException, but get marked as passing (Direct Runner) > https://api.github.com/repos/apache/beam/issues/19817 > <https://api.github.com/repos/apache/beam/issues/19817>: datetime and decimal > should be logical types > https://api.github.com/repos/apache/beam/issues/19815 > <https://api.github.com/repos/apache/beam/issues/19815>: Add support for > remaining data types in python RowCoder > https://api.github.com/repos/apache/beam/issues/19813 > <https://api.github.com/repos/apache/beam/issues/19813>: PubsubIO returns > empty message bodies for all messages read > https://api.github.com/repos/apache/beam/issues/19556 > <https://api.github.com/repos/apache/beam/issues/19556>: User reports > protobuf ClassChangeError running against 2.6.0 or above > https://api.github.com/repos/apache/beam/issues/19369 > <https://api.github.com/repos/apache/beam/issues/19369>: KafkaIO doesn't > commit offsets while being used as bounded source > https://api.github.com/repos/apache/beam/issues/17950 > <https://api.github.com/repos/apache/beam/issues/17950>: [Bug]: Java > Precommit permared