Hi all, FWIW, a quick and dirty solution is here: http://xsnippet.org/360188/ :)
Thanks, Roman On Fri, Sep 19, 2014 at 2:03 PM, Ben Nemec <openst...@nemebean.com> wrote: > On 09/19/2014 08:13 AM, Sean Dague wrote: >> I've spent the better part of the last 2 weeks in the Nova bug tracker >> to try to turn it into something that doesn't cause people to run away >> screaming. I don't remember exactly where we started at open bug count 2 >> weeks ago (it was north of 1400, with > 200 bugs in new, but it might >> have been north of 1600), but as of this email we're at < 1000 open bugs >> (I'm counting Fix Committed as closed, even though LP does not), and ~0 >> new bugs (depending on the time of the day). >> >> == Philosophy in Triaging == >> >> I'm going to lay out the philosophy of triaging I've had, because this >> may also set the tone going forward. >> >> A bug tracker is a tool to help us make a better release. It does not >> exist for it's own good, it exists to help. Which means when evaluating >> what stays in and what leaves we need to evaluate if any particular >> artifact will help us make a better release. But also more importantly >> realize that there is a cost for carrying every artifact in the tracker. >> Resolving duplicates gets non linearly harder as the number of artifacts >> go up. Triaging gets non-linearly hard as the number of artifacts go up. >> >> With this I was being somewhat pragmatic about closing bugs. An old bug >> that is just a stacktrace is typically not useful. An old bug that is a >> vague sentence that we should refactor a particular module (with no >> specifics on the details) is not useful. A bug reported against a very >> old version of OpenStack where the code has changed a lot in the >> relevant area, and there aren't responses from the author, is not >> useful. Not useful bugs just add debt, and we should get rid of them. >> That makes the chance of pulling a random bug off the tracker something >> that you could actually look at fixing, instead of mostly just stalling out. >> >> So I closed a lot of stuff as Invalid / Opinion that fell into those camps. >> >> == Keeping New Bugs at close to 0 == >> >> After driving the bugs in the New state down to zero last week, I found >> it's actually pretty easy to keep it at 0. >> >> We get 10 - 20 new bugs a day in Nova (during a weekday). Of those ~20% >> aren't actually a bug, and can be closed immediately. ~30% look like a >> bug, but don't have anywhere near enough information in them, and >> flipping them to incomplete with questions quickly means we have a real >> chance of getting the right info. ~10% are fixable in < 30 minutes worth >> of work. And the rest are real bugs, that seem to have enough to dive >> into it, and can be triaged into Confirmed, set a priority, and add the >> appropriate tags for the area. >> >> But, more importantly, this means we can filter bug quality on the way >> in. And we can also encourage bug reporters that are giving us good >> stuff, or even easy stuff, as we respond quickly. >> >> Recommendation #1: we adopt a 0 new bugs policy to keep this from >> getting away from us in the future. > > We have this policy in TripleO, and to help keep it fresh in people's > minds Roman Podolyaka (IIRC) wrote an untriaged-bot for the IRC channel > that periodically posts a list of any New bugs. I've found it very > helpful, so it's probably worth getting that into infra somewhere so > other people can use it too. > >> >> == Our worse bug reporters are often core reviewers == >> >> I'm going to pick on Dan Prince here, mostly because I have a recent >> concrete example, however in triaging the bug queue much of the core >> team is to blame (including myself). >> >> https://bugs.launchpad.net/nova/+bug/1368773 is a terrible bug. Also, it >> was set incomplete and no response. I'm almost 100% sure it's a dupe of >> the multiprocess bug we've been tracking down but it's so terse that you >> can't get to the bottom of it. >> >> There were a ton of 2012 nova bugs that were basically "post it notes". >> Oh, "we should refactor this function". Full stop. While those are fine >> for personal tracking, their value goes to zero probably 3 months after >> they are files, especially if the reporter stops working on the issue at >> hand. Nova has plenty of "wouldn't it be great if we... " ideas. I'm not >> convinced using bugs for those is useful unless we go and close them out >> aggressively if they stall. >> >> Also, if Nova core can't file a good bug, it's hard to set the example >> for others in our community. >> >> Recommendation #2: hey, Nova core, lets be better about filing the kinds >> of bugs we want to see! mkay! >> >> Recommendation #3: Let's create a tag for "personal work items" or >> something for these class of TODOs people are leaving themselves that >> make them a ton easier to cull later when they stall and no one else has >> enough context to pick them up. >> >> == Tags == >> >> The aggressive tagging that Tracy brought into the project has been >> awesome. It definitely helps slice out into better functional areas. >> Here is the top of our current official tag list (and bug count): >> >> 95 compute >> 83 libvirt >> 74 api >> 68 vmware >> 67 network >> 41 db >> 40 testing >> 40 volumes >> 36 ec2 >> 35 icehouse-backport-potential >> 32 low-hanging-fruit >> 31 xenserver >> 25 ironic >> 23 hyper-v >> 16 cells >> 14 scheduler >> 12 baremetal >> 9 ceph >> 9 security >> 8 oslo >> ... >> >> So, good stuff. However I think we probably want to take a further step >> and attempt to get champions for tags. So that tag owners would ensure >> their bug list looks sane, and actually spend some time fixing them. >> It's pretty clear, for instance, that the ec2 bugs are just piling up, >> and very few fixes coming in. Cells seems like it's in the same camp (a >> bunch of recent bugs have been cells related, it looks like a lot more >> deployments are trying it). >> >> Probably the most important thing in tag owners would be cleaning up the >> bugs in the tag. Realizing that 2 bugs were actually the same bug. >> Cleaning up descriptions / titles / etc so that people can move forward >> on them. >> >> Recommendation #4: create tag champions >> >> == Soft Spots == >> >> After looking at probably close to 1000 bugs in 2 weeks I have a >> particular impression of soft spots that we have. >> >> Quotas are kind of a mess. It's not clear that we're even eventually >> consistent. There are a lot of bugs about creating servers, deleteing >> servers, and leaking quota in the process. I know Jay and Sylvan are >> diving hard on the resource tracker right now, I think this should be a >> Kilo focus area because it creates terrible confusion and bugs for people. >> >> EC2 has definitely regressed, especially after block device mapping >> changes, to the point that it's not clear it's functional outside of the >> most basic server create commands. The EC2 code is largely unchanged >> since 2012, and only lightly tested, we need to decide if this is >> important or not, and either fix it or delete it. There have been many >> past hands going up that said they would help, and then they never do >> (you known who you are). >> >> The VM State machine model is .... Well it's at least suboptimal, but >> it's also clear that it's massively leaky, and the way we handle it >> internally means we end up in inconsistent wedges all the time. I expect >> the complexity here causes a ton of bugs. We need some refactoring to >> make things a ton more clear about what's supposed to be happening, and >> how to rollback when they go wrong. I think the Tasks work was headed >> down that path, but that seems stalled now. >> >> Cross interaction with Neutron and Cinder remains racey. We are pretty >> optimistic on when resources will be available. Even the event interface >> with Neutron hasn't fully addressed this. I think a really great Design >> Summit session would be Nova + Neutron + Cinder to figure out a shared >> architecture to address this. I'd expect this to be at least a double >> session. >> >> Recommendation #5 - 8: we should get on those things :) >> >> == Triaging Inconsistencies == >> >> I found some inconsistencies in how people were triaging bugs, and the >> state inconsistencies probably don't help with making the bugs seem >> confusing: https://wiki.openstack.org/wiki/BugTriage provides some >> guideance. >> >> Importantly: >> >> Incomplete is an Open state. For bugzilla folks this is NEEDSINFO. I saw >> a bunch of 'closing' comments but a move to Incomplete. >> >> Triaged should be used if the solution to fix the bug is in the bug >> itself. Triaged is Confirmed + Solution at enough details to fix it. >> >> Incomplete bugs should not have assignees or milestones, otherwise it >> won't time out. >> >> == General Cleanup Rules == >> >> Here are some general cleanup rules that I was using: >> >> If an Incomplete bug has no response after 30 days it's fair game to >> close (Invalid, Opinion, Won't Fix). >> >> If a bug is In Progress with no patch posted after 30 days, it is not In >> Progress. Remove assignee, move back to last state (probably confirmed). >> Move to Opinion if it's really a "post it note". >> >> If a bug is In Progress but the patches were abandoned, it's no longer >> In Progress. Remove assignee, move back to last state (probably >> confirmed). Move to Opinion if it's really a "post it note". >> >> == Rescuing Stalled Fixes == >> >> Over the course of this I found a bunch of the In Progress bugs were >> real issues, with real fixes, that had stalled out for one of a number >> of reasons. Often it had a -1 'needs unit tests' on it, and it's sort of >> clear the author didn't really know how to do that for this patch. Other >> times the author's first language was not english, and the patch commit >> message was confusing enough that no one understood what it was fixing. >> (One of these bugs I restored, rewrote the commit message, and then it >> sailed through the process.) >> >> Recommendation #9: if you are going to -1 for unit tests, please go the >> extra step of saying 'I think you should write a test that does X, Y, Z'. >> >> Recommendation #10: We need to find a better balance in rewriting commit >> messages. Maybe we should just make it socially acceptable to rewrite >> the commit message as part of review. >> >> .... >> >> I'm sure there are other thoughts, but my brain is running out of steam. >> These were the things that popped to the top of my head. It's definitely >> been really interesting to spend this much time with the tracker to >> build a bigger picture of this feedback channel we have from our users. >> Hopefully other folks found some of this handy. >> >> -Sean >> > > > _______________________________________________ > OpenStack-dev mailing list > OpenStack-dev@lists.openstack.org > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev _______________________________________________ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev