[chromium-dev] Re: Flakiness. Please help.

Eric Seidel Wed, 05 Aug 2009 22:11:43 -0700

Do we have a list of flakey tests?  I feel like we used to have one...

On Wed, Aug 5, 2009 at 9:44 PM, Peter Kasting <pkast...@google.com> wrote:


> THIS MAIL APPLIES TO YOU
>
> Flakiness is growing.  Smash it before it gets bigger, and keep it smashed.
>
> ***
>
> The MOST IMPORTANT section in this gigantic mail:
>
> PLEASE spend some of every workday (or each week at least, if you can't
> spare time each day) looking at test failures, flakiness,
> valgrind/purify/coverity bugs, crashes, and/or memory bugs.  Make it a goal
> to get an average of one line in the test-expectations file removed each
> day.  If you're a Googler, put it on your OKRs (now, not sometime tomorrow).
>
> * DON'T wait for someone to assign bugs to you or ask for your help
> * DON'T wait for a team fixit week (those haven't worked)
> * DON'T wait for someone else to solve the problems
> * DON'T wait until after your current project is finished
> * DON'T wait until you have worked on WebKit
>
> HELP, even if it's just a little, even if it's not your core competence.
>  We currently have hundreds upon hundreds of failing or flaky tests.  We can
> dramatically reduce this quickly but ONLY IF YOU HELP.  This is an
> investment not only in the quality of Chrome but in the team's ability to
> move fast, so help here doesn't just improve the quality of Chrome, but also
> the derivative of the quality :)
>
> (If you do not know how to do anything above and need handholding, e-mail
> me and I will help you.  It's OK to be ignorant.)
>
> ***
>
> Next, how you should help keep the tree green at all times:
>
> * If you ever look at the buildbot and see red, and there's no explanation
> in the build status, ask what's going on on #chromium.  Ping the sheriffs
> specifically (they're listed in the upper-right corner).  If you do not get
> an answer about ownership within a few minutes, close the tree (if you have
> the rights to) or ask someone to close it.  THE TREE SHOULD NOT BE OPEN WITH
> RED THAT NO ONE OWNS.  Help the sheriffs out with this -- they can't watch
> every second.  Closed trees suck; unowned bustage sucks more.  Be
> hard-nosed.
>
> * Yes, even purify, valgrind, and reliability bot redness.  If you can't
> figure out what to do with these, try pinging erikkay for purify issues and
> huanr for reliability issues.  (Not sure who a good general valgrind contact
> is.)
>
> * If you ever look at the buildbot and see orange ("unexpected pass"),
> especially in the WebKit LayoutTest bots, ping the WebKit sheriff (the
> calendar is linked from the top of
> http://dev.chromium.org/developers/how-tos/webkit-merge-1 ; I don't know
> whether it's world-readable).  If he wasn't aware of it, agree between you
> on who will deal with it.  Orange alone is not reason to close the tree, but
> it should NOT be ignored.
>
> * DON'T IGNORE TESTS BECAUSE THEY WENT GREEN ON THE NEXT CYCLE.  If they're
> really fixed by someone's commit, that should be easy to determine.
>  Otherwise, they're flaky, and we NEED to mark them as such, not just leave
> them.
>
> ***
>
> Finally, how to help if the LayoutTest bots are red or orange:
>
> (1) Try and determine if the test(s) are consistently passing/failing
> unexpectedly, or if they're flaky.  Make sure you look at all the different
> bots to see which OSes are affected.
> (2) Update src/webkit/tools/layout_tests/test-expectations.txt.  Look for
> the test(s) in question.  Often, flaky tests will already be in there as
> failing or flaky for one OS, and need to have more added; or they will be
> marked flaky ("FAIL PASS") and need "CRASH" added.  If they're not there,
> add a line.
> (3) Ensure the test(s) have a bug on file.  Note the bug on the
> expectation.
> (4) If any tests are crashing (flaky or not), they're high-priority and
> someone needs to triage them.  Today, dglazkov was WebKit sheriff and was
> having me mark these bugs as P1, Mstone-3, owner:dglazkov.  I'm not sure
> whether the Right Thing is to assign them to the WebKit sheriff or still to
> him (feel free to comment, dglazkov!).  Why are these P1?  Because until we
> prove they can't affect Chrome itself, they potentially can, and Chrome
> crashes are always P1.  They affect stability and security both.
> (5) If you have commit rights, go ahead and TBR test-expectations changes
> you're confident of.  I even suggest using --force if the tree is closed.
>  Updating expectations is like fixing bustage, it helps the tree go green
> faster and thus is almost always desirable.  If you don't have commit
> rights, send your review to the WebKit sheriff.
>
> ***
>
> Your reward for reading this far:
> * At the end of the quarter, I will nominate for a peer bonus every Googler
> who puts something meaningful about flakiness/test failures/the other stuff
> above on their OKRs, accomplishes it, and sends me a note pointing that out.
> * At the end of the quarter, I will nominate for commit access every
> non-Googler who sends me a pointer to ten patches relating to the above
> items that they have posted for review, and who doesn't otherwise have some
> reason why they can't be nominated.
>
> If other people want to sweeten the pot somehow, feel free.
>
> PK
>
> >
>

--~--~---------~--~----~------------~-------~--~----~
Chromium Developers mailing list: chromium-dev@googlegroups.com 
View archives, change email options, or unsubscribe: 
    http://groups.google.com/group/chromium-dev
-~----------~----~----~----~------~----~------~--~---

[chromium-dev] Re: Flakiness. Please help.

Reply via email to