Re: Project Stockwell (reducing intermittents) - March 2017 update

2017-03-23 Thread Kartikaya Gupta
On Thu, Mar 9, 2017 at 10:31 AM, Kartikaya Gupta wrote: > On Wed, Mar 8, 2017 at 6:02 PM, L. David Baron wrote: >> As of 5 days ago, "Treeherder Bug Filer" was not using BUG_COMPONENT >> information. I say this based on: >>

Re: Project Stockwell (reducing intermittents) - March 2017 update

2017-03-09 Thread Karl Tomlinson
> It'd make me feel slightly less sad that we're disabling tests > that do their job 90% of the time... The way I interpret a test failing 10% of the time is that either it has already done its job to indicate a problem in the product, or the test is not doing its job. Either way, if it is not

Re: Project Stockwell (reducing intermittents) - March 2017 update

2017-03-09 Thread James Graham
On 09/03/17 19:53, Milan Sreckovic wrote: Not a reply to this message, just continuing the thread. I'd like to see us run all the intermittently disabled tests once a ... week, say, or at some non-zero frequency, and automatically re-enable the tests that magically get better. I have a feeling

Re: Project Stockwell (reducing intermittents) - March 2017 update

2017-03-09 Thread Milan Sreckovic
Not a reply to this message, just continuing the thread. I'd like to see us run all the intermittently disabled tests once a ... week, say, or at some non-zero frequency, and automatically re-enable the tests that magically get better. I have a feeling that some intermittent failures get

Re: Project Stockwell (reducing intermittents) - March 2017 update

2017-03-09 Thread jmaher
A lot of great discussion here, thanks everyone for taking some time out of your day to weigh in on this subject. There are slight differences between a bug being filed and actively working on the bug once it crosses our threshold of 30 failures/week- I want to discuss when we have looked at

Re: Project Stockwell (reducing intermittents) - March 2017 update

2017-03-09 Thread Kartikaya Gupta
On Wed, Mar 8, 2017 at 6:02 PM, L. David Baron wrote: > As of 5 days ago, "Treeherder Bug Filer" was not using BUG_COMPONENT > information. I say this based on: > https://bugzilla.mozilla.org/show_bug.cgi?id=1344304 > being filed in Core :: Layout despite: > > $ ./mach

Re: Project Stockwell (reducing intermittents) - March 2017 update

2017-03-08 Thread Karl Tomlinson
I would like to see failure rates expressed as a ratio of failures to test runs, but I recognise that this data may not be readily available and getting it may not be that important if we have a rough idea. These are a means for setting priorities, and so a rank works well. If we have 100 tests,

Re: Project Stockwell (reducing intermittents) - March 2017 update

2017-03-08 Thread L. David Baron
On Wednesday 2017-03-08 17:15 -0500, Kartikaya Gupta wrote: > On Wed, Mar 8, 2017 at 4:01 PM, wrote: > > In the past I have not always been made aware when my tests were disabled, > > which has lead to me feeling jaded. > > We have a process (in theory) that

Re: Project Stockwell (reducing intermittents) - March 2017 update

2017-03-08 Thread Kartikaya Gupta
On Wed, Mar 8, 2017 at 4:01 PM, wrote: > In the past I have not always been made aware when my tests were disabled, > which has lead to me feeling jaded. We have a process (in theory) that ensures the relevant people get notified of tests. The process involves

Re: Project Stockwell (reducing intermittents) - March 2017 update

2017-03-08 Thread chris . ryan . pearce
On Wednesday, March 8, 2017 at 11:18:03 PM UTC+13, jma...@mozilla.com wrote: > On Tuesday, March 7, 2017 at 11:45:38 PM UTC-5, Chris Pearce wrote: > > I recommend that instead of classifying intermittents as tests which fail > > > 30 times per week, to instead classify tests that fail more than

Re: Project Stockwell (reducing intermittents) - March 2017 update

2017-03-08 Thread Marco Bonardo
On Wed, Mar 8, 2017 at 2:12 PM, Kartikaya Gupta wrote: > What makes me sad is all the > developers in this thread trying to push back against disabling of > clearly problematic tests, asking for things like tracking bugs and > needinfos, when the reality is that the

Re: Project Stockwell (reducing intermittents) - March 2017 update

2017-03-08 Thread Kartikaya Gupta
I've been reading this thread with much sadness, but refraining from commenting because I have nothing good to say. But I feel like I should probably comment regardless. What makes me sad is all the developers in this thread trying to push back against disabling of clearly problematic tests,

Re: Project Stockwell (reducing intermittents) - March 2017 update

2017-03-08 Thread jmaher
On Tuesday, March 7, 2017 at 11:45:38 PM UTC-5, Chris Pearce wrote: > I recommend that instead of classifying intermittents as tests which fail > > 30 times per week, to instead classify tests that fail more than some > threshold percent as intermittent. Otherwise on a week with lots of

Re: Project Stockwell (reducing intermittents) - March 2017 update

2017-03-07 Thread Chris Pearce
I recommend that instead of classifying intermittents as tests which fail > 30 times per week, to instead classify tests that fail more than some threshold percent as intermittent. Otherwise on a week with lots of checkins, a test which isn't actually a problem could clear the threshold and

Re: Project Stockwell (reducing intermittents) - March 2017 update

2017-03-07 Thread jmaher
On Tuesday, March 7, 2017 at 2:57:21 PM UTC-5, Steve Fink wrote: > On 03/07/2017 11:34 AM, Joel Maher wrote: > > Good suggestion here- I have seen so many cases where a simple > > fix/disabled/unknown/needswork just do not describe it. Let me work on a > > few new tags given that we have 248 bugs

Re: Project Stockwell (reducing intermittents) - March 2017 update

2017-03-07 Thread Kris Maglione
On Tue, Mar 07, 2017 at 11:15:54AM -0800, Kris Maglione wrote: It would be nice if, rather than disabling the test, we could just annotate so that it would still run, and show up in Orange Factor, but wouldn't turn the job orange. Which might be as simple as moving those jobs into a

Re: Project Stockwell (reducing intermittents) - March 2017 update

2017-03-07 Thread Joel Maher
Good suggestion here- I have seen so many cases where a simple fix/disabled/unknown/needswork just do not describe it. Let me work on a few new tags given that we have 248 bugs to date. I am thinking maybe [stockwell turnedoff] - where the job is turned off- we could also ensure one of the last

Re: Project Stockwell (reducing intermittents) - March 2017 update

2017-03-07 Thread Marco Bonardo
On Tue, Mar 7, 2017 at 8:11 PM, wrote: > Thanks for checking up on this- there are 6 specific bugs that have this > signature in the disabled set- in this case they are all linux32-debug > devtools tests- we disabled devtools on linux32-debug because the runtime > was

Re: Project Stockwell (reducing intermittents) - March 2017 update

2017-03-07 Thread jmaher
On Tuesday, March 7, 2017 at 1:59:14 PM UTC-5, Steve Fink wrote: > Is there a mechanism in place to detect when disabled intermittent tests > have been fixed? > > eg, every so often you could rerun disabled tests individually a bunch > of times. Or if you can distinguish which tests are

Re: Project Stockwell (reducing intermittents) - March 2017 update

2017-03-07 Thread Kris Maglione
On Tue, Mar 07, 2017 at 06:26:28AM -0800, jma...@mozilla.com wrote: In March, we want to find a way to disable the teststhat are causing the most pain or are most likely not to be fixed, without unduly jeopardizing the chance that these bugs will be fixed. We propose: 1) all high frequency

Re: Project Stockwell (reducing intermittents) - March 2017 update

2017-03-07 Thread jmaher
On Tuesday, March 7, 2017 at 1:53:48 PM UTC-5, Marco Bonardo wrote: > On Tue, Mar 7, 2017 at 6:42 PM, Joel Maher wrote: > > > Thank for pointing that out. In some cases we have fixed tests that are > > just timing out, in a few cases we disable because the test typically

Re: Project Stockwell (reducing intermittents) - March 2017 update

2017-03-07 Thread Boris Zbarsky
On 3/7/17 1:33 PM, Honza Bambas wrote: I presume that when a test is disabled a bug is filed As far as I can tell, that's not the case... If that were the case, that would be a good start, yes. -Boris ___ dev-platform mailing list

Re: Project Stockwell (reducing intermittents) - March 2017 update

2017-03-07 Thread Steve Fink
Is there a mechanism in place to detect when disabled intermittent tests have been fixed? eg, every so often you could rerun disabled tests individually a bunch of times. Or if you can distinguish which tests are failing, run them all a bunch of times and pick apart the wreckage to see which

Re: Project Stockwell (reducing intermittents) - March 2017 update

2017-03-07 Thread Marco Bonardo
On Tue, Mar 7, 2017 at 6:42 PM, Joel Maher wrote: > Thank for pointing that out. In some cases we have fixed tests that are > just timing out, in a few cases we disable because the test typically runs > much faster (i.e. <15 seconds) and is hanging/timing out. In other

Re: Project Stockwell (reducing intermittents) - March 2017 update

2017-03-07 Thread Honza Bambas
I presume that when a test is disabled a bug is filed and triaged within the responsible team as any regular bug. Only that way we don't forget and push on fixing it and returning back to the wheel. Are there also some data or stats how often tests having a strong orange factor catch actual

Re: Project Stockwell (reducing intermittents) - March 2017 update

2017-03-07 Thread Joel Maher
Thank for pointing that out. In some cases we have fixed tests that are just timing out, in a few cases we disable because the test typically runs much faster (i.e. <15 seconds) and is hanging/timing out. In other cases extending the timeout doesn't help (i.e. a hang/timeout). Please feel free

Re: Project Stockwell (reducing intermittents) - March 2017 update

2017-03-07 Thread Marco Bonardo
On Tue, Mar 7, 2017 at 6:34 PM, Marco Bonardo > > In case of mochitest browser tests failing on "This test exceeded the > timeout threshold", the temporary solution after 1 or 2 weeks should be to > add requestLongertimeout,rather than disabling them. They should still be > split up into smaller

Re: Project Stockwell (reducing intermittents) - March 2017 update

2017-03-07 Thread Marco Bonardo
On Tue, Mar 7, 2017 at 3:26 PM, wrote: > In recent months we have been triaging high frequency (>=30 times/week) > failures in automated tests. We find that we are fixing 35% of the bugs > and disabling 23% of them. > In case of mochitest browser tests failing on "This test

Re: Project Stockwell (reducing intermittents) - March 2017 update

2017-03-07 Thread Boris Zbarsky
On 3/7/17 9:26 AM, jma...@mozilla.com wrote: We find that we are fixing 35% of the bugs and disabling 23% of them. Is there a credible plan for reenabling the ones we disable? -Boris ___ dev-platform mailing list dev-platform@lists.mozilla.org

Project Stockwell (reducing intermittents) - March 2017 update

2017-03-07 Thread jmaher
In recent months we have been triaging high frequency (>=30 times/week) failures in automated tests. We find that we are fixing 35% of the bugs and disabling 23% of them. The great news is we are fixing many of the issues. The sad news is we are disabling tests, but usually only after giving