Re: [webkit-dev] Skipping Flakey Tests
On Dec 22, 2009, at 10:31 AM, Darin Adler wrote: On Dec 21, 2009, at 6:14 PM, Dirk Pranke wrote: Given all that, Darin, what were you suggesting when you said "Let's fix that"? Lets add a feature so something in the tests tree can indicate a Chromium Windows result should be the base result, rather than the platform/win one. We can debate exactly how it should work, but lets come up with a design and do it. One possibility is to have three windows-related result directories, win/, win-cg/ and win-chromium/. We'd use the latter two for test results that are different between the CoreGraphics/CFNetwork/Apple port vs. the Chromium/Google port. Any results that should be common to all windows-based platforms. Note: As far as I can tell, this problem is all about granularity of applying the expected results, not about whether we check in known failing results instead of out-of-band FAIL expectations. The latter may or may not be a good idea(*), but it seems independent of the original problem. Some of Dirk and Peter's messages seemed to conflate these two points. Regards, Maciej * - I tend to think we should track expected failure results, for the reasons Darin cited and also because tests that "fail" already can still catch further regressions. ___ webkit-dev mailing list webkit-dev@lists.webkit.org http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev
Re: [webkit-dev] Skipping Flakey Tests
On Tue, Dec 22, 2009 at 4:58 PM, Darin Adler wrote: > On Dec 22, 2009, at 4:27 PM, Dirk Pranke wrote: > >> In the completely generic case, I hope we are not checking in incorrect >> results. > > We do intentionally check in incorrect results, fairly often. For example, > we’ve checked in whole test suites and then generated expected results > without studying the tests to see which ones are successful and which are > failures. > Interesting. I wasn't aware of that, and I guess I hadn't noticed it yet. >> An alternative would be to move to the more general syntax (and hopefully, >> just move to the tool) that Chromium uses. > > I’m surprised that Chromium developed a separate tool. If instead the > Chromium team had enhanced the WebKit project’s shared run-webkit-tests we’d > be better off. How did we end up with two separate tools?! That I couldn't tell you, as the decision precedes me joining the team; I'm sure someone else can chime in. I don't think anyone would argue that one tool would be better than two, and Eric Seidel and I have been working on a plan to merge the two feature sets so that we do end up with only one tool; the major plusses to the Chromium tool are that it has a more expressive syntax for tracking failures across multiple platforms, and it can run tests in parallel across multiple cores, so it tends to be 3x faster than the perl version (at least on my 4-CPU MacPro). I do know the WebKit version supports a bunch of switches and features that the Chromium tool doesn't, but they're mostly switches I've never needed to use, I think, so I couldn't tell you off the top of my head what they are. >> Second, there's the question of whether or not you want to track what the >> "expected incorrect" results are, separate from what the "expected correct" >> results are. That way, you can detect when a test fails *differently* than >> it has been in the past. > > I do think we want to track this. It’s part of why the original system worked > they way it did when I created it back in 2005. Good to know. Being a fan of this feature myself, I would happily add it. > Also, in some cases it may be difficult to generate “correct” results if the > engine doesn’t yet have correct behavior at the time the test is being > created. True enough. -- Dirk ___ webkit-dev mailing list webkit-dev@lists.webkit.org http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev
Re: [webkit-dev] Skipping Flakey Tests
On Dec 22, 2009, at 4:27 PM, Dirk Pranke wrote: > In the completely generic case, I hope we are not checking in incorrect > results. We do intentionally check in incorrect results, fairly often. For example, we’ve checked in whole test suites and then generated expected results without studying the tests to see which ones are successful and which are failures. > An alternative would be to move to the more general syntax (and hopefully, > just move to the tool) that Chromium uses. I’m surprised that Chromium developed a separate tool. If instead the Chromium team had enhanced the WebKit project’s shared run-webkit-tests we’d be better off. How did we end up with two separate tools?! > Second, there's the question of whether or not you want to track what the > "expected incorrect" results are, separate from what the "expected correct" > results are. That way, you can detect when a test fails *differently* than it > has been in the past. I do think we want to track this. It’s part of why the original system worked they way it did when I created it back in 2005. Also, in some cases it may be difficult to generate “correct” results if the engine doesn’t yet have correct behavior at the time the test is being created. -- Darin ___ webkit-dev mailing list webkit-dev@lists.webkit.org http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev
Re: [webkit-dev] Skipping Flakey Tests
On Tue, Dec 22, 2009 at 10:31 AM, Darin Adler wrote: > On Dec 21, 2009, at 6:14 PM, Dirk Pranke wrote: > >> Given all that, Darin, what were you suggesting when you said "Let's fix >> that"? > > Lets add a feature so something in the tests tree can indicate a Chromium > Windows result should be the base result, rather than the platform/win one. > We can debate exactly how it should work, but lets come up with a design and > do it. > For a given test, either the test produces generic results (and the results are checked in alongside the test), the test produces "mostly generic" results (meaning most platforms/ports can use the generic results, but some intentionally diverge), or the test produces completely platform-specific results. In the completely generic case, I hope we are not checking in incorrect results. Are we concerned about this case? I think the "mostly generic" case is probably a variant of the "generic" case, and should have the same policy. That leaves the "platform-specific" case. In this case, marking any particular platform as "right" doesn't make a lot of sense, because what's right for one platform may or may not be right for another. The problem comes up in ports like Chromium that use a search path for results. I would not suggest that we change anything here - if platform/win/foo-expected.txt is "wrong", we should probably just check in an override in platform/chromium-win/foo-expected.txt . If too many of these situations occur, we're probably just better off dropping platform/win from the search path (which is what I think we actually probably should do in in our win port, but I leave that as an excercise for me to determine). So, I don't think we need to change anything to address the above issues. There are one or two other points of design. First, there's the question of whether or not "intentionally incorrect" results should ever be checked in. One reason to do this is because run-webkit-tests doesn't have a "FAIL" concept, just a "SKIPPED" concept. It would be easy to do this, and probably the best way to do this is to add a "Failures" file alongside the "Skipped" file, using the same syntax. An alternative would be to move to the more general syntax (and hopefully, just move to the tool) that Chromium uses. Second, there's the question of whether or not you want to track what the "expected incorrect" results are, separate from what the "expected correct" results are. That way, you can detect when a test fails *differently* than it has been in the past. It is an open question as to how useful and/or how much maintenance it would be to do this. If we were to do it, I would suggest adding something like "foo-failure.txt" files alongside the "foo-expected.txt" files. To sum up: (1) For platform-specific failures, we should either (a) check in new overriding baselines or (b) fix the baseline search path. No significant code changes are needed. (2) For generic failures, we can either (a) add a "Failures" file, (b) implement Chromium's test-expectations syntax, (c) move to Chromium's tools (getting b along the way), (d) check in incorrect output as the "expected results" and add platform-specific baselines for platforms that "get it right". (3) If you want to capture "expected incorrect" *and* "expected correct", add a "-failure" set of expectations and mod the tools accordingly. I would vote for (1a) or (1b) (basically status quo), and (2c). I really don't like (2d), and (2b) seems like a waste of effort compared to (2c). If we're unclear if (2c) is really valuable, I would volunteer to implement (2a) as a stopgap (although it won't happen until after the holidays). I would not bother to implement (3) at this point, but I won't stop someone else from doing it, either. -- Dirk ___ webkit-dev mailing list webkit-dev@lists.webkit.org http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev
Re: [webkit-dev] Skipping Flakey Tests
On Dec 21, 2009, at 6:14 PM, Dirk Pranke wrote: > Given all that, Darin, what were you suggesting when you said "Let's fix > that"? Lets add a feature so something in the tests tree can indicate a Chromium Windows result should be the base result, rather than the platform/win one. We can debate exactly how it should work, but lets come up with a design and do it. -- Darin ___ webkit-dev mailing list webkit-dev@lists.webkit.org http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev
Re: [webkit-dev] Skipping Flakey Tests
On Mon, Dec 21, 2009 at 6:14 PM, Dirk Pranke wrote: > The Chromium framework doesn't look at the Skipped files, but does > look at the Safari Win baselines (we'll use those if we don't find a > better match). > > Given all that, Darin, what were you suggesting when you said "Let's fix > that"? I assumed it meant "There should be a distinction between all-Win-platforms baselines and Safari/Win-only baselines", and then we could make the Chromium harness look at the former but not the latter, and put this particular case into the latter. I have no idea who's making that change, or whatever other change we decide on, though. PK ___ webkit-dev mailing list webkit-dev@lists.webkit.org http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev
Re: [webkit-dev] Skipping Flakey Tests
Somewhere in between the two of you I got lost. Which "the framework" are you referring to? If you're referring to the run-webkit-tests in WebKitTools/Scripts, you are correct that it has no way to distinguish Safari/Win from Chromium/Win. This doesn't really matter, since this framework isn't used by Chromium. The Chromium framework doesn't look at the Skipped files, but does look at the Safari Win baselines (we'll use those if we don't find a better match). To Peter's point, this means that we have to check in a correct expected result to override the incorrect one in platform/win, which is annoying. I don't see any obvious way to get around this, except that I think it probably does us little to no good to use the Win expected results at all, and we should probably just skip them in general. This doesn't really solve the problem in general (or if Mac has this problem). Note that if/when we happen to upstream the Chromium version of run_webkit_tests, it does support marking files as 'FAIL', although it does not currently allow us to capture 'expected fail' results separately from 'expected pass' results. So, if we were all using Chromium's run_webkit_tests, we could mark the Safari/Win version as expected to fail (separately from SKIP), but I'm not sure if this is what the Safari/Win guys want. Given all that, Darin, what were you suggesting when you said "Let's fix that"? -- Dirk On Mon, Dec 21, 2009 at 1:54 PM, Darin Adler wrote: > On Dec 21, 2009, at 1:50 PM, Peter Kasting wrote: > >> the framework doesn't seem to have a way of distinguishing Safari/Win from >> Chromium/Win > > Lets fix that. > > -- Darin > > ___ > webkit-dev mailing list > webkit-dev@lists.webkit.org > http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev > ___ webkit-dev mailing list webkit-dev@lists.webkit.org http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev
Re: [webkit-dev] Skipping Flakey Tests
On Dec 21, 2009, at 1:50 PM, Peter Kasting wrote: > the framework doesn't seem to have a way of distinguishing Safari/Win from > Chromium/Win Lets fix that. -- Darin ___ webkit-dev mailing list webkit-dev@lists.webkit.org http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev
Re: [webkit-dev] Skipping Flakey Tests
On Thu, Oct 1, 2009 at 10:41 AM, Drew Wilson wrote: > In this case, there was a failure in one of the layout tests on the windows > platform, so following the advice below, aroben correctly checked in an > update to the test expectations instead of skipping the tests. > > Downstream, this busted the Chromium tests, because that failure was not > happening in Chromium, and now our correct test output doesn't match the > incorrect test output that's been codified in the test expecations. We can > certainly manage this downstream by rebaselining the test and managing a > custom chromium test expectation, but that's a pain and is somewhat fragile > as it requires maintenance every time someone adds a new test case to the > test. > This came up again this past Friday. http://trac.webkit.org/changeset/52324added purposefully-failing results for WebKit Windows, which broke Chromium downstream because we don't fail the test. Darin's original reply here included the line "And we should structure test results and exceptions so that it’s easy to get the expected failure on the right platforms and success on others." It seems like this isn't the case currently, since the framework doesn't seem to have a way of distinguishing Safari/Win from Chromium/Win, meaning the only way we can express the current state of affairs via result snapshots is to check in a bad baseline over the good one for all Windows ports and then have each port that passes check in a good baseline. This is a pretty poor experience :(, and more than "a slight inconvenience" as Darin dismissively termed it. I liked Dirk's idea of being able to note that a test is failing, rather than skipping it or checking in a bogus baseline. I don't see a lot of value in the bad-baseline strategy beyond keeping the test running, and noting "this test fails on Safari/Win" accomplishes that same objective in a less-misleading and more-other-port-friendly fashion. PK ___ webkit-dev mailing list webkit-dev@lists.webkit.org http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev
Re: [webkit-dev] Skipping Flakey Tests
OK, I agree as well - skipping is not a good solution here; I don't think the status quo is perfect, yet probably not imperfect enough to do anything about :) I guess there's just a process wrinkle we need to address on the Chromium side. It's easy to rebaseline a test in Chromium, but less easy to figure out when it's safe to un-rebaseline it. -atw On Thu, Oct 1, 2009 at 11:57 AM, Eric Seidel wrote: > I agree with Darin. I don't think that this is a good example of > where skipping would be useful. > > I think more you're identifying that there is a test hierarchy problem > here. Chromium really wants to base its tests off of some base "win" > implementation, and then "win-apple", "win-chromium", "win-cairo" > results could derive from that, similar to how "mac" and > "mac-leopard", "mac-tiger", "mac-snowleopard" work. > > > I think we should skip only tests that endanger the testing strategy > because > > they are super-slow, crash, or adversely affect other tests in some way. > > Back to the original topic: I do however see flakey tests as > "endangering our testing strategy" because they provide false > negatives, and greatly reduce the value of the layout tests and things > which run the layout tests, like the buildbots or the commit-bot. > > I also agree with Darin's earlier comment that WebKit needs something > like Chromium's multiple-expected results support so that we can > continue to run flakey tests, even if they're flakey instead of having > to resort to skipping them. But for now, skipping is the best we > have, and I still encourage us to use it when necessary instead of > leaving layout tests flakey. :) > > -eric > ___ webkit-dev mailing list webkit-dev@lists.webkit.org http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev
Re: [webkit-dev] Skipping Flakey Tests
On Oct 1, 2009, at 11:58 AM, Eric Seidel wrote: I think more you're identifying that there is a test hierarchy problem here. Chromium really wants to base its tests off of some base "win" implementation, and then "win-apple", "win-chromium", "win-cairo" results could derive from that, similar to how "mac" and "mac-leopard", "mac-tiger", "mac-snowleopard" work. Something like that would be excellent if this pattern turns up often. I don’t think we should make the change because of one test, but if it comes up a lot we definitely should. Back to the original topic: I do however see flakey tests as "endangering our testing strategy" because they provide false negatives, and greatly reduce the value of the layout tests and things which run the layout tests, like the buildbots or the commit- bot. I also agree with Darin's earlier comment that WebKit needs something like Chromium's multiple-expected results support so that we can continue to run flakey tests, even if they're flakey instead of having to resort to skipping them. But for now, skipping is the best we have, and I still encourage us to use it when necessary instead of leaving layout tests flakey. :) I agree on all of this. Except that the two specific flakey tests we were discussing that got us started on this discussion were really serious bugs and it was really good to fix them rather than skipping them. After this experience, I now do share Alexey’s fear that if we had skipped them we would not have fixed the regression. Best, if possible, would have been to notice when they turned from reliable tests to flakey tests and rolled the change that made them flakey out. -- Darin ___ webkit-dev mailing list webkit-dev@lists.webkit.org http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev
Re: [webkit-dev] Skipping Flakey Tests
I agree with Darin. I don't think that this is a good example of where skipping would be useful. I think more you're identifying that there is a test hierarchy problem here. Chromium really wants to base its tests off of some base "win" implementation, and then "win-apple", "win-chromium", "win-cairo" results could derive from that, similar to how "mac" and "mac-leopard", "mac-tiger", "mac-snowleopard" work. > I think we should skip only tests that endanger the testing strategy because > they are super-slow, crash, or adversely affect other tests in some way. Back to the original topic: I do however see flakey tests as "endangering our testing strategy" because they provide false negatives, and greatly reduce the value of the layout tests and things which run the layout tests, like the buildbots or the commit-bot. I also agree with Darin's earlier comment that WebKit needs something like Chromium's multiple-expected results support so that we can continue to run flakey tests, even if they're flakey instead of having to resort to skipping them. But for now, skipping is the best we have, and I still encourage us to use it when necessary instead of leaving layout tests flakey. :) -eric On Thu, Oct 1, 2009 at 11:47 AM, Darin Adler wrote: > On Oct 1, 2009, at 11:41 AM, Drew Wilson wrote: > >> I don't have an opinion about flakey tests, but flat-out-busted tests >> should get skipped. Any thoughts/objections? > > I object. > > If a test fails on some platforms and succeeds on others, we should have the > success result checked in as the default case, and the failure as an > exception. And we should structure test results and exceptions so that it’s > easy to get the expected failure on the right platforms and success on > others. Your story about a slight inconvenience because a test failed on the > base Windows WebKit but succeeded on the Chromium WebKit does not seem like > a reason to change this! > > Skipping the test does not seem like a good thing to do for the long term > health of the project. It is good to exercise all the other code each test > covers and also to notice when a test result gets even worse or gets better > when a seemingly unrelated change is made. > > I think we should skip only tests that endanger the testing strategy because > they are super-slow, crash, or adversely affect other tests in some way. > > -- Darin > > ___ > webkit-dev mailing list > webkit-dev@lists.webkit.org > http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev > ___ webkit-dev mailing list webkit-dev@lists.webkit.org http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev
Re: [webkit-dev] Skipping Flakey Tests
On Thu, Oct 1, 2009 at 11:47 AM, Darin Adler wrote: > On Oct 1, 2009, at 11:41 AM, Drew Wilson wrote: > >> I don't have an opinion about flakey tests, but flat-out-busted tests >> should get skipped. Any thoughts/objections? > > I object. > > If a test fails on some platforms and succeeds on others, we should have the > success result checked in as the default case, and the failure as an > exception. And we should structure test results and exceptions so that it’s > easy to get the expected failure on the right platforms and success on > others. Your story about a slight inconvenience because a test failed on the > base Windows WebKit but succeeded on the Chromium WebKit does not seem like > a reason to change this! > > Skipping the test does not seem like a good thing to do for the long term > health of the project. It is good to exercise all the other code each test > covers and also to notice when a test result gets even worse or gets better > when a seemingly unrelated change is made. > > I think we should skip only tests that endanger the testing strategy because > they are super-slow, crash, or adversely affect other tests in some way. > I agree that skipping the test is the wrong thing to do. However, checking in an incorrect baseline over the correct baseline is also the wrong thing to do (because, as Drew points out, this can break other platforms that don't have the bug). Chromium does have the concept of marking tests as expected to FAIL, but it does not have a way to capture what the expected failure is (i.e., there is no way to capture a "FAIL" baseline). We discussed this recently and punted on it because it was unclear how useful this would really be, and -- as we all probably agree -- it's better not to have failing tests in the first place. Eric and Dimitry have suggested that we look into pulling the Chromium expectations framework upstream into Webkit and adding the features that WebKit's framework has that Chromium's doesn't. It sounds to me like this might be the right long-term solution, and I'd be happy to work on it. In the meantime, maybe it makes sense to add Fail files alongside the Skipped files? That would allow the bots to stay green, but would at least keep the tests running. -- Dirk ___ webkit-dev mailing list webkit-dev@lists.webkit.org http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev
Re: [webkit-dev] Skipping Flakey Tests
On Oct 1, 2009, at 11:41 AM, Drew Wilson wrote: I don't have an opinion about flakey tests, but flat-out-busted tests should get skipped. Any thoughts/objections? I object. If a test fails on some platforms and succeeds on others, we should have the success result checked in as the default case, and the failure as an exception. And we should structure test results and exceptions so that it’s easy to get the expected failure on the right platforms and success on others. Your story about a slight inconvenience because a test failed on the base Windows WebKit but succeeded on the Chromium WebKit does not seem like a reason to change this! Skipping the test does not seem like a good thing to do for the long term health of the project. It is good to exercise all the other code each test covers and also to notice when a test result gets even worse or gets better when a seemingly unrelated change is made. I think we should skip only tests that endanger the testing strategy because they are super-slow, crash, or adversely affect other tests in some way. -- Darin ___ webkit-dev mailing list webkit-dev@lists.webkit.org http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev
Re: [webkit-dev] Skipping Flakey Tests
I wanted to re-open this discussion with some real-world feedback. In this case, there was a failure in one of the layout tests on the windows platform, so following the advice below, aroben correctly checked in an update to the test expectations instead of skipping the tests. Downstream, this busted the Chromium tests, because that failure was not happening in Chromium, and now our correct test output doesn't match the incorrect test output that's been codified in the test expecations. We can certainly manage this downstream by rebaselining the test and managing a custom chromium test expectation, but that's a pain and is somewhat fragile as it requires maintenance every time someone adds a new test case to the test. I'd really like to suggest that we skip broken tests rather than codify their breakages in the expectations file. Perhaps we'd make exceptions to this rule for tests that have a bunch of working test cases (in which case there's value in running the other test cases instead of skipping the entire test). But in general it's less work for everyone just to skip broken tests. I don't have an opinion about flakey tests, but flat-out-busted tests should get skipped. Any thoughts/objections? -atw On Fri, Sep 25, 2009 at 1:59 PM, Darin Adler wrote: > Green buildbots have a lot of value. > > I think it’s worthwhile finding a way to have them even when there are test > failures. > > For predictable failures, the best approach is to land the expected failure > as an expected result, and use a bug to track the fact that it’s wrong. To > me this does seem a bit like “sweeping something under the rug”, a bug > report is much easier to overlook than a red buildbot. We don’t have a great > system for keeping track of the most important bugs. > > For tests that give intermittent and inconsistent results, the best we can > currently do is to skip the test. I think it would make sense to instead > allow multiple expected results. I gather that one of the tools used in the > Chromium project has this concept and I think there’s no real reason not to > add the concept to run-webkit-tests as long as we are conscientious about > not using it when it’s not needed. And use a bug to track the fact that the > test gives insufficient results. This has the same downsides as landing the > expected failure results. > > For tests that have an adverse effect on other tests, the best we can > currently do is to skip the test. > > I think we are overusing the Skipped machinery at the moment for platform > differences. I think in many cases it would be better to instead land an > expected failure result. On the other hand, one really great thing about the > Skipped file is that there’s a complete list in the file, allowing everyone > to see the list. It makes a good to do list, probably better than just a > list of bugs. This made Darin Fisher’s recent “why are so many tests > skipped, lets fix it” message possible. > >-- Darin > > > ___ > webkit-dev mailing list > webkit-dev@lists.webkit.org > http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev > ___ webkit-dev mailing list webkit-dev@lists.webkit.org http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev
Re: [webkit-dev] Skipping Flakey Tests
28.09.2009, в 17:00, Maciej Stachowiak написал(а): p.s. I now have two "skipping flakey tests" changes up for review: https://bugs.webkit.org/show_bug.cgi?id=29322 If Brady and Alexey are ok with disabling this test, then I'm fine with it. I like Alexey's suggestion to have a separate "flaky tests" list that the commit queue script can ignore, without preventing them from being run at all. I'm investigating this one now, it looks like we have a refcounting issue with some new code in CString/CStringBuffer. I'd prefer to keep this test enabled (and it's a good thing that it's been causing lots of pain, that's what keeps regression count low). - WBR, Alexey Proskuryakov ___ webkit-dev mailing list webkit-dev@lists.webkit.org http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev
Re: [webkit-dev] Skipping Flakey Tests
On Mon, Sep 28, 2009 at 5:01 PM, Maciej Stachowiak wrote: > > On Sep 28, 2009, at 4:47 PM, David Levin wrote: > > I don't believe that the test was checked in a flaky state. It was solid > for a long time and then something happened... > > > What's "the test" in this context? The network / credentials one? > Yep. > > > I'll try to add more logging to this test this evening (after my turn at > helping chromium stay up to date with WebKit is over). > > I'll ping Drew about the other test. > > > It sounds like that test was always buggy, if Drew is right about the > cause. > > - Macie > > > Dave > > > On Mon, Sep 28, 2009 at 4:40 PM, Eric Seidel wrote: > >> On Fri, Sep 25, 2009 at 2:42 PM, Maciej Stachowiak wrote: >> > I like Dave Levin's idea that the first action should be to instrument >> the >> > tests so we can find out why they intermittently fail. Especially if the >> > failure is reproducible on the bots but not on developer systems. >> >> I like this idea too. I don't like the reality that flakey tests are >> a burden on all developers caused by one. >> >> > Using the >> > skip list should be a last resort, because that hides the failure >> instead of >> > helping us diangose the cause. >> >> I (respectfully) disagree. I think we shouldn't be so afraid to skip >> tests. We don't allow people to check in compiles which fail. We >> don't allow people to check in tests which fail on other platforms >> (without skipping them) or on every other run. Why should we allow >> people to check in tests which fail every 10 runs? Or worse, why >> should we leave a known flakey test checked in/un-attended which fails >> every 10 runs? >> >> If we can't easily roll-out the failing tests (or the commit which >> cause them to start failing), we should skip them to keep the bots (a >> shared resource) green, so as not to block other work on the project. >> No? >> >> I very much like WebKit's "everyone is responsible for the whole >> project" culture, but I disagree that the burden of diagnosis should >> be on the person trying to make a completely unrelated checkin (as is >> the case when we leave flakey tests enabled in the tree). >> >> -eric >> >> p.s. I now have two "skipping flakey tests" changes up for review: >> https://bugs.webkit.org/show_bug.cgi?id=29322 >> https://bugs.webkit.org/show_bug.cgi?id=29344 >> ___ >> webkit-dev mailing list >> webkit-dev@lists.webkit.org >> http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev >> > > > ___ webkit-dev mailing list webkit-dev@lists.webkit.org http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev
Re: [webkit-dev] Skipping Flakey Tests
On Sep 28, 2009, at 4:47 PM, David Levin wrote: I don't believe that the test was checked in a flaky state. It was solid for a long time and then something happened... What's "the test" in this context? The network / credentials one? I'll try to add more logging to this test this evening (after my turn at helping chromium stay up to date with WebKit is over). I'll ping Drew about the other test. It sounds like that test was always buggy, if Drew is right about the cause. - Macie Dave On Mon, Sep 28, 2009 at 4:40 PM, Eric Seidel wrote: On Fri, Sep 25, 2009 at 2:42 PM, Maciej Stachowiak wrote: > I like Dave Levin's idea that the first action should be to instrument the > tests so we can find out why they intermittently fail. Especially if the > failure is reproducible on the bots but not on developer systems. I like this idea too. I don't like the reality that flakey tests are a burden on all developers caused by one. > Using the > skip list should be a last resort, because that hides the failure instead of > helping us diangose the cause. I (respectfully) disagree. I think we shouldn't be so afraid to skip tests. We don't allow people to check in compiles which fail. We don't allow people to check in tests which fail on other platforms (without skipping them) or on every other run. Why should we allow people to check in tests which fail every 10 runs? Or worse, why should we leave a known flakey test checked in/un-attended which fails every 10 runs? If we can't easily roll-out the failing tests (or the commit which cause them to start failing), we should skip them to keep the bots (a shared resource) green, so as not to block other work on the project. No? I very much like WebKit's "everyone is responsible for the whole project" culture, but I disagree that the burden of diagnosis should be on the person trying to make a completely unrelated checkin (as is the case when we leave flakey tests enabled in the tree). -eric p.s. I now have two "skipping flakey tests" changes up for review: https://bugs.webkit.org/show_bug.cgi?id=29322 https://bugs.webkit.org/show_bug.cgi?id=29344 ___ webkit-dev mailing list webkit-dev@lists.webkit.org http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev ___ webkit-dev mailing list webkit-dev@lists.webkit.org http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev
Re: [webkit-dev] Skipping Flakey Tests
On Sep 28, 2009, at 4:40 PM, Eric Seidel wrote: On Fri, Sep 25, 2009 at 2:42 PM, Maciej Stachowiak wrote: I like Dave Levin's idea that the first action should be to instrument the tests so we can find out why they intermittently fail. Especially if the failure is reproducible on the bots but not on developer systems. I like this idea too. I don't like the reality that flakey tests are a burden on all developers caused by one. Then in my opinion that's what we should do first, when a test is failing sporadically and the cause is unknown. Using the skip list should be a last resort, because that hides the failure instead of helping us diangose the cause. I (respectfully) disagree. I think we shouldn't be so afraid to skip tests. We don't allow people to check in compiles which fail. We don't allow people to check in tests which fail on other platforms (without skipping them) or on every other run. Why should we allow people to check in tests which fail every 10 runs? Or worse, why should we leave a known flakey test checked in/un-attended which fails every 10 runs? If a brand new test fails every 10 runs, then we should revert the patch that landed it, just as if it had caused a test to always fail. However, I get the impression that many "flaky test" issues appear after the fact - a test that has been running fine for a long time starts failing sporadically. In that kind of case, it seems likely that a subsequent code change and not the original test is at fault. The challenge is that it may be difficult to identify the code change that made the test start failing sporadically. But for example if a test newly started failing 100% of the time, we would not consider it an appropriate fix to disable that test. Or at least I wouldn't. If we can't easily roll-out the failing tests (or the commit which cause them to start failing), we should skip them to keep the bots (a shared resource) green, so as not to block other work on the project. No? The reason for the "keep the bots green" rule is to prevent regressions. If we maintain the rule by disabling tests, we are sacrificing the actual purpose of the rule for the sake of pro forma adherence. Thats why I'd like it to be a last resort, unless the test is new and its clear it was flaky from the get-go. The first resort should be to get the person who made the test I very much like WebKit's "everyone is responsible for the whole project" culture, but I disagree that the burden of diagnosis should be on the person trying to make a completely unrelated checkin (as is the case when we leave flakey tests enabled in the tree). This normally hasn't been a huge problem, other than for the commit queue script. -eric p.s. I now have two "skipping flakey tests" changes up for review: https://bugs.webkit.org/show_bug.cgi?id=29322 If Brady and Alexey are ok with disabling this test, then I'm fine with it. I like Alexey's suggestion to have a separate "flaky tests" list that the commit queue script can ignore, without preventing them from being run at all. https://bugs.webkit.org/show_bug.cgi?id=29344 This one seems like the test was always buggy, so OK to turn off. I do think the test can be fixed however, despite comments to the contrary in the bug. Regards, Maciej ___ webkit-dev mailing list webkit-dev@lists.webkit.org http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev
Re: [webkit-dev] Skipping Flakey Tests
I don't believe that the test was checked in a flaky state. It was solid for a long time and then something happened... I'll try to add more logging to this test this evening (after my turn at helping chromium stay up to date with WebKit is over). I'll ping Drew about the other test. Dave On Mon, Sep 28, 2009 at 4:40 PM, Eric Seidel wrote: > On Fri, Sep 25, 2009 at 2:42 PM, Maciej Stachowiak wrote: > > I like Dave Levin's idea that the first action should be to instrument > the > > tests so we can find out why they intermittently fail. Especially if the > > failure is reproducible on the bots but not on developer systems. > > I like this idea too. I don't like the reality that flakey tests are > a burden on all developers caused by one. > > > Using the > > skip list should be a last resort, because that hides the failure instead > of > > helping us diangose the cause. > > I (respectfully) disagree. I think we shouldn't be so afraid to skip > tests. We don't allow people to check in compiles which fail. We > don't allow people to check in tests which fail on other platforms > (without skipping them) or on every other run. Why should we allow > people to check in tests which fail every 10 runs? Or worse, why > should we leave a known flakey test checked in/un-attended which fails > every 10 runs? > > If we can't easily roll-out the failing tests (or the commit which > cause them to start failing), we should skip them to keep the bots (a > shared resource) green, so as not to block other work on the project. > No? > > I very much like WebKit's "everyone is responsible for the whole > project" culture, but I disagree that the burden of diagnosis should > be on the person trying to make a completely unrelated checkin (as is > the case when we leave flakey tests enabled in the tree). > > -eric > > p.s. I now have two "skipping flakey tests" changes up for review: > https://bugs.webkit.org/show_bug.cgi?id=29322 > https://bugs.webkit.org/show_bug.cgi?id=29344 > ___ > webkit-dev mailing list > webkit-dev@lists.webkit.org > http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev > ___ webkit-dev mailing list webkit-dev@lists.webkit.org http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev
Re: [webkit-dev] Skipping Flakey Tests
On Fri, Sep 25, 2009 at 2:42 PM, Maciej Stachowiak wrote: > I like Dave Levin's idea that the first action should be to instrument the > tests so we can find out why they intermittently fail. Especially if the > failure is reproducible on the bots but not on developer systems. I like this idea too. I don't like the reality that flakey tests are a burden on all developers caused by one. > Using the > skip list should be a last resort, because that hides the failure instead of > helping us diangose the cause. I (respectfully) disagree. I think we shouldn't be so afraid to skip tests. We don't allow people to check in compiles which fail. We don't allow people to check in tests which fail on other platforms (without skipping them) or on every other run. Why should we allow people to check in tests which fail every 10 runs? Or worse, why should we leave a known flakey test checked in/un-attended which fails every 10 runs? If we can't easily roll-out the failing tests (or the commit which cause them to start failing), we should skip them to keep the bots (a shared resource) green, so as not to block other work on the project. No? I very much like WebKit's "everyone is responsible for the whole project" culture, but I disagree that the burden of diagnosis should be on the person trying to make a completely unrelated checkin (as is the case when we leave flakey tests enabled in the tree). -eric p.s. I now have two "skipping flakey tests" changes up for review: https://bugs.webkit.org/show_bug.cgi?id=29322 https://bugs.webkit.org/show_bug.cgi?id=29344 ___ webkit-dev mailing list webkit-dev@lists.webkit.org http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev
Re: [webkit-dev] Skipping Flakey Tests
On Sep 25, 2009, at 1:49 PM, Eric Seidel wrote: Hum... Discussion kinda died. I'm interested in soliciting more input, particularly from Apple folks. Unless silence indicates that others agree with David Levin, Yaar and Dimitri that we should skip flakey tests? If that's the case, then I'd love a review on: https://bugs.webkit.org/show_bug.cgi?id=29322 :) I like Dave Levin's idea that the first action should be to instrument the tests so we can find out why they intermittently fail. Especially if the failure is reproducible on the bots but not on developer systems. Using the skip list should be a last resort, because that hides the failure instead of helping us diangose the cause. Regards, Maciej ___ webkit-dev mailing list webkit-dev@lists.webkit.org http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev
Re: [webkit-dev] Skipping Flakey Tests
On Fri, Sep 25, 2009 at 1:59 PM, Darin Adler wrote: > For tests that give intermittent and inconsistent results, the best we can > currently do is to skip the test. I think it would make sense to instead > allow multiple expected results. I gather that one of the tools used in the > Chromium project has this concept and I think there’s no real reason not to > add the concept to run-webkit-tests as long as we are conscientious about > not using it when it’s not needed. Not to derail the discussion, but to provide context for Darin's reply: Yes, Chromium's version of run-webkit-tests (called run_webkit_tests.py) has multiple-expected-results support (along with running tests in parallel and other goodness). But it's missing a bunch of the nifty flags WebKit's run-webkit-tests has. My hope is to eventually unify them, which we've filed some bugs on: http://code.google.com/p/chromium/issues/detail?id=23099 https://bugs.webkit.org/show_bug.cgi?id=10906 -eric On Fri, Sep 25, 2009 at 1:59 PM, Darin Adler wrote: > Green buildbots have a lot of value. > > I think it’s worthwhile finding a way to have them even when there are test > failures. > > For predictable failures, the best approach is to land the expected failure > as an expected result, and use a bug to track the fact that it’s wrong. To > me this does seem a bit like “sweeping something under the rug”, a bug > report is much easier to overlook than a red buildbot. We don’t have a great > system for keeping track of the most important bugs. > > For tests that give intermittent and inconsistent results, the best we can > currently do is to skip the test. I think it would make sense to instead > allow multiple expected results. I gather that one of the tools used in the > Chromium project has this concept and I think there’s no real reason not to > add the concept to run-webkit-tests as long as we are conscientious about > not using it when it’s not needed. And use a bug to track the fact that the > test gives insufficient results. This has the same downsides as landing the > expected failure results. > > For tests that have an adverse effect on other tests, the best we can > currently do is to skip the test. > > I think we are overusing the Skipped machinery at the moment for platform > differences. I think in many cases it would be better to instead land an > expected failure result. On the other hand, one really great thing about the > Skipped file is that there’s a complete list in the file, allowing everyone > to see the list. It makes a good to do list, probably better than just a > list of bugs. This made Darin Fisher’s recent “why are so many tests > skipped, lets fix it” message possible. > > -- Darin > > ___ > webkit-dev mailing list > webkit-dev@lists.webkit.org > http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev > ___ webkit-dev mailing list webkit-dev@lists.webkit.org http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev
Re: [webkit-dev] Skipping Flakey Tests
Green buildbots have a lot of value. I think it’s worthwhile finding a way to have them even when there are test failures. For predictable failures, the best approach is to land the expected failure as an expected result, and use a bug to track the fact that it’s wrong. To me this does seem a bit like “sweeping something under the rug”, a bug report is much easier to overlook than a red buildbot. We don’t have a great system for keeping track of the most important bugs. For tests that give intermittent and inconsistent results, the best we can currently do is to skip the test. I think it would make sense to instead allow multiple expected results. I gather that one of the tools used in the Chromium project has this concept and I think there’s no real reason not to add the concept to run-webkit-tests as long as we are conscientious about not using it when it’s not needed. And use a bug to track the fact that the test gives insufficient results. This has the same downsides as landing the expected failure results. For tests that have an adverse effect on other tests, the best we can currently do is to skip the test. I think we are overusing the Skipped machinery at the moment for platform differences. I think in many cases it would be better to instead land an expected failure result. On the other hand, one really great thing about the Skipped file is that there’s a complete list in the file, allowing everyone to see the list. It makes a good to do list, probably better than just a list of bugs. This made Darin Fisher’s recent “why are so many tests skipped, lets fix it” message possible. -- Darin ___ webkit-dev mailing list webkit-dev@lists.webkit.org http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev
Re: [webkit-dev] Skipping Flakey Tests
Hum... Discussion kinda died. I'm interested in soliciting more input, particularly from Apple folks. Unless silence indicates that others agree with David Levin, Yaar and Dimitri that we should skip flakey tests? If that's the case, then I'd love a review on: https://bugs.webkit.org/show_bug.cgi?id=29322 :) Thanks for your time! -eric On Thu, Sep 24, 2009 at 1:40 PM, Eric Seidel wrote: > I think the question is most interesting in the abstract as a question > of general policy about flakey tests. I think handling the > circumstances of each bug is best left for discussion in the bug > themselves. That said, I've attached a list of recently filed bugs > about flakey tests, per your request. > > -eric > > https://bugs.webkit.org/show_bug.cgi?id=29322 (mentioned in the original mail) > https://bugs.webkit.org/show_bug.cgi?id=29505 (same root bug, I suspect) > https://bugs.webkit.org/show_bug.cgi?id=29620 (being worked on, > skipping not yet proposed) > https://bugs.webkit.org/show_bug.cgi?id=28845 (OS bug, resulted in 2 > skips so far) > https://bugs.webkit.org/show_bug.cgi?id=28624 (same OS bug, not yet skipped) > https://bugs.webkit.org/show_bug.cgi?id=29035 (same OS bug, not yet skipped) > https://bugs.webkit.org/show_bug.cgi?id=29154 (still being > investigated, skipping not yet proposed) > > On Wed, Sep 23, 2009 at 11:23 PM, Maciej Stachowiak wrote: >> >> On Sep 23, 2009, at 11:09 PM, Eric Seidel wrote: >> >> Alexey and I have been discussing if WebKit should add flakey tests to >> Skipped lists: >> https://bugs.webkit.org/show_bug.cgi?id=29322 >> >> Alexey asked that I bring the discussion to a larger audience. >> >> Pros: >> - Buildbots stay green. >> - Red bots/tests means your change caused an error. >> >> Cons: >> - Skipped tests may be more likely to be forgotten (and thus never fixed). >> >> What does WebKit think? Should we skip flakey tests which can't be >> resolved in a timely manner (and instead track issues via bugs) as >> WebKit policy? Or should we reserve the skipped list for other >> conditions (like bugs in the OS)? >> >> Thoughts? >> >> I'm a little concerned about sweeping failures under the carpet, but on the >> other hand the buildbots are much less valuable if they are red too much. >> It's hard to think about this in the abstract. Do you have any concrete >> examples? >> - Maciej >> > ___ webkit-dev mailing list webkit-dev@lists.webkit.org http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev
Re: [webkit-dev] Skipping Flakey Tests
I think the question is most interesting in the abstract as a question of general policy about flakey tests. I think handling the circumstances of each bug is best left for discussion in the bug themselves. That said, I've attached a list of recently filed bugs about flakey tests, per your request. -eric https://bugs.webkit.org/show_bug.cgi?id=29322 (mentioned in the original mail) https://bugs.webkit.org/show_bug.cgi?id=29505 (same root bug, I suspect) https://bugs.webkit.org/show_bug.cgi?id=29620 (being worked on, skipping not yet proposed) https://bugs.webkit.org/show_bug.cgi?id=28845 (OS bug, resulted in 2 skips so far) https://bugs.webkit.org/show_bug.cgi?id=28624 (same OS bug, not yet skipped) https://bugs.webkit.org/show_bug.cgi?id=29035 (same OS bug, not yet skipped) https://bugs.webkit.org/show_bug.cgi?id=29154 (still being investigated, skipping not yet proposed) On Wed, Sep 23, 2009 at 11:23 PM, Maciej Stachowiak wrote: > > On Sep 23, 2009, at 11:09 PM, Eric Seidel wrote: > > Alexey and I have been discussing if WebKit should add flakey tests to > Skipped lists: > https://bugs.webkit.org/show_bug.cgi?id=29322 > > Alexey asked that I bring the discussion to a larger audience. > > Pros: > - Buildbots stay green. > - Red bots/tests means your change caused an error. > > Cons: > - Skipped tests may be more likely to be forgotten (and thus never fixed). > > What does WebKit think? Should we skip flakey tests which can't be > resolved in a timely manner (and instead track issues via bugs) as > WebKit policy? Or should we reserve the skipped list for other > conditions (like bugs in the OS)? > > Thoughts? > > I'm a little concerned about sweeping failures under the carpet, but on the > other hand the buildbots are much less valuable if they are red too much. > It's hard to think about this in the abstract. Do you have any concrete > examples? > - Maciej > ___ webkit-dev mailing list webkit-dev@lists.webkit.org http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev
Re: [webkit-dev] Skipping Flakey Tests
I wonder if there's interest in the WebKit community to improve granularity (and detail) with which you can document test expectations, like we have done on Chromium: http://dev.chromium.org/developers/testing/webkit-layout-tests#TOC-Test-Expectations This way, you could monitor known failures with the awesomeness that is Layout Test Flakiness Dashboard: http://src.chromium.org/viewvc/chrome/trunk/src/webkit/tools/layout_tests/flakiness_dashboard.html Since we already have done this work on one port, we could definitely extend this to all other ports. :DG< On Wed, Sep 23, 2009 at 11:41 PM, David Levin wrote: > If a test is flaky, it doesn't seem good to keep in the mix because it will > turn red for no reason and people will think a check in was bad. > oth, if a test is flaky but has enough logging in it to indicated an > underlying cause, then it may be worth keeping in the mix so that the root > issue may be determined. > imo, a good solution in this case is to improve the test to print out more > information to help track down the underlying bug. > dave > > On Wed, Sep 23, 2009 at 11:23 PM, Maciej Stachowiak wrote: >> >> On Sep 23, 2009, at 11:09 PM, Eric Seidel wrote: >> >> Alexey and I have been discussing if WebKit should add flakey tests to >> Skipped lists: >> https://bugs.webkit.org/show_bug.cgi?id=29322 >> >> Alexey asked that I bring the discussion to a larger audience. >> >> Pros: >> - Buildbots stay green. >> - Red bots/tests means your change caused an error. >> >> Cons: >> - Skipped tests may be more likely to be forgotten (and thus never fixed). >> >> What does WebKit think? Should we skip flakey tests which can't be >> resolved in a timely manner (and instead track issues via bugs) as >> WebKit policy? Or should we reserve the skipped list for other >> conditions (like bugs in the OS)? >> >> Thoughts? >> >> I'm a little concerned about sweeping failures under the carpet, but on >> the other hand the buildbots are much less valuable if they are red too >> much. It's hard to think about this in the abstract. Do you have any >> concrete examples? >> - Maciej >> >> ___ >> webkit-dev mailing list >> webkit-dev@lists.webkit.org >> http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev >> > > > ___ > webkit-dev mailing list > webkit-dev@lists.webkit.org > http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev > > ___ webkit-dev mailing list webkit-dev@lists.webkit.org http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev
Re: [webkit-dev] Skipping Flakey Tests
If a test is flaky, it doesn't seem good to keep in the mix because it will turn red for no reason and people will think a check in was bad. oth, if a test is flaky but has enough logging in it to indicated an underlying cause, then it may be worth keeping in the mix so that the root issue may be determined. imo, a good solution in this case is to improve the test to print out more information to help track down the underlying bug. dave On Wed, Sep 23, 2009 at 11:23 PM, Maciej Stachowiak wrote: > > On Sep 23, 2009, at 11:09 PM, Eric Seidel wrote: > > Alexey and I have been discussing if WebKit should add flakey tests to > Skipped lists: > https://bugs.webkit.org/show_bug.cgi?id=29322 > > Alexey asked that I bring the discussion to a larger audience. > > Pros: > - Buildbots stay green. > - Red bots/tests means your change caused an error. > > Cons: > - Skipped tests may be more likely to be forgotten (and thus never fixed). > > What does WebKit think? Should we skip flakey tests which can't be > resolved in a timely manner (and instead track issues via bugs) as > WebKit policy? Or should we reserve the skipped list for other > conditions (like bugs in the OS)? > > Thoughts? > > > I'm a little concerned about sweeping failures under the carpet, but on the > other hand the buildbots are much less valuable if they are red too much. > It's hard to think about this in the abstract. Do you have any concrete > examples? > > - Maciej > > > ___ > webkit-dev mailing list > webkit-dev@lists.webkit.org > http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev > > ___ webkit-dev mailing list webkit-dev@lists.webkit.org http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev
Re: [webkit-dev] Skipping Flakey Tests
On Sep 23, 2009, at 11:09 PM, Eric Seidel wrote: Alexey and I have been discussing if WebKit should add flakey tests to Skipped lists: https://bugs.webkit.org/show_bug.cgi?id=29322 Alexey asked that I bring the discussion to a larger audience. Pros: - Buildbots stay green. - Red bots/tests means your change caused an error. Cons: - Skipped tests may be more likely to be forgotten (and thus never fixed). What does WebKit think? Should we skip flakey tests which can't be resolved in a timely manner (and instead track issues via bugs) as WebKit policy? Or should we reserve the skipped list for other conditions (like bugs in the OS)? Thoughts? I'm a little concerned about sweeping failures under the carpet, but on the other hand the buildbots are much less valuable if they are red too much. It's hard to think about this in the abstract. Do you have any concrete examples? - Maciej ___ webkit-dev mailing list webkit-dev@lists.webkit.org http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev
[webkit-dev] Skipping Flakey Tests
Alexey and I have been discussing if WebKit should add flakey tests to Skipped lists: https://bugs.webkit.org/show_bug.cgi?id=29322 Alexey asked that I bring the discussion to a larger audience. Pros: - Buildbots stay green. - Red bots/tests means your change caused an error. Cons: - Skipped tests may be more likely to be forgotten (and thus never fixed). What does WebKit think? Should we skip flakey tests which can't be resolved in a timely manner (and instead track issues via bugs) as WebKit policy? Or should we reserve the skipped list for other conditions (like bugs in the OS)? Thoughts? -eric ___ webkit-dev mailing list webkit-dev@lists.webkit.org http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev