On Mon, Apr 9, 2012 at 3:02 PM, James Robinson <jam...@google.com> wrote: >> 3) Don't use test_expectations.txt to suppress failures across a >> single cycle of the bot, just so you can gather updated baselines >> without the tree going red. While it might seem that you're doing tree >> maintainers a favor, in my experience this just makes things confusing >> and it's better to let the tree go red until you can actually >> rebaseline things. It's too easy to add a suppression "for now" and >> then forget about it. > > > How would you suggest someone contribute a patch that changes layout test > results, if not by marking the test as failing? >
Both this and Dana's response are good questions which I don't have great answers for. Let me explain my thinking in a little more detail. In my ideal world, you would be able to get updated baselines *prior* to trying to land a patch. This is of course not really possible today for any test that fails on multiple ports with different results, as it's practically impossible to run more than a couple of ports by hand, and we don't have a bot infrastructure to help you here. In my closer-to-the-real-world ideal, the test_expectations.txt file would be *empty*, all of the ports would be using them, and then adding in suppressions for files you expect to fail in your patch would be a good signal. In practice, however, marking a test as failing has a few problems: 1) if you don't get the list of suppressions right, than what shows up at the bots can be confusing - you don't realize that more (or different) tests are failing than the ones that showed up. Also, you can see tests fail on some bots that you didn't account for that don't use the expectations at all (e.g., you suppressed Chromium correctly, but the Apple bots go red). 2) adding suppressions to test_expectations.txt increases contention on the test_expectations.txt file, confusing the gardeners / bot-watchers and making merges harder. It is practically difficult to tell the difference between "I just added these lines as a temporary suppression" and "I added these lines as a suppression two weeks ago but then forgot about them", and as Ojan will tell you, you end up with a bunch of lines in the file that just have a comment saying "just needs rebaselining", but you have no idea if those statements are still true or not. 3) the current tool-of-the-day for managing rebaselining, garden-o-matic, is much better suited to handling the "unexpected" failures on the bots rather than the "expected" failures you've introduced. That said, as Dana points out, my proposal makes it hard if not impossible for the EWS to work at all; I'm biased because I hardly ever post changes that are expected to introduce failures on the EWS bots that should still land. I don't know what to do about this, although perhaps moving to my "near-ideal" model might work. If there's consensus in the mean time that it is better on balance to check in suppressions, perhaps we can figure out a better way to do that. Maybe (shudder) a second test_expectations file? Or maybe it would be better to actually check in suppressions marked as REBASELINE (or something like that)? -- Dirk _______________________________________________ webkit-dev mailing list webkit-dev@lists.webkit.org http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev