My understanding of the current proposal is this:

1) This applies to tests that fail deterministically, for reasons other than a 
crash or hang.
2) If the test has a new result that you're confident is a progression (or 
neither better or worse), you simply update the -expected.txt file.
3) If the test has a new result that you're confident is a regression, you 
delete the -expected.txt file and check in the new results as -failing.txt.
4) Ditto points 2 and 3 with respect to -expected.png, for image diffs.
5) We would stop using all other ways of marking tests that fail 
deterministically, including Skipped and the many things you could enter in 
TestExpectations.

Is that correct?

If so, I'd like to suggest a minor modification. In place of point 3 above, I 
propose the following:

3) If the test has a new result that you're confident is a regression, you 
rename the -expected.txt file to -previous.txt (or maybe -correct.txt or 
-pre-expected.txt or something) and check in the new results as -expected.txt 
(unless there is already a -previous.txt, in which case just update -expected 
and leave -previous).

I propose this for the following reasons:

- It maintains the longstanding approach that -expected.txt reflects what is 
currently *expected* to happen, not whether it is right or wrong in some 
abstract sense. It is an expectation, not a reference.
- It still leaves a clear indication of tests that somebody needs to look at 
further, to determine if a regression occurred.
- It leaves both old and new result in place for easy comparison by an expert.

Regards,
Maciej


On Aug 17, 2012, at 6:06 PM, Ojan Vafai <o...@chromium.org> wrote:

> +1
> 
> 
> On Fri, Aug 17, 2012 at 5:36 PM, Ryosuke Niwa <rn...@webkit.org> wrote:
> That's my expectation although we probably can't do that for flaky tests :(
> 
> e.g. sometimes fails with image diff.
> 
> On Fri, Aug 17, 2012 at 5:35 PM, Filip Pizlo <fpi...@apple.com> wrote:
> +1, contingent upon the following: are we agreeing that all current uses of 
> TEXT, IMAGE, and so forth in TestExpectations should be in the *very near 
> term* following Dirk's change be turned into -failing files?
> 
> -Filip
> 
> 
> On Aug 17, 2012, at 5:01 PM, Ryosuke Niwa <rn...@webkit.org> wrote:
> 
>> On Fri, Aug 17, 2012 at 4:55 PM, Ojan Vafai <o...@chromium.org> wrote:
>> Asserting a test case is 100% correct is nearly impossible for a large 
>> percentage of tests. The main advantage it gives us is the ability to have 
>> -expected mean "unsure".
>> 
>> Lets instead only add -failing (i.e. no -passing). Leaving -expected to mean 
>> roughly what it does today to Chromium folk (roughly, as best we can tell 
>> this test is passing). -failing means it's *probably* an incorrect result 
>> but needs an expert to look at it to either mark it correct (i.e. rename it 
>> to -expected) or figure out how the root cause of the bug.
>> 
>> This actually matches exactly what Chromium gardeners do today, except 
>> instead of putting a line in TestExpectations/Skipped to look at later, they 
>> checkin the -failing file to look at later, which has all the advantages 
>> Dirk listed in the other thread.
>> 
>> I'm much more comfortable with this proposal.
>> 
>> - Ryosuke
>> 
>> _______________________________________________
>> webkit-dev mailing list
>> webkit-dev@lists.webkit.org
>> http://lists.webkit.org/mailman/listinfo/webkit-dev
> 
> 
> 
> _______________________________________________
> webkit-dev mailing list
> webkit-dev@lists.webkit.org
> http://lists.webkit.org/mailman/listinfo/webkit-dev

_______________________________________________
webkit-dev mailing list
webkit-dev@lists.webkit.org
http://lists.webkit.org/mailman/listinfo/webkit-dev

Reply via email to