[chromium-dev] Re: revising the output from run_webkit_tests

2009-11-03 Thread Dirk Pranke

Anyone who wants to follow along on this, I've filed
http://code.google.com/p/chromium/issues/detail?id=26659 to track it.

-- Dirk

On Sat, Oct 24, 2009 at 5:01 PM, Dirk Pranke  wrote:
> Sure. I was floating the idea first before doing any work, but I'll
> just grab an existing text run and hack it up for comparison ...
>
> -- Dirk
>
> On Fri, Oct 23, 2009 at 3:51 PM, Ojan Vafai  wrote:
>> Can you give example outputs for the common cases? It would be easier to
>> discuss those.
>>
>> On Fri, Oct 23, 2009 at 3:43 PM, Dirk Pranke  wrote:
>>>
>>> If you've never run run_webkit_tests to run the layout test
>>> regression, or don't care about it, you can stop reading ...
>>>
>>> If you have run it, and you're like me, you've probably wondered a lot
>>> about the output ... questions like:
>>>
>>> 1) what do the numbers printed at the beginning of the test mean?
>>> 2) what do all of these "test failed" messages mean, and are they bad?
>>> 3) what do the numbers printed at the end of the test mean?
>>> 4) why are the numbers at the end different from the numbers at the
>>> beginning?
>>> 5) did my regression run cleanly, or not?
>>>
>>> You may have also wondered a couple of other things:
>>> 6) What do we expect this test to do?
>>> 7) Where is the baseline for this test?
>>> 8) What is the baseline search path for this test?
>>>
>>> Having just spent a week trying (again), to reconcile the numbers I'm
>>> getting on the LTTF dashboard with what we print out in the test, I'm
>>> thinking about drastically revising the output from the script,
>>> roughly as follows:
>>>
>>> * print the information needed to reproduce the test and look at the
>>> results
>>> * print the expected results in summary form (roughly the expanded
>>> version of the first table in the dashboard - # of tests by
>>> (wontfix/fix/defer x pass/fail/are flaky).
>>> * don't print out failure text to the screen during the run
>>> * print out any *unexpected* results at the end (like we do today)
>>>
>>> The goal would be that if all of your tests pass, you get less than a
>>> small screenful of output from running the tests.
>>>
>>> In addition, we would record a full log of (test,expectation,result)
>>> to the results directory (and this would also be available onscreen
>>> with --verbose)
>>>
>>> Lastly, I'll add a flag to re-run the tests that just failed, so it's
>>> easy to test if the failures were flaky.
>>>
>>> Then I'll rip out as much of the set logic in test_expectations.py as
>>> we can possibly get away with, so that no one has to spend the week I
>>> just did again. I'll probably replace it with much of the logic I use
>>> to generate the dashboard, which is much more flexible in terms of
>>> extracting different types of queries and numbers.
>>>
>>> I think the net result will be the same level of information that we
>>> get today, just in much more meaningful form.
>>>
>>> Thoughts? Comments? Is anyone particularly wedded to the existing
>>> output, or worried about losing a particular piece of info?
>>>
>>> -- Dirk
>>
>>
>

--~--~-~--~~~---~--~~
Chromium Developers mailing list: chromium-dev@googlegroups.com 
View archives, change email options, or unsubscribe: 
http://groups.google.com/group/chromium-dev
-~--~~~~--~~--~--~---



[chromium-dev] Re: revising the output from run_webkit_tests

2009-10-24 Thread Dirk Pranke

Sure. I was floating the idea first before doing any work, but I'll
just grab an existing text run and hack it up for comparison ...

-- Dirk

On Fri, Oct 23, 2009 at 3:51 PM, Ojan Vafai  wrote:
> Can you give example outputs for the common cases? It would be easier to
> discuss those.
>
> On Fri, Oct 23, 2009 at 3:43 PM, Dirk Pranke  wrote:
>>
>> If you've never run run_webkit_tests to run the layout test
>> regression, or don't care about it, you can stop reading ...
>>
>> If you have run it, and you're like me, you've probably wondered a lot
>> about the output ... questions like:
>>
>> 1) what do the numbers printed at the beginning of the test mean?
>> 2) what do all of these "test failed" messages mean, and are they bad?
>> 3) what do the numbers printed at the end of the test mean?
>> 4) why are the numbers at the end different from the numbers at the
>> beginning?
>> 5) did my regression run cleanly, or not?
>>
>> You may have also wondered a couple of other things:
>> 6) What do we expect this test to do?
>> 7) Where is the baseline for this test?
>> 8) What is the baseline search path for this test?
>>
>> Having just spent a week trying (again), to reconcile the numbers I'm
>> getting on the LTTF dashboard with what we print out in the test, I'm
>> thinking about drastically revising the output from the script,
>> roughly as follows:
>>
>> * print the information needed to reproduce the test and look at the
>> results
>> * print the expected results in summary form (roughly the expanded
>> version of the first table in the dashboard - # of tests by
>> (wontfix/fix/defer x pass/fail/are flaky).
>> * don't print out failure text to the screen during the run
>> * print out any *unexpected* results at the end (like we do today)
>>
>> The goal would be that if all of your tests pass, you get less than a
>> small screenful of output from running the tests.
>>
>> In addition, we would record a full log of (test,expectation,result)
>> to the results directory (and this would also be available onscreen
>> with --verbose)
>>
>> Lastly, I'll add a flag to re-run the tests that just failed, so it's
>> easy to test if the failures were flaky.
>>
>> Then I'll rip out as much of the set logic in test_expectations.py as
>> we can possibly get away with, so that no one has to spend the week I
>> just did again. I'll probably replace it with much of the logic I use
>> to generate the dashboard, which is much more flexible in terms of
>> extracting different types of queries and numbers.
>>
>> I think the net result will be the same level of information that we
>> get today, just in much more meaningful form.
>>
>> Thoughts? Comments? Is anyone particularly wedded to the existing
>> output, or worried about losing a particular piece of info?
>>
>> -- Dirk
>
>

--~--~-~--~~~---~--~~
Chromium Developers mailing list: chromium-dev@googlegroups.com 
View archives, change email options, or unsubscribe: 
http://groups.google.com/group/chromium-dev
-~--~~~~--~~--~--~---



[chromium-dev] Re: revising the output from run_webkit_tests

2009-10-23 Thread Nicolas Sylvain
On Fri, Oct 23, 2009 at 3:43 PM, Dirk Pranke  wrote:

>
> If you've never run run_webkit_tests to run the layout test
> regression, or don't care about it, you can stop reading ...
>
> If you have run it, and you're like me, you've probably wondered a lot
> about the output ... questions like:
>
> 1) what do the numbers printed at the beginning of the test mean?
> 2) what do all of these "test failed" messages mean, and are they bad?
> 3) what do the numbers printed at the end of the test mean?
> 4) why are the numbers at the end different from the numbers at the
> beginning?
> 5) did my regression run cleanly, or not?
>
> You may have also wondered a couple of other things:
> 6) What do we expect this test to do?
> 7) Where is the baseline for this test?
> 8) What is the baseline search path for this test?
>
> Having just spent a week trying (again), to reconcile the numbers I'm
> getting on the LTTF dashboard with what we print out in the test, I'm
> thinking about drastically revising the output from the script,
> roughly as follows:
>
> * print the information needed to reproduce the test and look at the
> results
> * print the expected results in summary form (roughly the expanded
> version of the first table in the dashboard - # of tests by
> (wontfix/fix/defer x pass/fail/are flaky).
> * don't print out failure text to the screen during the run
> * print out any *unexpected* results at the end (like we do today)
>
> The goal would be that if all of your tests pass, you get less than a
> small screenful of output from running the tests.
>
> In addition, we would record a full log of (test,expectation,result)
> to the results directory (and this would also be available onscreen
> with --verbose)
>
> Lastly, I'll add a flag to re-run the tests that just failed, so it's
> easy to test if the failures were flaky.
>
This would be nice for the buildbots. We would also need to add a new
section
in the results for Unexpected Flaky Tests (failed then passed).

Nicolas


>
> Then I'll rip out as much of the set logic in test_expectations.py as
> we can possibly get away with, so that no one has to spend the week I
> just did again. I'll probably replace it with much of the logic I use
> to generate the dashboard, which is much more flexible in terms of
> extracting different types of queries and numbers.
>
> I think the net result will be the same level of information that we
> get today, just in much more meaningful form.
>
> Thoughts? Comments? Is anyone particularly wedded to the existing
> output, or worried about losing a particular piece of info?
>
> -- Dirk
>
> >
>

--~--~-~--~~~---~--~~
Chromium Developers mailing list: chromium-dev@googlegroups.com 
View archives, change email options, or unsubscribe: 
http://groups.google.com/group/chromium-dev
-~--~~~~--~~--~--~---



[chromium-dev] Re: revising the output from run_webkit_tests

2009-10-23 Thread Ojan Vafai
Can you give example outputs for the common cases? It would be easier to
discuss those.

On Fri, Oct 23, 2009 at 3:43 PM, Dirk Pranke  wrote:

> If you've never run run_webkit_tests to run the layout test
> regression, or don't care about it, you can stop reading ...
>
> If you have run it, and you're like me, you've probably wondered a lot
> about the output ... questions like:
>
> 1) what do the numbers printed at the beginning of the test mean?
> 2) what do all of these "test failed" messages mean, and are they bad?
> 3) what do the numbers printed at the end of the test mean?
> 4) why are the numbers at the end different from the numbers at the
> beginning?
> 5) did my regression run cleanly, or not?
>
> You may have also wondered a couple of other things:
> 6) What do we expect this test to do?
> 7) Where is the baseline for this test?
> 8) What is the baseline search path for this test?
>
> Having just spent a week trying (again), to reconcile the numbers I'm
> getting on the LTTF dashboard with what we print out in the test, I'm
> thinking about drastically revising the output from the script,
> roughly as follows:
>
> * print the information needed to reproduce the test and look at the
> results
> * print the expected results in summary form (roughly the expanded
> version of the first table in the dashboard - # of tests by
> (wontfix/fix/defer x pass/fail/are flaky).
> * don't print out failure text to the screen during the run
> * print out any *unexpected* results at the end (like we do today)
>
> The goal would be that if all of your tests pass, you get less than a
> small screenful of output from running the tests.
>
> In addition, we would record a full log of (test,expectation,result)
> to the results directory (and this would also be available onscreen
> with --verbose)
>
> Lastly, I'll add a flag to re-run the tests that just failed, so it's
> easy to test if the failures were flaky.
>
> Then I'll rip out as much of the set logic in test_expectations.py as
> we can possibly get away with, so that no one has to spend the week I
> just did again. I'll probably replace it with much of the logic I use
> to generate the dashboard, which is much more flexible in terms of
> extracting different types of queries and numbers.
>
> I think the net result will be the same level of information that we
> get today, just in much more meaningful form.
>
> Thoughts? Comments? Is anyone particularly wedded to the existing
> output, or worried about losing a particular piece of info?
>
> -- Dirk
>

--~--~-~--~~~---~--~~
Chromium Developers mailing list: chromium-dev@googlegroups.com 
View archives, change email options, or unsubscribe: 
http://groups.google.com/group/chromium-dev
-~--~~~~--~~--~--~---