Re: [Piglit] [PATCH] framework/backends/junit: Report expected failures/crashes as skipped.
Acked-by: Dylan Baker baker.dyla...@gmail.com On Tue, Mar 03, 2015 at 02:47:59PM +, Jose Fonseca wrote: I recently tried the junit backend's ability to ignore expected failures/crashes and found it a godsend -- instead of having to look as test graph results periodically, I can just tell jenkins to email me when things go south. The only drawback is that by reporting the expected issues as passing it makes it too easy to forget about them and misinterpret the pass-rates. So this change modifies the junit backend to report the expected issues as skipped, making it more obvious when looking at the test graphs that these tests are not really passing, and that whatever functionality they target is not being fully covered. This change also makes use of the junit `message` attribute to explain the reason of the skip. (In fact, we could consider using the `message` attribute on other kind of failures to inform the piglit result, instead of using the non-standard `type`.) --- framework/backends/junit.py | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/framework/backends/junit.py b/framework/backends/junit.py index 82f9c29..53b6086 100644 --- a/framework/backends/junit.py +++ b/framework/backends/junit.py @@ -129,17 +129,19 @@ class JUnitBackend(FileBackend): # Add relevant result value, if the result is pass then it doesn't # need one of these statuses if data['result'] == 'skip': -etree.SubElement(element, 'skipped') +res = etree.SubElement(element, 'skipped') elif data['result'] in ['warn', 'fail', 'dmesg-warn', 'dmesg-fail']: if expected_result == failure: err.text += \n\nWARN: passing test as an expected failure +res = etree.SubElement(element, 'skipped', message='expected failure') else: res = etree.SubElement(element, 'failure') elif data['result'] == 'crash': if expected_result == error: err.text += \n\nWARN: passing test as an expected crash +res = etree.SubElement(element, 'skipped', message='expected crash') else: res = etree.SubElement(element, 'error') -- 2.1.0 signature.asc Description: Digital signature ___ Piglit mailing list Piglit@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/piglit
Re: [Piglit] [PATCH] framework/backends/junit: Report expected failures/crashes as skipped.
Thanks Jose! this is an improvement. In my experience, broken tests are introduced and fixed in mesa on a daily basis. This has a few consequences: - On a daily basis, I look at failures and update the expected pass/fails depending on whether it is a new test or a regression. Much of this process is automated. - Branches quickly diverge on the basis of passing/failing tests. Having separate pass/fail configs on release branches is unmanageable. To account for this, my automation records the relevant commit sha as the value in the config file (the key is the test name). I post-process the junit xml to filter out test failures with commits that occurred after the branch point. - for platforms that are too slow to build each checkin, I run an automated bisect which builds/tests in jenkins, then updates config files. - Our platform matrix generates over 350k unskipped tests for each build. We filter out skipped tests due to the memory consumption on jenkins when displaying this many tests. I am interested in learning more about your test system, and sharing lessons learned / techniques. -Mark Reviewed-by: Mark Janes mark.a.ja...@intel.com Jose Fonseca jfons...@vmware.com writes: I recently tried the junit backend's ability to ignore expected failures/crashes and found it a godsend -- instead of having to look as test graph results periodically, I can just tell jenkins to email me when things go south. The only drawback is that by reporting the expected issues as passing it makes it too easy to forget about them and misinterpret the pass-rates. So this change modifies the junit backend to report the expected issues as skipped, making it more obvious when looking at the test graphs that these tests are not really passing, and that whatever functionality they target is not being fully covered. This change also makes use of the junit `message` attribute to explain the reason of the skip. (In fact, we could consider using the `message` attribute on other kind of failures to inform the piglit result, instead of using the non-standard `type`.) --- framework/backends/junit.py | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/framework/backends/junit.py b/framework/backends/junit.py index 82f9c29..53b6086 100644 --- a/framework/backends/junit.py +++ b/framework/backends/junit.py @@ -129,17 +129,19 @@ class JUnitBackend(FileBackend): # Add relevant result value, if the result is pass then it doesn't # need one of these statuses if data['result'] == 'skip': -etree.SubElement(element, 'skipped') +res = etree.SubElement(element, 'skipped') elif data['result'] in ['warn', 'fail', 'dmesg-warn', 'dmesg-fail']: if expected_result == failure: err.text += \n\nWARN: passing test as an expected failure +res = etree.SubElement(element, 'skipped', message='expected failure') else: res = etree.SubElement(element, 'failure') elif data['result'] == 'crash': if expected_result == error: err.text += \n\nWARN: passing test as an expected crash +res = etree.SubElement(element, 'skipped', message='expected crash') else: res = etree.SubElement(element, 'error') -- 2.1.0 ___ Piglit mailing list Piglit@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/piglit
Re: [Piglit] [PATCH] framework/backends/junit: Report expected failures/crashes as skipped.
Jose Fonseca jfons...@vmware.com writes: Thanks for the reviews. My setup is not complicated as yours. Mostly because my main focus has been llvmpipe, and we're not actively adding supporting to new OpenGL extensions for it at this moment, so most of the new piglit tests either skip or pass. New failures are relatively rare. One of the lessons I learned with Jenkins is that if the automation is complex, it should be in git and not in Jenkins projects. Our Jenkins jobs typically just execute a single script with a constrained set of parameters. This facilitates testing/debugging without invoking builds on the Jenkins instance. It also is much easier to handle development branches by branching your scripts, as opposed to cloning Jenkins projects. We orchestrate multi-platform jobs with python via Jenkins Remote Access API. Binaries and test results are communicated between builds via a shared drive. Before I used https://wiki.jenkins-ci.org/display/JENKINS/Email-ext+plugin -- it allows to control if/when emails are sent with great flexibility. In particular it allows to send emails for new regressions, but not for tests that were failing previously. Still, nothing beats being able to look at a bunch of test jobs and immediately tell all blue == all good. Alhough I use jenkins for many years now, this is a lesson I only learned recently -- it's better to mask out expected failures somehow and get a boolean all pass or fail for the whole test stuie, than trying to track pass/fail for individual tests. The latter just doesn't scale... We do have an internal branches where we run piglit. I confess I don't a good solution yet. Your trick of maintaining a database of which git commit tests were added is quite neat. Another thing worth considering would be to branch or tag/ piglit whenever Mesa is branched, and keep using a matching (and unchanging) piglit commit. Tagging piglit is a simpler solution than what I've done. The primary use case, though, is for developers to test their branches before they send them to the mailing list. Without a recent rebase, their branches will typically report failures on tests which were fixed on master. We also run testsuites through different APIs (namely D3D9/10). These testsuite rarely get updated, and llvmpipe conformance is actually quite good to start with, so it's easy to get all pass there. Piglit, by being continuously updated/extended, is indeed more of a challenge than other testsuites. We also use piglit for testing our OpenGL guest driver, but we use an internal testing infrastructure to driver, not Jenkins. So our experiences there don't apply. I also have a few benchmarks on jenkins. Again, I only keep track of performance metrics via Jenkins Plots and Measurements plugins, but I don't produce pass/fail based on those metrics. I am however considering doing something of the sort -- e.g., getting the history of the metrics via jenkins JSON API, fit into a probablylity distribution, and fail when performance goes below a given percentile. I'm curious to know which benchmarks you use. We experience variability in a lot of benchmarks, especially if they are cpu-heavy and running on an under-powered system. It would be great to have workloads that produce reliable trends without having to run them repeatedly. thanks for the details! -Mark Jose On 03/03/15 18:34, Mark Janes wrote: Thanks Jose! this is an improvement. In my experience, broken tests are introduced and fixed in mesa on a daily basis. This has a few consequences: - On a daily basis, I look at failures and update the expected pass/fails depending on whether it is a new test or a regression. Much of this process is automated. - Branches quickly diverge on the basis of passing/failing tests. Having separate pass/fail configs on release branches is unmanageable. To account for this, my automation records the relevant commit sha as the value in the config file (the key is the test name). I post-process the junit xml to filter out test failures with commits that occurred after the branch point. - for platforms that are too slow to build each checkin, I run an automated bisect which builds/tests in jenkins, then updates config files. - Our platform matrix generates over 350k unskipped tests for each build. We filter out skipped tests due to the memory consumption on jenkins when displaying this many tests. I am interested in learning more about your test system, and sharing lessons learned / techniques. -Mark Reviewed-by: Mark Janes mark.a.ja...@intel.com Jose Fonseca jfons...@vmware.com writes: I recently tried the junit backend's ability to ignore expected failures/crashes and found it a godsend -- instead of having to look as test graph results periodically, I can just tell jenkins to email me when things go south.
Re: [Piglit] [PATCH] framework/backends/junit: Report expected failures/crashes as skipped.
Thanks for the reviews. My setup is not complicated as yours. Mostly because my main focus has been llvmpipe, and we're not actively adding supporting to new OpenGL extensions for it at this moment, so most of the new piglit tests either skip or pass. New failures are relatively rare. Before I used https://wiki.jenkins-ci.org/display/JENKINS/Email-ext+plugin -- it allows to control if/when emails are sent with great flexibility. In particular it allows to send emails for new regressions, but not for tests that were failing previously. Still, nothing beats being able to look at a bunch of test jobs and immediately tell all blue == all good. Alhough I use jenkins for many years now, this is a lesson I only learned recently -- it's better to mask out expected failures somehow and get a boolean all pass or fail for the whole test stuie, than trying to track pass/fail for individual tests. The latter just doesn't scale... We do have an internal branches where we run piglit. I confess I don't a good solution yet. Your trick of maintaining a database of which git commit tests were added is quite neat. Another thing worth considering would be to branch or tag/ piglit whenever Mesa is branched, and keep using a matching (and unchanging) piglit commit. We also run testsuites through different APIs (namely D3D9/10). These testsuite rarely get updated, and llvmpipe conformance is actually quite good to start with, so it's easy to get all pass there. Piglit, by being continuously updated/extended, is indeed more of a challenge than other testsuites. We also use piglit for testing our OpenGL guest driver, but we use an internal testing infrastructure to driver, not Jenkins. So our experiences there don't apply. I also have a few benchmarks on jenkins. Again, I only keep track of performance metrics via Jenkins Plots and Measurements plugins, but I don't produce pass/fail based on those metrics. I am however considering doing something of the sort -- e.g., getting the history of the metrics via jenkins JSON API, fit into a probablylity distribution, and fail when performance goes below a given percentile. Jose On 03/03/15 18:34, Mark Janes wrote: Thanks Jose! this is an improvement. In my experience, broken tests are introduced and fixed in mesa on a daily basis. This has a few consequences: - On a daily basis, I look at failures and update the expected pass/fails depending on whether it is a new test or a regression. Much of this process is automated. - Branches quickly diverge on the basis of passing/failing tests. Having separate pass/fail configs on release branches is unmanageable. To account for this, my automation records the relevant commit sha as the value in the config file (the key is the test name). I post-process the junit xml to filter out test failures with commits that occurred after the branch point. - for platforms that are too slow to build each checkin, I run an automated bisect which builds/tests in jenkins, then updates config files. - Our platform matrix generates over 350k unskipped tests for each build. We filter out skipped tests due to the memory consumption on jenkins when displaying this many tests. I am interested in learning more about your test system, and sharing lessons learned / techniques. -Mark Reviewed-by: Mark Janes mark.a.ja...@intel.com Jose Fonseca jfons...@vmware.com writes: I recently tried the junit backend's ability to ignore expected failures/crashes and found it a godsend -- instead of having to look as test graph results periodically, I can just tell jenkins to email me when things go south. The only drawback is that by reporting the expected issues as passing it makes it too easy to forget about them and misinterpret the pass-rates. So this change modifies the junit backend to report the expected issues as skipped, making it more obvious when looking at the test graphs that these tests are not really passing, and that whatever functionality they target is not being fully covered. This change also makes use of the junit `message` attribute to explain the reason of the skip. (In fact, we could consider using the `message` attribute on other kind of failures to inform the piglit result, instead of using the non-standard `type`.) --- framework/backends/junit.py | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/framework/backends/junit.py b/framework/backends/junit.py index 82f9c29..53b6086 100644 --- a/framework/backends/junit.py +++ b/framework/backends/junit.py @@ -129,17 +129,19 @@ class JUnitBackend(FileBackend): # Add relevant result value, if the result is pass then it doesn't # need one of these statuses if data['result'] == 'skip': -etree.SubElement(element, 'skipped') +res = etree.SubElement(element, 'skipped') elif
[Piglit] [PATCH] framework/backends/junit: Report expected failures/crashes as skipped.
I recently tried the junit backend's ability to ignore expected failures/crashes and found it a godsend -- instead of having to look as test graph results periodically, I can just tell jenkins to email me when things go south. The only drawback is that by reporting the expected issues as passing it makes it too easy to forget about them and misinterpret the pass-rates. So this change modifies the junit backend to report the expected issues as skipped, making it more obvious when looking at the test graphs that these tests are not really passing, and that whatever functionality they target is not being fully covered. This change also makes use of the junit `message` attribute to explain the reason of the skip. (In fact, we could consider using the `message` attribute on other kind of failures to inform the piglit result, instead of using the non-standard `type`.) --- framework/backends/junit.py | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/framework/backends/junit.py b/framework/backends/junit.py index 82f9c29..53b6086 100644 --- a/framework/backends/junit.py +++ b/framework/backends/junit.py @@ -129,17 +129,19 @@ class JUnitBackend(FileBackend): # Add relevant result value, if the result is pass then it doesn't # need one of these statuses if data['result'] == 'skip': -etree.SubElement(element, 'skipped') +res = etree.SubElement(element, 'skipped') elif data['result'] in ['warn', 'fail', 'dmesg-warn', 'dmesg-fail']: if expected_result == failure: err.text += \n\nWARN: passing test as an expected failure +res = etree.SubElement(element, 'skipped', message='expected failure') else: res = etree.SubElement(element, 'failure') elif data['result'] == 'crash': if expected_result == error: err.text += \n\nWARN: passing test as an expected crash +res = etree.SubElement(element, 'skipped', message='expected crash') else: res = etree.SubElement(element, 'error') -- 2.1.0 ___ Piglit mailing list Piglit@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/piglit