Jose Fonseca <jfons...@vmware.com> writes: > Thanks for the reviews. > > > My setup is not complicated as yours. Mostly because my main focus has > been llvmpipe, and we're not actively adding supporting to new OpenGL > extensions for it at this moment, so most of the new piglit tests either > skip or pass. New failures are relatively rare.
One of the lessons I learned with Jenkins is that if the automation is complex, it should be in git and not in Jenkins projects. Our Jenkins jobs typically just execute a single script with a constrained set of parameters. This facilitates testing/debugging without invoking builds on the Jenkins instance. It also is much easier to handle development branches by branching your scripts, as opposed to cloning Jenkins projects. We orchestrate multi-platform jobs with python via Jenkins Remote Access API. Binaries and test results are communicated between builds via a shared drive. > Before I used > https://wiki.jenkins-ci.org/display/JENKINS/Email-ext+plugin -- it > allows to control if/when emails are sent with great flexibility. In > particular it allows to send emails for new regressions, but not for > tests that were failing previously. > > Still, nothing beats being able to look at a bunch of test jobs and > immediately tell all blue == all good. Alhough I use jenkins for many > years now, this is a lesson I only learned recently -- it's better to > mask out expected failures somehow and get a boolean "all pass" or > "fail" for the whole test stuie, than trying to track pass/fail for > individual tests. The latter just doesn't scale... > > > > We do have an internal branches where we run piglit. I confess I don't > a good solution yet. Your trick of maintaining a database of which git > commit tests were added is quite neat. Another thing worth considering > would be to branch or tag/ piglit whenever Mesa is branched, and keep > using a matching (and unchanging) piglit commit. Tagging piglit is a simpler solution than what I've done. The primary use case, though, is for developers to test their branches before they send them to the mailing list. Without a recent rebase, their branches will typically report failures on tests which were fixed on master. > We also run testsuites through different APIs (namely D3D9/10). These > testsuite rarely get updated, and llvmpipe conformance is actually quite > good to start with, so it's easy to get "all pass" there. > > > Piglit, by being continuously updated/extended, is indeed more of a > challenge than other testsuites. > > > We also use piglit for testing our OpenGL guest driver, but we use an > internal testing infrastructure to driver, not Jenkins. So our > experiences there don't apply. > > > I also have a few benchmarks on jenkins. Again, I only keep track of > performance metrics via Jenkins Plots and Measurements plugins, but I > don't produce pass/fail based on those metrics. I am however > considering doing something of the sort -- e.g., getting the history of > the metrics via jenkins JSON API, fit into a probablylity distribution, > and fail when performance goes below a given percentile. I'm curious to know which benchmarks you use. We experience variability in a lot of benchmarks, especially if they are cpu-heavy and running on an under-powered system. It would be great to have workloads that produce reliable trends without having to run them repeatedly. thanks for the details! -Mark > Jose > > > On 03/03/15 18:34, Mark Janes wrote: >> Thanks Jose! this is an improvement. >> >> In my experience, broken tests are introduced and fixed in mesa on a >> daily basis. This has a few consequences: >> >> - On a daily basis, I look at failures and update the expected >> pass/fails depending on whether it is a new test or a regression. >> Much of this process is automated. >> >> - Branches quickly diverge on the basis of passing/failing tests. >> Having separate pass/fail configs on release branches is >> unmanageable. To account for this, my automation records the >> relevant commit sha as the value in the config file (the key is the >> test name). I post-process the junit xml to filter out test failures >> with commits that occurred after the branch point. >> >> - for platforms that are too slow to build each checkin, I run an >> automated bisect which builds/tests in jenkins, then updates config >> files. >> >> - Our platform matrix generates over 350k unskipped tests for each >> build. We filter out skipped tests due to the memory consumption on >> jenkins when displaying this many tests. >> >> I am interested in learning more about your test system, and sharing >> lessons learned / techniques. >> >> -Mark >> >> Reviewed-by: Mark Janes <mark.a.ja...@intel.com> >> >> Jose Fonseca <jfons...@vmware.com> writes: >> >>> I recently tried the junit backend's ability to ignore expected >>> failures/crashes and found it a godsend -- instead of having to look as >>> test graph results periodically, I can just tell jenkins to email me >>> when things go south. >>> >>> The only drawback is that by reporting the expected issues as passing it >>> makes it too easy to forget about them and misinterpret the pass-rates. >>> So this change modifies the junit backend to report the expected issues >>> as skipped, making it more obvious when looking at the test graphs that >>> these tests are not really passing, and that whatever functionality they >>> target is not being fully covered. >>> >>> This change also makes use of the junit `message` attribute to explain >>> the reason of the skip. (In fact, we could consider using the `message` >>> attribute on other kind of failures to inform the piglit result, instead >>> of using the non-standard `type`.) >>> --- >>> framework/backends/junit.py | 4 +++- >>> 1 file changed, 3 insertions(+), 1 deletion(-) >>> >>> diff --git a/framework/backends/junit.py b/framework/backends/junit.py >>> index 82f9c29..53b6086 100644 >>> --- a/framework/backends/junit.py >>> +++ b/framework/backends/junit.py >>> @@ -129,17 +129,19 @@ class JUnitBackend(FileBackend): >>> # Add relevant result value, if the result is pass then it >>> doesn't >>> # need one of these statuses >>> if data['result'] == 'skip': >>> - etree.SubElement(element, 'skipped') >>> + res = etree.SubElement(element, 'skipped') >>> >>> elif data['result'] in ['warn', 'fail', 'dmesg-warn', >>> 'dmesg-fail']: >>> if expected_result == "failure": >>> err.text += "\n\nWARN: passing test as an expected >>> failure" >>> + res = etree.SubElement(element, 'skipped', >>> message='expected failure') >>> else: >>> res = etree.SubElement(element, 'failure') >>> >>> elif data['result'] == 'crash': >>> if expected_result == "error": >>> err.text += "\n\nWARN: passing test as an expected >>> crash" >>> + res = etree.SubElement(element, 'skipped', >>> message='expected crash') >>> else: >>> res = etree.SubElement(element, 'error') >>> >>> -- >>> 2.1.0 _______________________________________________ Piglit mailing list Piglit@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/piglit