Re: [Piglit] [PATCH] framework/backends/junit: Report expected failures/crashes as skipped.

2015-03-03 Thread Dylan Baker
Acked-by: Dylan Baker baker.dyla...@gmail.com

On Tue, Mar 03, 2015 at 02:47:59PM +, Jose Fonseca wrote:
 I recently tried the junit backend's ability to ignore expected
 failures/crashes and found it a godsend -- instead of having to look as
 test graph results periodically, I can just tell jenkins to email me
 when things go south.
 
 The only drawback is that by reporting the expected issues as passing it
 makes it too easy to forget about them and misinterpret the pass-rates.
 So this change modifies the junit backend to report the expected issues
 as skipped, making it more obvious when looking at the test graphs that
 these tests are not really passing, and that whatever functionality they
 target is not being fully covered.
 
 This change also makes use of the junit `message` attribute to explain
 the reason of the skip.  (In fact, we could consider using the `message`
 attribute on other kind of failures to inform the piglit result, instead
 of using the non-standard `type`.)
 ---
  framework/backends/junit.py | 4 +++-
  1 file changed, 3 insertions(+), 1 deletion(-)
 
 diff --git a/framework/backends/junit.py b/framework/backends/junit.py
 index 82f9c29..53b6086 100644
 --- a/framework/backends/junit.py
 +++ b/framework/backends/junit.py
 @@ -129,17 +129,19 @@ class JUnitBackend(FileBackend):
  # Add relevant result value, if the result is pass then it 
 doesn't
  # need one of these statuses
  if data['result'] == 'skip':
 -etree.SubElement(element, 'skipped')
 +res = etree.SubElement(element, 'skipped')
  
  elif data['result'] in ['warn', 'fail', 'dmesg-warn', 
 'dmesg-fail']:
  if expected_result == failure:
  err.text += \n\nWARN: passing test as an expected 
 failure
 +res = etree.SubElement(element, 'skipped', 
 message='expected failure')
  else:
  res = etree.SubElement(element, 'failure')
  
  elif data['result'] == 'crash':
  if expected_result == error:
  err.text += \n\nWARN: passing test as an expected crash
 +res = etree.SubElement(element, 'skipped', 
 message='expected crash')
  else:
  res = etree.SubElement(element, 'error')
  
 -- 
 2.1.0
 


signature.asc
Description: Digital signature
___
Piglit mailing list
Piglit@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/piglit


Re: [Piglit] [PATCH] framework/backends/junit: Report expected failures/crashes as skipped.

2015-03-03 Thread Mark Janes
Thanks Jose! this is an improvement.

In my experience, broken tests are introduced and fixed in mesa on a
daily basis.  This has a few consequences:

 - On a daily basis, I look at failures and update the expected
   pass/fails depending on whether it is a new test or a regression.
   Much of this process is automated.

 - Branches quickly diverge on the basis of passing/failing tests.
   Having separate pass/fail configs on release branches is
   unmanageable.  To account for this, my automation records the
   relevant commit sha as the value in the config file (the key is the
   test name).  I post-process the junit xml to filter out test failures
   with commits that occurred after the branch point.

 - for platforms that are too slow to build each checkin, I run an
   automated bisect which builds/tests in jenkins, then updates config
   files.

 - Our platform matrix generates over 350k unskipped tests for each
   build.  We filter out skipped tests due to the memory consumption on
   jenkins when displaying this many tests.

I am interested in learning more about your test system, and sharing
lessons learned / techniques.

-Mark

Reviewed-by: Mark Janes mark.a.ja...@intel.com

Jose Fonseca jfons...@vmware.com writes:

 I recently tried the junit backend's ability to ignore expected
 failures/crashes and found it a godsend -- instead of having to look as
 test graph results periodically, I can just tell jenkins to email me
 when things go south.

 The only drawback is that by reporting the expected issues as passing it
 makes it too easy to forget about them and misinterpret the pass-rates.
 So this change modifies the junit backend to report the expected issues
 as skipped, making it more obvious when looking at the test graphs that
 these tests are not really passing, and that whatever functionality they
 target is not being fully covered.

 This change also makes use of the junit `message` attribute to explain
 the reason of the skip.  (In fact, we could consider using the `message`
 attribute on other kind of failures to inform the piglit result, instead
 of using the non-standard `type`.)
 ---
  framework/backends/junit.py | 4 +++-
  1 file changed, 3 insertions(+), 1 deletion(-)

 diff --git a/framework/backends/junit.py b/framework/backends/junit.py
 index 82f9c29..53b6086 100644
 --- a/framework/backends/junit.py
 +++ b/framework/backends/junit.py
 @@ -129,17 +129,19 @@ class JUnitBackend(FileBackend):
  # Add relevant result value, if the result is pass then it 
 doesn't
  # need one of these statuses
  if data['result'] == 'skip':
 -etree.SubElement(element, 'skipped')
 +res = etree.SubElement(element, 'skipped')
  
  elif data['result'] in ['warn', 'fail', 'dmesg-warn', 
 'dmesg-fail']:
  if expected_result == failure:
  err.text += \n\nWARN: passing test as an expected 
 failure
 +res = etree.SubElement(element, 'skipped', 
 message='expected failure')
  else:
  res = etree.SubElement(element, 'failure')
  
  elif data['result'] == 'crash':
  if expected_result == error:
  err.text += \n\nWARN: passing test as an expected crash
 +res = etree.SubElement(element, 'skipped', 
 message='expected crash')
  else:
  res = etree.SubElement(element, 'error')
  
 -- 
 2.1.0
___
Piglit mailing list
Piglit@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/piglit


Re: [Piglit] [PATCH] framework/backends/junit: Report expected failures/crashes as skipped.

2015-03-03 Thread Mark Janes
Jose Fonseca jfons...@vmware.com writes:

 Thanks for the reviews.


 My setup is not complicated as yours.  Mostly because my main focus has 
 been llvmpipe, and we're not actively adding supporting to new OpenGL 
 extensions for it at this moment, so most of the new piglit tests either 
 skip or pass. New failures are relatively rare.

One of the lessons I learned with Jenkins is that if the automation is
complex, it should be in git and not in Jenkins projects.  Our Jenkins
jobs typically just execute a single script with a constrained set of
parameters.

This facilitates testing/debugging without invoking builds on the
Jenkins instance.  It also is much easier to handle development branches
by branching your scripts, as opposed to cloning Jenkins projects.

We orchestrate multi-platform jobs with python via Jenkins Remote Access
API.  Binaries and test results are communicated between builds via a
shared drive.

 Before I used 
 https://wiki.jenkins-ci.org/display/JENKINS/Email-ext+plugin -- it 
 allows to control if/when emails are sent with great flexibility. In 
 particular it allows to send emails for new regressions, but not for 
 tests that were failing previously.

 Still, nothing beats being able to look at a bunch of test jobs and 
 immediately tell all blue == all good.   Alhough I use jenkins for many 
 years now, this is a lesson I only learned recently -- it's better to 
 mask out expected failures somehow and get a boolean all pass or 
 fail for the whole test stuie, than trying to track pass/fail for 
 individual tests.  The latter just doesn't scale...



 We do have an internal branches where we run piglit.  I confess I don't 
 a good solution yet.  Your trick of maintaining a database of which git 
 commit tests were added is quite neat.  Another thing worth considering 
 would be to branch or tag/ piglit whenever Mesa is branched, and keep 
 using a matching (and unchanging) piglit commit.

Tagging piglit is a simpler solution than what I've done.  The primary
use case, though, is for developers to test their branches before they
send them to the mailing list.  Without a recent rebase, their branches
will typically report failures on tests which were fixed on master.

 We also run testsuites through different APIs (namely D3D9/10).  These 
 testsuite rarely get updated, and llvmpipe conformance is actually quite 
 good to start with, so it's easy to get all pass there.


 Piglit, by being continuously updated/extended, is indeed more of a 
 challenge than other testsuites.


 We also use piglit for testing our OpenGL guest driver, but we use an 
 internal testing infrastructure to driver, not Jenkins.  So our 
 experiences there don't apply.


 I also have a few benchmarks on jenkins.  Again, I only keep track of 
 performance metrics via Jenkins Plots and Measurements plugins, but I 
 don't produce pass/fail based on those metrics.  I am however 
 considering doing something of the sort -- e.g., getting the history of 
 the metrics via jenkins JSON API, fit into a probablylity distribution, 
 and fail when performance goes below a given percentile.

I'm curious to know which benchmarks you use.  We experience variability
in a lot of benchmarks, especially if they are cpu-heavy and running on
an under-powered system.  It would be great to have workloads that
produce reliable trends without having to run them repeatedly.

thanks for the details!

-Mark

 Jose


 On 03/03/15 18:34, Mark Janes wrote:
 Thanks Jose! this is an improvement.

 In my experience, broken tests are introduced and fixed in mesa on a
 daily basis.  This has a few consequences:

   - On a daily basis, I look at failures and update the expected
 pass/fails depending on whether it is a new test or a regression.
 Much of this process is automated.

   - Branches quickly diverge on the basis of passing/failing tests.
 Having separate pass/fail configs on release branches is
 unmanageable.  To account for this, my automation records the
 relevant commit sha as the value in the config file (the key is the
 test name).  I post-process the junit xml to filter out test failures
 with commits that occurred after the branch point.

   - for platforms that are too slow to build each checkin, I run an
 automated bisect which builds/tests in jenkins, then updates config
 files.

   - Our platform matrix generates over 350k unskipped tests for each
 build.  We filter out skipped tests due to the memory consumption on
 jenkins when displaying this many tests.

 I am interested in learning more about your test system, and sharing
 lessons learned / techniques.

 -Mark

 Reviewed-by: Mark Janes mark.a.ja...@intel.com

 Jose Fonseca jfons...@vmware.com writes:

 I recently tried the junit backend's ability to ignore expected
 failures/crashes and found it a godsend -- instead of having to look as
 test graph results periodically, I can just tell jenkins to email me
 when things go south.


Re: [Piglit] [PATCH] framework/backends/junit: Report expected failures/crashes as skipped.

2015-03-03 Thread Jose Fonseca

Thanks for the reviews.


My setup is not complicated as yours.  Mostly because my main focus has 
been llvmpipe, and we're not actively adding supporting to new OpenGL 
extensions for it at this moment, so most of the new piglit tests either 
skip or pass. New failures are relatively rare.


Before I used 
https://wiki.jenkins-ci.org/display/JENKINS/Email-ext+plugin -- it 
allows to control if/when emails are sent with great flexibility. In 
particular it allows to send emails for new regressions, but not for 
tests that were failing previously.


Still, nothing beats being able to look at a bunch of test jobs and 
immediately tell all blue == all good.   Alhough I use jenkins for many 
years now, this is a lesson I only learned recently -- it's better to 
mask out expected failures somehow and get a boolean all pass or 
fail for the whole test stuie, than trying to track pass/fail for 
individual tests.  The latter just doesn't scale...




We do have an internal branches where we run piglit.  I confess I don't 
a good solution yet.  Your trick of maintaining a database of which git 
commit tests were added is quite neat.  Another thing worth considering 
would be to branch or tag/ piglit whenever Mesa is branched, and keep 
using a matching (and unchanging) piglit commit.




We also run testsuites through different APIs (namely D3D9/10).  These 
testsuite rarely get updated, and llvmpipe conformance is actually quite 
good to start with, so it's easy to get all pass there.



Piglit, by being continuously updated/extended, is indeed more of a 
challenge than other testsuites.



We also use piglit for testing our OpenGL guest driver, but we use an 
internal testing infrastructure to driver, not Jenkins.  So our 
experiences there don't apply.



I also have a few benchmarks on jenkins.  Again, I only keep track of 
performance metrics via Jenkins Plots and Measurements plugins, but I 
don't produce pass/fail based on those metrics.  I am however 
considering doing something of the sort -- e.g., getting the history of 
the metrics via jenkins JSON API, fit into a probablylity distribution, 
and fail when performance goes below a given percentile.



Jose


On 03/03/15 18:34, Mark Janes wrote:

Thanks Jose! this is an improvement.

In my experience, broken tests are introduced and fixed in mesa on a
daily basis.  This has a few consequences:

  - On a daily basis, I look at failures and update the expected
pass/fails depending on whether it is a new test or a regression.
Much of this process is automated.

  - Branches quickly diverge on the basis of passing/failing tests.
Having separate pass/fail configs on release branches is
unmanageable.  To account for this, my automation records the
relevant commit sha as the value in the config file (the key is the
test name).  I post-process the junit xml to filter out test failures
with commits that occurred after the branch point.

  - for platforms that are too slow to build each checkin, I run an
automated bisect which builds/tests in jenkins, then updates config
files.

  - Our platform matrix generates over 350k unskipped tests for each
build.  We filter out skipped tests due to the memory consumption on
jenkins when displaying this many tests.

I am interested in learning more about your test system, and sharing
lessons learned / techniques.

-Mark

Reviewed-by: Mark Janes mark.a.ja...@intel.com

Jose Fonseca jfons...@vmware.com writes:


I recently tried the junit backend's ability to ignore expected
failures/crashes and found it a godsend -- instead of having to look as
test graph results periodically, I can just tell jenkins to email me
when things go south.

The only drawback is that by reporting the expected issues as passing it
makes it too easy to forget about them and misinterpret the pass-rates.
So this change modifies the junit backend to report the expected issues
as skipped, making it more obvious when looking at the test graphs that
these tests are not really passing, and that whatever functionality they
target is not being fully covered.

This change also makes use of the junit `message` attribute to explain
the reason of the skip.  (In fact, we could consider using the `message`
attribute on other kind of failures to inform the piglit result, instead
of using the non-standard `type`.)
---
  framework/backends/junit.py | 4 +++-
  1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/framework/backends/junit.py b/framework/backends/junit.py
index 82f9c29..53b6086 100644
--- a/framework/backends/junit.py
+++ b/framework/backends/junit.py
@@ -129,17 +129,19 @@ class JUnitBackend(FileBackend):
  # Add relevant result value, if the result is pass then it doesn't
  # need one of these statuses
  if data['result'] == 'skip':
-etree.SubElement(element, 'skipped')
+res = etree.SubElement(element, 'skipped')

  elif 

[Piglit] [PATCH] framework/backends/junit: Report expected failures/crashes as skipped.

2015-03-03 Thread Jose Fonseca
I recently tried the junit backend's ability to ignore expected
failures/crashes and found it a godsend -- instead of having to look as
test graph results periodically, I can just tell jenkins to email me
when things go south.

The only drawback is that by reporting the expected issues as passing it
makes it too easy to forget about them and misinterpret the pass-rates.
So this change modifies the junit backend to report the expected issues
as skipped, making it more obvious when looking at the test graphs that
these tests are not really passing, and that whatever functionality they
target is not being fully covered.

This change also makes use of the junit `message` attribute to explain
the reason of the skip.  (In fact, we could consider using the `message`
attribute on other kind of failures to inform the piglit result, instead
of using the non-standard `type`.)
---
 framework/backends/junit.py | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/framework/backends/junit.py b/framework/backends/junit.py
index 82f9c29..53b6086 100644
--- a/framework/backends/junit.py
+++ b/framework/backends/junit.py
@@ -129,17 +129,19 @@ class JUnitBackend(FileBackend):
 # Add relevant result value, if the result is pass then it doesn't
 # need one of these statuses
 if data['result'] == 'skip':
-etree.SubElement(element, 'skipped')
+res = etree.SubElement(element, 'skipped')
 
 elif data['result'] in ['warn', 'fail', 'dmesg-warn', 
'dmesg-fail']:
 if expected_result == failure:
 err.text += \n\nWARN: passing test as an expected failure
+res = etree.SubElement(element, 'skipped', 
message='expected failure')
 else:
 res = etree.SubElement(element, 'failure')
 
 elif data['result'] == 'crash':
 if expected_result == error:
 err.text += \n\nWARN: passing test as an expected crash
+res = etree.SubElement(element, 'skipped', 
message='expected crash')
 else:
 res = etree.SubElement(element, 'error')
 
-- 
2.1.0

___
Piglit mailing list
Piglit@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/piglit