I agree that the increase in existing errors in derby tests is of concern. In order to make everyone productive it is important we get
as close to possible to 0 errors.  Note I believe that some of these
failures are happening, even though developers are running tests
because:
1) intermittent timing/machine environment differences
2) jvm/environment differences
3) just a problem with a commit where something was missed.

If a diff persists more than a day or so, I think the best thing to do
is to make sure an individual JIRA issue comes about and committers
should make sure it is addressed by checking when problem appeared
and what submission caused it. For instance I thought the process worked well for DERBY-810 which was intermittent (I never saw it reproduce), a jira was filed, I realized it was a new test I committed, and I followed and found that the original submitter was working on it.

I wonder if there is anyway JIRA can be used to make it easy to query
about known problems affecting test results.  Or maybe this is the job
of some entry on the web site?


Sometimes we are going to need help
from people with access to different environments if the problem is
specific to those environments.  For instance I mostly only have access
to windows.  I never see the sun OS specific ones, and often don't see
ones that only reproduce on fast or many cpu.  I do have have access to
a wide range of JVM's running on windows.

I agree any diff makes it hard for a submitter to know he is not breaking anything. It would be nice if test jira issues were treated very high, and owned soon. And if the fix looks hard it may be reasonable to disable the test while fixing it so as not to cause pain
to others in the community.  But there are exceptions, I know I have
sometimes left intermittent failure tests in, because I just could not
reproduce and I hoped to gather some information from others seeing
the diff (sometimes putting extra debugging in to dump info out) - in this case it should be advertised to the community and documented as
such in the JIRA entry.

Kristian Waagan wrote:
David W. Van Couvering wrote:

I got some test failures in derbynetclient mats, so I checked the tinderbox. We have quite a number of failures on the latest revision that ran tests, 370061, running on Solaris 10 x86:

http://www.multinet.no/~solberg/public/Apache/TinderBox_Derby/Limited/testSummary-370061.html

derbylang: 1 failure
derbytools: 1 failure
encryptionBlowfish: 1 failure
i18nTest: 1 failure
jdbcapi: 1 failure
derbynetclientmats: 3 failures
derbynetmats: 2 failures
encryptionAll: 2 failures
encryptoin: 1 failure

derbyall: 12 failures


I just had a quick look at the errors. There were multiple IOExceptions regarding denied access to a file. I saw an out of heap space error and some file not found exceptions. I remember that a change of a filename did happen lately (wisconsin?). I am a bit concerned that something may have gone wrong in/with the test environment, so I would be ascertained by another run to see if some of the errors disappear.

If the errors don't go away, I guess we have forgotten to update the policy file and some tests.

I agree with you David that derbyall should in principle run without errors. I tried to fix a test that had been failing for a long time on Solaris 10. I would appreciate comments on the issue: https://issues.apache.org/jira/browse/DERBY-788

I would also encourage people to close Jira issues after the fix has verified. It helps a lot if we are able to describe how the bug can be reproduced when we create an issue. When it comes to derbyall, I expect that the committers have run it before commiting, and hopefully that the developer has done it as well before the patch was submitted. Maybe it's just me, but in my previous work with Jira, an issue wasn't done until it is was closed.



regards,
--
Kristian

This seems a bit much; it's hard for a developer to know if their own changes are valid as you have to sift through all the existing failures. The last Solaris 10 x86 regression test run on revision 369861 had only 2 failures.

Looking at the derbyall history, at least on XP it seems to be getting worse and worse (from 7 failures on 1/6 to 12 failures on 1/17). On Solaris 10 it's gone from 1 failure to 5 failures, while on Linux it's stayed steady.

Can those of us who have checked in/contributed patches lately please look at the failures and see if you recognize what might be causing them?

Also: is anybody keeping an eye on these and raising a flag if the tests start failing? I thought we had a pretty strict rule that we should have 100% pass on derbyall.

Thanks,

David





Reply via email to