On 13 January 2015 at 21:26, Thomas Neidhart <thomas.neidh...@gmail.com> wrote: > On 01/13/2015 09:01 PM, Phil Steitz wrote: >> On 1/12/15 3:21 PM, Thomas Neidhart wrote: >>> On 01/12/2015 11:17 PM, Phil Steitz wrote: >>>> On 1/12/15 2:30 PM, Thomas Neidhart wrote: >>>>> On 01/12/2015 10:26 PM, Thomas Neidhart wrote: >>>>>> On 01/12/2015 08:09 PM, Phil Steitz wrote: >>>>>>> On 1/12/15 11:37 AM, sebb wrote: >>>>>>>> On 12 January 2015 at 18:11, Phil Steitz <phil.ste...@gmail.com> wrote: >>>>>>>>> On 1/12/15 10:50 AM, sebb wrote: >>>>>>>>>> On 11 January 2015 at 22:10, Phil Steitz <phil.ste...@gmail.com> >>>>>>>>>> wrote: >>>>>>>>>>> On 1/11/15 11:19 AM, Phil Steitz wrote: >>>>>>>>>>>> On 1/10/15 10:49 PM, Phil Steitz wrote: >>>>>>>>>>>>> On 1/9/15 6:09 PM, sebb wrote: >>>>>>>>>>>>>> On 10 January 2015 at 01:01, Phil Steitz <phil.ste...@gmail.com> >>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>> On 1/9/15 5:32 PM, sebb wrote: >>>>>>>>>>>>>>>> On 9 January 2015 at 23:48, sebb <seb...@gmail.com> wrote: >>>>>>>>>>>>>>>>> Of the last 6 runs, only 1 had a problem with unit test >>>>>>>>>>>>>>>>> failures. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> All the builds ran on ubuntu3, apart from the failure which >>>>>>>>>>>>>>>>> ran on H10. >>>>>>>>>>>>>>>>> This may have some bearing on the result; I don't yet know. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> I had a quick look at 2 tests that failed: >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> SimpleRegressionTest.testPerfect >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> SimpleRegressionTest.testPerfectNegative >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Although the test case has some instance data, these >>>>>>>>>>>>>>>>> particular tests >>>>>>>>>>>>>>>>> do not use any, so it does not look like a concurrency issue >>>>>>>>>>>>>>>>> in the >>>>>>>>>>>>>>>>> unit test itself. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> The SimpleRegression class has mutable instance data, but the >>>>>>>>>>>>>>>>> test >>>>>>>>>>>>>>>>> cases create their own instance. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> I don't know anything about the math functions involved, but >>>>>>>>>>>>>>>>> it looks >>>>>>>>>>>>>>>>> as though Infinity might result from getSignificance() if >>>>>>>>>>>>>>>>> getSlopeStdErr() returns 0, as the latter is used as a >>>>>>>>>>>>>>>>> divisor. Or if >>>>>>>>>>>>>>>>> the field sumXX is 0 because that is also used as a divisor. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Maybe the H10 host has different floating point hardware? >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> I'll try running some more tests on H10. >>>>>>>>>>>>>>>> the build failed again on H10; exactly the same tests failed >>>>>>>>>>>>>>>> as before: >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> This test: >>>>>>>>>>>>>>>> https://builds.apache.org/job/Commons%20Math%20H10/1/console >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Previous failure: >>>>>>>>>>>>>>>> https://builds.apache.org/job/Commons%20Math/14/console >>>>>>>>>>>>>>> This is actually a bug. Thanks, sebb (and Jenkins)! >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Has been here since 1.x. What is going on is that the data sets >>>>>>>>>>>>>>> used in the test cases are set up to be perfect linear >>>>>>>>>>>>>>> relationships, which should in fact lead to mean square error >>>>>>>>>>>>>>> (and >>>>>>>>>>>>>>> hence slope standard error) equal to 0. The Jenkins box must be >>>>>>>>>>>>>>> getting exact 0. The funny thing is the test is there to >>>>>>>>>>>>>>> validate >>>>>>>>>>>>>>> correct performance for models like this. Its success >>>>>>>>>>>>>>> unfortunately >>>>>>>>>>>>>>> depends on poor precision. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> I will open a JIRA for this. I don't think it is a release >>>>>>>>>>>>>>> blocker >>>>>>>>>>>>>>> for 3.4.1, as I am sure you would get the same thing in any >>>>>>>>>>>>>>> earlier >>>>>>>>>>>>>>> version of [math]. >>>>>>>>>>>>>> OK good to know. >>>>>>>>>>>>>> >>>>>>>>>>>>>> I'll leave the H10 Jenkins job for now to make it easy to retest. >>>>>>>>>>>>> My first guess here was wrong. The infinities are being handled >>>>>>>>>>>>> correctly for the JDKs I have. Something must be going awry in >>>>>>>>>>>>> the >>>>>>>>>>>>> t distribution cumulative probability computation for +INF on the >>>>>>>>>>>>> box that is failing. Is there a way to find out exactly what JDK >>>>>>>>>>>>> and OS version are being used? >>>>>>>>>>>> I just committed a test that tests the t distribution computations >>>>>>>>>>>> directly. It seems to have run clean; but the other test ran clean >>>>>>>>>>>> too. Is there any way to force the build to use the host that >>>>>>>>>>>> fails? >>>>>>>>>>> I can't make any sense of what is going on with the Jenkins builds. >>>>>>>>>>> Clean runs and then lots of errors. This one explains the >>>>>>>>>>> SimpleRegression "problem" (which is not a problem with that class >>>>>>>>>>> at least) >>>>>>>>>>> >>>>>>>>>>> testCumulativeProbablilityExtremes(org.apache.commons.math3.distribution.TDistributionTest) >>>>>>>>>>> Time elapsed: 0.001 sec <<< FAILURE! >>>>>>>>>>> java.lang.AssertionError: expected:<1.0> but was:<-Infinity> >>>>>>>>>>> at org.junit.Assert.fail(Assert.java:88) >>>>>>>>>>> at org.junit.Assert.failNotEquals(Assert.java:743) >>>>>>>>>>> at org.junit.Assert.assertEquals(Assert.java:494) >>>>>>>>>>> at org.junit.Assert.assertEquals(Assert.java:592) >>>>>>>>>>> at >>>>>>>>>>> org.apache.commons.math3.distribution.TDistributionTest.testCumulativeProbablilityExtremes(TDistributionTest.java:109) >>>>>>>>>>> >>>>>>>>>>> Earlier runs this ran clean. There is nothing non-deterministic >>>>>>>>>>> about this test (or quite a few of the others that randomly seem to >>>>>>>>>>> fail). >>>>>>>>>>> >>>>>>>>>>> I wonder if we have a bad cpu or something somewhere. >>>>>>>>>> AFAICS all the failed builds ran on H10. >>>>>>>>>> >>>>>>>>>> IMO it is consistent; the apparent randomness comes from the fact the >>>>>>>>>> there are several Ubuntu hosts, including H10. >>>>>>>>> Am I reading it / looking at the wrong one, or did this one succeed? >>>>>>>>> >>>>>>>>> https://builds.apache.org/view/All/job/Commons%20Math%20H10/6/ >>>>>>>>> >>>>>>>>> That one was right after I added tests confirming that the t >>>>>>>>> distribution cum prob handles INFs correctly. >>>>>>>> That did run on H10 and did succeed; I'd not noticed that one before. >>>>>>>> >>>>>>>> I think it is still true that the failures have only occurred on H10. >>>>>>>> >>>>>>>> However, the latest one is failing: >>>>>>>> >>>>>>>> https://builds.apache.org/job/Commons%20Math/24/console >>>>>>>> >>>>>>>> This is on H11 - I think that's the first time H11 has been used. >>>>>>>> >>>>>>>> I suppose it's possible that H10 and H11 have a common failing, but it >>>>>>>> seems less likely. >>>>>>>> >>>>>>>> I added a bit more debug - showing the value of sumXX - but that seems >>>>>>>> OK on H11. >>>>>>>> >>>>>>>> I just added a bit more debug. >>>>>>> I am pretty sure the SimpleRegressionTest failure is actually cause >>>>>>> by the same thing causing the t-distribution test to fail (the >>>>>>> reason I added that one). >>>>>>> >>>>>>> One that is more straightforward to chase is this one, which fails >>>>>>> pretty consistently when "bad things happen" >>>>>>> >>>>>>> testExpInf(org.apache.commons.math3.complex.ComplexTest) Time elapsed: >>>>>>> 0.001 sec <<< FAILURE! >>>>>>> java.lang.AssertionError: expected:<0.0> but was:<Infinity> >>>>>>> at org.junit.Assert.fail(Assert.java:88) >>>>>>> at org.junit.Assert.failNotEquals(Assert.java:743) >>>>>>> at org.junit.Assert.assertEquals(Assert.java:494) >>>>>>> at org.junit.Assert.assertEquals(Assert.java:592) >>>>>>> at org.apache.commons.math3.TestUtils.assertSame(TestUtils.java:76) >>>>>>> at org.apache.commons.math3.TestUtils.assertSame(TestUtils.java:84) >>>>>>> at >>>>>>> org.apache.commons.math3.complex.ComplexTest.testExpInf(ComplexTest.java:788) >>>>>>> >>>>>>> I would wager that what is going on here is 0.0 * -INF = INF. >>>>>> The output returned by the debug statements added by sebb is: >>>>>> >>>>>> expReal=Infinity >>>>>> cosImag=0.5403023058681398 >>>>>> sinImag=0.8414709848078965 >>>>>> result=(Infinity, Infinity) >>>>>> >>>>>> while expReal should be -Infinity. >>>>>> >>>>>> of course, Math.exp(Infinity) = Infinity. >>>>> oh stupid mistake, please forget my last post. >>>>> I messed up expReal with the actual real value. >>>> But it should be 0, since expReal should be exp(-INF) >>> just added a few more debug output to the test and the result is: >>> >>> real=-Infinity >>> -real=2147483647 >>> expReal=Infinity >>> >>> according to FastMath.exp(), with these values, the code path should be >>> as follows: >>> >>> if (x < 0.0) { >>> intVal = (int) -x; >>> >>> if (intVal > 746) { >>> if (hiPrec != null) { >>> hiPrec[0] = 0.0; >>> hiPrec[1] = 0.0; >>> } >>> --> return 0.0; >>> } >>> >>> >>> but obviously it doesn't do this. I guess we can only inspect the >>> generated class files for a potential compiler bug. >> >> I did a little more poking about last night in the failed tests and >> the ones I spot-checked could all have had to do with incorrect >> computations of exp(-INF). What is strange is that the cast you >> show above is working correctly (compliant with JLS) and the code >> path should be as you have it there. It seems very strange that >> just this one code path is sporadically having problems. > > You can see the result of various test builds here: > > https://builds.apache.org/job/Commons%20Math%20H10/ > > Everytime I added more debug output to FastMath.exp(), the tests succeeded. > > I also setup a jenkins instance with the same maven / jdk version to > build commons-math, but could never reproduce an error so far. > > Without direct access to one of the failing servers, I doubt that we > will be able to find / fix this problem.
I wonder if it could be a runtime optimiser issue? That might explain why debug affected the outcome. Might be worth trying the non-debug code with Java 6 or 7. If the issue is related to Java 5, then it could explain why it was not seen by developer testing - I don't think many devs use Java 5 (or the incompatibility would not have crept into the release). > Thomas > > --------------------------------------------------------------------- > To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org > For additional commands, e-mail: dev-h...@commons.apache.org > --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org