Albyn's reply is in line with an hypothesis I was beginning to formulate (without looking at the underlying FoRTRAN code), prompted by the hint in '?integrate':
Details: If one or both limits are infinite, the infinite range is mapped onto a finite interval. For a finite interval, globally adaptive interval subdivision is used in connection with extrapolation by the Epsilon algorithm. as a result of some numerical experiments. Following up on Harold Doran's original dnorm(x,mean.mean/10) pattern (and quoting a small subset of the experiments ... ): integrate(function(x) dnorm(x, 100,10), -Inf, Inf) # 1 with absolute error < 0.00012 integrate(function(x) dnorm(x, 200,20), -Inf, Inf) # 1.429508e-08 with absolute error < 2.8e-08 integrate(function(x) dnorm(x, 140,14), -Inf, Inf) # 1 with absolute error < 2.2e-06 integrate(function(x) dnorm(x, 150,15), -Inf, Inf) # 2.261582e-05 with absolute error < 4.4e-05 integrate(function(x) dnorm(x, 144,14.4), -Inf, Inf) # 1 with absolute error < 1.7e-06 integrate(function(x) dnorm(x, 145,14.5), -Inf, Inf) # 5.447138e-05 with absolute error < 0.00011 integrate(function(x) dnorm(x, 150,15), -33000, 33300) # 1 with absolute error < 0.00012 integrate(function(x) dnorm(x, 150,15), -34000, 34300) # 5.239671e-05 with absolute error < 0.00010 integrate(function(x) dnorm(x, 150,15), -33768.260234277, 34068.2602334277) # 0.5000307 with absolute error < 6.1e-05 integrate(function(x) dnorm(x, 150,15), -33768.260234278, 34068.2602334278) # 6.139415e-05 with absolute error < 0.00012 So it seems that, depending on how integrate() "maps to a finite interval", and on how the algorithm goes about its "adaptive interval subdivision", there are critical points where integrate() switches from one to another of the following: [A] The whole of a finite interval containing all but the extreme outer tails of dnorm() is integrated over; [B] The whole of a finite interval containing one half of the distribution of dnorm() is integrated over; [C] The whole of a finite interval lying in the extreme of one tail of the dnorm distribution is integrated over. with result: [A] close to 1; [B] close to 0.5; [C] close to 0. This is compatible with Albyn's conclusion that the integral is split into the sum of (-Inf,0) and (0,Inf), with the algorithm ignoring one, or the other, or both, or neither! This must be one of the nastiest exemplar's of "rounding error" ever!!! Ted. On 02-Dec-10 22:39:37, Albyn Jones wrote: > To really understaand it you will have to look at the fortran code > underlying integrate. I tracked it back through a couple of layers > (dqagi, dqagie, ... just use google, these are old netlib > subroutines) then decided I ought to get back to grading papers :-) > > It looks like the integral is split into the sum of two integrals, > (-Inf,0] and [0,Inf). > > >> integrate(function(x) dnorm(x, 350,50), 0, Inf) > 1 with absolute error < 1.5e-05 >> integrate(function(x) dnorm(x, 400,50), 0, Inf) > 1.068444e-05 with absolute error < 2.1e-05 >> integrate(function(x) dnorm(x, 500,50), 0, Inf) > 8.410947e-11 with absolute error < 1.6e-10 >> integrate(function(x) dnorm(x, 500,50), 0, 1000) > 1 with absolute error < 7.4e-05 > > The integral from 0 to Inf is the lim_{x -> Inf} of the integral > over [0,x]. I suspect that the algorithm is picking an interval > [0,x], evaluating the integral over that interval, and then performing > some test to decide whether to expand the interval. When the initial > interval doesn't contain much, expanding a little won't add enough to > trigger another iteration. > > albyn > > On Thu, Dec 02, 2010 at 04:21:34PM -0500, Doran, Harold wrote: >> The integral of any probability density from -Inf to Inf should equal >> 1, correct? I don't understand last result below. >> >> > integrate(function(x) dnorm(x, 0,1), -Inf, Inf) >> 1 with absolute error < 9.4e-05 >> >> > integrate(function(x) dnorm(x, 100,10), -Inf, Inf) >> 1 with absolute error < 0.00012 >> >> > integrate(function(x) dnorm(x, 500,50), -Inf, Inf) >> 8.410947e-11 with absolute error < 1.6e-10 >> >> > all.equal(integrate(function(x) dnorm(x, 500,50), -Inf, Inf)$value, >> > 0) >> [1] TRUE >> >> > sessionInfo() >> R version 2.10.1 (2009-12-14) >> i386-pc-mingw32 >> >> locale: >> [1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United >> States.1252 >> [3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C >> [5] LC_TIME=English_United States.1252 >> >> attached base packages: >> [1] stats graphics grDevices utils datasets methods base >> >> other attached packages: >> [1] statmod_1.4.6 mlmRev_0.99875-1 lme4_0.999375-35 >> Matrix_0.999375-33 lattice_0.17-26 >> >> loaded via a namespace (and not attached): >> [1] grid_2.10.1 nlme_3.1-96 stats4_2.10.1 tools_2.10.1 >> >> [[alternative HTML version deleted]] >> >> ______________________________________________ >> R-help@r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> > > -- > Albyn Jones > Reed College > jo...@reed.edu > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. -------------------------------------------------------------------- E-Mail: (Ted Harding) <ted.hard...@wlandres.net> Fax-to-email: +44 (0)870 094 0861 Date: 02-Dec-10 Time: 23:26:44 ------------------------------ XFMail ------------------------------ ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.