[jira] [Commented] (MATH-585) Very slow generation of gamma random variates
[ https://issues.apache.org/jira/browse/MATH-585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13069196#comment-13069196 ] Mikkel Meyer Andersen commented on MATH-585: Fixed in revision 1149350. > Very slow generation of gamma random variates > - > > Key: MATH-585 > URL: https://issues.apache.org/jira/browse/MATH-585 > Project: Commons Math > Issue Type: Improvement >Affects Versions: 2.2, 3.0 > Environment: All >Reporter: Darren Wilkinson >Assignee: Mikkel Meyer Andersen > Labels: Gamma, Random > Attachments: MATH585-1.patch, MATH585-4-gamma.patch, > MATH585-MT2001-1.patch > > Original Estimate: 6h > Remaining Estimate: 6h > > The current implementation of gamma random variate generation works, but uses > an inversion method. This is well-known to be a bad idea. Usually a carefully > constructed rejection procedure is used. To give an idea of the magnitude of > the problem, the Gamma variate generation in Parallel COLT is roughly 50 > times faster than in Commons Math. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MATH-585) Very slow generation of gamma random variates
[ https://issues.apache.org/jira/browse/MATH-585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13052035#comment-13052035 ] Mikkel Meyer Andersen commented on MATH-585: Thanks for the comments. While working with the constants, I fell over a new and extremely interesting paper [1] "A Simple Method for Generating Gamma Variables" from 2001 by Marsaglia and Tsang (I'm tempted to ask: who else? :-)). It relies on a fast Gaussian random generator (which I believe we have). It is equivalent to the speed of the Gaussian random generator. Furthermore, it is really simple to implement. So in short, it looks very interesting. Actually, it also seems faster than Ahrens and Dieter (1982). A few samples (using JDKRandomGenerator): {quote} Generating 100 random gamma(1.0, 1.1) with mean 1.1 and var 1.2102 Marsaglia and Tsang (2001) took 269.0 ms. with mean 1.0997977261198317 and var 1.2113772336053434 Ahrens and Dieter (1982) took 358.0 ms. with mean 1.1003522016214415 and var 1.212513390495578 Generating 100 random gamma(1.5, 2.0) with mean 3.0 and var 6.0 Marsaglia and Tsang (2001) took 279.0 ms. with mean 2.998893072520475 and var 5.978837507498615 Ahrens and Dieter (1982) took 357.0 ms. with mean 2.9975071864627463 and var 5.999613004242815 Generating 100 random gamma(15.0, 1.0) with mean 15.0 and var 15.0 Marsaglia and Tsang (2001) took 259.0 ms. with mean 14.997425134380348 and var 15.006604469206364 Ahrens and Dieter (1982) took 278.0 ms. with mean 14.993230125478366 and var 14.984980114735736 {quote} It is implemented as follows: {code} final double d = shape - 0.33; final double c = 1.0 / (3*FastMath.sqrt(d)); do { final double x = generator.nextGaussian(); final double v = (1+c*x)*(1+c*x)*(1+c*x); if (v <= 0) { continue; } final double xx = x*x; final double u = this.nextUniform(0, 1); // Squeeze if (u < 1 - 0.0331*xx*xx) { return scale*d*v; } if (FastMath.log(u) < 0.5*xx + d*(1 - v + FastMath.log(v))) { return scale*d*v; } } while (true); {code} The algorithm is also usable for shape < 1, but Ahrens and Dieter (1974) seems better in that case. What do you think about using Marsaglia and Tsang (2001) instead of Ahrens and Dieter (1982)? I now prefer Marsaglia and Tsang (2001) :-). Phil, in regards to freely available online reference, I agree that it would be great, but I also think that it is second to choosing a good algorithm. [1]: http://dx.doi.org/10.1145/358407.358414 > Very slow generation of gamma random variates > - > > Key: MATH-585 > URL: https://issues.apache.org/jira/browse/MATH-585 > Project: Commons Math > Issue Type: Improvement >Affects Versions: 2.2, 3.0 > Environment: All >Reporter: Darren Wilkinson >Assignee: Mikkel Meyer Andersen > Labels: Gamma, Random > Attachments: MATH585-1.patch, MATH585-4-gamma.patch > > Original Estimate: 6h > Remaining Estimate: 6h > > The current implementation of gamma random variate generation works, but uses > an inversion method. This is well-known to be a bad idea. Usually a carefully > constructed rejection procedure is used. To give an idea of the magnitude of > the problem, the Gamma variate generation in Parallel COLT is roughly 50 > times faster than in Commons Math. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MATH-585) Very slow generation of gamma random variates
[ https://issues.apache.org/jira/browse/MATH-585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13051708#comment-13051708 ] Phil Steitz commented on MATH-585: -- Wow, are having too much fun with this, Mikkel :)) Nice work! I agree with Luc's comments and wonder if it might actually be better to detach the inner class entirely and use composition instead - i.e. have RandomDataImpl (lazily) construct and hold a reference to one (or two) GammaRejectionSampler(s), passing itself as a constructor argument to the cached sampler instance(s). I don't follow exactly how you are planning to implement the caching otherwise, since a single RandomDataImpl may get requests that split the parameter space. Also, there is an invariant stated in the RandomDataImpl class javadoc that all of the variates that it sources come from the same (pluggble, reseedable) RandomGenerator. I also agree with Luc on the exception typing and advertising. We need to understand what exactly can cause the raw MathExceptions and type what we throw accordingly. Also, I agree with Luc on advertising the NotStrictlyPositiveExceptions (or the superclass, IAE) for bad parameters. One more thing. It would be great if there were a freely available online reference somewhere. Not a requirement, but it makes it easier for more people to join in the fun if the algorithm docs are freely available. Thanks for jumping on this. > Very slow generation of gamma random variates > - > > Key: MATH-585 > URL: https://issues.apache.org/jira/browse/MATH-585 > Project: Commons Math > Issue Type: Improvement >Affects Versions: 2.2, 3.0 > Environment: All >Reporter: Darren Wilkinson >Assignee: Mikkel Meyer Andersen > Labels: Gamma, Random > Attachments: MATH585-1.patch, MATH585-4-gamma.patch > > Original Estimate: 6h > Remaining Estimate: 6h > > The current implementation of gamma random variate generation works, but uses > an inversion method. This is well-known to be a bad idea. Usually a carefully > constructed rejection procedure is used. To give an idea of the magnitude of > the problem, the Gamma variate generation in Parallel COLT is roughly 50 > times faster than in Commons Math. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MATH-585) Very slow generation of gamma random variates
[ https://issues.apache.org/jira/browse/MATH-585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13051702#comment-13051702 ] Mikkel Meyer Andersen commented on MATH-585: Now I understand. Originally GammaRejectionSampler was not an inner class but its own, and after I made it inner I did not give it as much thought as I should have. Thanks for clarifying. I like the solution of removing the reference to the outer class in the constructor best. I'm working on the constants - it's not as easy as I thought, but the hunt is kind of fun :-). > Very slow generation of gamma random variates > - > > Key: MATH-585 > URL: https://issues.apache.org/jira/browse/MATH-585 > Project: Commons Math > Issue Type: Improvement >Affects Versions: 2.2, 3.0 > Environment: All >Reporter: Darren Wilkinson >Assignee: Mikkel Meyer Andersen > Labels: Gamma, Random > Attachments: MATH585-1.patch, MATH585-4-gamma.patch > > Original Estimate: 6h > Remaining Estimate: 6h > > The current implementation of gamma random variate generation works, but uses > an inversion method. This is well-known to be a bad idea. Usually a carefully > constructed rejection procedure is used. To give an idea of the magnitude of > the problem, the Gamma variate generation in Parallel COLT is roughly 50 > times faster than in Commons Math. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MATH-585) Very slow generation of gamma random variates
[ https://issues.apache.org/jira/browse/MATH-585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13051663#comment-13051663 ] Luc Maisonobe commented on MATH-585: I was not suggesting a static field gammaRejectionSampler but a static inner class GammaRejectionSampler since you already pass the outer class as an argument to the constructor of the inner class. You could also keep the inner class non-static but in this case you should not pass the reference to the outer class as a constructor argument and store it by yourself, but you should rely on the fact a non-static class is automatically bound to the instance of the outer class it belongs to and it does already have access to the outer class private fields. As you use randomData, this is perhaps a better solution than having a static inner class. > Very slow generation of gamma random variates > - > > Key: MATH-585 > URL: https://issues.apache.org/jira/browse/MATH-585 > Project: Commons Math > Issue Type: Improvement >Affects Versions: 2.2, 3.0 > Environment: All >Reporter: Darren Wilkinson >Assignee: Mikkel Meyer Andersen > Labels: Gamma, Random > Attachments: MATH585-1.patch, MATH585-4-gamma.patch > > Original Estimate: 6h > Remaining Estimate: 6h > > The current implementation of gamma random variate generation works, but uses > an inversion method. This is well-known to be a bad idea. Usually a carefully > constructed rejection procedure is used. To give an idea of the magnitude of > the problem, the Gamma variate generation in Parallel COLT is roughly 50 > times faster than in Commons Math. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MATH-585) Very slow generation of gamma random variates
[ https://issues.apache.org/jira/browse/MATH-585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13051560#comment-13051560 ] Mikkel Meyer Andersen commented on MATH-585: The idea was that if I had to sample from both Gamma(1, 2) and Gamma(10, 34) I would instantiate two RandomDataImpl - one for sampling from Gamma(1, 2) and one for sampling from Gamma(10, 34). This would be efficient due to caching mechanisms in the inner class GammaRejectionSampler. If GammaRejectionSampler is made static, the two RandomDataImpl's would share the same GammaRejectionSampler, hence the caching would not be exploited, which is why it isn't static. I am not that used to using inner classes, so I might have misunderstood something - if so, please correct me. Some of the constants are taken directly from the paper. I'll look into generating them with a larger number of digits. Thanks for pointing out the exception-corrections. > Very slow generation of gamma random variates > - > > Key: MATH-585 > URL: https://issues.apache.org/jira/browse/MATH-585 > Project: Commons Math > Issue Type: Improvement >Affects Versions: 2.2, 3.0 > Environment: All >Reporter: Darren Wilkinson >Assignee: Mikkel Meyer Andersen > Labels: Gamma, Random > Attachments: MATH585-1.patch, MATH585-4-gamma.patch > > Original Estimate: 6h > Remaining Estimate: 6h > > The current implementation of gamma random variate generation works, but uses > an inversion method. This is well-known to be a bad idea. Usually a carefully > constructed rejection procedure is used. To give an idea of the magnitude of > the problem, the Gamma variate generation in Parallel COLT is roughly 50 > times faster than in Commons Math. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MATH-585) Very slow generation of gamma random variates
[ https://issues.apache.org/jira/browse/MATH-585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13051519#comment-13051519 ] Luc Maisonobe commented on MATH-585: Here are my comments: Shouldn't the inner class be static as well ? It seems independent from its upper class. The constants seem to have few digits, could they be precomputed with a larger number of digits ? The nextGamma should advertise throwing NotStrictlyPositiveException. The two "throw new MathException()" should really be replaced by something else. Either something related to nonconvergence or even a MathInternalError if the correspondng cases should never happeN > Very slow generation of gamma random variates > - > > Key: MATH-585 > URL: https://issues.apache.org/jira/browse/MATH-585 > Project: Commons Math > Issue Type: Improvement >Affects Versions: 2.2, 3.0 > Environment: All >Reporter: Darren Wilkinson >Assignee: Mikkel Meyer Andersen > Labels: Gamma, Random > Attachments: MATH585-1.patch, MATH585-4-gamma.patch > > Original Estimate: 6h > Remaining Estimate: 6h > > The current implementation of gamma random variate generation works, but uses > an inversion method. This is well-known to be a bad idea. Usually a carefully > constructed rejection procedure is used. To give an idea of the magnitude of > the problem, the Gamma variate generation in Parallel COLT is roughly 50 > times faster than in Commons Math. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MATH-585) Very slow generation of gamma random variates
[ https://issues.apache.org/jira/browse/MATH-585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13047265#comment-13047265 ] Mikkel Meyer Andersen commented on MATH-585: Darren, exactly. I'm also thinking about an inner class, but it's quite easy to change so I'll wait to raise that question until I have written some tests and get closer to committing. Yes, it indeed took some time - but as you might imagine, it was good fun :-). Brent, thanks for the proposals. They actually improved the algorithm slightly - around 7% in the (too) small tests I made (5 samples on each with one choice of parameters). Tests from before the change was made: {quote} Generating 100 random gamma(23.3, 4.5) with mean 104.850001 and var 471.825005 nextGamma took 502.0 ms. with mean 104.88707585478387 and var 471.92840503136534 Generating 100 random gamma(23.3, 4.5) with mean 104.850001 and var 471.825005 nextGamma took 483.0 ms. with mean 104.82077923605411 and var 470.1964562913125 Generating 100 random gamma(23.3, 4.5) with mean 104.850001 and var 471.825005 nextGamma took 542.0 ms. with mean 104.85678418999862 and var 471.6356319869918 Generating 100 random gamma(23.3, 4.5) with mean 104.850001 and var 471.825005 nextGamma took 471.0 ms. with mean 104.86904621863894 and var 472.21151252978103 Generating 100 random gamma(23.3, 4.5) with mean 104.850001 and var 471.825005 nextGamma took 491.0 ms. with mean 104.83920614619137 and var 473.1249434489607 {quote} and afterwards {quote} Generating 100 random gamma(23.3, 4.5) with mean 104.850001 and var 471.825005 nextGamma took 447.0 ms. with mean 104.87373894447317 and var 471.39010480694253 Generating 100 random gamma(23.3, 4.5) with mean 104.850001 and var 471.825005 nextGamma took 455.0 ms. with mean 104.85845231810445 and var 472.42913996929894 Generating 100 random gamma(23.3, 4.5) with mean 104.850001 and var 471.825005 nextGamma took 482.0 ms. with mean 104.8721506023297 and var 471.1563267884988 Generating 100 random gamma(23.3, 4.5) with mean 104.850001 and var 471.825005 nextGamma took 460.0 ms. with mean 104.90256013379438 and var 472.7495176399378 Generating 100 random gamma(23.3, 4.5) with mean 104.850001 and var 471.825005 nextGamma took 461.0 ms. with mean 104.82129306938253 and var 470.8630399950795 {quote} {code} > x1 <- c(502, 483, 542, 471, 491) > x2 <- c(447, 455, 482, 460, 461) > mean(x1)-mean(x2) [1] 36.8 > (mean(x2)-mean(x1))/mean(x1) [1] -0.07392527 {code} > Very slow generation of gamma random variates > - > > Key: MATH-585 > URL: https://issues.apache.org/jira/browse/MATH-585 > Project: Commons Math > Issue Type: Improvement >Affects Versions: 2.2, 3.0 > Environment: All >Reporter: Darren Wilkinson >Assignee: Mikkel Meyer Andersen > Labels: Gamma, Random > Attachments: MATH585-1.patch > > Original Estimate: 6h > Remaining Estimate: 6h > > The current implementation of gamma random variate generation works, but uses > an inversion method. This is well-known to be a bad idea. Usually a carefully > constructed rejection procedure is used. To give an idea of the magnitude of > the problem, the Gamma variate generation in Parallel COLT is roughly 50 > times faster than in Commons Math. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MATH-585) Very slow generation of gamma random variates
[ https://issues.apache.org/jira/browse/MATH-585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13047223#comment-13047223 ] Brent Worden commented on MATH-585: --- Here is a performance improvement suggestion to the implementation in the patch: Since the loop in calculateQ is always a fixed number of iterations, eliminate it and compute the polynomial directly. Also, instead of computing is as a1*v + a2*v^2 + a3*v^3 + ... a9*v^9 use the equivalent form v*(a1 + v*(a2 + v*(a3 + ... + v*(a8 + a9*v. This method eliminates about half of the multiplications needed to compute the polynomial. Apply the same technique to the loops found in Step 4 (lines 244-250 of MATH585-1.patch) and in Step 11 (lines 307-318). > Very slow generation of gamma random variates > - > > Key: MATH-585 > URL: https://issues.apache.org/jira/browse/MATH-585 > Project: Commons Math > Issue Type: Improvement >Affects Versions: 2.2, 3.0 > Environment: All >Reporter: Darren Wilkinson >Assignee: Mikkel Meyer Andersen > Labels: Gamma, Random > Attachments: MATH585-1.patch > > Original Estimate: 6h > Remaining Estimate: 6h > > The current implementation of gamma random variate generation works, but uses > an inversion method. This is well-known to be a bad idea. Usually a carefully > constructed rejection procedure is used. To give an idea of the magnitude of > the problem, the Gamma variate generation in Parallel COLT is roughly 50 > times faster than in Commons Math. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MATH-585) Very slow generation of gamma random variates
[ https://issues.apache.org/jira/browse/MATH-585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13047176#comment-13047176 ] Darren Wilkinson commented on MATH-585: --- So, I am not very good at reading patches, and I am not set up right now to be able to apply it, but if I understand correctly, I think it looks fine. The idea is that there is just a single instance of the gamma class associated with each random number stream, and the reason for having it is to keep all of the constants and variables out of the name space of the random number stream. This makes sense. It probably could be done with an inner class, which might be slightly cleaner, but I do not feel strongly about it. Caching stuff is always a worry for thread safety, but is clearly important for performance. I'm guessing that the random number streams are not thread safe anyway, and that it is assumed that threads each have their own random number streams (preferably seeded differently!). Thanks again for making this a priority. You say it was quite easy, but I'm sure it was a lot of work! > Very slow generation of gamma random variates > - > > Key: MATH-585 > URL: https://issues.apache.org/jira/browse/MATH-585 > Project: Commons Math > Issue Type: Improvement >Affects Versions: 2.2, 3.0 > Environment: All >Reporter: Darren Wilkinson >Assignee: Mikkel Meyer Andersen > Labels: Gamma, Random > Attachments: MATH585-1.patch > > Original Estimate: 6h > Remaining Estimate: 6h > > The current implementation of gamma random variate generation works, but uses > an inversion method. This is well-known to be a bad idea. Usually a carefully > constructed rejection procedure is used. To give an idea of the magnitude of > the problem, the Gamma variate generation in Parallel COLT is roughly 50 > times faster than in Commons Math. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MATH-585) Very slow generation of gamma random variates
[ https://issues.apache.org/jira/browse/MATH-585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13047112#comment-13047112 ] Darren Wilkinson commented on MATH-585: --- That's great. Even if it makes little difference to the gamma generator, it is worth speeding up exponential generation - it is very important in many applications. In fact, in my own applications it is even more important than (general) gamma generation... > Very slow generation of gamma random variates > - > > Key: MATH-585 > URL: https://issues.apache.org/jira/browse/MATH-585 > Project: Commons Math > Issue Type: Improvement >Affects Versions: 2.2, 3.0 > Environment: All >Reporter: Darren Wilkinson >Assignee: Mikkel Meyer Andersen > Labels: Gamma, Random > Attachments: MATH585-1.patch > > Original Estimate: 6h > Remaining Estimate: 6h > > The current implementation of gamma random variate generation works, but uses > an inversion method. This is well-known to be a bad idea. Usually a carefully > constructed rejection procedure is used. To give an idea of the magnitude of > the problem, the Gamma variate generation in Parallel COLT is roughly 50 > times faster than in Commons Math. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MATH-585) Very slow generation of gamma random variates
[ https://issues.apache.org/jira/browse/MATH-585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13047110#comment-13047110 ] Mikkel Meyer Andersen commented on MATH-585: Algorithm SA (Ahrens) from p. 876 in the above is faster than the inversion-based exponential sampling. A couple of tests: Generating 10 random Exp(1.0) with mean 1.0 and var 1.0 nextOldExponential took 120.0 ms. with mean 0.9962712206756855 and var 0.9983243442732745 nextExponential took 20.0 ms. with mean 0.9958227425469129 and var 0.9990279223847179 Generating 1000 random Exp(1.0) with mean 1.0 and var 1.0 nextOldExponential took 671.0 ms. with mean 1.0009038221005877 and var 1.001436791894477 nextExponential took 548.0 ms. with mean 1.246277478757 and var 0.9996828540541313 Generating 1000 random Exp(12.23) with mean 12.23 and var 149.5729 nextOldExponential took 671.0 ms. with mean 12.229128607001826 and var 149.4295066207314 nextExponential took 543.0 ms. with mean 12.233285138487334 and var 149.55112465876087 Generating 1000 random Exp(0.0013) with mean 0.0013 and var 1.69E-6 nextOldExponential took 669.0 ms. with mean 0.0012998098500627037 and var 1.6880018139800875E-6 nextExponential took 544.0 ms. with mean 0.0013003604434494178 and var 1.6893988569492528E-6 Generating 1000 random Exp(6.0) with mean 6.0 and var 36.0 nextOldExponential took 687.0 ms. with mean 5.58564653789 and var 36.008520922357725 nextExponential took 606.0 ms. with mean 5.997436087619412 and var 35.95429173708095 I'm not sure how much it will improve the new Gamma, because exponentials are rarely sampled (only in about 1-2% in average of the times, I guess). > Very slow generation of gamma random variates > - > > Key: MATH-585 > URL: https://issues.apache.org/jira/browse/MATH-585 > Project: Commons Math > Issue Type: Improvement >Affects Versions: 2.2, 3.0 > Environment: All >Reporter: Darren Wilkinson >Assignee: Mikkel Meyer Andersen > Labels: Gamma, Random > Attachments: MATH585-1.patch > > Original Estimate: 6h > Remaining Estimate: 6h > > The current implementation of gamma random variate generation works, but uses > an inversion method. This is well-known to be a bad idea. Usually a carefully > constructed rejection procedure is used. To give an idea of the magnitude of > the problem, the Gamma variate generation in Parallel COLT is roughly 50 > times faster than in Commons Math. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MATH-585) Very slow generation of gamma random variates
[ https://issues.apache.org/jira/browse/MATH-585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13047015#comment-13047015 ] Mikkel Meyer Andersen commented on MATH-585: It is probably possible to gain a bit more by using (reference found in R's ?rexp) {quote} Ahrens, J. H. and Dieter, U. (1972). Computer methods for sampling from the exponential and normal distributions. _Communications of the ACM_, *15*, 873-882. {quote} instead of the currently simple inversion method to sample from the exponential distribution (step 8-9). > Very slow generation of gamma random variates > - > > Key: MATH-585 > URL: https://issues.apache.org/jira/browse/MATH-585 > Project: Commons Math > Issue Type: Improvement >Affects Versions: 2.2, 3.0 > Environment: All >Reporter: Darren Wilkinson >Assignee: Mikkel Meyer Andersen > Labels: Gamma, Random > Attachments: MATH585-1.patch > > Original Estimate: 6h > Remaining Estimate: 6h > > The current implementation of gamma random variate generation works, but uses > an inversion method. This is well-known to be a bad idea. Usually a carefully > constructed rejection procedure is used. To give an idea of the magnitude of > the problem, the Gamma variate generation in Parallel COLT is roughly 50 > times faster than in Commons Math. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MATH-585) Very slow generation of gamma random variates
[ https://issues.apache.org/jira/browse/MATH-585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13046528#comment-13046528 ] Darren Wilkinson commented on MATH-585: --- That looks great. I will be happy to do a bit of testing when the new version makes it into the snapshot. Thanks! > Very slow generation of gamma random variates > - > > Key: MATH-585 > URL: https://issues.apache.org/jira/browse/MATH-585 > Project: Commons Math > Issue Type: Improvement >Affects Versions: 2.2, 3.0 > Environment: All >Reporter: Darren Wilkinson > Labels: Gamma, Random > Original Estimate: 6h > Remaining Estimate: 6h > > The current implementation of gamma random variate generation works, but uses > an inversion method. This is well-known to be a bad idea. Usually a carefully > constructed rejection procedure is used. To give an idea of the magnitude of > the problem, the Gamma variate generation in Parallel COLT is roughly 50 > times faster than in Commons Math. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MATH-585) Very slow generation of gamma random variates
[ https://issues.apache.org/jira/browse/MATH-585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13046520#comment-13046520 ] Mikkel Meyer Andersen commented on MATH-585: To above is with shape = 13.3 and scale = 4.3 > Very slow generation of gamma random variates > - > > Key: MATH-585 > URL: https://issues.apache.org/jira/browse/MATH-585 > Project: Commons Math > Issue Type: Improvement >Affects Versions: 2.2, 3.0 > Environment: All >Reporter: Darren Wilkinson > Labels: Gamma, Random > Original Estimate: 6h > Remaining Estimate: 6h > > The current implementation of gamma random variate generation works, but uses > an inversion method. This is well-known to be a bad idea. Usually a carefully > constructed rejection procedure is used. To give an idea of the magnitude of > the problem, the Gamma variate generation in Parallel COLT is roughly 50 > times faster than in Commons Math. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MATH-585) Very slow generation of gamma random variates
[ https://issues.apache.org/jira/browse/MATH-585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13046519#comment-13046519 ] Mikkel Meyer Andersen commented on MATH-585: Just for your info: I'm working on implementing the methods described in the papers, and they are not too difficult. I need to do some cleaning, and then I'll upload a patch within a few weeks. Current status of a test run: Generating 100 random gamma with mean 57.19 and var 245.916997 nextGamma done nextGamma took 23698.0 ms. with mean 57.17778720638469 and var 246.443482942482 nextNewGamma done nextNewGamma took 230.0 ms. with mean 57.20940243394036 and var 245.80250080092702 So the performance gain is around 100 which isn't too bad :-). > Very slow generation of gamma random variates > - > > Key: MATH-585 > URL: https://issues.apache.org/jira/browse/MATH-585 > Project: Commons Math > Issue Type: Improvement >Affects Versions: 2.2, 3.0 > Environment: All >Reporter: Darren Wilkinson > Labels: Gamma, Random > Original Estimate: 6h > Remaining Estimate: 6h > > The current implementation of gamma random variate generation works, but uses > an inversion method. This is well-known to be a bad idea. Usually a carefully > constructed rejection procedure is used. To give an idea of the magnitude of > the problem, the Gamma variate generation in Parallel COLT is roughly 50 > times faster than in Commons Math. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MATH-585) Very slow generation of gamma random variates
[ https://issues.apache.org/jira/browse/MATH-585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13045326#comment-13045326 ] Mikkel Meyer Andersen commented on MATH-585: I know that it is indeed important, and I will look into it as soon as time allows me to. R is GPL, so using their implementation is a no-go. But the help file provides the methods they have implemented: {quote} ‘rgamma’ for ‘shape >= 1’ uses Ahrens, J. H. and Dieter, U. (1982). Generating gamma variates by a modified rejection technique. _Communications of the ACM_, *25*, 47-54, and for ‘0 < shape < 1’ uses Ahrens, J. H. and Dieter, U. (1974). Computer methods for sampling from gamma, beta, Poisson and binomial distributions. _Computing_, *12*, 223-246. {quote} Again, thanks for reporting this - we'll do our best to improve our implementation. > Very slow generation of gamma random variates > - > > Key: MATH-585 > URL: https://issues.apache.org/jira/browse/MATH-585 > Project: Commons Math > Issue Type: Improvement >Affects Versions: 2.2, 3.0 > Environment: All >Reporter: Darren Wilkinson > Labels: Gamma, Random > Original Estimate: 6h > Remaining Estimate: 6h > > The current implementation of gamma random variate generation works, but uses > an inversion method. This is well-known to be a bad idea. Usually a carefully > constructed rejection procedure is used. To give an idea of the magnitude of > the problem, the Gamma variate generation in Parallel COLT is roughly 50 > times faster than in Commons Math. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MATH-585) Very slow generation of gamma random variates
[ https://issues.apache.org/jira/browse/MATH-585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13045058#comment-13045058 ] Darren Wilkinson commented on MATH-585: --- It is not urgent in the sense that I can use Parallel COLT! :-/ But it is quite a big deal for anyone wanting to use Commons Math and doing serious Monte Carlo work. Not only is the gamma distribution itself very important, but the gamma generator is used as the basis for simulating variates from many other distributions (including inverse gamma, chi-square, inverse chi-square, beta, Dirichlet, ...), so a slow gamma generator has lots of knock-on effects. It is certainly the next most important non-uniform generator to provide after Gaussian. It is not trivial to implement well, so I recommend copying or porting another implementation if possible, though I'm not an expert on open source software licenses so I don't know how practical that is. The GSL and R are GNU projects which have good C implementations, which I think are based on slightly better algorithms than the Ahrens and Dieter algorithm you mention above. > Very slow generation of gamma random variates > - > > Key: MATH-585 > URL: https://issues.apache.org/jira/browse/MATH-585 > Project: Commons Math > Issue Type: Improvement >Affects Versions: 2.2, 3.0 > Environment: All >Reporter: Darren Wilkinson > Labels: Gamma, Random > Original Estimate: 6h > Remaining Estimate: 6h > > The current implementation of gamma random variate generation works, but uses > an inversion method. This is well-known to be a bad idea. Usually a carefully > constructed rejection procedure is used. To give an idea of the magnitude of > the problem, the Gamma variate generation in Parallel COLT is roughly 50 > times faster than in Commons Math. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MATH-585) Very slow generation of gamma random variates
[ https://issues.apache.org/jira/browse/MATH-585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13044789#comment-13044789 ] Mikkel Meyer Andersen commented on MATH-585: Thank you very much for reporting this. I have looked in the documentation of cern.jet.random.tdouble.Gamma in the Parallel COLT, and they write: {quote} Method: Acceptance Rejection combined with Acceptance Complement. High performance implementation. This is a port of RandGamma used in CLHEP 1.4.0 (C++). CLHEP's implementation, in turn, is based on gds.c from the C-RAND / WIN-RAND library. C-RAND's implementation, in turn, is based upon J.H. Ahrens, U. Dieter (1974): Computer methods for sampling from gamma, beta, Poisson and binomial distributions, Computing 12, 223-246. and J.H. Ahrens, U. Dieter (1982): Generating gamma variates by a modified rejection technique, Communications of the ACM 25, 47-54. {quote} Looking in the ChangeLog file in the source for version 2.1.0.1 of CLHEP, I found that "CLHEP now uses the LGPL license.". As far as I can tell, LGPL is not compatible with Apache License which Commons Math is under. This means that we must implement the method ourselves, e.g. based on the papers referred to above. How urgent is your need for this speed-up? Unfortunately I'm quite busy for the next months, but please submit a patch if possible. > Very slow generation of gamma random variates > - > > Key: MATH-585 > URL: https://issues.apache.org/jira/browse/MATH-585 > Project: Commons Math > Issue Type: Improvement >Affects Versions: 2.2, 3.0 > Environment: All >Reporter: Darren Wilkinson > Labels: Gamma, Random > Original Estimate: 6h > Remaining Estimate: 6h > > The current implementation of gamma random variate generation works, but uses > an inversion method. This is well-known to be a bad idea. Usually a carefully > constructed rejection procedure is used. To give an idea of the magnitude of > the problem, the Gamma variate generation in Parallel COLT is roughly 50 > times faster than in Commons Math. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira