[jira] [Commented] (MATH-585) Very slow generation of gamma random variates

2011-07-21 Thread Mikkel Meyer Andersen (JIRA)

[ 
https://issues.apache.org/jira/browse/MATH-585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13069196#comment-13069196
 ] 

Mikkel Meyer Andersen commented on MATH-585:


Fixed in revision 1149350.

> Very slow generation of gamma random variates
> -
>
> Key: MATH-585
> URL: https://issues.apache.org/jira/browse/MATH-585
> Project: Commons Math
>  Issue Type: Improvement
>Affects Versions: 2.2, 3.0
> Environment: All
>Reporter: Darren Wilkinson
>Assignee: Mikkel Meyer Andersen
>  Labels: Gamma, Random
> Attachments: MATH585-1.patch, MATH585-4-gamma.patch, 
> MATH585-MT2001-1.patch
>
>   Original Estimate: 6h
>  Remaining Estimate: 6h
>
> The current implementation of gamma random variate generation works, but uses 
> an inversion method. This is well-known to be a bad idea. Usually a carefully 
> constructed rejection procedure is used. To give an idea of the magnitude of 
> the problem, the Gamma variate generation in Parallel COLT is roughly 50 
> times faster than in Commons Math. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MATH-585) Very slow generation of gamma random variates

2011-06-20 Thread Mikkel Meyer Andersen (JIRA)

[ 
https://issues.apache.org/jira/browse/MATH-585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13052035#comment-13052035
 ] 

Mikkel Meyer Andersen commented on MATH-585:


Thanks for the comments.

While working with the constants, I fell over a new and extremely interesting 
paper [1] "A Simple Method for Generating Gamma Variables" from 2001 by 
Marsaglia and Tsang (I'm tempted to ask: who else? :-)). It relies on a fast 
Gaussian random generator (which I believe we have). It is equivalent to the 
speed of the Gaussian random generator. Furthermore, it is really simple to 
implement. So in short, it looks very interesting. Actually, it also seems 
faster than Ahrens and Dieter (1982). A few samples (using JDKRandomGenerator):
{quote}
Generating 100 random gamma(1.0, 1.1) with mean 1.1 and var 
1.2102
Marsaglia and Tsang (2001) took 269.0 ms. with mean 1.0997977261198317 and var 
1.2113772336053434
Ahrens and Dieter (1982)   took 358.0 ms. with mean 1.1003522016214415 and var 
1.212513390495578

Generating 100 random gamma(1.5, 2.0) with mean 3.0 and var 6.0
Marsaglia and Tsang (2001) took 279.0 ms. with mean 2.998893072520475 and var 
5.978837507498615
Ahrens and Dieter (1982)   took 357.0 ms. with mean 2.9975071864627463 and var 
5.999613004242815

Generating 100 random gamma(15.0, 1.0) with mean 15.0 and var 15.0
Marsaglia and Tsang (2001) took 259.0 ms. with mean 14.997425134380348 and var 
15.006604469206364
Ahrens and Dieter (1982)   took 278.0 ms. with mean 14.993230125478366 and var 
14.984980114735736
{quote}

It is implemented as follows:
{code}
final double d = shape - 0.33;
final double c = 1.0 / (3*FastMath.sqrt(d));

do {
final double x = generator.nextGaussian();
final double v = (1+c*x)*(1+c*x)*(1+c*x);

if (v <= 0) {
continue;
}

final double xx = x*x;
final double u = this.nextUniform(0, 1);

// Squeeze
if (u < 1 - 0.0331*xx*xx) {
return scale*d*v;
}

if (FastMath.log(u) < 0.5*xx + d*(1 - v + FastMath.log(v))) {
return scale*d*v;
}

} while (true);
{code}
The algorithm is also usable for shape < 1, but Ahrens and Dieter (1974) seems 
better in that case.

What do you think about using Marsaglia and Tsang (2001) instead of Ahrens and 
Dieter (1982)? I now prefer Marsaglia and Tsang (2001) :-).

Phil, in regards to freely available online reference, I agree that it would be 
great, but I also think that it is second to choosing a good algorithm.

[1]: http://dx.doi.org/10.1145/358407.358414

> Very slow generation of gamma random variates
> -
>
> Key: MATH-585
> URL: https://issues.apache.org/jira/browse/MATH-585
> Project: Commons Math
>  Issue Type: Improvement
>Affects Versions: 2.2, 3.0
> Environment: All
>Reporter: Darren Wilkinson
>Assignee: Mikkel Meyer Andersen
>  Labels: Gamma, Random
> Attachments: MATH585-1.patch, MATH585-4-gamma.patch
>
>   Original Estimate: 6h
>  Remaining Estimate: 6h
>
> The current implementation of gamma random variate generation works, but uses 
> an inversion method. This is well-known to be a bad idea. Usually a carefully 
> constructed rejection procedure is used. To give an idea of the magnitude of 
> the problem, the Gamma variate generation in Parallel COLT is roughly 50 
> times faster than in Commons Math. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MATH-585) Very slow generation of gamma random variates

2011-06-19 Thread Phil Steitz (JIRA)

[ 
https://issues.apache.org/jira/browse/MATH-585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13051708#comment-13051708
 ] 

Phil Steitz commented on MATH-585:
--

Wow, are having too much fun with this, Mikkel :))  Nice work!

I agree with Luc's comments and wonder if it might actually be better to detach 
the inner class entirely and use composition instead - i.e. have RandomDataImpl 
(lazily) construct and hold a reference to one (or two) 
GammaRejectionSampler(s), passing itself as a constructor argument to the 
cached sampler instance(s).  I don't follow exactly how you are planning to 
implement the caching otherwise, since a single RandomDataImpl may get requests 
that split the parameter space.  Also, there is an invariant stated in the 
RandomDataImpl class javadoc that all of the variates that it sources come from 
the same (pluggble, reseedable) RandomGenerator.

I also agree with Luc on the exception typing and advertising.  We need to 
understand what exactly can cause the raw MathExceptions and type what we throw 
accordingly.  Also, I agree with Luc on advertising the 
NotStrictlyPositiveExceptions (or the superclass, IAE) for bad parameters.

One more thing.  It would be great if there were a freely available online 
reference somewhere.  Not a requirement, but it makes it easier for more people 
to join in the fun if the algorithm docs are freely available.

Thanks for jumping on this. 


> Very slow generation of gamma random variates
> -
>
> Key: MATH-585
> URL: https://issues.apache.org/jira/browse/MATH-585
> Project: Commons Math
>  Issue Type: Improvement
>Affects Versions: 2.2, 3.0
> Environment: All
>Reporter: Darren Wilkinson
>Assignee: Mikkel Meyer Andersen
>  Labels: Gamma, Random
> Attachments: MATH585-1.patch, MATH585-4-gamma.patch
>
>   Original Estimate: 6h
>  Remaining Estimate: 6h
>
> The current implementation of gamma random variate generation works, but uses 
> an inversion method. This is well-known to be a bad idea. Usually a carefully 
> constructed rejection procedure is used. To give an idea of the magnitude of 
> the problem, the Gamma variate generation in Parallel COLT is roughly 50 
> times faster than in Commons Math. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MATH-585) Very slow generation of gamma random variates

2011-06-19 Thread Mikkel Meyer Andersen (JIRA)

[ 
https://issues.apache.org/jira/browse/MATH-585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13051702#comment-13051702
 ] 

Mikkel Meyer Andersen commented on MATH-585:


Now I understand. Originally GammaRejectionSampler was not an inner class but 
its own, and after I made it inner I did not give it as much thought as I 
should have. Thanks for clarifying. I like the solution of removing the 
reference to the outer class in the constructor best.

I'm working on the constants - it's not as easy as I thought, but the hunt is 
kind of fun :-).

> Very slow generation of gamma random variates
> -
>
> Key: MATH-585
> URL: https://issues.apache.org/jira/browse/MATH-585
> Project: Commons Math
>  Issue Type: Improvement
>Affects Versions: 2.2, 3.0
> Environment: All
>Reporter: Darren Wilkinson
>Assignee: Mikkel Meyer Andersen
>  Labels: Gamma, Random
> Attachments: MATH585-1.patch, MATH585-4-gamma.patch
>
>   Original Estimate: 6h
>  Remaining Estimate: 6h
>
> The current implementation of gamma random variate generation works, but uses 
> an inversion method. This is well-known to be a bad idea. Usually a carefully 
> constructed rejection procedure is used. To give an idea of the magnitude of 
> the problem, the Gamma variate generation in Parallel COLT is roughly 50 
> times faster than in Commons Math. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MATH-585) Very slow generation of gamma random variates

2011-06-19 Thread Luc Maisonobe (JIRA)

[ 
https://issues.apache.org/jira/browse/MATH-585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13051663#comment-13051663
 ] 

Luc Maisonobe commented on MATH-585:


I was not suggesting a static field gammaRejectionSampler but a static inner 
class GammaRejectionSampler since you already pass the outer class as an 
argument to the constructor of the inner class.

You could also keep the inner class non-static but in this case you should not 
pass the reference to the outer class as a constructor argument and store it by 
yourself, but you should rely on the fact a non-static class is automatically 
bound to the instance of the outer class it belongs to and it does already have 
access to the outer class private fields. As you use randomData, this is 
perhaps a better solution than having a static inner class.

> Very slow generation of gamma random variates
> -
>
> Key: MATH-585
> URL: https://issues.apache.org/jira/browse/MATH-585
> Project: Commons Math
>  Issue Type: Improvement
>Affects Versions: 2.2, 3.0
> Environment: All
>Reporter: Darren Wilkinson
>Assignee: Mikkel Meyer Andersen
>  Labels: Gamma, Random
> Attachments: MATH585-1.patch, MATH585-4-gamma.patch
>
>   Original Estimate: 6h
>  Remaining Estimate: 6h
>
> The current implementation of gamma random variate generation works, but uses 
> an inversion method. This is well-known to be a bad idea. Usually a carefully 
> constructed rejection procedure is used. To give an idea of the magnitude of 
> the problem, the Gamma variate generation in Parallel COLT is roughly 50 
> times faster than in Commons Math. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MATH-585) Very slow generation of gamma random variates

2011-06-18 Thread Mikkel Meyer Andersen (JIRA)

[ 
https://issues.apache.org/jira/browse/MATH-585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13051560#comment-13051560
 ] 

Mikkel Meyer Andersen commented on MATH-585:


The idea was that if I had to sample from both Gamma(1, 2) and Gamma(10, 34) I 
would instantiate two RandomDataImpl - one for sampling from Gamma(1, 2) and 
one for sampling from Gamma(10, 34). This would be efficient due to caching 
mechanisms in the inner class GammaRejectionSampler. If GammaRejectionSampler 
is made static, the two RandomDataImpl's would share the same 
GammaRejectionSampler, hence the caching would not be exploited, which is why 
it isn't static. I am not that used to using inner classes, so I might have 
misunderstood something - if so, please correct me.

Some of the constants are taken directly from the paper. I'll look into 
generating them with a larger number of digits.

Thanks for pointing out the exception-corrections.

> Very slow generation of gamma random variates
> -
>
> Key: MATH-585
> URL: https://issues.apache.org/jira/browse/MATH-585
> Project: Commons Math
>  Issue Type: Improvement
>Affects Versions: 2.2, 3.0
> Environment: All
>Reporter: Darren Wilkinson
>Assignee: Mikkel Meyer Andersen
>  Labels: Gamma, Random
> Attachments: MATH585-1.patch, MATH585-4-gamma.patch
>
>   Original Estimate: 6h
>  Remaining Estimate: 6h
>
> The current implementation of gamma random variate generation works, but uses 
> an inversion method. This is well-known to be a bad idea. Usually a carefully 
> constructed rejection procedure is used. To give an idea of the magnitude of 
> the problem, the Gamma variate generation in Parallel COLT is roughly 50 
> times faster than in Commons Math. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MATH-585) Very slow generation of gamma random variates

2011-06-18 Thread Luc Maisonobe (JIRA)

[ 
https://issues.apache.org/jira/browse/MATH-585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13051519#comment-13051519
 ] 

Luc Maisonobe commented on MATH-585:


Here are my comments:

Shouldn't the inner class be static as well ? It seems independent from its 
upper class.

The constants seem to have few digits, could they be precomputed with a larger 
number of digits ?

The nextGamma should advertise throwing NotStrictlyPositiveException.
The two "throw new MathException()" should really be replaced by something 
else. Either something related to nonconvergence or even a MathInternalError if 
the correspondng cases should never happeN

> Very slow generation of gamma random variates
> -
>
> Key: MATH-585
> URL: https://issues.apache.org/jira/browse/MATH-585
> Project: Commons Math
>  Issue Type: Improvement
>Affects Versions: 2.2, 3.0
> Environment: All
>Reporter: Darren Wilkinson
>Assignee: Mikkel Meyer Andersen
>  Labels: Gamma, Random
> Attachments: MATH585-1.patch, MATH585-4-gamma.patch
>
>   Original Estimate: 6h
>  Remaining Estimate: 6h
>
> The current implementation of gamma random variate generation works, but uses 
> an inversion method. This is well-known to be a bad idea. Usually a carefully 
> constructed rejection procedure is used. To give an idea of the magnitude of 
> the problem, the Gamma variate generation in Parallel COLT is roughly 50 
> times faster than in Commons Math. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MATH-585) Very slow generation of gamma random variates

2011-06-10 Thread Mikkel Meyer Andersen (JIRA)

[ 
https://issues.apache.org/jira/browse/MATH-585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13047265#comment-13047265
 ] 

Mikkel Meyer Andersen commented on MATH-585:


Darren, exactly. I'm also thinking about an inner class, but it's quite easy to 
change so I'll wait to raise that question until I have written some tests and 
get closer to committing. Yes, it indeed took some time - but as you might 
imagine, it was good fun :-).

Brent, thanks for the proposals. They actually improved the algorithm slightly 
- around 7% in the (too) small tests I made (5 samples on each with one choice 
of parameters). Tests from before the change was made:
{quote}
Generating 100 random gamma(23.3, 4.5) with mean 104.850001 and var 
471.825005
nextGamma took 502.0 ms. with mean 104.88707585478387 and var 471.92840503136534

Generating 100 random gamma(23.3, 4.5) with mean 104.850001 and var 
471.825005
nextGamma took 483.0 ms. with mean 104.82077923605411 and var 470.1964562913125

Generating 100 random gamma(23.3, 4.5) with mean 104.850001 and var 
471.825005
nextGamma took 542.0 ms. with mean 104.85678418999862 and var 471.6356319869918

Generating 100 random gamma(23.3, 4.5) with mean 104.850001 and var 
471.825005
nextGamma took 471.0 ms. with mean 104.86904621863894 and var 472.21151252978103

Generating 100 random gamma(23.3, 4.5) with mean 104.850001 and var 
471.825005
nextGamma took 491.0 ms. with mean 104.83920614619137 and var 473.1249434489607
{quote}
and afterwards
{quote}
Generating 100 random gamma(23.3, 4.5) with mean 104.850001 and var 
471.825005
nextGamma took 447.0 ms. with mean 104.87373894447317 and var 471.39010480694253

Generating 100 random gamma(23.3, 4.5) with mean 104.850001 and var 
471.825005
nextGamma took 455.0 ms. with mean 104.85845231810445 and var 472.42913996929894

Generating 100 random gamma(23.3, 4.5) with mean 104.850001 and var 
471.825005
nextGamma took 482.0 ms. with mean 104.8721506023297 and var 471.1563267884988

Generating 100 random gamma(23.3, 4.5) with mean 104.850001 and var 
471.825005
nextGamma took 460.0 ms. with mean 104.90256013379438 and var 472.7495176399378

Generating 100 random gamma(23.3, 4.5) with mean 104.850001 and var 
471.825005
nextGamma took 461.0 ms. with mean 104.82129306938253 and var 470.8630399950795
{quote}

{code}
> x1 <- c(502, 483, 542, 471, 491)
> x2 <- c(447, 455, 482, 460, 461)
> mean(x1)-mean(x2)
[1] 36.8
> (mean(x2)-mean(x1))/mean(x1)
[1] -0.07392527
{code}

> Very slow generation of gamma random variates
> -
>
> Key: MATH-585
> URL: https://issues.apache.org/jira/browse/MATH-585
> Project: Commons Math
>  Issue Type: Improvement
>Affects Versions: 2.2, 3.0
> Environment: All
>Reporter: Darren Wilkinson
>Assignee: Mikkel Meyer Andersen
>  Labels: Gamma, Random
> Attachments: MATH585-1.patch
>
>   Original Estimate: 6h
>  Remaining Estimate: 6h
>
> The current implementation of gamma random variate generation works, but uses 
> an inversion method. This is well-known to be a bad idea. Usually a carefully 
> constructed rejection procedure is used. To give an idea of the magnitude of 
> the problem, the Gamma variate generation in Parallel COLT is roughly 50 
> times faster than in Commons Math. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MATH-585) Very slow generation of gamma random variates

2011-06-10 Thread Brent Worden (JIRA)

[ 
https://issues.apache.org/jira/browse/MATH-585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13047223#comment-13047223
 ] 

Brent Worden commented on MATH-585:
---

Here is a performance improvement suggestion to the implementation in the patch:

Since the loop in calculateQ is always a fixed number of iterations, eliminate 
it and compute the polynomial directly.  Also, instead of computing is as a1*v 
+ a2*v^2 + a3*v^3 + ... a9*v^9 use the equivalent form v*(a1 + v*(a2 + v*(a3 + 
... + v*(a8 + a9*v.  This method eliminates about half of the 
multiplications needed to compute the polynomial.

Apply the same technique to the loops found in Step 4 (lines 244-250 of 
MATH585-1.patch) and in Step 11 (lines 307-318).


> Very slow generation of gamma random variates
> -
>
> Key: MATH-585
> URL: https://issues.apache.org/jira/browse/MATH-585
> Project: Commons Math
>  Issue Type: Improvement
>Affects Versions: 2.2, 3.0
> Environment: All
>Reporter: Darren Wilkinson
>Assignee: Mikkel Meyer Andersen
>  Labels: Gamma, Random
> Attachments: MATH585-1.patch
>
>   Original Estimate: 6h
>  Remaining Estimate: 6h
>
> The current implementation of gamma random variate generation works, but uses 
> an inversion method. This is well-known to be a bad idea. Usually a carefully 
> constructed rejection procedure is used. To give an idea of the magnitude of 
> the problem, the Gamma variate generation in Parallel COLT is roughly 50 
> times faster than in Commons Math. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MATH-585) Very slow generation of gamma random variates

2011-06-10 Thread Darren Wilkinson (JIRA)

[ 
https://issues.apache.org/jira/browse/MATH-585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13047176#comment-13047176
 ] 

Darren Wilkinson commented on MATH-585:
---

So, I am not very good at reading patches, and I am not set up right now to be 
able to apply it, but if I understand correctly, I think it looks fine. The 
idea is that there is just a single instance of the gamma class associated with 
each random number stream, and the reason for having it is to keep all of the 
constants and variables out of the name space of the random number stream. This 
makes sense. It probably could be done with an inner class, which might be 
slightly cleaner, but I do not feel strongly about it. Caching stuff is always 
a worry for thread safety, but is clearly important for performance. I'm 
guessing that the random number streams are not thread safe anyway, and that it 
is assumed that threads each have their own random number streams (preferably 
seeded differently!). Thanks again for making this a priority. You say it was 
quite easy, but I'm sure it was a lot of work!

> Very slow generation of gamma random variates
> -
>
> Key: MATH-585
> URL: https://issues.apache.org/jira/browse/MATH-585
> Project: Commons Math
>  Issue Type: Improvement
>Affects Versions: 2.2, 3.0
> Environment: All
>Reporter: Darren Wilkinson
>Assignee: Mikkel Meyer Andersen
>  Labels: Gamma, Random
> Attachments: MATH585-1.patch
>
>   Original Estimate: 6h
>  Remaining Estimate: 6h
>
> The current implementation of gamma random variate generation works, but uses 
> an inversion method. This is well-known to be a bad idea. Usually a carefully 
> constructed rejection procedure is used. To give an idea of the magnitude of 
> the problem, the Gamma variate generation in Parallel COLT is roughly 50 
> times faster than in Commons Math. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MATH-585) Very slow generation of gamma random variates

2011-06-10 Thread Darren Wilkinson (JIRA)

[ 
https://issues.apache.org/jira/browse/MATH-585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13047112#comment-13047112
 ] 

Darren Wilkinson commented on MATH-585:
---

That's great. Even if it makes little difference to the gamma generator, it is 
worth speeding up exponential generation - it is very important in many 
applications. In fact, in my own applications it is even more important than 
(general) gamma generation...

> Very slow generation of gamma random variates
> -
>
> Key: MATH-585
> URL: https://issues.apache.org/jira/browse/MATH-585
> Project: Commons Math
>  Issue Type: Improvement
>Affects Versions: 2.2, 3.0
> Environment: All
>Reporter: Darren Wilkinson
>Assignee: Mikkel Meyer Andersen
>  Labels: Gamma, Random
> Attachments: MATH585-1.patch
>
>   Original Estimate: 6h
>  Remaining Estimate: 6h
>
> The current implementation of gamma random variate generation works, but uses 
> an inversion method. This is well-known to be a bad idea. Usually a carefully 
> constructed rejection procedure is used. To give an idea of the magnitude of 
> the problem, the Gamma variate generation in Parallel COLT is roughly 50 
> times faster than in Commons Math. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MATH-585) Very slow generation of gamma random variates

2011-06-10 Thread Mikkel Meyer Andersen (JIRA)

[ 
https://issues.apache.org/jira/browse/MATH-585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13047110#comment-13047110
 ] 

Mikkel Meyer Andersen commented on MATH-585:


Algorithm SA (Ahrens) from p. 876 in the above is faster than the 
inversion-based exponential sampling. A couple of tests:

Generating 10 random Exp(1.0) with mean 1.0 and var 1.0
nextOldExponential took 120.0 ms. with mean 0.9962712206756855 and var 
0.9983243442732745
nextExponential took 20.0 ms. with mean 0.9958227425469129 and var 
0.9990279223847179

Generating 1000 random Exp(1.0) with mean 1.0 and var 1.0
nextOldExponential took 671.0 ms. with mean 1.0009038221005877 and var 
1.001436791894477
nextExponential took 548.0 ms. with mean 1.246277478757 and var 
0.9996828540541313

Generating 1000 random Exp(12.23) with mean 12.23 and var 149.5729
nextOldExponential took 671.0 ms. with mean 12.229128607001826 and var 
149.4295066207314
nextExponential took 543.0 ms. with mean 12.233285138487334 and var 
149.55112465876087

Generating 1000 random Exp(0.0013) with mean 0.0013 and var 1.69E-6
nextOldExponential took 669.0 ms. with mean 0.0012998098500627037 and var 
1.6880018139800875E-6
nextExponential took 544.0 ms. with mean 0.0013003604434494178 and var 
1.6893988569492528E-6

Generating 1000 random Exp(6.0) with mean 6.0 and var 36.0
nextOldExponential took 687.0 ms. with mean 5.58564653789 and var 
36.008520922357725
nextExponential took 606.0 ms. with mean 5.997436087619412 and var 
35.95429173708095

I'm not sure how much it will improve the new Gamma, because exponentials are 
rarely sampled (only in about 1-2% in average of the times, I guess).

> Very slow generation of gamma random variates
> -
>
> Key: MATH-585
> URL: https://issues.apache.org/jira/browse/MATH-585
> Project: Commons Math
>  Issue Type: Improvement
>Affects Versions: 2.2, 3.0
> Environment: All
>Reporter: Darren Wilkinson
>Assignee: Mikkel Meyer Andersen
>  Labels: Gamma, Random
> Attachments: MATH585-1.patch
>
>   Original Estimate: 6h
>  Remaining Estimate: 6h
>
> The current implementation of gamma random variate generation works, but uses 
> an inversion method. This is well-known to be a bad idea. Usually a carefully 
> constructed rejection procedure is used. To give an idea of the magnitude of 
> the problem, the Gamma variate generation in Parallel COLT is roughly 50 
> times faster than in Commons Math. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MATH-585) Very slow generation of gamma random variates

2011-06-09 Thread Mikkel Meyer Andersen (JIRA)

[ 
https://issues.apache.org/jira/browse/MATH-585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13047015#comment-13047015
 ] 

Mikkel Meyer Andersen commented on MATH-585:


It is probably possible to gain a bit more by using (reference found in R's 
?rexp)
{quote}
Ahrens, J. H. and Dieter, U. (1972).  Computer methods for
sampling from the exponential and normal distributions.
_Communications of the ACM_, *15*, 873-882.
{quote}
instead of the currently simple inversion method to sample from the exponential 
distribution (step 8-9).

> Very slow generation of gamma random variates
> -
>
> Key: MATH-585
> URL: https://issues.apache.org/jira/browse/MATH-585
> Project: Commons Math
>  Issue Type: Improvement
>Affects Versions: 2.2, 3.0
> Environment: All
>Reporter: Darren Wilkinson
>Assignee: Mikkel Meyer Andersen
>  Labels: Gamma, Random
> Attachments: MATH585-1.patch
>
>   Original Estimate: 6h
>  Remaining Estimate: 6h
>
> The current implementation of gamma random variate generation works, but uses 
> an inversion method. This is well-known to be a bad idea. Usually a carefully 
> constructed rejection procedure is used. To give an idea of the magnitude of 
> the problem, the Gamma variate generation in Parallel COLT is roughly 50 
> times faster than in Commons Math. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MATH-585) Very slow generation of gamma random variates

2011-06-09 Thread Darren Wilkinson (JIRA)

[ 
https://issues.apache.org/jira/browse/MATH-585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13046528#comment-13046528
 ] 

Darren Wilkinson commented on MATH-585:
---

That looks great. I will be happy to do a bit of testing when the new version 
makes it into the snapshot. Thanks!

> Very slow generation of gamma random variates
> -
>
> Key: MATH-585
> URL: https://issues.apache.org/jira/browse/MATH-585
> Project: Commons Math
>  Issue Type: Improvement
>Affects Versions: 2.2, 3.0
> Environment: All
>Reporter: Darren Wilkinson
>  Labels: Gamma, Random
>   Original Estimate: 6h
>  Remaining Estimate: 6h
>
> The current implementation of gamma random variate generation works, but uses 
> an inversion method. This is well-known to be a bad idea. Usually a carefully 
> constructed rejection procedure is used. To give an idea of the magnitude of 
> the problem, the Gamma variate generation in Parallel COLT is roughly 50 
> times faster than in Commons Math. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MATH-585) Very slow generation of gamma random variates

2011-06-09 Thread Mikkel Meyer Andersen (JIRA)

[ 
https://issues.apache.org/jira/browse/MATH-585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13046520#comment-13046520
 ] 

Mikkel Meyer Andersen commented on MATH-585:


To above is with shape = 13.3 and scale = 4.3

> Very slow generation of gamma random variates
> -
>
> Key: MATH-585
> URL: https://issues.apache.org/jira/browse/MATH-585
> Project: Commons Math
>  Issue Type: Improvement
>Affects Versions: 2.2, 3.0
> Environment: All
>Reporter: Darren Wilkinson
>  Labels: Gamma, Random
>   Original Estimate: 6h
>  Remaining Estimate: 6h
>
> The current implementation of gamma random variate generation works, but uses 
> an inversion method. This is well-known to be a bad idea. Usually a carefully 
> constructed rejection procedure is used. To give an idea of the magnitude of 
> the problem, the Gamma variate generation in Parallel COLT is roughly 50 
> times faster than in Commons Math. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MATH-585) Very slow generation of gamma random variates

2011-06-09 Thread Mikkel Meyer Andersen (JIRA)

[ 
https://issues.apache.org/jira/browse/MATH-585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13046519#comment-13046519
 ] 

Mikkel Meyer Andersen commented on MATH-585:


Just for your info: I'm working on implementing the methods described in the 
papers, and they are not too difficult. I need to do some cleaning, and then 
I'll upload a patch within a few weeks. Current status of a test run:

Generating 100 random gamma with mean 57.19 and var 245.916997
nextGamma done
nextGamma took 23698.0 ms. with mean 57.17778720638469 and var 246.443482942482
nextNewGamma done
nextNewGamma took 230.0 ms. with mean 57.20940243394036 and var 
245.80250080092702

So the performance gain is around 100 which isn't too bad :-).

> Very slow generation of gamma random variates
> -
>
> Key: MATH-585
> URL: https://issues.apache.org/jira/browse/MATH-585
> Project: Commons Math
>  Issue Type: Improvement
>Affects Versions: 2.2, 3.0
> Environment: All
>Reporter: Darren Wilkinson
>  Labels: Gamma, Random
>   Original Estimate: 6h
>  Remaining Estimate: 6h
>
> The current implementation of gamma random variate generation works, but uses 
> an inversion method. This is well-known to be a bad idea. Usually a carefully 
> constructed rejection procedure is used. To give an idea of the magnitude of 
> the problem, the Gamma variate generation in Parallel COLT is roughly 50 
> times faster than in Commons Math. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MATH-585) Very slow generation of gamma random variates

2011-06-07 Thread Mikkel Meyer Andersen (JIRA)

[ 
https://issues.apache.org/jira/browse/MATH-585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13045326#comment-13045326
 ] 

Mikkel Meyer Andersen commented on MATH-585:


I know that it is indeed important, and I will look into it as soon as time 
allows me to. R is GPL, so using their implementation is a no-go. But the help 
file provides the methods they have implemented:
{quote}
‘rgamma’ for ‘shape >= 1’ uses

Ahrens, J. H. and Dieter, U. (1982).  Generating gamma variates by
a modified rejection technique.  _Communications of the ACM_,
*25*, 47-54,

and for ‘0 < shape < 1’ uses

Ahrens, J. H. and Dieter, U. (1974).  Computer methods for
sampling from gamma, beta, Poisson and binomial distributions.
_Computing_, *12*, 223-246.
{quote}

Again, thanks for reporting this - we'll do our best to improve our 
implementation.

> Very slow generation of gamma random variates
> -
>
> Key: MATH-585
> URL: https://issues.apache.org/jira/browse/MATH-585
> Project: Commons Math
>  Issue Type: Improvement
>Affects Versions: 2.2, 3.0
> Environment: All
>Reporter: Darren Wilkinson
>  Labels: Gamma, Random
>   Original Estimate: 6h
>  Remaining Estimate: 6h
>
> The current implementation of gamma random variate generation works, but uses 
> an inversion method. This is well-known to be a bad idea. Usually a carefully 
> constructed rejection procedure is used. To give an idea of the magnitude of 
> the problem, the Gamma variate generation in Parallel COLT is roughly 50 
> times faster than in Commons Math. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MATH-585) Very slow generation of gamma random variates

2011-06-06 Thread Darren Wilkinson (JIRA)

[ 
https://issues.apache.org/jira/browse/MATH-585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13045058#comment-13045058
 ] 

Darren Wilkinson commented on MATH-585:
---

It is not urgent in the sense that I can use Parallel COLT! :-/ But it is quite 
a big deal for anyone wanting to use Commons Math and doing serious Monte Carlo 
work. Not only is the gamma distribution itself very important, but the gamma 
generator is used as the basis for simulating variates from many other 
distributions (including inverse gamma, chi-square, inverse chi-square, beta, 
Dirichlet, ...), so a slow gamma generator has lots of knock-on effects. It is 
certainly the next most important non-uniform generator to provide after 
Gaussian. It is not trivial to implement well, so I recommend copying or 
porting another implementation if possible, though I'm not an expert on open 
source software licenses so I don't know how practical that is. The GSL and R 
are GNU projects which have good C implementations, which I think are based on 
slightly better algorithms than the Ahrens and Dieter algorithm you mention 
above.

> Very slow generation of gamma random variates
> -
>
> Key: MATH-585
> URL: https://issues.apache.org/jira/browse/MATH-585
> Project: Commons Math
>  Issue Type: Improvement
>Affects Versions: 2.2, 3.0
> Environment: All
>Reporter: Darren Wilkinson
>  Labels: Gamma, Random
>   Original Estimate: 6h
>  Remaining Estimate: 6h
>
> The current implementation of gamma random variate generation works, but uses 
> an inversion method. This is well-known to be a bad idea. Usually a carefully 
> constructed rejection procedure is used. To give an idea of the magnitude of 
> the problem, the Gamma variate generation in Parallel COLT is roughly 50 
> times faster than in Commons Math. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MATH-585) Very slow generation of gamma random variates

2011-06-06 Thread Mikkel Meyer Andersen (JIRA)

[ 
https://issues.apache.org/jira/browse/MATH-585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13044789#comment-13044789
 ] 

Mikkel Meyer Andersen commented on MATH-585:


Thank you very much for reporting this. I have looked in the documentation of 
cern.jet.random.tdouble.Gamma in the Parallel COLT, and they write:
{quote}
Method: Acceptance Rejection combined with Acceptance Complement.
High performance implementation. This is a port of RandGamma used in CLHEP 
1.4.0 (C++). CLHEP's implementation, in turn, is based on gds.c from the C-RAND 
/ WIN-RAND library. C-RAND's implementation, in turn, is based upon
J.H. Ahrens, U. Dieter (1974): Computer methods for sampling from gamma, beta, 
Poisson and binomial distributions, Computing 12, 223-246.

and

J.H. Ahrens, U. Dieter (1982): Generating gamma variates by a modified 
rejection technique, Communications of the ACM 25, 47-54.
{quote}

Looking in the ChangeLog file in the source for version 2.1.0.1 of CLHEP, I 
found that "CLHEP now uses the LGPL license.". As far as I can tell, LGPL is 
not compatible with Apache License which Commons Math is under. 

This means that we must implement the method ourselves, e.g. based on the 
papers referred to above.

How urgent is your need for this speed-up? Unfortunately I'm quite busy for the 
next months, but please submit a patch if possible.

> Very slow generation of gamma random variates
> -
>
> Key: MATH-585
> URL: https://issues.apache.org/jira/browse/MATH-585
> Project: Commons Math
>  Issue Type: Improvement
>Affects Versions: 2.2, 3.0
> Environment: All
>Reporter: Darren Wilkinson
>  Labels: Gamma, Random
>   Original Estimate: 6h
>  Remaining Estimate: 6h
>
> The current implementation of gamma random variate generation works, but uses 
> an inversion method. This is well-known to be a bad idea. Usually a carefully 
> constructed rejection procedure is used. To give an idea of the magnitude of 
> the problem, the Gamma variate generation in Parallel COLT is roughly 50 
> times faster than in Commons Math. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira