Re: Hypergeometric distribution

2006-01-05 Thread Robert Kern
Bengt Richter wrote: On 4 Jan 2006 12:46:47 -0800, Raven [EMAIL PROTECTED] wrote: The problem with Stirling's approximation is that I need to calculate the hypergeometric hence the factorial for numbers within a large range e.g. choose(14000,170) or choose(5,2) It seems you are hinting at

Re: Hypergeometric distribution

2006-01-05 Thread Bengt Richter
On Thu, 05 Jan 2006 09:47:02 -0600, Robert Kern [EMAIL PROTECTED] wrote: Bengt Richter wrote: On 4 Jan 2006 12:46:47 -0800, Raven [EMAIL PROTECTED] wrote: The problem with Stirling's approximation is that I need to calculate the hypergeometric hence the factorial for numbers within a large

Re: Hypergeometric distribution

2006-01-04 Thread Raven
? Hi Cameron, my real goal was to calculate the hypergeometric distribution. The problem was that the function for hypergeometric calculation from scipy uses the scipy.comb function which by default uses floats so for large numbers comb(n,r) returns inf. and hence the hypergeometric returns nan

Re: Hypergeometric distribution

2006-01-04 Thread Paul Rubin
Raven [EMAIL PROTECTED] writes: The problem with Stirling's approximation is that I need to calculate the hypergeometric hence the factorial for numbers within a large range e.g. choose(14000,170) or choose(5,2) Stirling's approximation to second order is fairly accurate even at low values:

Re: Hypergeometric distribution

2006-01-04 Thread Bengt Richter
, are you *sure* the suggestions already offered aren't adequate? Hi Cameron, my real goal was to calculate the hypergeometric distribution. The problem was that the function for hypergeometric ISTM that can't have been your real goal -- unless you are e.g. preparing numeric tables for publication

Re: Hypergeometric distribution

2006-01-03 Thread Raven
Travis E. Oliphant wrote: Notice the keyword for the comb function (in scipy) lets you use it to compute exact values. SciPy does not just automatically use the long integer because this will always slow you down. comb(N, k, exact=0) Combinations of N things taken k at a time. If

Re: Hypergeometric distribution

2006-01-03 Thread Cameron Laird
In article [EMAIL PROTECTED], Raven [EMAIL PROTECTED] wrote: Well, what to say? I am very happy for all the solutions you guys have posted :-) For Paul: I would prefer not to use Stirling's approximation The problem with long integers is that to calculate the hypergeometric I need to do float

Re: Hypergeometric distribution

2006-01-02 Thread Raven
Well, what to say? I am very happy for all the solutions you guys have posted :-) For Paul: I would prefer not to use Stirling's approximation The problem with long integers is that to calculate the hypergeometric I need to do float division and multiplication because integer division returns 0.

Re: Hypergeometric distribution

2006-01-02 Thread Scott David Daniels
Raven wrote: ... def main(): t = time.time() for i in range(1000): r = hypergeometric(6,6,30,6) print time.time() - t t = time.time() for i in range(1000): r = hypergeometric_gamma(6,6,30,6) print time.time() - t and the result is: 0.0386447906494

Re: Hypergeometric distribution

2006-01-02 Thread Bengt Richter
On 2 Jan 2006 03:35:33 -0800, Raven [EMAIL PROTECTED] wrote: [...] The problem with long integers is that to calculate the hypergeometric I need to do float division and multiplication because integer division returns 0. A solution could be to calculate log(Long_Factorial_Integer) ISTM you

Re: Hypergeometric distribution

2006-01-02 Thread Raven
Scott David Daniels ha scritto: You should really look into the timeit module -- you'll get nice solid timings slightly easier to tweak. This seems a very interesting module, I will give it a try as soon as possible. Thanks Scott. Ale -- http://mail.python.org/mailman/listinfo/python-list

Re: Hypergeometric distribution

2006-01-02 Thread Raven
Bengt Richter wrote: ISTM you wouldn't get zero if you scaled by 10**significant_digits (however many you require) before dividing. E.g., expected hits per trillion (or septillion or whatever) expresses probability too. Perhaps that could work in your calculation? Regards, Bengt

Re: Hypergeometric distribution

2006-01-02 Thread Raven
Bengt Richter wrote: ISTM you wouldn't get zero if you scaled by 10**significant_digits (however many you require) before dividing. E.g., expected hits per trillion (or septillion or whatever) expresses probability too. Perhaps that could work in your calculation? Regards, Bengt

Re: Hypergeometric distribution

2006-01-02 Thread Travis E. Oliphant
Raven wrote: Thanks Steven for your very interesting post. This was a critical instance from my problem: from scipy import comb comb(14354,174) inf The scipy.stats.distributions.hypergeom function uses the scipy.comb function, so it returned nan since it tries to divide an

Re: Hypergeometric distribution

2006-01-01 Thread Steven D'Aprano
On Sat, 31 Dec 2005 16:24:02 -0800, Raven wrote: Thanks to all of you guys, I could resolve my problem using the logarithms as proposed by Robert. I needed to calculate the factorial for genomic data, more specifically for the number of genes in the human genome i.e. about 30.000 and that is

Re: Hypergeometric distribution

2006-01-01 Thread Raven
Thanks Steven for your very interesting post. This was a critical instance from my problem: from scipy import comb comb(14354,174) inf The scipy.stats.distributions.hypergeom function uses the scipy.comb function, so it returned nan since it tries to divide an infinite. I did not tried to

Re: Hypergeometric distribution

2006-01-01 Thread Paul Rubin
Raven [EMAIL PROTECTED] writes: Yes I am calculating hundreds of hypergeometric probabilities so I need fast calculations Can you use Stirling's approximation to get the logs of the factorials? -- http://mail.python.org/mailman/listinfo/python-list

Re: Hypergeometric distribution

2006-01-01 Thread Steven D'Aprano
On Sun, 01 Jan 2006 14:24:39 -0800, Raven wrote: Thanks Steven for your very interesting post. This was a critical instance from my problem: from scipy import comb comb(14354,174) inf Curious. It wouldn't surprise me if scipy was using floats, because 'inf' is usually a floating point

Re: Hypergeometric distribution

2005-12-31 Thread Raven
Thanks to all of you guys, I could resolve my problem using the logarithms as proposed by Robert. I needed to calculate the factorial for genomic data, more specifically for the number of genes in the human genome i.e. about 30.000 and that is a big number :-) I didn't know gmpy Thanks a lot,

Hypergeometric distribution

2005-12-26 Thread Raven
to calculate the hypergeometric distribution? The statistical package R can handle such calculations but I don't want to use python R binding since I want a standalone app. Thanks a lot Ale -- http://mail.python.org/mailman/listinfo/python-list

Re: Hypergeometric distribution

2005-12-26 Thread Robert Kern
other libray or an algorithm to calculate the hypergeometric distribution? Use logarithms. Specifically, from scipy import special def logchoose(n, k): lgn1 = special.gammaln(n+1) lgk1 = special.gammaln(k+1) lgnk1 = special.gammaln(n-k+1) return lgn1 - (lgnk1 + lgk1) def

Re: Hypergeometric distribution

2005-12-26 Thread Gerard Flanagan
other libray or an algorithm to calculate the hypergeometric distribution? The statistical package R can handle such calculations but I don't want to use python R binding since I want a standalone app. Thanks a lot Ale Ale I had this code lying about if it helps. I don't know if it's even

Re: Hypergeometric distribution

2005-12-26 Thread Steven D'Aprano
On Mon, 26 Dec 2005 12:18:55 -0800, Raven wrote: Hi to all, I need to calculate the hpergeometric distribution: choose(r, x) * choose(b, n-x) p(x; r,b,n) = - choose(r+b, n) choose(r,x) is the

Re: Hypergeometric distribution

2005-12-26 Thread Alex Martelli
subsist. Is there any other libray or an algorithm to calculate the hypergeometric distribution? The statistical package R can handle such calculations but I don't want to use python R binding since I want a standalone app. Have you tried with gmpy? Alex -- http://mail.python.org/mailman/listinfo