[Numpy-discussion] Random int64 and float64 numbers
Hi, I'm trying to generate random 64-bit integer values for integers and floats using Numpy, within the entire range of valid values for that type. To generate random 32-bit floats, I can use: np.random.uniform(low=np.finfo(np.float32).min,high=np.finfo (np.float32).max,size=10) which gives for example array([ 1.47351436e+37, 9.93620693e+37, 2.22893053e+38, -3.33828977e+38, 1.08247781e+37, -8.37481260e+37, 2.64176554e+38, -2.72207226e+37, 2.54790459e+38, -2.47883866e+38]) but if I try and use this for 64-bit numbers, i.e. np.random.uniform(low=np.finfo(np.float64).min,high=np.finfo (np.float64).max,size=10) I get array([ Inf, Inf, Inf, Inf, Inf, Inf, Inf, Inf, Inf, Inf]) Similarly, for integers, I can successfully generate random 32-bit integers: np.random.random_integers(np.iinfo(np.int32).min,high=np.iinfo (np.int32).max,size=10) which gives array([-1506183689, 662982379, -1616890435, -1519456789, 1489753527, -604311122, 2034533014, 449680073, -444302414, -1924170329]) but am unsuccessful for 64-bit integers, i.e. np.random.random_integers(np.iinfo(np.int64).min,high=np.iinfo (np.int64).max,size=10) which produces the following error: OverflowError: long int too large to convert to int Is this expected behavior, or are these bugs? Thanks for any help, Thomas ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Random int64 and float64 numbers
Thomas Robitaille skrev: > np.random.random_integers(np.iinfo(np.int32).min,high=np.iinfo > (np.int32).max,size=10) > > which gives > > array([-1506183689, 662982379, -1616890435, -1519456789, 1489753527, > -604311122, 2034533014, 449680073, -444302414, > -1924170329]) > > This fails on my computer (Python 2.6.4, NumPy 1.3.0 on Win32). >>> np.random.random_integers(np.iinfo(np.int32).min,high=np.iinfo (np.int32).max,size=10) Traceback (most recent call last): File "", line 2, in (np.int32).max,size=10) File "mtrand.pyx", line 950, in mtrand.RandomState.random_integers File "mtrand.pyx", line 746, in mtrand.RandomState.randint OverflowError: long int too large to convert to int It might have something to do with this: >>> 2**31-1 2147483647L >>> -2**31 -2147483648L In light of this annoying behaviour: def random_int64(size): a0 = np.random.random_integers(0, 0x, size=size).astype(np.uint64) a1 = np.random.random_integers(0, 0x, size=size).astype(np.uint64) a2 = np.random.random_integers(0, 0x, size=size).astype(np.uint64) a3 = np.random.random_integers(0, 0x, size=size).astype(np.uint64) a = a0 + (a1<<16) + (a2 << 32) + (a3 << 48) return a.view(dtype=np.int64) Sturla ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Random int64 and float64 numbers
Thomas Robitaille wrote: > Hi, > > I'm trying to generate random 64-bit integer values for integers and > floats using Numpy, within the entire range of valid values for that > type. To generate random 32-bit floats, I can use: > > np.random.uniform(low=np.finfo(np.float32).min,high=np.finfo > (np.float32).max,size=10) > > which gives for example > > array([ 1.47351436e+37, 9.93620693e+37, 2.22893053e+38, > -3.33828977e+38, 1.08247781e+37, -8.37481260e+37, > 2.64176554e+38, -2.72207226e+37, 2.54790459e+38, > -2.47883866e+38]) > > but if I try and use this for 64-bit numbers, i.e. > > np.random.uniform(low=np.finfo(np.float64).min,high=np.finfo > (np.float64).max,size=10) > > I get > > array([ Inf, Inf, Inf, Inf, Inf, Inf, Inf, Inf, Inf, Inf]) > > Similarly, for integers, I can successfully generate random 32-bit > integers: > > np.random.random_integers(np.iinfo(np.int32).min,high=np.iinfo > (np.int32).max,size=10) > > which gives > > array([-1506183689, 662982379, -1616890435, -1519456789, 1489753527, > -604311122, 2034533014, 449680073, -444302414, > -1924170329]) > > but am unsuccessful for 64-bit integers, i.e. > > np.random.random_integers(np.iinfo(np.int64).min,high=np.iinfo > (np.int64).max,size=10) > > which produces the following error: > > OverflowError: long int too large to convert to int > > Is this expected behavior, or are these bugs? > I think those are bugs, but it may be difficult to fix. You can check that if you restrict a tiny bit your interval, you get better result: import numpy as np # max/min for double precision is ~ 1.8e308 low, high = -1e308, 1e308 np.random.uniformat(low, high, 100) # bunch of inf low, high = -1e307, 1e307 np.random.uniformat(low, high, 100) # much more reasonable It may be that you are pushing the limits of the random generator. Your min and max may be border cases: if you use the min/max representable numbers, and the random generator needs to do any addition of a positive number, you will 'overflow' your float number (Robert will have a better answer to this). The problem is that it may be difficult to detect this in advance. David ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Random int64 and float64 numbers
On Sun, Nov 1, 2009 at 10:20 PM, David Cournapeau wrote: > Thomas Robitaille wrote: >> Hi, >> >> I'm trying to generate random 64-bit integer values for integers and >> floats using Numpy, within the entire range of valid values for that >> type. To generate random 32-bit floats, I can use: >> >> np.random.uniform(low=np.finfo(np.float32).min,high=np.finfo >> (np.float32).max,size=10) >> >> which gives for example >> >> array([ 1.47351436e+37, 9.93620693e+37, 2.22893053e+38, >> -3.33828977e+38, 1.08247781e+37, -8.37481260e+37, >> 2.64176554e+38, -2.72207226e+37, 2.54790459e+38, >> -2.47883866e+38]) >> >> but if I try and use this for 64-bit numbers, i.e. >> >> np.random.uniform(low=np.finfo(np.float64).min,high=np.finfo >> (np.float64).max,size=10) >> >> I get >> >> array([ Inf, Inf, Inf, Inf, Inf, Inf, Inf, Inf, Inf, Inf]) >> >> Similarly, for integers, I can successfully generate random 32-bit >> integers: >> >> np.random.random_integers(np.iinfo(np.int32).min,high=np.iinfo >> (np.int32).max,size=10) >> >> which gives >> >> array([-1506183689, 662982379, -1616890435, -1519456789, 1489753527, >> -604311122, 2034533014, 449680073, -444302414, >> -1924170329]) >> >> but am unsuccessful for 64-bit integers, i.e. >> >> np.random.random_integers(np.iinfo(np.int64).min,high=np.iinfo >> (np.int64).max,size=10) >> >> which produces the following error: >> >> OverflowError: long int too large to convert to int >> >> Is this expected behavior, or are these bugs? >> > > I think those are bugs, but it may be difficult to fix. > > You can check that if you restrict a tiny bit your interval, you get > better result: > > import numpy as np > # max/min for double precision is ~ 1.8e308 > low, high = -1e308, 1e308 > np.random.uniformat(low, high, 100) # bunch of inf > low, high = -1e307, 1e307 > np.random.uniformat(low, high, 100) # much more reasonable > > It may be that you are pushing the limits of the random generator. Your > min and max may be border cases: if you use the min/max representable > numbers, and the random generator needs to do any addition of a positive > number, you will 'overflow' your float number (Robert will have a better > answer to this). The problem is that it may be difficult to detect this > in advance. > > David > ___ > NumPy-Discussion mailing list > NumPy-Discussion@scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > array([ Inf, Inf, Inf, Inf, Inf, Inf, Inf, Inf, Inf, Inf]) might actually be the right answer if you want a uniform distribution on the real line. I never realized how many numbers are out there when I saw that most numbers in the example are e+37 or e+38. Josef ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Random int64 and float64 numbers
On Sun, Nov 1, 2009 at 20:57, Thomas Robitaille wrote: > Hi, > > I'm trying to generate random 64-bit integer values for integers and > floats using Numpy, within the entire range of valid values for that > type. 64-bit and larger integers could be done, but it requires modification. The integer distributions were written to support C longs, not anything larger. You could also use .bytes() and np.fromstring(). > To generate random 32-bit floats, I can use: > np.random.uniform(low=np.finfo(np.float32).min,high=np.finfo > (np.float32).max,size=10) What is the use case here? I know of none. Floating point is a bit weird and will cause you many problems over such an extended range. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Random int64 and float64 numbers
josef.p...@gmail.com wrote: > > array([ Inf, Inf, Inf, Inf, Inf, Inf, Inf, Inf, Inf, Inf]) > > might actually be the right answer if you want a uniform distribution > on the real line. Does it make sense to define a uniform random variable whose range is the extended real line ? It would not have a distribution w.r.t the Lebesgue measure, right ? I must confess I am quickly lost in the maths in statistics, though, David ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Random int64 and float64 numbers
Robert Kern skrev: > 64-bit and larger integers could be done, but it requires > modification. The integer distributions were written to support C > longs, not anything larger. You could also use .bytes() and > np.fromstring(). > But as of Python 2.6.4, even 32-bit integers fail, at least on Windows. Sturla ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Random int64 and float64 numbers
On Sun, Nov 1, 2009 at 22:17, Sturla Molden wrote: > Robert Kern skrev: >> 64-bit and larger integers could be done, but it requires >> modification. The integer distributions were written to support C >> longs, not anything larger. You could also use .bytes() and >> np.fromstring(). >> > But as of Python 2.6.4, even 32-bit integers fail, at least on Windows. Then let me clarify: it was written to support integer ranges up to sys.maxint. Absolutely, it would be desirable to extend it. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Random int64 and float64 numbers
Sturla Molden wrote: > Robert Kern skrev: > >> 64-bit and larger integers could be done, but it requires >> modification. The integer distributions were written to support C >> longs, not anything larger. You could also use .bytes() and >> np.fromstring(). >> >> > But as of Python 2.6.4, even 32-bit integers fail, at least on Windows. > It fails on linux as well - I think it is a 32 vs 64 bits issue, not a windows vs linux. I don't know what happens on windows 64, though: we may have issue if we use long. cheers, David ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Random int64 and float64 numbers
Robert Kern skrev: > Then let me clarify: it was written to support integer ranges up to > sys.maxint. Absolutely, it would be desirable to extend it. > > I know, but look at this: >>> import sys >>> sys.maxint 2147483647 >>> 2**31-1 2147483647L sys.maxint becomes a long, which is what confuses mtrand. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Random int64 and float64 numbers
On Sun, Nov 1, 2009 at 10:55 PM, David Cournapeau wrote: > josef.p...@gmail.com wrote: >> >> array([ Inf, Inf, Inf, Inf, Inf, Inf, Inf, Inf, Inf, Inf]) >> >> might actually be the right answer if you want a uniform distribution >> on the real line. > > Does it make sense to define a uniform random variable whose range is > the extended real line ? It would not have a distribution w.r.t the > Lebesgue measure, right ? > > I must confess I am quickly lost in the maths in statistics, though, No, it wouldn't be a proper distribution. However in Bayesian analysis it is used as an improper (diffuse) prior, which often replicates frequentists results. But it's a theoretical derivation, I don't think anyone tries to simulate this. To simulate huge uniform integers, I think it should be possible to use the floating point random numbers and rescale and round them. Josef > > David > ___ > NumPy-Discussion mailing list > NumPy-Discussion@scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Random int64 and float64 numbers
Robert Kern skrev: > Then let me clarify: it was written to support integer ranges up to > sys.maxint. Absolutely, it would be desirable to extend it. > > Actually it only supports integers up to sys.maxint-1, as random_integers call randint. random_integers includes the upper range, but randint excludes the upper range. Thus, this happens on line 1153 in mtrand.pyx: return self.randint(low, high+1, size) The main source of the problem is that number smaller than sys.maxint can become a long. (I have asked why on python-dev, it does not make any sence.) So when random_integers pass "high+1" to randint, it is unneccesarily converted to a long. Then, there is an exception on line 847: hi = high With hi previously declared to long, Cython refuses the conversion. Now, we could try a downcast to int like this: hi = int(high) which would make Cython only raise an exception in case of an integer overflow. >>> int(2**31) 2147483648L >>> int(2**31-1) 2147483647 If there is no overflow, high becomes an int and conversion to C long is allowed. Still, this will only support integer ranges up to sys.maxint - 1. We thus have to swap the order of randint and random_intgers. The one with the inclusive upper interval should call rk_interval. Sturla ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Random int64 and float64 numbers
Sturla Molden skrev: > Robert Kern skrev: > >> Then let me clarify: it was written to support integer ranges up to >> sys.maxint. Absolutely, it would be desirable to extend it. >> >> >> > Actually it only supports integers up to sys.maxint-1, as > random_integers call randint. random_integers includes the upper range, > but randint excludes the upper range. Thus, this happens on line 1153 in > mtrand.pyx: > > return self.randint(low, high+1, size) > > inclusive upper interval should call rk_interval > I love this one: cdef long lo, hi, diff [...] diff = hi - lo - 1 which silently overflows, and is the reason for this strange exception: >>> np.random.random_integers(-2147483648,high=2147483646,size=10) Traceback (most recent call last): File "", line 1, in np.random.random_integers(-2147483648,high=2147483646,size=10) File "mtrand.pyx", line 950, in mtrand.RandomState.random_integers File "mtrand.pyx", line 750, in mtrand.RandomState.randint ValueError: low >= high I'll call this a bug. Sturla ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Random int64 and float64 numbers
On Sun, Nov 1, 2009 at 23:14, Sturla Molden wrote: > I'll call this a bug. Yes. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Random int64 and float64 numbers
josef.p...@gmail.com wrote: > > No, it wouldn't be a proper distribution. However in Bayesian analysis > it is used as an improper (diffuse) prior Ah, right - I wonder how this is handled rigorously, though. I know some basics of Bayesian statistics, but I don't much about Bayesian statistics from a theoretical POV (i.e. a rigorous mathematical development). > To simulate huge uniform integers, I think it should be possible to use > the floating point random numbers and rescale and round them. > Rescaling and especially rounding may bias the distribution, no ? The best (but long term) strategy would be to support arbitrary precision integer, as mentioned by Robert. David ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Random int64 and float64 numbers
Sturla Molden wrote: > Sturla Molden skrev: > >> Robert Kern skrev: >> >> >>> Then let me clarify: it was written to support integer ranges up to >>> sys.maxint. Absolutely, it would be desirable to extend it. >>> >>> >>> >>> >> Actually it only supports integers up to sys.maxint-1, as >> random_integers call randint. random_integers includes the upper range, >> but randint excludes the upper range. Thus, this happens on line 1153 in >> mtrand.pyx: >> >> return self.randint(low, high+1, size) >> >> inclusive upper interval should call rk_interval >> >> > > I love this one: > > cdef long lo, hi, diff > [...] > diff = hi - lo - 1 > > which silently overflows, and is the reason for this strange exception: > > >>> np.random.random_integers(-2147483648,high=2147483646,size=10) > > Traceback (most recent call last): > File "", line 1, in > np.random.random_integers(-2147483648,high=2147483646,size=10) > File "mtrand.pyx", line 950, in mtrand.RandomState.random_integers > File "mtrand.pyx", line 750, in mtrand.RandomState.randint > ValueError: low >= high > > I'll call this a bug. > Yep, I was bitten by it as well: http://projects.scipy.org/numpy/ticket/965 David ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Random int64 and float64 numbers
2009/11/1 Thomas Robitaille : > Hi, > > I'm trying to generate random 64-bit integer values for integers and > floats using Numpy, within the entire range of valid values for that > type. To generate random 32-bit floats, I can use: Others have addressed why this is giving bogus results. But it's worth thinking about what you're actually trying to do. If it's "fuzz tests", that is, producing unrestricted random floats to feed to a function, then even if this worked it wouldn't produce what you want: it will never produce floats of order unity, for example. If you want do what you described, you could produce floats uniformly from -1 to 1 and then multiply the results by the largest representable float. If you want "random floats", I suggest generating an array of random bytes and reinterpreting them as floats. You'll get a fair number of NaNs and infinities, so you may want to take only the finite values and regenerate the rest. This will give you some numbers from all over the range of floats, including tiny values (and denormalized values, which are a potentially important special case). Anne > np.random.uniform(low=np.finfo(np.float32).min,high=np.finfo > (np.float32).max,size=10) > > which gives for example > > array([ 1.47351436e+37, 9.93620693e+37, 2.22893053e+38, > -3.33828977e+38, 1.08247781e+37, -8.37481260e+37, > 2.64176554e+38, -2.72207226e+37, 2.54790459e+38, > -2.47883866e+38]) > > but if I try and use this for 64-bit numbers, i.e. > > np.random.uniform(low=np.finfo(np.float64).min,high=np.finfo > (np.float64).max,size=10) > > I get > > array([ Inf, Inf, Inf, Inf, Inf, Inf, Inf, Inf, Inf, Inf]) > > Similarly, for integers, I can successfully generate random 32-bit > integers: > > np.random.random_integers(np.iinfo(np.int32).min,high=np.iinfo > (np.int32).max,size=10) > > which gives > > array([-1506183689, 662982379, -1616890435, -1519456789, 1489753527, > -604311122, 2034533014, 449680073, -444302414, > -1924170329]) > > but am unsuccessful for 64-bit integers, i.e. > > np.random.random_integers(np.iinfo(np.int64).min,high=np.iinfo > (np.int64).max,size=10) > > which produces the following error: > > OverflowError: long int too large to convert to int > > Is this expected behavior, or are these bugs? > > Thanks for any help, > > Thomas > ___ > NumPy-Discussion mailing list > NumPy-Discussion@scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Random int64 and float64 numbers
Interesting thread, which leaves me wondering two things: is it documented somewhere (e.g., at the IEEE site) precisely how many *decimal* mantissae are representable using the 64-bit IEEE standard for float representation (if that makes sense); and are such decimal mantissae uniformly distributed, i.e., is the distance between each the same, or does the incommensurability of two and five (and thus binary and decimal) make these irregularly, or bi-periodically, i.e., every other mantissae are regularly spaced? DG On Mon, Nov 2, 2009 at 7:09 AM, Anne Archibald wrote: > 2009/11/1 Thomas Robitaille : > > Hi, > > > > I'm trying to generate random 64-bit integer values for integers and > > floats using Numpy, within the entire range of valid values for that > > type. To generate random 32-bit floats, I can use: > > Others have addressed why this is giving bogus results. But it's worth > thinking about what you're actually trying to do. If it's "fuzz > tests", that is, producing unrestricted random floats to feed to a > function, then even if this worked it wouldn't produce what you want: > it will never produce floats of order unity, for example. > > If you want do what you described, you could produce floats uniformly > from -1 to 1 and then multiply the results by the largest > representable float. > > If you want "random floats", I suggest generating an array of random > bytes and reinterpreting them as floats. You'll get a fair number of > NaNs and infinities, so you may want to take only the finite values > and regenerate the rest. This will give you some numbers from all over > the range of floats, including tiny values (and denormalized values, > which are a potentially important special case). > > Anne > > > np.random.uniform(low=np.finfo(np.float32).min,high=np.finfo > > (np.float32).max,size=10) > > > > which gives for example > > > > array([ 1.47351436e+37, 9.93620693e+37, 2.22893053e+38, > > -3.33828977e+38, 1.08247781e+37, -8.37481260e+37, > > 2.64176554e+38, -2.72207226e+37, 2.54790459e+38, > > -2.47883866e+38]) > > > > but if I try and use this for 64-bit numbers, i.e. > > > > np.random.uniform(low=np.finfo(np.float64).min,high=np.finfo > > (np.float64).max,size=10) > > > > I get > > > > array([ Inf, Inf, Inf, Inf, Inf, Inf, Inf, Inf, Inf, Inf]) > > > > Similarly, for integers, I can successfully generate random 32-bit > > integers: > > > > np.random.random_integers(np.iinfo(np.int32).min,high=np.iinfo > > (np.int32).max,size=10) > > > > which gives > > > > array([-1506183689, 662982379, -1616890435, -1519456789, 1489753527, > > -604311122, 2034533014, 449680073, -444302414, > > -1924170329]) > > > > but am unsuccessful for 64-bit integers, i.e. > > > > np.random.random_integers(np.iinfo(np.int64).min,high=np.iinfo > > (np.int64).max,size=10) > > > > which produces the following error: > > > > OverflowError: long int too large to convert to int > > > > Is this expected behavior, or are these bugs? > > > > Thanks for any help, > > > > Thomas > > ___ > > NumPy-Discussion mailing list > > NumPy-Discussion@scipy.org > > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > ___ > NumPy-Discussion mailing list > NumPy-Discussion@scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Random int64 and float64 numbers
On 5-Nov-09, at 4:54 PM, David Goldsmith wrote: > Interesting thread, which leaves me wondering two things: is it > documented > somewhere (e.g., at the IEEE site) precisely how many *decimal* > mantissae > are representable using the 64-bit IEEE standard for float > representation > (if that makes sense); IEEE-754 says nothing about decimal representations aside from how to round when converting to and from strings. You have to provide/accept *at least* 9 decimal digits in the significand for single-precision and 17 for double-precision (section 5.6). AFAIK implementations will vary in how they handle cases where a binary significand would yield more digits than that. David ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Random int64 and float64 numbers
On Thu, Nov 5, 2009 at 4:26 PM, David Warde-Farley wrote: > On 5-Nov-09, at 4:54 PM, David Goldsmith wrote: > > > Interesting thread, which leaves me wondering two things: is it > > documented > > somewhere (e.g., at the IEEE site) precisely how many *decimal* > > mantissae > > are representable using the 64-bit IEEE standard for float > > representation > > (if that makes sense); > > IEEE-754 says nothing about decimal representations aside from how to > round when converting to and from strings. You have to provide/accept > *at least* 9 decimal digits in the significand for single-precision > and 17 for double-precision (section 5.6). AFAIK implementations will > vary in how they handle cases where a binary significand would yield > more digits than that. > > I believe that was the argument for the extended precision formats. The givien number of decimal digits is sufficient to recover the same float that produced them if a slightly higher precision is used in the conversion. Chuck ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Random int64 and float64 numbers
On Thu, Nov 5, 2009 at 6:36 PM, Charles R Harris wrote: > > > On Thu, Nov 5, 2009 at 4:26 PM, David Warde-Farley > wrote: >> >> On 5-Nov-09, at 4:54 PM, David Goldsmith wrote: >> >> > Interesting thread, which leaves me wondering two things: is it >> > documented >> > somewhere (e.g., at the IEEE site) precisely how many *decimal* >> > mantissae >> > are representable using the 64-bit IEEE standard for float >> > representation >> > (if that makes sense); >> >> IEEE-754 says nothing about decimal representations aside from how to >> round when converting to and from strings. You have to provide/accept >> *at least* 9 decimal digits in the significand for single-precision >> and 17 for double-precision (section 5.6). AFAIK implementations will >> vary in how they handle cases where a binary significand would yield >> more digits than that. >> > > I believe that was the argument for the extended precision formats. The > givien number of decimal digits is sufficient to recover the same float that > produced them if a slightly higher precision is used in the conversion. > > Chuck >From the discussion for the floating point representation, it seems that a uniform random number generator would have a very coarse grid in the range for example -1e30 to +1e30 compared to interval -0.5,0.5. How many points can be represented by a float in [-0.5,0.5] compared to [1e30, 1e30+1.]? If I interpret this correctly, then there are as many floating point numbers in [0,1] as in [1,inf), or am I misinterpreting this. So how does a PRNG handle a huge interval of uniform numbers? Josef > > > ___ > NumPy-Discussion mailing list > NumPy-Discussion@scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Random int64 and float64 numbers
On Thu, Nov 5, 2009 at 7:04 PM, wrote: > On Thu, Nov 5, 2009 at 6:36 PM, Charles R Harris > wrote: > > > > > > On Thu, Nov 5, 2009 at 4:26 PM, David Warde-Farley > > wrote: > >> > >> On 5-Nov-09, at 4:54 PM, David Goldsmith wrote: > >> > >> > Interesting thread, which leaves me wondering two things: is it > >> > documented > >> > somewhere (e.g., at the IEEE site) precisely how many *decimal* > >> > mantissae > >> > are representable using the 64-bit IEEE standard for float > >> > representation > >> > (if that makes sense); > >> > >> IEEE-754 says nothing about decimal representations aside from how to > >> round when converting to and from strings. You have to provide/accept > >> *at least* 9 decimal digits in the significand for single-precision > >> and 17 for double-precision (section 5.6). AFAIK implementations will > >> vary in how they handle cases where a binary significand would yield > >> more digits than that. > >> > > > > I believe that was the argument for the extended precision formats. The > > givien number of decimal digits is sufficient to recover the same float > that > > produced them if a slightly higher precision is used in the conversion. > > > > Chuck > > >From the discussion for the floating point representation, it seems that > a uniform random number generator would have a very coarse grid > in the range for example -1e30 to +1e30 compared to interval -0.5,0.5. > > How many points can be represented by a float in [-0.5,0.5] compared > to [1e30, 1e30+1.]? > If I interpret this correctly, then there are as many floating point > numbers > in [0,1] as in [1,inf), or am I misinterpreting this. > > So how does a PRNG handle a huge interval of uniform numbers? > > There are several implementations, but the ones I'm familiar with reduce to scaling. If the rng produces random unsigned integers, then the range of integers is scaled to the interval [0,1). The variations involve explicit scaling (portable) or bit twiddling of the IEEE formats. In straight forward scaling some ranges of the random integers may not map 1-1, so the unused bits are masked off first; if you want doubles you only need 52 bits, etc. For bit twiddling there is an implicit 1 in the mantissa, so the basic range works out to [1,2), but that can be fixed by subtracting 1 from the result. Handling larger ranges than [0,1) just involves another scaling. Chuck ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Random int64 and float64 numbers
On Thu, Nov 5, 2009 at 9:23 PM, Charles R Harris wrote: > > > On Thu, Nov 5, 2009 at 7:04 PM, wrote: >> >> On Thu, Nov 5, 2009 at 6:36 PM, Charles R Harris >> wrote: >> > >> > >> > On Thu, Nov 5, 2009 at 4:26 PM, David Warde-Farley >> > wrote: >> >> >> >> On 5-Nov-09, at 4:54 PM, David Goldsmith wrote: >> >> >> >> > Interesting thread, which leaves me wondering two things: is it >> >> > documented >> >> > somewhere (e.g., at the IEEE site) precisely how many *decimal* >> >> > mantissae >> >> > are representable using the 64-bit IEEE standard for float >> >> > representation >> >> > (if that makes sense); >> >> >> >> IEEE-754 says nothing about decimal representations aside from how to >> >> round when converting to and from strings. You have to provide/accept >> >> *at least* 9 decimal digits in the significand for single-precision >> >> and 17 for double-precision (section 5.6). AFAIK implementations will >> >> vary in how they handle cases where a binary significand would yield >> >> more digits than that. >> >> >> > >> > I believe that was the argument for the extended precision formats. The >> > givien number of decimal digits is sufficient to recover the same float >> > that >> > produced them if a slightly higher precision is used in the conversion. >> > >> > Chuck >> >> >From the discussion for the floating point representation, it seems that >> a uniform random number generator would have a very coarse grid >> in the range for example -1e30 to +1e30 compared to interval -0.5,0.5. >> >> How many points can be represented by a float in [-0.5,0.5] compared >> to [1e30, 1e30+1.]? >> If I interpret this correctly, then there are as many floating point >> numbers >> in [0,1] as in [1,inf), or am I misinterpreting this. >> >> So how does a PRNG handle a huge interval of uniform numbers? >> > > There are several implementations, but the ones I'm familiar with reduce to > scaling. If the rng produces random unsigned integers, then the range of > integers is scaled to the interval [0,1). The variations involve explicit > scaling (portable) or bit twiddling of the IEEE formats. In straight forward > scaling some ranges of the random integers may not map 1-1, so the unused > bits are masked off first; if you want doubles you only need 52 bits, etc. > For bit twiddling there is an implicit 1 in the mantissa, so the basic range > works out to [1,2), but that can be fixed by subtracting 1 from the result. > Handling larger ranges than [0,1) just involves another scaling. So, since this is then a discrete distribution, what is the number of points in the support? (My guess would be 2**52, but I don't know much about numerical representations.) This would be the largest set of integers that could be generated without gaps in the distribution, and would determine the grid size for floating point random variables. (?) for Davids example: low, high = -1e307, 1e307 np.random.uniform(low, high, 100) # much more reasonable this would imply a grid size of >>> 2*1e307/2.**52 4.4408920985006261e+291 or something similar. (floating points are not very dense in the real line.) Josef > > Chuck > > ___ > NumPy-Discussion mailing list > NumPy-Discussion@scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Random int64 and float64 numbers
On Fri, Nov 6, 2009 at 6:54 AM, David Goldsmith wrote: > Interesting thread, which leaves me wondering two things: is it documented > somewhere (e.g., at the IEEE site) precisely how many *decimal* mantissae > are representable using the 64-bit IEEE standard for float representation > (if that makes sense); and are such decimal mantissae uniformly distributed They are definitely not uniformly distributed: that's why two numbers are close around 1 when they have only a few EPS difference, but around 1e100, you have to add quite a few EPS to even get a different number at all. That may be my audio processing background, but I like to think about float as numbers which have the same relative precision at any "level" - a kind of dB scale. If you want numbers with a fixed number of decimals, you need a fixed point representation. David ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Random int64 and float64 numbers
On Thu, Nov 5, 2009 at 7:53 PM, wrote: > On Thu, Nov 5, 2009 at 9:23 PM, Charles R Harris > wrote: > > > > > > On Thu, Nov 5, 2009 at 7:04 PM, wrote: > >> > >> On Thu, Nov 5, 2009 at 6:36 PM, Charles R Harris > >> wrote: > >> > > >> > > >> > On Thu, Nov 5, 2009 at 4:26 PM, David Warde-Farley < > d...@cs.toronto.edu> > >> > wrote: > >> >> > >> >> On 5-Nov-09, at 4:54 PM, David Goldsmith wrote: > >> >> > >> >> > Interesting thread, which leaves me wondering two things: is it > >> >> > documented > >> >> > somewhere (e.g., at the IEEE site) precisely how many *decimal* > >> >> > mantissae > >> >> > are representable using the 64-bit IEEE standard for float > >> >> > representation > >> >> > (if that makes sense); > >> >> > >> >> IEEE-754 says nothing about decimal representations aside from how to > >> >> round when converting to and from strings. You have to provide/accept > >> >> *at least* 9 decimal digits in the significand for single-precision > >> >> and 17 for double-precision (section 5.6). AFAIK implementations will > >> >> vary in how they handle cases where a binary significand would yield > >> >> more digits than that. > >> >> > >> > > >> > I believe that was the argument for the extended precision formats. > The > >> > givien number of decimal digits is sufficient to recover the same > float > >> > that > >> > produced them if a slightly higher precision is used in the > conversion. > >> > > >> > Chuck > >> > >> >From the discussion for the floating point representation, it seems > that > >> a uniform random number generator would have a very coarse grid > >> in the range for example -1e30 to +1e30 compared to interval -0.5,0.5. > >> > >> How many points can be represented by a float in [-0.5,0.5] compared > >> to [1e30, 1e30+1.]? > >> If I interpret this correctly, then there are as many floating point > >> numbers > >> in [0,1] as in [1,inf), or am I misinterpreting this. > >> > >> So how does a PRNG handle a huge interval of uniform numbers? > >> > > > > There are several implementations, but the ones I'm familiar with reduce > to > > scaling. If the rng produces random unsigned integers, then the range of > > integers is scaled to the interval [0,1). The variations involve explicit > > scaling (portable) or bit twiddling of the IEEE formats. In straight > forward > > scaling some ranges of the random integers may not map 1-1, so the unused > > bits are masked off first; if you want doubles you only need 52 bits, > etc. > > For bit twiddling there is an implicit 1 in the mantissa, so the basic > range > > works out to [1,2), but that can be fixed by subtracting 1 from the > result. > > Handling larger ranges than [0,1) just involves another scaling. > > So, since this is then a discrete distribution, what is the number of > points > in the support? (My guess would be 2**52, but I don't know much about > numerical representations.) > > This would be the largest set of integers that could be generated > without gaps in the distribution, and would determine the grid size > for floating point random variables. (?) > > Yes. > for Davids example: > low, high = -1e307, 1e307 > np.random.uniform(low, high, 100) # much more reasonable > > this would imply a grid size of > >>> 2*1e307/2.**52 > 4.4408920985006261e+291 > > or something similar. (floating points are not very dense in the real > line.) > > Yes, or rather, they are more or less logarithmically distributed. Floats are basically logarithms base 2 with a mantissa of fixed precision. Actual logarithm code uses the exponent and does a correction to the mantissa, i.e., takes the logarithm base two of numbers in the range [1,2). Random integers are treated differently. To get random integers in a given range, say [0,100], the bitstream produced by the rng would be broken up into 7 bit chunks [0, 127] that are used in a sampling algorithm that loops until a value in the range is produced. So any range of integers can be produced. Code for that comes with the Mersenne Twister, but I don't know if numpy uses it. Chuck ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Random int64 and float64 numbers
On Thu, Nov 5, 2009 at 3:26 PM, David Warde-Farley wrote: > On 5-Nov-09, at 4:54 PM, David Goldsmith wrote: > > > Interesting thread, which leaves me wondering two things: is it > > documented > > somewhere (e.g., at the IEEE site) precisely how many *decimal* > > mantissae > > are representable using the 64-bit IEEE standard for float > > representation > > (if that makes sense); > > IEEE-754 says nothing about decimal representations aside from how to > round when converting to and from strings. You have to provide/accept > *at least* 9 decimal digits in the significand for single-precision > and 17 for double-precision (section 5.6). AFAIK implementations will > vary in how they handle cases where a binary significand would yield > more digits than that. > I was actually more interested in the opposite situation, where the decimal representation (which is what a user would most likely provide) doesn't have a finite binary expansion: what happens then, something analogous to the decimal "rule of fives"? DG ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Random int64 and float64 numbers
2009/11/5 David Goldsmith : > On Thu, Nov 5, 2009 at 3:26 PM, David Warde-Farley > wrote: >> >> On 5-Nov-09, at 4:54 PM, David Goldsmith wrote: >> >> > Interesting thread, which leaves me wondering two things: is it >> > documented >> > somewhere (e.g., at the IEEE site) precisely how many *decimal* >> > mantissae >> > are representable using the 64-bit IEEE standard for float >> > representation >> > (if that makes sense); >> >> IEEE-754 says nothing about decimal representations aside from how to >> round when converting to and from strings. You have to provide/accept >> *at least* 9 decimal digits in the significand for single-precision >> and 17 for double-precision (section 5.6). AFAIK implementations will >> vary in how they handle cases where a binary significand would yield >> more digits than that. > > I was actually more interested in the opposite situation, where the decimal > representation (which is what a user would most likely provide) doesn't have > a finite binary expansion: what happens then, something analogous to the > decimal "rule of fives"? If you interpret "0.1" as 1/10, then this is a very general floating-point issue: how you you round off numbers you can't represent exactly? The usual answer (leaving aside internal extended-precision shenanigans) is to round, with the rule that when you're exactly between two floating-point numbers you round to the one that's even, rather than always up or always down (the numerical analysis wizards tell us that this is more numerically stable). Anne > DG > > ___ > NumPy-Discussion mailing list > NumPy-Discussion@scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Random int64 and float64 numbers
On Thu, Nov 5, 2009 at 10:42 PM, David Goldsmith wrote: > On Thu, Nov 5, 2009 at 3:26 PM, David Warde-Farley > wrote: >> >> On 5-Nov-09, at 4:54 PM, David Goldsmith wrote: >> >> > Interesting thread, which leaves me wondering two things: is it >> > documented >> > somewhere (e.g., at the IEEE site) precisely how many *decimal* >> > mantissae >> > are representable using the 64-bit IEEE standard for float >> > representation >> > (if that makes sense); >> >> IEEE-754 says nothing about decimal representations aside from how to >> round when converting to and from strings. You have to provide/accept >> *at least* 9 decimal digits in the significand for single-precision >> and 17 for double-precision (section 5.6). AFAIK implementations will >> vary in how they handle cases where a binary significand would yield >> more digits than that. > > I was actually more interested in the opposite situation, where the decimal > representation (which is what a user would most likely provide) doesn't have > a finite binary expansion: what happens then, something analogous to the > decimal "rule of fives"? Since according to my calculations there are only about >>> 4* 10**17 * 308 12320L double-precision floats, there are huge gaps in the floating point representation of the real line. Any user input or calculation result just gets converted to the closest float. Josef > > DG > > ___ > NumPy-Discussion mailing list > NumPy-Discussion@scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Random int64 and float64 numbers
On Thu, Nov 5, 2009 at 10:28 PM, Charles R Harris wrote: > > > On Thu, Nov 5, 2009 at 7:53 PM, wrote: >> >> On Thu, Nov 5, 2009 at 9:23 PM, Charles R Harris >> wrote: >> > >> > >> > On Thu, Nov 5, 2009 at 7:04 PM, wrote: >> >> >> >> On Thu, Nov 5, 2009 at 6:36 PM, Charles R Harris >> >> wrote: >> >> > >> >> > >> >> > On Thu, Nov 5, 2009 at 4:26 PM, David Warde-Farley >> >> > >> >> > wrote: >> >> >> >> >> >> On 5-Nov-09, at 4:54 PM, David Goldsmith wrote: >> >> >> >> >> >> > Interesting thread, which leaves me wondering two things: is it >> >> >> > documented >> >> >> > somewhere (e.g., at the IEEE site) precisely how many *decimal* >> >> >> > mantissae >> >> >> > are representable using the 64-bit IEEE standard for float >> >> >> > representation >> >> >> > (if that makes sense); >> >> >> >> >> >> IEEE-754 says nothing about decimal representations aside from how >> >> >> to >> >> >> round when converting to and from strings. You have to >> >> >> provide/accept >> >> >> *at least* 9 decimal digits in the significand for single-precision >> >> >> and 17 for double-precision (section 5.6). AFAIK implementations >> >> >> will >> >> >> vary in how they handle cases where a binary significand would yield >> >> >> more digits than that. >> >> >> >> >> > >> >> > I believe that was the argument for the extended precision formats. >> >> > The >> >> > givien number of decimal digits is sufficient to recover the same >> >> > float >> >> > that >> >> > produced them if a slightly higher precision is used in the >> >> > conversion. >> >> > >> >> > Chuck >> >> >> >> >From the discussion for the floating point representation, it seems >> >> that >> >> a uniform random number generator would have a very coarse grid >> >> in the range for example -1e30 to +1e30 compared to interval -0.5,0.5. >> >> >> >> How many points can be represented by a float in [-0.5,0.5] compared >> >> to [1e30, 1e30+1.]? >> >> If I interpret this correctly, then there are as many floating point >> >> numbers >> >> in [0,1] as in [1,inf), or am I misinterpreting this. >> >> >> >> So how does a PRNG handle a huge interval of uniform numbers? >> >> >> > >> > There are several implementations, but the ones I'm familiar with reduce >> > to >> > scaling. If the rng produces random unsigned integers, then the range of >> > integers is scaled to the interval [0,1). The variations involve >> > explicit >> > scaling (portable) or bit twiddling of the IEEE formats. In straight >> > forward >> > scaling some ranges of the random integers may not map 1-1, so the >> > unused >> > bits are masked off first; if you want doubles you only need 52 bits, >> > etc. >> > For bit twiddling there is an implicit 1 in the mantissa, so the basic >> > range >> > works out to [1,2), but that can be fixed by subtracting 1 from the >> > result. >> > Handling larger ranges than [0,1) just involves another scaling. >> >> So, since this is then a discrete distribution, what is the number of >> points >> in the support? (My guess would be 2**52, but I don't know much about >> numerical representations.) >> >> This would be the largest set of integers that could be generated >> without gaps in the distribution, and would determine the grid size >> for floating point random variables. (?) >> > > Yes. > >> >> for Davids example: >> low, high = -1e307, 1e307 >> np.random.uniform(low, high, 100) # much more reasonable >> >> this would imply a grid size of >> >>> 2*1e307/2.**52 >> 4.4408920985006261e+291 >> >> or something similar. (floating points are not very dense in the real >> line.) >> > > Yes, or rather, they are more or less logarithmically distributed. Floats > are basically logarithms base 2 with a mantissa of fixed precision. Actual > logarithm code uses the exponent and does a correction to the mantissa, > i.e., takes the logarithm base two of numbers in the range [1,2). > > Random integers are treated differently. To get random integers in a given > range, say [0,100], the bitstream produced by the rng would be broken up > into 7 bit chunks [0, 127] that are used in a sampling algorithm that loops > until a value in the range is produced. So any range of integers can be > produced. Code for that comes with the Mersenne Twister, but I don't know if > numpy uses it. > > Chuck Thanks for the explanation. It's good to understand some of the limitations, but I don't think in any Monte Carlo, I will have a distribution with tails that are fat enough that the discretization becomes a problem. Josef > > > ___ > NumPy-Discussion mailing list > NumPy-Discussion@scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Random int64 and float64 numbers
2009/11/5 : > On Thu, Nov 5, 2009 at 10:42 PM, David Goldsmith > wrote: >> On Thu, Nov 5, 2009 at 3:26 PM, David Warde-Farley >> wrote: >>> >>> On 5-Nov-09, at 4:54 PM, David Goldsmith wrote: >>> >>> > Interesting thread, which leaves me wondering two things: is it >>> > documented >>> > somewhere (e.g., at the IEEE site) precisely how many *decimal* >>> > mantissae >>> > are representable using the 64-bit IEEE standard for float >>> > representation >>> > (if that makes sense); >>> >>> IEEE-754 says nothing about decimal representations aside from how to >>> round when converting to and from strings. You have to provide/accept >>> *at least* 9 decimal digits in the significand for single-precision >>> and 17 for double-precision (section 5.6). AFAIK implementations will >>> vary in how they handle cases where a binary significand would yield >>> more digits than that. >> >> I was actually more interested in the opposite situation, where the decimal >> representation (which is what a user would most likely provide) doesn't have >> a finite binary expansion: what happens then, something analogous to the >> decimal "rule of fives"? > > Since according to my calculations there are only about > 4* 10**17 * 308 > 12320L More straightforwardly, it's not too far below 2**64. > double-precision floats, there are huge gaps in the floating point > representation of the real line. > Any user input or calculation result just gets converted to the > closest float. Yes. But the "huge" gaps are only huge in absolute size; in fractional error they're always about the same size. They are usually only an issue if you're representing numbers in a small range with a huge offset. To take a not-so-random example, you could be representing times in days since November 17 1858, when what you care about are the microsecond-scale differences in photon arrival times. Even then you're probably okay as long as you compute directly (t[i] = i*dt/86400 + start_t) rather than having some sort of running accumulator (t[i]=t; t += dt/86400). Professor Kahan's reasoning for using doubles for most everything is that they generally have so much more precision than you actually need that you can get away with being sloppy. Anne > Josef > > >> >> DG >> >> ___ >> NumPy-Discussion mailing list >> NumPy-Discussion@scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> >> > ___ > NumPy-Discussion mailing list > NumPy-Discussion@scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Random int64 and float64 numbers
On Thu, Nov 5, 2009 at 9:14 PM, wrote: > On Thu, Nov 5, 2009 at 10:42 PM, David Goldsmith > wrote: > > On Thu, Nov 5, 2009 at 3:26 PM, David Warde-Farley > > wrote: > >> > >> On 5-Nov-09, at 4:54 PM, David Goldsmith wrote: > >> > >> > Interesting thread, which leaves me wondering two things: is it > >> > documented > >> > somewhere (e.g., at the IEEE site) precisely how many *decimal* > >> > mantissae > >> > are representable using the 64-bit IEEE standard for float > >> > representation > >> > (if that makes sense); > >> > >> IEEE-754 says nothing about decimal representations aside from how to > >> round when converting to and from strings. You have to provide/accept > >> *at least* 9 decimal digits in the significand for single-precision > >> and 17 for double-precision (section 5.6). AFAIK implementations will > >> vary in how they handle cases where a binary significand would yield > >> more digits than that. > > > > I was actually more interested in the opposite situation, where the > decimal > > representation (which is what a user would most likely provide) doesn't > have > > a finite binary expansion: what happens then, something analogous to the > > decimal "rule of fives"? > > Since according to my calculations there are only about > > >>> 4* 10**17 * 308 > 12320L > > double-precision floats, there are huge gaps in the floating point > representation of the real line. > 2**64 minus some flags for nans and such. Chuck ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Random int64 and float64 numbers
On Thu, Nov 5, 2009 at 8:07 PM, Anne Archibald wrote: > 2009/11/5 David Goldsmith : > > On Thu, Nov 5, 2009 at 3:26 PM, David Warde-Farley > > wrote: > >> > >> On 5-Nov-09, at 4:54 PM, David Goldsmith wrote: > >> > >> > Interesting thread, which leaves me wondering two things: is it > >> > documented > >> > somewhere (e.g., at the IEEE site) precisely how many *decimal* > >> > mantissae > >> > are representable using the 64-bit IEEE standard for float > >> > representation > >> > (if that makes sense); > >> > >> IEEE-754 says nothing about decimal representations aside from how to > >> round when converting to and from strings. You have to provide/accept > >> *at least* 9 decimal digits in the significand for single-precision > >> and 17 for double-precision (section 5.6). AFAIK implementations will > >> vary in how they handle cases where a binary significand would yield > >> more digits than that. > > > > I was actually more interested in the opposite situation, where the > decimal > > representation (which is what a user would most likely provide) doesn't > have > > a finite binary expansion: what happens then, something analogous to the > > decimal "rule of fives"? > > If you interpret "0.1" as 1/10, then this is a very general > floating-point issue: how you you round off numbers you can't > represent exactly? The usual answer (leaving aside internal > extended-precision shenanigans) is to round, with the rule that when > you're exactly between two floating-point numbers you round to the one > that's even, rather than always up or always down (the numerical > analysis wizards tell us that this is more numerically stable). > More numerically stable, or simply unbiased, since half the time your rounding up, and the other half rounding down (that's the rationale as it was explained to me). DG > > Anne > > > DG > > > > ___ > > NumPy-Discussion mailing list > > NumPy-Discussion@scipy.org > > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > > ___ > NumPy-Discussion mailing list > NumPy-Discussion@scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Random int64 and float64 numbers
David Cournapeau wrote: > On Fri, Nov 6, 2009 at 6:54 AM, David Goldsmith > wrote: > >> Interesting thread, which leaves me wondering two things: is it documented >> somewhere (e.g., at the IEEE site) precisely how many *decimal* mantissae >> are representable using the 64-bit IEEE standard for float representation >> (if that makes sense); and are such decimal mantissae uniformly distributed >> > > They are definitely not uniformly distributed: that's why two numbers > are close around 1 when they have only a few EPS difference, but > around 1e100, you have to add quite a few EPS to even get a different > number at all. > > That may be my audio processing background, but I like to think about > float as numbers which have the same relative precision at any "level" > - a kind of dB scale. If you want numbers with a fixed number of > decimals, you need a fixed point representation. > David Godsmith was asking about the mantissae. For a double, that is a 53 bit signed integer. I.e. you have 52 bit fractional part (bit 0-51), 11 bit exponent (bit 52-62), and one sign bit (bit 63). The mantissae is uniformly distributed like any signed integer. The mantissae of a double have 2**53 different integer values: -2**52 to 2**52-1. But the value of a floating point number is value = (-1)**signbit * 2**(exponent - bias) * (1 - fraction) with bias = 1023 for a double. Thus, floating point numbers are not uniformly distributed, but the mantissae is. For numerical illiterates this might come as a surprise. But in numerical mathematics, the resolution is in the number of "significant digits", not in "the number of decimals". 101 and .00201 have the same numerical precision. A decimal, on the other hand, can be thought of as a floating point number using base-10 instead of base-2 for the exponent: value = (-1)**signbit * 10**(exponent - bias) * (1 - fraction) Decimals and floats are not fundamentally different. There are number exactly representable with a decimal that cannot be exactly represented with a float. But numerical computation do not become more precise with a decimal than a float. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Random int64 and float64 numbers
Thanks, Sturla, you've confirmed what I thought was the case (and explained more thoroughly the answer others gave more succinctly, but also more opaquely). :-) DG On Sat, Nov 7, 2009 at 11:40 AM, Sturla Molden wrote: > David Cournapeau wrote: > > On Fri, Nov 6, 2009 at 6:54 AM, David Goldsmith > wrote: > > > >> Interesting thread, which leaves me wondering two things: is it > documented > >> somewhere (e.g., at the IEEE site) precisely how many *decimal* > mantissae > >> are representable using the 64-bit IEEE standard for float > representation > >> (if that makes sense); and are such decimal mantissae uniformly > distributed > >> > > > > They are definitely not uniformly distributed: that's why two numbers > > are close around 1 when they have only a few EPS difference, but > > around 1e100, you have to add quite a few EPS to even get a different > > number at all. > > > > That may be my audio processing background, but I like to think about > > float as numbers which have the same relative precision at any "level" > > - a kind of dB scale. If you want numbers with a fixed number of > > decimals, you need a fixed point representation. > > > > David Godsmith was asking about the mantissae. For a double, that is a > 53 bit signed integer. I.e. you have 52 bit fractional part (bit 0-51), > 11 bit exponent (bit 52-62), and one sign bit (bit 63). The mantissae is > uniformly distributed like any signed integer. The mantissae of a double > have 2**53 different integer values: -2**52 to 2**52-1. > > But the value of a floating point number is > > value = (-1)**signbit * 2**(exponent - bias) * (1 - fraction) > > with bias = 1023 for a double. Thus, floating point numbers are not > uniformly distributed, but the mantissae is. > > For numerical illiterates this might come as a surprise. But in > numerical mathematics, the resolution is in the number of "significant > digits", not in "the number of decimals". 101 and .00201 have the same > numerical precision. > > A decimal, on the other hand, can be thought of as a floating point > number using base-10 instead of base-2 for the exponent: > > value = (-1)**signbit * 10**(exponent - bias) * (1 - fraction) > > Decimals and floats are not fundamentally different. There are number > exactly representable with a decimal that cannot be exactly represented > with a float. But numerical computation do not become more precise with > a decimal than a float. > > > > > > > > > > > > > > > > > > ___ > NumPy-Discussion mailing list > NumPy-Discussion@scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion