Re: [Numpy-discussion] please change mean to use dtype=float

Charles R Harris Tue, 19 Sep 2006 21:18:58 -0700

On 9/19/06, Charles R Harris <[EMAIL PROTECTED]> wrote:

On 9/19/06, Charles R Harris < [EMAIL PROTECTED]> wrote:

On 9/19/06, Sebastian Haase < [EMAIL PROTECTED]> wrote:
Travis Oliphant wrote:
> Sebastian Haase wrote:
>> I still would argue that getting a "good" (smaller rounding errors) answer
>> should be the default -- if speed is wanted, then *that* could be still
>> specified by explicitly using dtype=float32  (which would also be a possible
>> choice for int32 input) .
>>
> So you are arguing for using long double then.... ;-)
>
>> In image processing we always want means to be calculated in float64 even
>> though input data is always float32 (if not uint16).
>>
>> Also it is simpler to say "float64 is the default" (full stop.) - instead
>>
>> "float64 is the default unless you have float32"
>>
> "the type you have is the default except for integers".  Do you really
> want float64 to be the default for float96?
>
> Unless we are going to use long double as the default, then I'm not
> convinced that we should special-case the "double" type.
>
I guess I'm not really aware of the float96 type ...
Is that a "machine type" on any system ?  I always thought that -- e.g .
coming from C -- double is "as good as it gets"...
Who uses float96 ?  I heard somewhere that (some) CPUs use 80bits
internally when calculating 64bit double-precision...

Is this not going into some academic argument !?
For all I know, calculating mean()s (and sum()s, ...) is always done in
double precision -- never in single precision, even when the data is in
float32.

Having float32 be the default for float32 data is just requiring more
typing, and more explaining ...  it would compromise numpy usability as
a day-to-day replacement for other systems.

Sorry, if I'm being ignorant ...

I'm going to side with Travis here. It is only a default and easily overridden. And yes, there are precisions greater than double. I was using quad precision back in the eighties on a VAX for some inherently ill conditioned problems. And on my machine long double is 12 bytes.

Here is the 754r (revision) spec: http://en.wikipedia.org/wiki/IEEE_754r

It includes quads (128 bits) and half precision (16 bits) floats. I believe the latter are used for some imaging stuff, radar for instance, and are also available in some high end GPUs from Nvidia and other companies. The 80 bit numbers you refer to were defined as extended precision in the original 754 spec and were mostly intended for temporaries in internal FPU computations. They have various alignment requirements for efficient use, which is why they show up as 96 bits (4 byte alignment) and sometimes 128 bits (8 byte alignment). So actually, float128 would not always distinquish between extended precision and quad precision. I see more work for Travis in the future ;)

I just checked this out. On amd64 32 bit linux gives 12 bytes for long double, 64 bit linux gives 16 bytes for long doubles, but they both have 64 bit mantissas, i.e., they are both 80 bit extended precision. Those sizes are the defaults and can be overridden by compiler flags. Anyway, we may need some way to tell the difference between float128 and quads since they will both have the same length on 64 bit architectures. But that is a problem for the future.

Chuck

-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys -- and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV

_______________________________________________
Numpy-discussion mailing list
Numpy-discussion@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/numpy-discussion

Re: [Numpy-discussion] please change mean to use dtype=float

Reply via email to