On Fri, Jun 10, 2011 at 9:10 PM, Benjamin Root <ben.r...@ou.edu> wrote:
> > > On Fri, Jun 10, 2011 at 9:29 PM, Olivier Delalleau <sh...@keba.be> wrote: > >> >> 2011/6/10 Olivier Delalleau <sh...@keba.be> >> >>> 2011/6/10 Charles R Harris <charlesr.har...@gmail.com> >>> >>>> >>>> >>>> On Fri, Jun 10, 2011 at 5:19 PM, Olivier Delalleau <sh...@keba.be>wrote: >>>> >>>>> 2011/6/10 Charles R Harris <charlesr.har...@gmail.com> >>>>> >>>>>> >>>>>> >>>>>> On Fri, Jun 10, 2011 at 3:43 PM, Benjamin Root <ben.r...@ou.edu>wrote: >>>>>> >>>>>>> >>>>>>> >>>>>>> On Fri, Jun 10, 2011 at 3:24 PM, Charles R Harris < >>>>>>> charlesr.har...@gmail.com> wrote: >>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> On Fri, Jun 10, 2011 at 2:17 PM, Benjamin Root <ben.r...@ou.edu>wrote: >>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> On Fri, Jun 10, 2011 at 3:02 PM, Charles R Harris < >>>>>>>>> charlesr.har...@gmail.com> wrote: >>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On Fri, Jun 10, 2011 at 1:50 PM, Benjamin Root >>>>>>>>>> <ben.r...@ou.edu>wrote: >>>>>>>>>> >>>>>>>>>>> Came across an odd error while using numpy master. Note, my >>>>>>>>>>> system is 32-bits. >>>>>>>>>>> >>>>>>>>>>> >>> import numpy as np >>>>>>>>>>> >>> type(np.sum([1, 2, 3], dtype=np.int32)) == np.int32 >>>>>>>>>>> False >>>>>>>>>>> >>> type(np.sum([1, 2, 3], dtype=np.int64)) == np.int64 >>>>>>>>>>> True >>>>>>>>>>> >>> type(np.sum([1, 2, 3], dtype=np.float32)) == np.float32 >>>>>>>>>>> True >>>>>>>>>>> >>> type(np.sum([1, 2, 3], dtype=np.float64)) == np.float64 >>>>>>>>>>> True >>>>>>>>>>> >>>>>>>>>>> So, only the summation performed with a np.int32 accumulator >>>>>>>>>>> results in a type that doesn't match the expected type. Now, for >>>>>>>>>>> even more >>>>>>>>>>> strangeness: >>>>>>>>>>> >>>>>>>>>>> >>> type(np.sum([1, 2, 3], dtype=np.int32)) >>>>>>>>>>> <type 'numpy.int32'> >>>>>>>>>>> >>> hex(id(type(np.sum([1, 2, 3], dtype=np.int32)))) >>>>>>>>>>> '0x9599a0' >>>>>>>>>>> >>> hex(id(np.int32)) >>>>>>>>>>> '0x959a80' >>>>>>>>>>> >>>>>>>>>>> So, the type from the sum() reports itself as a numpy int, but >>>>>>>>>>> its memory address is different from the memory address for >>>>>>>>>>> np.int32. >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> One of them is probably a long, print out the typecode, >>>>>>>>>> dtype.char. >>>>>>>>>> >>>>>>>>>> Chuck >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>> Good intuition, but odd result... >>>>>>>>> >>>>>>>>> >>> import numpy as np >>>>>>>>> >>> a = np.sum([1, 2, 3], dtype=np.int32) >>>>>>>>> >>> b = np.int32(6) >>>>>>>>> >>> type(a) >>>>>>>>> <type 'numpy.int32'> >>>>>>>>> >>> type(b) >>>>>>>>> <type 'numpy.int32'> >>>>>>>>> >>> a.dtype.char >>>>>>>>> 'i' >>>>>>>>> >>> b.dtype.char >>>>>>>>> 'l' >>>>>>>>> >>>>>>>>> So, the standard np.int32 is getting listed as a long somehow? To >>>>>>>>> further investigate: >>>>>>>>> >>>>>>>>> >>>>>>>> Yes, long shifts around from int32 to int64 depending on the OS. For >>>>>>>> instance, in 64 bit Windows it's 32 bits while in 64 bit Linux it's 64 >>>>>>>> bits. >>>>>>>> On 32 bit systems it is 32 bits. >>>>>>>> >>>>>>>> Chuck >>>>>>>> >>>>>>>> >>>>>>> Right, that makes sense. But, the question is why does sum() put out >>>>>>> a result dtype that is not identical to the dtype that I requested, or >>>>>>> even >>>>>>> the dtype of the input array? Could this be an indication of a bug >>>>>>> somewhere? Even if the bug is harmless (it was only noticed within the >>>>>>> test >>>>>>> suite of larry), is this unexpected? >>>>>>> >>>>>>> >>>>>> I expect sum is using a ufunc and it acts differently on account of >>>>>> the cleanup of the ufunc casting rules. And yes, a long *is* int32 on >>>>>> your >>>>>> machine. On mine >>>>>> >>>>>> In [4]: dtype('q') # long long >>>>>> Out[4]: dtype('int64') >>>>>> >>>>>> In [5]: dtype('l') # long >>>>>> Out[5]: dtype('int64') >>>>>> >>>>>> The mapping from C types to numpy width types isn't 1-1. Personally, I >>>>>> think we should drop long ;) But it used to be the standard Python type >>>>>> in >>>>>> the C API. Mark has also pointed out the problems/confusion this >>>>>> ambiguity >>>>>> causes and someday we should probably think it out and fix it. But I >>>>>> don't >>>>>> think it is the most pressing problem. >>>>>> >>>>>> Chuck >>>>>> >>>>>> >>>>> But isn't it a bug if numpy.dtype('i') != numpy.dtype('l') on a 32 bit >>>>> computer where both are int32? >>>>> >>>>> >>>> Maybe yes, maybe no ;) They have different descriptors, so from numpy's >>>> perspective they are different, but at the hardware/precision level they >>>> are >>>> the same. It's more of a decision as to what != means in this case. Since >>>> numpy started as Numeric with only the c types the current behavior is >>>> consistent, but that doesn't mean it shouldn't change at some point. >>>> >>>> Chuck >>>> >>> >>> Well apparently it was actually changed recently, since in Numpy 1.5.1 on >>> a Windows 32 bit machine, they are considered equal with '=='. >>> Personally I think if the string representation of two dtypes is "int32", >>> then they should be ==, otherwise it wouldn't make much sense given that you >>> can directly test the equality of a dtype with a string like "int32" (like >>> dtype('i') == "int32" and dtype('l') == "int32"). >>> >> >> I also just checked on a fresh install of numpy 1.6.0 on python 3.2, and >> both types are equal as well. >> > > Are you talking about the release of 1.6, or the continued development > branch? This is happening to me on the master branch, but I have not tried > earlier versions. Again, I think this bolsters the evidence that this is > from a (very) recent change. > > >> I've been playing quite a bit with numpy dtypes and it's the first time I >> hear two dtypes representing the exact same kind of data do not compare >> equal, so I'm still enclined to believe it should be considered a bug. >> >> > Quite honestly, I really don't care that the dtypes aren't equal. I > usually work at a purely python level and performing actions based on types > is generally bad practice anyway. Anytime that I (rarely) check types, I > would use isinstance() against one of the core numerical types rather than a > numpy type. The fact that I even found this issue was completely by > accident while investigating a test failure in larry. > > What concerns me more is that the type coming from the ufunc is not the > same type that went in, or even requested through the dtype argument. I > think *that* should be the main concern here, and should probably be tested > for in the unit tests. > > To be a bit more explicit: In [3]: np.sum([1, 2, 3], dtype='q').dtype.char Out[3]: 'l' In [4]: np.sum([1, 2, 3], dtype='l').dtype.char Out[4]: 'l' Note that there were previous oddities, for instance the returned type for a + b would not necessarily be the same as for b + a, even though the precisions would be the same. Chuck
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion