On Fri, Jun 10, 2011 at 10:34 PM, Olivier Delalleau <sh...@keba.be> wrote:
> 2011/6/10 Benjamin Root <ben.r...@ou.edu> > >> >> >> On Fri, Jun 10, 2011 at 9:29 PM, Olivier Delalleau <sh...@keba.be> wrote: >> >>> >>> 2011/6/10 Olivier Delalleau <sh...@keba.be> >>> >>>> 2011/6/10 Charles R Harris <charlesr.har...@gmail.com> >>>> >>>>> >>>>> >>>>> On Fri, Jun 10, 2011 at 5:19 PM, Olivier Delalleau <sh...@keba.be>wrote: >>>>> >>>>>> 2011/6/10 Charles R Harris <charlesr.har...@gmail.com> >>>>>> >>>>>>> >>>>>>> >>>>>>> On Fri, Jun 10, 2011 at 3:43 PM, Benjamin Root <ben.r...@ou.edu>wrote: >>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> On Fri, Jun 10, 2011 at 3:24 PM, Charles R Harris < >>>>>>>> charlesr.har...@gmail.com> wrote: >>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> On Fri, Jun 10, 2011 at 2:17 PM, Benjamin Root <ben.r...@ou.edu>wrote: >>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On Fri, Jun 10, 2011 at 3:02 PM, Charles R Harris < >>>>>>>>>> charlesr.har...@gmail.com> wrote: >>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> On Fri, Jun 10, 2011 at 1:50 PM, Benjamin Root >>>>>>>>>>> <ben.r...@ou.edu>wrote: >>>>>>>>>>> >>>>>>>>>>>> Came across an odd error while using numpy master. Note, my >>>>>>>>>>>> system is 32-bits. >>>>>>>>>>>> >>>>>>>>>>>> >>> import numpy as np >>>>>>>>>>>> >>> type(np.sum([1, 2, 3], dtype=np.int32)) == np.int32 >>>>>>>>>>>> False >>>>>>>>>>>> >>> type(np.sum([1, 2, 3], dtype=np.int64)) == np.int64 >>>>>>>>>>>> True >>>>>>>>>>>> >>> type(np.sum([1, 2, 3], dtype=np.float32)) == np.float32 >>>>>>>>>>>> True >>>>>>>>>>>> >>> type(np.sum([1, 2, 3], dtype=np.float64)) == np.float64 >>>>>>>>>>>> True >>>>>>>>>>>> >>>>>>>>>>>> So, only the summation performed with a np.int32 accumulator >>>>>>>>>>>> results in a type that doesn't match the expected type. Now, for >>>>>>>>>>>> even more >>>>>>>>>>>> strangeness: >>>>>>>>>>>> >>>>>>>>>>>> >>> type(np.sum([1, 2, 3], dtype=np.int32)) >>>>>>>>>>>> <type 'numpy.int32'> >>>>>>>>>>>> >>> hex(id(type(np.sum([1, 2, 3], dtype=np.int32)))) >>>>>>>>>>>> '0x9599a0' >>>>>>>>>>>> >>> hex(id(np.int32)) >>>>>>>>>>>> '0x959a80' >>>>>>>>>>>> >>>>>>>>>>>> So, the type from the sum() reports itself as a numpy int, but >>>>>>>>>>>> its memory address is different from the memory address for >>>>>>>>>>>> np.int32. >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>> One of them is probably a long, print out the typecode, >>>>>>>>>>> dtype.char. >>>>>>>>>>> >>>>>>>>>>> Chuck >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> Good intuition, but odd result... >>>>>>>>>> >>>>>>>>>> >>> import numpy as np >>>>>>>>>> >>> a = np.sum([1, 2, 3], dtype=np.int32) >>>>>>>>>> >>> b = np.int32(6) >>>>>>>>>> >>> type(a) >>>>>>>>>> <type 'numpy.int32'> >>>>>>>>>> >>> type(b) >>>>>>>>>> <type 'numpy.int32'> >>>>>>>>>> >>> a.dtype.char >>>>>>>>>> 'i' >>>>>>>>>> >>> b.dtype.char >>>>>>>>>> 'l' >>>>>>>>>> >>>>>>>>>> So, the standard np.int32 is getting listed as a long somehow? To >>>>>>>>>> further investigate: >>>>>>>>>> >>>>>>>>>> >>>>>>>>> Yes, long shifts around from int32 to int64 depending on the OS. >>>>>>>>> For instance, in 64 bit Windows it's 32 bits while in 64 bit Linux >>>>>>>>> it's 64 >>>>>>>>> bits. On 32 bit systems it is 32 bits. >>>>>>>>> >>>>>>>>> Chuck >>>>>>>>> >>>>>>>>> >>>>>>>> Right, that makes sense. But, the question is why does sum() put >>>>>>>> out a result dtype that is not identical to the dtype that I >>>>>>>> requested, or >>>>>>>> even the dtype of the input array? Could this be an indication of a >>>>>>>> bug >>>>>>>> somewhere? Even if the bug is harmless (it was only noticed within >>>>>>>> the test >>>>>>>> suite of larry), is this unexpected? >>>>>>>> >>>>>>>> >>>>>>> I expect sum is using a ufunc and it acts differently on account of >>>>>>> the cleanup of the ufunc casting rules. And yes, a long *is* int32 on >>>>>>> your >>>>>>> machine. On mine >>>>>>> >>>>>>> In [4]: dtype('q') # long long >>>>>>> Out[4]: dtype('int64') >>>>>>> >>>>>>> In [5]: dtype('l') # long >>>>>>> Out[5]: dtype('int64') >>>>>>> >>>>>>> The mapping from C types to numpy width types isn't 1-1. Personally, >>>>>>> I think we should drop long ;) But it used to be the standard Python >>>>>>> type in >>>>>>> the C API. Mark has also pointed out the problems/confusion this >>>>>>> ambiguity >>>>>>> causes and someday we should probably think it out and fix it. But I >>>>>>> don't >>>>>>> think it is the most pressing problem. >>>>>>> >>>>>>> Chuck >>>>>>> >>>>>>> >>>>>> But isn't it a bug if numpy.dtype('i') != numpy.dtype('l') on a 32 bit >>>>>> computer where both are int32? >>>>>> >>>>>> >>>>> Maybe yes, maybe no ;) They have different descriptors, so from numpy's >>>>> perspective they are different, but at the hardware/precision level they >>>>> are >>>>> the same. It's more of a decision as to what != means in this case. Since >>>>> numpy started as Numeric with only the c types the current behavior is >>>>> consistent, but that doesn't mean it shouldn't change at some point. >>>>> >>>>> Chuck >>>>> >>>> >>>> Well apparently it was actually changed recently, since in Numpy 1.5.1 >>>> on a Windows 32 bit machine, they are considered equal with '=='. >>>> Personally I think if the string representation of two dtypes is >>>> "int32", then they should be ==, otherwise it wouldn't make much sense >>>> given >>>> that you can directly test the equality of a dtype with a string like >>>> "int32" (like dtype('i') == "int32" and dtype('l') == "int32"). >>>> >>> >>> I also just checked on a fresh install of numpy 1.6.0 on python 3.2, and >>> both types are equal as well. >>> >> >> Are you talking about the release of 1.6, or the continued development >> branch? This is happening to me on the master branch, but I have not tried >> earlier versions. Again, I think this bolsters the evidence that this is >> from a (very) recent change. >> >> >>> I've been playing quite a bit with numpy dtypes and it's the first time I >>> hear two dtypes representing the exact same kind of data do not compare >>> equal, so I'm still enclined to believe it should be considered a bug. >>> >>> >> Quite honestly, I really don't care that the dtypes aren't equal. I >> usually work at a purely python level and performing actions based on types >> is generally bad practice anyway. Anytime that I (rarely) check types, I >> would use isinstance() against one of the core numerical types rather than a >> numpy type. The fact that I even found this issue was completely by >> accident while investigating a test failure in larry. >> >> What concerns me more is that the type coming from the ufunc is not the >> same type that went in, or even requested through the dtype argument. I >> think *that* should be the main concern here, and should probably be tested >> for in the unit tests. >> >> Ben Root >> > > The project I'm working on (http://deeplearning.net/software/theano/) > heavily relies on dtype.__eq__, because it uses typed objects associated to > data of e.g. int32 or float64 types, and it needs to know if the provided > numpy arrays are of the proper type. > So we do a lot of comparisons like: > array.dtype == "int32" > > I'd be curious to know, in your case, what is the output of the following > lines: > numpy.dtype('i') == "int32" > numpy.dtype('l') == "int32" > str(numpy.dtype('i')) > str(numpy.dtype('l')) > > My output on a 32-bit, Ubuntu 11.04 machine with the latest numpy from master is: True True 'int32' 'int32' I will have a new 64-bit machine up and running on Monday (yay!) to do some further tests on, but I suspect I am currently in the minority here for architecture type. Ben Root
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion