[Numpy-discussion] type 'numpy.int64' unhashable

2009-10-30 Thread Sebastian Haase
Hi,
I get this error:
set(chainsA[0,:,0])
TypeError: unhashable type: 'numpy.ndarray'
>>> list(chainsA[0,:,0])
[2636, 2590, 2619, 2590]
>>> list(chainsA[0,:,0])[0]
2636
>>> type(_)


I understand where this error comes from, however what I was trying to
do seems to "intuitive" that I would like to ask for suggestions:
"What should I do if the "number" 2636 becomes unhashable ?"

Thanks,

Sebastian Haase
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] type 'numpy.int64' unhashable

2009-10-30 Thread David Cournapeau
On Fri, Oct 30, 2009 at 8:04 PM, Sebastian Haase  wrote:

> I understand where this error comes from, however what I was trying to
> do seems to "intuitive" that I would like to ask for suggestions:
> "What should I do if the "number" 2636 becomes unhashable ?"

In your example, that's the array which is unhashable, the numbers
itself should be hashable. Arrays are mutable, so I don't think you
can easily make them hashable. You could transform everything into
tuple of tuple of... if you need to use set, though.

David
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] type 'numpy.int64' unhashable

2009-10-30 Thread Gael Varoquaux
On Fri, Oct 30, 2009 at 08:21:16PM +0900, David Cournapeau wrote:
> On Fri, Oct 30, 2009 at 8:04 PM, Sebastian Haase  wrote:

> > I understand where this error comes from, however what I was trying to
> > do seems to "intuitive" that I would like to ask for suggestions:
> > "What should I do if the "number" 2636 becomes unhashable ?"

> In your example, that's the array which is unhashable, the numbers
> itself should be hashable. Arrays are mutable, so I don't think you
> can easily make them hashable. You could transform everything into
> tuple of tuple of... if you need to use set, though.

Use md5's of their .data attribute. This works quite well (you might want
to hash a pickled string of the dtype in addition).

Gaël
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] type 'numpy.int64' unhashable

2009-10-30 Thread James Bergstra
On Fri, Oct 30, 2009 at 7:23 AM, Gael Varoquaux
 wrote:
> On Fri, Oct 30, 2009 at 08:21:16PM +0900, David Cournapeau wrote:
>> On Fri, Oct 30, 2009 at 8:04 PM, Sebastian Haase  wrote:
>
>> > I understand where this error comes from, however what I was trying to
>> > do seems to "intuitive" that I would like to ask for suggestions:
>> > "What should I do if the "number" 2636 becomes unhashable ?"
>
>> In your example, that's the array which is unhashable, the numbers
>> itself should be hashable. Arrays are mutable, so I don't think you
>> can easily make them hashable. You could transform everything into
>> tuple of tuple of... if you need to use set, though.
>
> Use md5's of their .data attribute. This works quite well (you might want
> to hash a pickled string of the dtype in addition).
>
> Gaël

Careful... if your data is not contiguous in memory then you could be
adding lots of random noise to your hash key by doing this.  This
could cause equal ndarrays to hash to different values -- not good.
Make sure memory is contiguous before hashing the .data.  Flatten()
does this i think, as does copy(), array(), and many others.

James
-- 
http://www-etud.iro.umontreal.ca/~bergstrj
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] type 'numpy.int64' unhashable

2009-10-30 Thread Robert Kern
On Fri, Oct 30, 2009 at 08:11, James Bergstra  wrote:
> On Fri, Oct 30, 2009 at 7:23 AM, Gael Varoquaux
>  wrote:
>> On Fri, Oct 30, 2009 at 08:21:16PM +0900, David Cournapeau wrote:
>>> On Fri, Oct 30, 2009 at 8:04 PM, Sebastian Haase  
>>> wrote:
>>
>>> > I understand where this error comes from, however what I was trying to
>>> > do seems to "intuitive" that I would like to ask for suggestions:
>>> > "What should I do if the "number" 2636 becomes unhashable ?"
>>
>>> In your example, that's the array which is unhashable, the numbers
>>> itself should be hashable. Arrays are mutable, so I don't think you
>>> can easily make them hashable. You could transform everything into
>>> tuple of tuple of... if you need to use set, though.
>>
>> Use md5's of their .data attribute. This works quite well (you might want
>> to hash a pickled string of the dtype in addition).
>>
>> Gaël
>
> Careful... if your data is not contiguous in memory then you could be
> adding lots of random noise to your hash key by doing this.  This
> could cause equal ndarrays to hash to different values -- not good.
> Make sure memory is contiguous before hashing the .data.  Flatten()
> does this i think, as does copy(), array(), and many others.

.data doesn't work for non-contiguous arrays anyways. :-)

But all of this is irrelevant to the OP. First, I cannot replicate his problem.

In [12]: chainsA = np.arange(10, dtype=np.int64)

In [13]: set(chainsA)
Out[13]: set([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])


Second, he seems to be interested in scalar objects, not arrays. The
scalar objects should all be hashable and comparable out-of-box and
ready to be used in sets and as dict keys. We will need a complete,
self-contained example that demonstrates the problem to get any
further with this.

Third, even if he wanted to use arrays as set elements, he couldn't
because such objects not only need to have __hash__ defined, they also
need __eq__ to return a bool. We return boolean arrays that cannot be
used as a truth value.

Fourth, even if arrays could be compared, you couldn't replace their
__hash__ method or tell set to use a different function in place of
the __hash__ method.

Fifth, even if you could tell set to use a different hash function,
you wouldn't use cryptographic hashes. You would just
hash(buffer(arr)) for contiguous arrays and hash(arr.tostring()) for
the rest.

-- 
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth."
  -- Umberto Eco
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] type 'numpy.int64' unhashable

2009-10-30 Thread Sebastian Haase
On Fri, Oct 30, 2009 at 5:44 PM, Robert Kern  wrote:
> On Fri, Oct 30, 2009 at 08:11, James Bergstra  
> wrote:
>> On Fri, Oct 30, 2009 at 7:23 AM, Gael Varoquaux
>>  wrote:
>>> On Fri, Oct 30, 2009 at 08:21:16PM +0900, David Cournapeau wrote:
 On Fri, Oct 30, 2009 at 8:04 PM, Sebastian Haase  
 wrote:
>>>
 > I understand where this error comes from, however what I was trying to
 > do seems to "intuitive" that I would like to ask for suggestions:
 > "What should I do if the "number" 2636 becomes unhashable ?"
>>>
 In your example, that's the array which is unhashable, the numbers
 itself should be hashable. Arrays are mutable, so I don't think you
 can easily make them hashable. You could transform everything into
 tuple of tuple of... if you need to use set, though.
>>>
>>> Use md5's of their .data attribute. This works quite well (you might want
>>> to hash a pickled string of the dtype in addition).
>>>
>>> Gaël
>>
>> Careful... if your data is not contiguous in memory then you could be
>> adding lots of random noise to your hash key by doing this.  This
>> could cause equal ndarrays to hash to different values -- not good.
>> Make sure memory is contiguous before hashing the .data.  Flatten()
>> does this i think, as does copy(), array(), and many others.
>
> .data doesn't work for non-contiguous arrays anyways. :-)
>
> But all of this is irrelevant to the OP. First, I cannot replicate his 
> problem.
>
> In [12]: chainsA = np.arange(10, dtype=np.int64)
>
> In [13]: set(chainsA)
> Out[13]: set([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
>
>
> Second, he seems to be interested in scalar objects, not arrays. The
> scalar objects should all be hashable and comparable out-of-box and
> ready to be used in sets and as dict keys. We will need a complete,
> self-contained example that demonstrates the problem to get any
> further with this.
>
> Third, even if he wanted to use arrays as set elements, he couldn't
> because such objects not only need to have __hash__ defined, they also
> need __eq__ to return a bool. We return boolean arrays that cannot be
> used as a truth value.
>
> Fourth, even if arrays could be compared, you couldn't replace their
> __hash__ method or tell set to use a different function in place of
> the __hash__ method.
>
> Fifth, even if you could tell set to use a different hash function,
> you wouldn't use cryptographic hashes. You would just
> hash(buffer(arr)) for contiguous arrays and hash(arr.tostring()) for
> the rest.
>
> --
> Robert Kern
>
Thanks to everyone for replying. Nice detective work, Robert - indeed
it seems to work with "real" ndarrays -- I have to do some more
homework to get my problem into a shape so that I could demonstrate it
in a "small, self contained form".
Thanks again,

Sebastian
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion