Re: Hashability questions

2012-05-15 Thread Chris Angelico
On Tue, May 15, 2012 at 3:27 PM, Ian Kelly ian.g.ke...@gmail.com wrote:
 Why?  I can't see any purpose in implementing __eq__ this way, but I
 don't see how it's broken (assuming that __hash__ is actually
 implemented somehow and doesn't just raise TypeError).  The
 requirement is that if two objects compare equal, then they must have
 the same hash, and that clearly holds true here.

 Can you give a concrete example that demonstrates how this __eq__
 method is dangerous and broken?

Its brokenness is that hash collisions result in potentially-spurious
equalities. But I can still invent a (somewhat contrived) use for such
a setup:


class Modulo:
base = 256
def __init__(self,n):
self.val=int(n)
def __str__(self):
return str(self.val)
__repr__=__str__
def __hash__(self):
return self.val%self.base
def __eq__(self,other):
return hash(self)==hash(other)
def __iadd__(self,other):
try:
self.val+=other.val
except:
try:
self.val+=int(other)
except:
pass
return self

Two of these numbers will hash and compare equal if they are equal
modulo 'base'. Useful? Probably not. But it's plausibly defined.

ChrisA
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Hashability questions

2012-05-15 Thread Christian Heimes
Am 15.05.2012 07:27, schrieb Ian Kelly:
 Why?  I can't see any purpose in implementing __eq__ this way, but I
 don't see how it's broken (assuming that __hash__ is actually
 implemented somehow and doesn't just raise TypeError).  The
 requirement is that if two objects compare equal, then they must have
 the same hash, and that clearly holds true here.
 
 Can you give a concrete example that demonstrates how this __eq__
 method is dangerous and broken?

Code explains more than words. I've created two examples that some issues.

Mutable values break dicts as you won't be able to retrieve the same
object again:

 class Example(object):
... def __init__(self, value):
... self.value = value
... def __hash__(self):
... return hash(self.value)
... def __eq__(self, other):
... if not isinstance(other, Example):
... return NotImplemented
... return self.value == other.value
...


 ob = Example(egg)
 d = {}
 d[ob] = True
 d[egg] = True
 d[spam] = True
 ob in d
True
 d
{'egg': True, __main__.Example object at 0x7fab66cb7450: True, 'spam':
True}

 ob.value = spam
 ob in d
False
 d
{'egg': True, __main__.Example object at 0x7fab66cb7450: True, 'spam':
True}


When you mess up __eq__ you'll get funny results and suddenly the
insertion order does unexpected things to you:

 class Example2(object):
... def __init__(self, value):
... self.value = value
... def __hash__(self):
... return hash(self.value)
... def __eq__(self, other):
... return hash(self) == hash(other)
...
 d = {}
 ob = Example2(egg)
 d[ob] = True
 d
{__main__.Example2 object at 0x7fab66cb7610: True}
 d[egg] = True
 d
{__main__.Example2 object at 0x7fab66cb7610: True}
 d2 = {}
 d2[egg] = True
 d2[ob] = True
 d2
{'egg': True}


-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Hashability questions

2012-05-15 Thread Bob Grommes
On Monday, May 14, 2012 8:35:36 PM UTC-5, alex23 wrote:
 It looks like this has changed between Python 2 and 3:
 
 If a class does not define an __eq__() method it should not define a
 __hash__() operation either; if it defines __eq__() but not
 __hash__(), its instances will not be usable as items in hashable
 collections.
 
 From: http://docs.python.org/dev/reference/datamodel.html#object.__hash__
 
 You should just be able to add a __hash__ to Utility and it'll be fine.

Thanks, Alex.  I should have mentioned I was using Python 3.  I guess all this 
is a bit over-thought to just crank out some code -- in practice, comparing two 
classes for equality is mostly YAGNI -- but it's my way of coming to a 
reasonably in-depth understanding of how things work ... 
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Hashability questions

2012-05-15 Thread Ian Kelly
On Tue, May 15, 2012 at 3:25 AM, Christian Heimes li...@cheimes.de wrote:
 Code explains more than words. I've created two examples that some issues.

 Mutable values break dicts as you won't be able to retrieve the same
 object again:

Sure, you'll get no argument from me on that.  I was more interested
in the other one.

 When you mess up __eq__ you'll get funny results and suddenly the
 insertion order does unexpected things to you:

 class Example2(object):
 ...     def __init__(self, value):
 ...         self.value = value
 ...     def __hash__(self):
 ...         return hash(self.value)
 ...     def __eq__(self, other):
 ...         return hash(self) == hash(other)
 ...
 d = {}
 ob = Example2(egg)
 d[ob] = True
 d
 {__main__.Example2 object at 0x7fab66cb7610: True}
 d[egg] = True
 d
 {__main__.Example2 object at 0x7fab66cb7610: True}
 d2 = {}
 d2[egg] = True
 d2[ob] = True
 d2
 {'egg': True}

That's just how sets and dicts work with distinct objects that compare
equal.  I don't see any fundamental difference between that and this:

 d = {}
 d[42] = True
 d
{42: True}
 d[42.0] = True
 d
{42: True}
 d2 = {}
 d2[42.0] = True
 d2
{42.0: True}
 d2[42] = True
 d2
{42.0: True}

I wouldn't consider the hashing of ints and floats to be broken.
-- 
http://mail.python.org/mailman/listinfo/python-list


Hashability questions

2012-05-14 Thread Bob Grommes
Noob alert: writing my first Python class library.

I have a straightforward class called Utility that lives in Utility.py.

I'm trying to get a handle on best practices for fleshing out a library.  As 
such, I've done the following for starters:

  def __str__(self):
return str(type(self))

#  def __eq__(self,other):
#return hash(self) == hash(other)

The commented-out method is what I'm questioning.  As-is, I can do the 
following from my test harness:

u = Utility()
print(str(u))
print(hash(u))
u2 = Utility()
print(hash(u2))
print(hash(u) == hash(u2))

However if I uncomment the above _eq_() implementation, I get the following 
output:

class 'Utility.Utility'
Traceback (most recent call last):
  File /Users/bob/PycharmProjects/BGC/Tests.py, line 7, in module
print(hash(u))
TypeError: unhashable type: 'Utility'

Process finished with exit code 1

Obviously there is some sort of default implementation of __hash__() at work 
and my implementation of _eq_() has somehow broken it.  Can anyone explain 
what's going on?
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Hashability questions

2012-05-14 Thread Chris Kaynor
On Sun, May 13, 2012 at 12:11 PM, Bob Grommes bob.grom...@gmail.com wrote:
 Noob alert: writing my first Python class library.

 I have a straightforward class called Utility that lives in Utility.py.

 I'm trying to get a handle on best practices for fleshing out a library.  As 
 such, I've done the following for starters:

  def __str__(self):
    return str(type(self))

 #  def __eq__(self,other):
 #    return hash(self) == hash(other)

 The commented-out method is what I'm questioning.  As-is, I can do the 
 following from my test harness:

 u = Utility()
 print(str(u))
 print(hash(u))
 u2 = Utility()
 print(hash(u2))
 print(hash(u) == hash(u2))

 However if I uncomment the above _eq_() implementation, I get the following 
 output:

 class 'Utility.Utility'
 Traceback (most recent call last):
  File /Users/bob/PycharmProjects/BGC/Tests.py, line 7, in module
    print(hash(u))
 TypeError: unhashable type: 'Utility'

 Process finished with exit code 1

 Obviously there is some sort of default implementation of __hash__() at work 
 and my implementation of _eq_() has somehow broken it.  Can anyone explain 
 what's going on?

In Python, the default implementations of __hash__ and __eq__ are set
to return the id of the object. Thus, an object by default compares
equal only to itself, and it hashes the same everytime.

In Python3, if you override __eq__, the default __hash__ is removed,
however it can also be overridden to provide better hashing support.
In Python2, the default removal of __hash__ did not exist, which could
lead to stubble bugs where a class would override __eq__ by leave
__hash__ as the default implementation.

Generally, the default __eq__ and __hash__ functions provide the
correct results, and are nice and convenient to have. From there, the
case where __eq__ is overridden is the next most common, and if it is
overridden, the default __hash__ is almost never correct, and thus the
object should either not be hashable (the default in Python3) or
should also be overriden to produce the correct results.

The rule is that, if two objects return different results from
__hash__, they should never compare equal. The opposite rule also
holds true: if two objects compare equal, they should return the same
value from __hash__.

See http://docs.python.org/reference/datamodel.html#object.__hash__
and http://docs.python.org/reference/datamodel.html#object.__lt__ for
more information.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Hashability questions

2012-05-14 Thread Chris Rebert
On Sun, May 13, 2012 at 12:11 PM, Bob Grommes bob.grom...@gmail.com wrote:
 Noob alert: writing my first Python class library.

 I have a straightforward class called Utility that lives in Utility.py.

 I'm trying to get a handle on best practices for fleshing out a library.  As 
 such, I've done the following for starters:

  def __str__(self):
    return str(type(self))

 #  def __eq__(self,other):
 #    return hash(self) == hash(other)

 The commented-out method is what I'm questioning.  As-is, I can do the 
 following from my test harness:

 u = Utility()
 print(str(u))
 print(hash(u))
 u2 = Utility()
 print(hash(u2))
 print(hash(u) == hash(u2))

 However if I uncomment the above _eq_() implementation, I get the following 
 output:

 class 'Utility.Utility'
 Traceback (most recent call last):
  File /Users/bob/PycharmProjects/BGC/Tests.py, line 7, in module
    print(hash(u))
 TypeError: unhashable type: 'Utility'

 Process finished with exit code 1

 Obviously there is some sort of default implementation of __hash__() at work 
 and my implementation of _eq_() has somehow broken it.  Can anyone explain 
 what's going on?

See the 3rd, 5th, and 4th paragraphs of:
http://docs.python.org/dev/reference/datamodel.html#object.__hash__

Also, for future reference, it's advisable to state whether your
question concerns Python 2.x or Python 3.x.

Cheers,
Chris
--
http://chrisrebert.com
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Hashability questions

2012-05-14 Thread Dave Angel
On 05/14/2012 07:38 PM, Chris Kaynor wrote:
 On Sun, May 13, 2012 at 12:11 PM, Bob Grommes bob.grom...@gmail.com wrote:
 SNIP

 The rule is that, if two objects return different results from
 __hash__, they should never compare equal. The opposite rule also
 holds true: if two objects compare equal, they should return the same
 value from __hash__.


You state those as if they were different rules.  Those are
contrapositives of each other, and therefore equivalent.

The inverse would be that two objects with the same __hash__ should
compare equal, and that's clearly not true.



-- 

DaveA

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Hashability questions

2012-05-14 Thread alex23
On May 14, 5:11 am, Bob Grommes bob.grom...@gmail.com wrote:
 Obviously there is some sort of default implementation of __hash__()
 at work and my implementation of _eq_() has somehow broken it.
 Can anyone explain what's going on?

It looks like this has changed between Python 2 and 3:

If a class does not define an __eq__() method it should not define a
__hash__() operation either; if it defines __eq__() but not
__hash__(), its instances will not be usable as items in hashable
collections.

From: http://docs.python.org/dev/reference/datamodel.html#object.__hash__

You should just be able to add a __hash__ to Utility and it'll be fine.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Hashability questions

2012-05-14 Thread Christian Heimes
Am 13.05.2012 21:11, schrieb Bob Grommes:
 Noob alert: writing my first Python class library.
 
 I have a straightforward class called Utility that lives in Utility.py.
 
 I'm trying to get a handle on best practices for fleshing out a library.  As 
 such, I've done the following for starters:
 
   def __str__(self):
 return str(type(self))
 
 #  def __eq__(self,other):
 #return hash(self) == hash(other)
 
 The commented-out method is what I'm questioning.  As-is, I can do the 
 following from my test harness:

By the way, that's a dangerous and broken way to implement __eq__(). You
mustn't rely on hash() in __eq__() if you want to use your object in
sets and dicts. You must implement __hash__ and __eq__ in a way that
takes all relevant attributes into account. The attributes must be read
only, otherwise you are going to break sets and dicts.

Here is a best practice example:

class Example(object):
def __init__(self, attr1, attr2):
self._attr1 = attr1
self._attr2 = attr2

@property
def attr1(self):
return self._attr1

@property
def attr2(self):
return self._attr2

def __eq__(self, other):
if not isinstance(other, Example):
return NotImplemented
return (self._attr1 == other._attr1
and self._attr2 == other._attr2)

def __hash__(self):
return hash((self._attr1, self._attr2))

# optional
def __ne__(self, other):
if not isinstance(other, Example):
return NotImplemented
return not self == other






-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Hashability questions

2012-05-14 Thread Ian Kelly
On Mon, May 14, 2012 at 7:50 PM, Christian Heimes li...@cheimes.de wrote:
 Am 13.05.2012 21:11, schrieb Bob Grommes:
 Noob alert: writing my first Python class library.

 I have a straightforward class called Utility that lives in Utility.py.

 I'm trying to get a handle on best practices for fleshing out a library.  As 
 such, I've done the following for starters:

   def __str__(self):
     return str(type(self))

 #  def __eq__(self,other):
 #    return hash(self) == hash(other)

 The commented-out method is what I'm questioning.  As-is, I can do the 
 following from my test harness:

 By the way, that's a dangerous and broken way to implement __eq__(). You
 mustn't rely on hash() in __eq__() if you want to use your object in
 sets and dicts. You must implement __hash__ and __eq__ in a way that
 takes all relevant attributes into account. The attributes must be read
 only, otherwise you are going to break sets and dicts.

Why?  I can't see any purpose in implementing __eq__ this way, but I
don't see how it's broken (assuming that __hash__ is actually
implemented somehow and doesn't just raise TypeError).  The
requirement is that if two objects compare equal, then they must have
the same hash, and that clearly holds true here.

Can you give a concrete example that demonstrates how this __eq__
method is dangerous and broken?

Cheers,
Ian
-- 
http://mail.python.org/mailman/listinfo/python-list