[Python-ideas] Re: Thoughts about implementing object-compare in unittest package?

2020-07-29 Thread 2QdxY4RzWzUUiLuE
On 2020-07-29 at 14:26:25 +0900,
"Stephen J. Turnbull"  wrote:

> 2qdxy4rzwzuui...@potatochowder.com writes:
> 
>  > in order to foil suck attacks.
> 
> Typo of the Year candidate!  (It was a typo, right?)

Call it a Freudian slip of the fingers.
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/GS6TDRETAPG2WRLMNDQFZORSYC2OID2A/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Thoughts about implementing object-compare in unittest package?

2020-07-28 Thread Stephen J. Turnbull
2qdxy4rzwzuui...@potatochowder.com writes:

 > in order to foil suck attacks.

Typo of the Year candidate!  (It was a typo, right?)
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/DIMKMR5EKYUJFFGBKVCZY7VGPID2FGPU/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Thoughts about implementing object-compare in unittest package?

2020-07-28 Thread 2QdxY4RzWzUUiLuE
On 2020-07-28 at 15:58:58 -0700,
Christopher Barker  wrote:

> But a dict always has a LOT fewer buckets than possible hash values,
> so clashes within a bucket are not so rare, so equality needs to be
> checked always -- which is what I was missing.

> And while it wouldn't break anything, having a bunch of non-equal
> objects produce the same hash wouldn't break anything, it would break
> the O(1) performance of dicts.

> Have I got that right?

Yes.

Breaking O(1) performance was actually the root of possible Denial of
Service attacks:  if an attacker knows the algorithms, that attacker
could specifically create keys (e.g., user names) whose hash values are
the same, and then searching a dict degenerates to O(N), and then your
server falls to its knees.  At some point, Python added some
randomization to the way dictionaries work in order to foil suck
attacks.
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/TJCGVDGZWP4LXG44P4Z34BPMZRI3ODY3/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Thoughts about implementing object-compare in unittest package?

2020-07-28 Thread Christopher Barker
On Mon, Jul 27, 2020 at 5:42 PM Ethan Furman  wrote:
> Chris Barker wrote:

> >  Is this because it's possible, if very
> > unlikely, for ANY hash algorithm to create the same hash for two
> > different inputs? So equality always has to be checked anyway?
>
> snip

For example, if a hash algorithm decided to use short names, then a
> group of people might be sorted like this:
>
> Bob: Bob, Robert
> Chris: Christopher, Christine, Christian, Christina
> Ed: Edmund, Edward, Edwin, Edwina
>
> So if somebody draws a name from a hat:
>
>Christina
>
> You apply the hash to it:
>
>Chris
>
> Ignore the Bob and Ed buckets, then use equality checks on the Chris
> names to find the right one.
>

sure, but know (or assume anyway) that python dicts and sets don't use such
a simple, naive hash algorithm, so in fact, non-equal strings are very
unlikely to have the same hash:

In [42]: hash("Christina")

Out[42]: -8424898463413304204

In [43]: hash("Christopher")

Out[43]: 4404166401429815751

In [44]: hash("Christian")

Out[44]: 1032502133450913307

But a dict always has a LOT fewer buckets than possible hash values, so
clashes within a bucket are not so rare, so equality needs to be checked
always -- which is what I was missing.

And while it wouldn't break anything, having a bunch of non-equal objects
produce the same hash wouldn't break anything, it would break the O(1)
performance of dicts.

Have I got that right?

-CHB







> >> From a practical standpoint, think of dictionaries:
> >
> > (that's the trick here -- you can't "get" this without knowing something
> > about the implementation details of dicts.)
>
> Depends on the person -- I always do better with a concrete application.
>
> >> adding
> >> --
> >> - objects are sorted into buckets based on their hash
> >> - any one bucket can have several items with equal hashes
> >
> > is this mostly because there are many more possible hashes than buckets?
>
> Yes.
>
> >> - those several items (obviously) will not compare equal
> >
> > So the hash is a fast way to put stuff in buckets, so you only need to
> > compare with the others that end up in the same bucket?
>
> Yes.
>
> >> retrieving
> >> --
> >> - get the hash of the object
> >> - find the bucket that would hold that hash
> >> - find the already stored objects with the same hash
> >> - use __eq__ on each one to find the match
> >
> > So here's my question: if there is only one object in that bucket, is
> > __eq__ checked anyway?
>
> Yes -- just because it has the same hash does not mean it's equal.
>
> > So what happens when there is no __eq__?The object can still be hashable
> > -- I guess that's because there IS an __eq__ -- it defaults to an id
> > check, yes?
>
> Yes.
>
> The default hash, I believe, also defaults to the object id -- so, by
> default, objects are hashable and compare equal only to themselves.
>
> --
> ~Ethan~
> ___
> Python-ideas mailing list -- python-ideas@python.org
> To unsubscribe send an email to python-ideas-le...@python.org
> https://mail.python.org/mailman3/lists/python-ideas.python.org/
> Message archived at
> https://mail.python.org/archives/list/python-ideas@python.org/message/XPUXOSK7WHXV7LRB7H3I4S42JQ2WXQU3/
> Code of Conduct: http://python.org/psf/codeofconduct/
>


-- 
Christopher Barker, PhD

Python Language Consulting
  - Teaching
  - Scientific Software Development
  - Desktop GUI and Web Development
  - wxPython, numpy, scipy, Cython
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/YJNU4QPGWJUFYIL33PSI2NPYAC2BRLI5/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Thoughts about implementing object-compare in unittest package?

2020-07-27 Thread Ethan Furman

On 7/27/20 5:00 PM, Christopher Barker wrote:


I guess this is the part I find confusing:

when (and why) does __eq__ play a role?


__eq__ is the final authority on whether two objects are equal.  The 
default __eq__ punts and used identity.



On Mon, Jul 27, 2020 at 12:01 PM Ethan Furman wrote:



However, not all objects with the equal hashes compare equal themselves.


That's the one I find confusing -- why is it not "bad" for two objects 
with the same hash (the 42 example above) to not be equal? That seems 
like it would be very dangerous. Is this because it's possible, if very 
unlikely, for ANY hash algorithm to create the same hash for two 
different inputs? So equality always has to be checked anyway?


Well, there are a finite number of integers to be used as hashes, and 
potentially many more than that number of objects needing to be hashed. 
So, yes, hashes can (and will) be shared, and equality must be checked also.


For example, if a hash algorithm decided to use short names, then a 
group of people might be sorted like this:


Bob: Bob, Robert
Chris: Christopher, Christine, Christian, Christina
Ed: Edmund, Edward, Edwin, Edwina

So if somebody draws a name from a hat:

  Christina

You apply the hash to it:

  Chris

Ignore the Bob and Ed buckets, then use equality checks on the Chris 
names to find the right one.



From a practical standpoint, think of dictionaries:


(that's the trick here -- you can't "get" this without knowing something 
about the implementation details of dicts.)


Depends on the person -- I always do better with a concrete application.


adding
--
- objects are sorted into buckets based on their hash
- any one bucket can have several items with equal hashes


is this mostly because there are many more possible hashes than buckets?


Yes.


- those several items (obviously) will not compare equal


So the hash is a fast way to put stuff in buckets, so you only need to 
compare with the others that end up in the same bucket?


Yes.


retrieving
--
- get the hash of the object
- find the bucket that would hold that hash
- find the already stored objects with the same hash
- use __eq__ on each one to find the match


So here's my question: if there is only one object in that bucket, is 
__eq__ checked anyway?


Yes -- just because it has the same hash does not mean it's equal.

So what happens when there is no __eq__?The object can still be hashable 
-- I guess that's because there IS an __eq__ -- it defaults to an id 
check, yes?


Yes.

The default hash, I believe, also defaults to the object id -- so, by 
default, objects are hashable and compare equal only to themselves.


--
~Ethan~
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/XPUXOSK7WHXV7LRB7H3I4S42JQ2WXQU3/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Thoughts about implementing object-compare in unittest package?

2020-07-27 Thread Christopher Barker
I guess this is the part I find confusing:

when (and why) does __eq__ play a role?

On Mon, Jul 27, 2020 at 12:01 PM Ethan Furman  wrote:

> Equal objects must have equal hashes.
> Objects that compare equal must have hashes that compare equal.
>

OK got it.

However, not all objects with the equal hashes compare equal themselves.
>

That's the one I find confusing -- why is it not "bad" for two objects with
the same has (the 42 example above) to not be equal? That seems like it
would be very dangerous. Is this because it's possible, if very unlikely,
for ANY hash algorithm to create the same hash for two different inputs? So
equality always has to be checked anyway?


>  From a practical standpoint, think of dictionaries:
>
(that's the trick here -- you can't "get" this without knowing something
about the implementation details of dicts.)


> adding
> --
> - objects are sorted into buckets based on their hash
> - any one bucket can have several items with equal hashes
>

is this mostly because there are many more possible hashes than buckets?

- those several items (obviously) will not compare equal
>

So the hash is a fast way to put stuff in buckets, so you only need to
compare with the others that end up in the same bucket?

retrieving
> --
> - get the hash of the object
> - find the bucket that would hold that hash
> - find the already stored objects with the same hash
> - use __eq__ on each one to find the match
>

So here's my question: if there is only one object in that bucket, is
__eq__ checked anyway?

If so, then yes, can see why it's not dangerous (if potentially slow) to
have a bunch of unequal objects with the same hash.


> So, if an object's hash changes, then it will no longer be findable in
> any hash table (dict, set, etc.).
>

That part, I think I got.

So what happens when there is no __eq__?The object can still be hashable --
I guess that's because there IS an __eq__ -- it defaults to an id check,
yes?

-CHB


-- 
Christopher Barker, PhD

Python Language Consulting
  - Teaching
  - Scientific Software Development
  - Desktop GUI and Web Development
  - wxPython, numpy, scipy, Cython
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/JEDX5QYK2EZNXEBOP24PIOHQYA5FEVPD/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Thoughts about implementing object-compare in unittest package?

2020-07-27 Thread Ethan Furman

On 7/27/20 11:15 AM, Christopher Barker wrote:

On Sun, Jul 26, 2020 at 8:25 PM Guido van Rossum wrote:



In fact, defining `__hash__` as returning the constant `42` is
better, because it is fine if two objects that *don't* compare equal
still have the same hash value (but not the other way around).


Really? can anyone recommend something to read so I can "get" this -- 
it's counter intuitive to me. Is __eq__ always checked?!? I recently was 
faced with dealing with this issue in updating some old code, and I'm 
still a bit confused about the relationship between __hash__ and __eq__, 
and main Python docs did not clarify it for me.


Equal objects must have equal hashes.
Objects that compare equal must have hashes that compare equal.

However, not all objects with the equal hashes compare equal themselves.

From a practical standpoint, think of dictionaries:

adding
--
- objects are sorted into buckets based on their hash
- any one bucket can have several items with equal hashes
- those several items (obviously) will not compare equal

retrieving
--
- get the hash of the object
- find the bucket that would hold that hash
- find the already stored objects with the same hash
- use __eq__ on each one to find the match

So, if an object's hash changes, then it will no longer be findable in 
any hash table (dict, set, etc.).


--
~Ethan~
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/T3NF5RRX46XVKMGRWKCR23OADNA7APFJ/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Thoughts about implementing object-compare in unittest package?

2020-07-27 Thread Christopher Barker
On Sun, Jul 26, 2020 at 8:25 PM Guido van Rossum  wrote:

> The only reason I can think of why you are so resistant to this would be
> due to poor development practices, e.g. adding tests long after the "main"
> code has already been deployed, or having a separate team write tests.
>

and even then, maybe monkey-patch an __eq__ in for your tests?

For my part, I have for sure defined __eq__ for no other reason than tests
-- but I'm still glad I did.

Though perhaps the idea (sorry, not sure who to credit) of providing a
utility for object equality in the stdlib, so that in the common case, it
would be simple to write a "standard" __eq__ would be nice to have.

(note on that -- make sure it handles properties "properly" -- if that's
possible)

 In fact, defining `__hash__` as returning the constant `42` is better,
> because it is fine if two objects that *don't* compare equal still have the
> same hash value (but not the other way around).
>

Really? can anyone recommend something to read so I can "get" this -- it's
counter intuitive to me. Is __eq__ always checked?!? I recently was faced
with dealing with this issue in updating some old code, and I'm still a bit
confused about the relationship between __hash__ and __eq__, and main
Python docs did not clarify it for me.


> Finally, dataclasses get you all this for free, and they are the future.
>

That is a great point -- I've learned that the really nice thing about
dataclasses is that they keep a separate structure of all the attributes
that matter, and some metadata about them -- type, etc. This is really
useful, and better (or at least more stable) than simply relying on
__dict__ and friends.

I'm thinking that a "dataclasstools" package that builds on dataclasses,
would be really nice -- clearly something to start on PyPi, but as a
unified effort, we could get something cleaner than everyone building their
own little bit on their own.

-CHB


-- 
Christopher Barker, PhD

Python Language Consulting
  - Teaching
  - Scientific Software Development
  - Desktop GUI and Web Development
  - wxPython, numpy, scipy, Cython
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/IAY2T2VKATX3HYJHAMUB3B3SXHUY3JIF/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Thoughts about implementing object-compare in unittest package?

2020-07-26 Thread Steven D'Aprano
On Sun, Jul 26, 2020 at 08:22:47PM -0700, Guido van Rossum wrote:

> Regarding `__hash__`, it is a very bad idea to call `super().__hash__()`!

Today I learned. Thank you.

-- 
Steven
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/LSXQMDWABS7HLRIOSVNUTDUN3EO6Y3KO/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Thoughts about implementing object-compare in unittest package?

2020-07-26 Thread Guido van Rossum
I am really surprised at the resistance against defining `__eq__` on the
target objects. Every time this problem has cropped up in code I was
working on (including code part of very large corporate code bases) the
obvious solution was to define `__eq__`. The only reason I can think of why
you are so resistant to this would be due to poor development practices,
e.g. adding tests long after the "main" code has already been deployed, or
having a separate team write tests.

Regarding `__hash__`, it is a very bad idea to call `super().__hash__()`!
Unless your `__eq__` also just calls `super().__eq__(other)` (and what
would be the point of that?), defining `__hash__` that way will cause
irreproducible behavior where *sometimes* an object that is equal to a dict
key will not be found in the dict even though it is already present,
because the two objects have different hash values. Defining `__hash__` as
`id(self)` is no better. In fact, defining `__hash__` as returning the
constant `42` is better, because it is fine if two objects that *don't*
compare equal still have the same hash value (but not the other way around).

The right way to define `__hash__` is to construct a tuple of all the
attributes that are considered by `__eq__` and return the `hash()` of that
tuple. (In some cases you can make it faster by leaving some expensive
attribute out of the tuple -- again, that's fine, but don't consider
anything that's not used by `__eq__`.)

Finally, dataclasses get you all this for free, and they are the future.

On Sun, Jul 26, 2020 at 7:48 PM Henry Lin  wrote:

> @Steven D'Aprano  All good ideas ☺ I'm in agreement
> that we should be building solutions which are generalizable.
>
> Are there more concerns people would like to bring up when considering the
> problem of object equality?
>
> On Sun, Jul 26, 2020 at 9:25 PM Steven D'Aprano 
> wrote:
>
>> On Sun, Jul 26, 2020 at 11:12:39PM +0200, Alex Hall wrote:
>>
>> > There's another reason people might find this useful - if the objects
>> have
>> > differing attributes, the assertion can show exactly which ones,
>> instead of
>> > just saying that the objects are not equal.
>>
>> That's a good point.
>>
>> I sat down to start an implementation, when a fundamental issue with
>> this came to mind. This proposed comparison is effectively something
>> close to:
>>
>> vars(actual) == vars(expected)
>>
>> only recursively and with provision for objects with `__slots__` and/or
>> no `__dict__`. And that observation lead me to the insight that as tests
>> go, this is a risky, unreliable test.
>>
>> A built-in example:
>>
>>
>> actual = lambda: 1  # simulate some complex object
>> expected = lambda: 2  # another complex object
>> vars(actual) == vars(expected)  # returns True
>>
>>
>> So this is a comparison that needs to be used with care. It is easy for
>> the test to pass while the objects are nevertheless not what you expect.
>>
>> Having said that, another perspective is that unittest already has a
>> smart test for comparing dicts, assertDictEqual, which is automatically
>> called by assertEqual.
>>
>>
>> https://docs.python.org/3/library/unittest.html#unittest.TestCase.assertDictEqual
>>
>> So it may be sufficient to have a utility function that copies
>> an instance's slots and dict into a dict, and then compare dicts. Here's
>> a sketch:
>>
>> d1 = vars(actual).copy()
>> d1.update({key: value for key in actual.__slots__})
>> # Likewise for d2 from expected
>> self.assertEqual(d1, d2)
>>
>> Make that handle the corner cases where objects have no instance dict or
>> slots, and we're done.
>>
>> Thinking aloud here I see this as a kind of copy operation, and
>> think this would be useful outside of testing. I've written code to copy
>> attributes from instances on multiple occasions. So how about a new
>> function in the `copy` module to do so:
>>
>> copy.getattrs(obj, deep=False)
>>
>> that returns a dict. Then the desired comparison could be a thin
>> wrapper:
>>
>> def assertEqualAttrs(self, actual, expected, msg=None):
>> self.assertEqual(getattrs(actual), getattrs(expected))
>>
>>
>> I'm not keen on a specialist test function, but I'm warming to the idea
>> of exposing this functionality in a more general, and hence more useful,
>> form.
>>
>>
>> --
>> Steven
>> ___
>> Python-ideas mailing list -- python-ideas@python.org
>> To unsubscribe send an email to python-ideas-le...@python.org
>> https://mail.python.org/mailman3/lists/python-ideas.python.org/
>> Message archived at
>> https://mail.python.org/archives/list/python-ideas@python.org/message/MLRFS6RO7WF2UAEOS4YMH2FXRQHJUGWU/
>> Code of Conduct: http://python.org/psf/codeofconduct/
>>
> ___
> Python-ideas mailing list -- python-ideas@python.org
> To unsubscribe send an email to python-ideas-le...@python.org
> https://mail.python.org/mailman3/lists/python-ideas.python.org/
>

[Python-ideas] Re: Thoughts about implementing object-compare in unittest package?

2020-07-26 Thread Henry Lin
@Steven D'Aprano  All good ideas ☺ I'm in agreement
that we should be building solutions which are generalizable.

Are there more concerns people would like to bring up when considering the
problem of object equality?

On Sun, Jul 26, 2020 at 9:25 PM Steven D'Aprano  wrote:

> On Sun, Jul 26, 2020 at 11:12:39PM +0200, Alex Hall wrote:
>
> > There's another reason people might find this useful - if the objects
> have
> > differing attributes, the assertion can show exactly which ones, instead
> of
> > just saying that the objects are not equal.
>
> That's a good point.
>
> I sat down to start an implementation, when a fundamental issue with
> this came to mind. This proposed comparison is effectively something
> close to:
>
> vars(actual) == vars(expected)
>
> only recursively and with provision for objects with `__slots__` and/or
> no `__dict__`. And that observation lead me to the insight that as tests
> go, this is a risky, unreliable test.
>
> A built-in example:
>
>
> actual = lambda: 1  # simulate some complex object
> expected = lambda: 2  # another complex object
> vars(actual) == vars(expected)  # returns True
>
>
> So this is a comparison that needs to be used with care. It is easy for
> the test to pass while the objects are nevertheless not what you expect.
>
> Having said that, another perspective is that unittest already has a
> smart test for comparing dicts, assertDictEqual, which is automatically
> called by assertEqual.
>
>
> https://docs.python.org/3/library/unittest.html#unittest.TestCase.assertDictEqual
>
> So it may be sufficient to have a utility function that copies
> an instance's slots and dict into a dict, and then compare dicts. Here's
> a sketch:
>
> d1 = vars(actual).copy()
> d1.update({key: value for key in actual.__slots__})
> # Likewise for d2 from expected
> self.assertEqual(d1, d2)
>
> Make that handle the corner cases where objects have no instance dict or
> slots, and we're done.
>
> Thinking aloud here I see this as a kind of copy operation, and
> think this would be useful outside of testing. I've written code to copy
> attributes from instances on multiple occasions. So how about a new
> function in the `copy` module to do so:
>
> copy.getattrs(obj, deep=False)
>
> that returns a dict. Then the desired comparison could be a thin
> wrapper:
>
> def assertEqualAttrs(self, actual, expected, msg=None):
> self.assertEqual(getattrs(actual), getattrs(expected))
>
>
> I'm not keen on a specialist test function, but I'm warming to the idea
> of exposing this functionality in a more general, and hence more useful,
> form.
>
>
> --
> Steven
> ___
> Python-ideas mailing list -- python-ideas@python.org
> To unsubscribe send an email to python-ideas-le...@python.org
> https://mail.python.org/mailman3/lists/python-ideas.python.org/
> Message archived at
> https://mail.python.org/archives/list/python-ideas@python.org/message/MLRFS6RO7WF2UAEOS4YMH2FXRQHJUGWU/
> Code of Conduct: http://python.org/psf/codeofconduct/
>
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/U6RQ6GA7RXYHTEORTD3VBZ4KMJMXMB5O/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Thoughts about implementing object-compare in unittest package?

2020-07-26 Thread Steven D'Aprano
On Sun, Jul 26, 2020 at 11:12:39PM +0200, Alex Hall wrote:

> There's another reason people might find this useful - if the objects have
> differing attributes, the assertion can show exactly which ones, instead of
> just saying that the objects are not equal. 

That's a good point.

I sat down to start an implementation, when a fundamental issue with 
this came to mind. This proposed comparison is effectively something 
close to:

vars(actual) == vars(expected)

only recursively and with provision for objects with `__slots__` and/or 
no `__dict__`. And that observation lead me to the insight that as tests 
go, this is a risky, unreliable test.

A built-in example:


actual = lambda: 1  # simulate some complex object
expected = lambda: 2  # another complex object
vars(actual) == vars(expected)  # returns True


So this is a comparison that needs to be used with care. It is easy for 
the test to pass while the objects are nevertheless not what you expect.

Having said that, another perspective is that unittest already has a 
smart test for comparing dicts, assertDictEqual, which is automatically 
called by assertEqual.

https://docs.python.org/3/library/unittest.html#unittest.TestCase.assertDictEqual

So it may be sufficient to have a utility function that copies 
an instance's slots and dict into a dict, and then compare dicts. Here's 
a sketch:

d1 = vars(actual).copy()
d1.update({key: value for key in actual.__slots__})
# Likewise for d2 from expected
self.assertEqual(d1, d2)

Make that handle the corner cases where objects have no instance dict or 
slots, and we're done.

Thinking aloud here I see this as a kind of copy operation, and 
think this would be useful outside of testing. I've written code to copy 
attributes from instances on multiple occasions. So how about a new 
function in the `copy` module to do so:

copy.getattrs(obj, deep=False)

that returns a dict. Then the desired comparison could be a thin 
wrapper:

def assertEqualAttrs(self, actual, expected, msg=None):
self.assertEqual(getattrs(actual), getattrs(expected))


I'm not keen on a specialist test function, but I'm warming to the idea 
of exposing this functionality in a more general, and hence more useful, 
form.


-- 
Steven
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/MLRFS6RO7WF2UAEOS4YMH2FXRQHJUGWU/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Thoughts about implementing object-compare in unittest package?

2020-07-26 Thread Henry Lin
@Steven D'Aprano 

>- Developers might not want to leak the `__eq__` function to other
> >developers; I wouldn't want to invade the implementation of my class
> just
> >for testing.
> That seems odd to me. You are *literally* comparing two instances for
> equality, just calling it something different from `==`. Why would you
> not be happy to expose it?


My thinking is by default, the `==` operator checks whether two objects
have the same reference. So implementing `__eq__` is actually a breaking
change for developers. It seems by consensus of people here, people do tend
to implement `__eq__` anyways, so maybe this point is minor.

I do appreciate the suggestion of adding this feature into functools
though.

Let's assume we commit to doing something like this. Thinking how this
feature can be extended, let's suppose for testing purposes, I want to
highlight which attributes of two objects are mismatching. Would we have to
implement something different to find the delta between two objects, or
could components of the functools solution be reused? (Would we want a
feature like this to exist in the standard library?)

On Sun, Jul 26, 2020 at 8:29 PM Steven D'Aprano  wrote:

> On Sun, Jul 26, 2020 at 07:47:39PM +0200, Marco Sulla wrote:
> > On Sun, 26 Jul 2020 at 19:33, Henry Lin  wrote:
> >
> > >
> > >- Any class implementing the `__eq__` operator is no longer hashable
> > >
> > >
> > You can use:
> >
> > def __hash__(self):
> > return id(self)
>
> Don't do that. It's a horrible hash function.
>
> The `object` superclass already knows how to do a good, reliable hash
> function. Use it.
>
> def __hash__(self):
> return super().__hash__()
>
>
> --
> Steven
> ___
> Python-ideas mailing list -- python-ideas@python.org
> To unsubscribe send an email to python-ideas-le...@python.org
> https://mail.python.org/mailman3/lists/python-ideas.python.org/
> Message archived at
> https://mail.python.org/archives/list/python-ideas@python.org/message/7N3GMYKC46MRA4DUNS2C5R2CA4CJGMOG/
> Code of Conduct: http://python.org/psf/codeofconduct/
>
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/7MC27A465I5ZB6PV4S64C2XTAVNLTKIF/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Thoughts about implementing object-compare in unittest package?

2020-07-26 Thread Steven D'Aprano
On Sun, Jul 26, 2020 at 07:47:39PM +0200, Marco Sulla wrote:
> On Sun, 26 Jul 2020 at 19:33, Henry Lin  wrote:
> 
> >
> >- Any class implementing the `__eq__` operator is no longer hashable
> >
> >
> You can use:
> 
> def __hash__(self):
> return id(self)

Don't do that. It's a horrible hash function.

The `object` superclass already knows how to do a good, reliable hash 
function. Use it.

def __hash__(self):
return super().__hash__()


-- 
Steven
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/7N3GMYKC46MRA4DUNS2C5R2CA4CJGMOG/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Thoughts about implementing object-compare in unittest package?

2020-07-26 Thread Steven D'Aprano
On Sun, Jul 26, 2020 at 12:31:17PM -0500, Henry Lin wrote:
> Hi Steven,
> 
> You're right, declaring `__eq__` for the class we want to compare would
> solve this issue. However, we have the tradeoff that
> 
>- All classes need to implement the `__eq__` method to compare two
>instances;

One argument in favour of a standard solution would be to avoid 
duplicated implementations. Perhaps we should add something, not as a 
unittest method, but in functools:

def compare(a, b):
if a is b:
return True
# Simplified version.
return vars(a) == vars(b)

The actual implementation would be more complex, of course. Then classes 
could optionally implement equality:


def __eq__(self, other):
if isinstance(other, type(self):
return functools.compare(self, other)
return NotImplemented

or if you prefer, you could call the function directly in your unit 
tests:

self.assertTrue(functools.compare(actual, other))



>- Any class implementing the `__eq__` operator is no longer hashable

Easy enough to add back in:

def __hash__(self):
return super().__hash__()


>- Developers might not want to leak the `__eq__` function to other
>developers; I wouldn't want to invade the implementation of my class just
>for testing.

That seems odd to me. You are *literally* comparing two instances for 
equality, just calling it something different from `==`. Why would you 
not be happy to expose it?


> In terms of the "popularity" of this potential feature, from what I
> understand (and through my own development), there are testing libraries
> built with this feature. For example, testfixtures.compare
> 
> can compare two objects recursively, and I am using it in my development
> for this purpose.

That's a good example of what we should *not* do, and why trying to 
create a single standard solution for every imaginable scenario can only 
end up with an over-engineered, complex, complicated, confusing API:

testfixtures.compare(
x, y,
prefix=None, 
suffix=None, 
raises=True, 
recursive=True, 
strict=False, 
comparers=None, 
**kw)

Not shown in the function signature are additional keyword arguments:

actual, expected # alternative spelling for x, y
x_label,
y_label,
ignore_eq

That is literally thirteen optional parameters, plus arbitrary keyword 
parameters, for something that just compares two objects.

But a simple comparison function, possibly in functools, that simply 
compares attributes, might be worthwhile.



-- 
Steven
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/56GD3DDO2PLNOB4TIIO7PBJCUPFLGA3V/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Thoughts about implementing object-compare in unittest package?

2020-07-26 Thread Henry Lin
+1 to Alex Hall.

In general I think there are a lot of questions regarding whether using the
__eq__ operator is sufficient. It seems from people's feedback that it will
essentially get the job done, but like Alex says, if we want to understand
which field is leading to a test breaking, we wouldn't have the ability to
easily check.

On Sun, Jul 26, 2020 at 4:13 PM Alex Hall  wrote:

> On Sun, Jul 26, 2020 at 11:01 PM Ethan Furman  wrote:
>
>> On 7/26/20 10:31 AM, Henry Lin wrote:
>>
>> > You're right, declaring `__eq__` for the class we want to compare would
>> > solve this issue. However, we have the tradeoff that
>> >
>> >   * All classes need to implement the `__eq__` method to compare two
>> > instances;
>>
>> I usually implement __eq__ sooner or later anyway -- even if just for
>> testing.
>>
>> >   * Any class implementing the `__eq__` operator is no longer hashable
>>
>> One just needs to define a __hash__ method that behaves properly.
>>
>
> This is quite a significant change in behaviour which may break
> compatibility. Equality and hashing based only on identity can be quite a
> useful property which I often rely on.
>
> There's another reason people might find this useful - if the objects have
> differing attributes, the assertion can show exactly which ones, instead of
> just saying that the objects are not equal. Even if all the involved
> classes implement a matching repr, which is yet more work, the reprs will
> likely be on a single line and the diff will be difficult to read.
> ___
> Python-ideas mailing list -- python-ideas@python.org
> To unsubscribe send an email to python-ideas-le...@python.org
> https://mail.python.org/mailman3/lists/python-ideas.python.org/
> Message archived at
> https://mail.python.org/archives/list/python-ideas@python.org/message/ZJYGN42PCO4J73AAM7ZZSVTOFHPBADWT/
> Code of Conduct: http://python.org/psf/codeofconduct/
>
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/YFGS3DVFNLZ34BPCPTRBXC7H7WLCZ5FK/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Thoughts about implementing object-compare in unittest package?

2020-07-26 Thread Alex Hall
On Sun, Jul 26, 2020 at 11:01 PM Ethan Furman  wrote:

> On 7/26/20 10:31 AM, Henry Lin wrote:
>
> > You're right, declaring `__eq__` for the class we want to compare would
> > solve this issue. However, we have the tradeoff that
> >
> >   * All classes need to implement the `__eq__` method to compare two
> > instances;
>
> I usually implement __eq__ sooner or later anyway -- even if just for
> testing.
>
> >   * Any class implementing the `__eq__` operator is no longer hashable
>
> One just needs to define a __hash__ method that behaves properly.
>

This is quite a significant change in behaviour which may break
compatibility. Equality and hashing based only on identity can be quite a
useful property which I often rely on.

There's another reason people might find this useful - if the objects have
differing attributes, the assertion can show exactly which ones, instead of
just saying that the objects are not equal. Even if all the involved
classes implement a matching repr, which is yet more work, the reprs will
likely be on a single line and the diff will be difficult to read.
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/ZJYGN42PCO4J73AAM7ZZSVTOFHPBADWT/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Thoughts about implementing object-compare in unittest package?

2020-07-26 Thread Ethan Furman

On 7/26/20 10:31 AM, Henry Lin wrote:

You're right, declaring `__eq__` for the class we want to compare would 
solve this issue. However, we have the tradeoff that


  * All classes need to implement the `__eq__` method to compare two
instances;


I usually implement __eq__ sooner or later anyway -- even if just for 
testing.



  * Any class implementing the `__eq__` operator is no longer hashable


One just needs to define a __hash__ method that behaves properly.


  * Developers might not want to leak the `__eq__` function to other
developers; I wouldn't want to invade the implementation of my class
just for testing.


And yet that's exactly what you are proposing with your object compare. 
If two objects are, in fact, equal, why is it bad for == to say so?


--
~Ethan~
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/XFX6SXJ3QULMW3N5GZFA7CGDVXWF3KQM/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Thoughts about implementing object-compare in unittest package?

2020-07-26 Thread Richard Damon
On 7/26/20 4:09 PM, Marco Sulla wrote:
> You're quite right, but if you don't implement __eq__, the hash of an
> object is simply a random integer (I suppose generated from the
> address of the object).
>
> Alternatively, if you want a quick hash, you can use hash(str(obj))
> (if you implemented __str__ or __repr__).
>
And if you don't implement __eq__, I thought that the default equal was
same id(), (which is what the hash is based on too).

The idea was (I thought) that if you implement an __eq__, so that two
different object could compare equal, then you needed to come up with
some hash function for that object that matched that equality function,
or the object is considered unhashable.

-- 
Richard Damon
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/NWVR4JJPJYJLJDVINNYTZSR4AAPKNHXK/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Thoughts about implementing object-compare in unittest package?

2020-07-26 Thread Marco Sulla
You're quite right, but if you don't implement __eq__, the hash of an
object is simply a random integer (I suppose generated from the address of
the object).

Alternatively, if you want a quick hash, you can use hash(str(obj)) (if you
implemented __str__ or __repr__).
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/QDH4JUM547J6AGEEQ6XXM5RVLKAXC5JE/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Thoughts about implementing object-compare in unittest package?

2020-07-26 Thread Richard Damon
On 7/26/20 1:47 PM, Marco Sulla wrote:
> On Sun, 26 Jul 2020 at 19:33, Henry Lin  > wrote:
>
>   * Any class implementing the `__eq__` operator is no longer hashable
>
>
> You can use:
>
> def __hash__(self):
>     return id(self)
I thought that there was an assumption that if two objects are equal
(via __eq__) then their hashes (via __hash__) should be equal? Which
wouldn't hold for this definition, and thus dictionaries wouldn't behave
as expected.

-- 
Richard Damon
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/T5DLPPPMWLGYAYNZ7LXG2IZJJWD5RWF6/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Thoughts about implementing object-compare in unittest package?

2020-07-26 Thread Marco Sulla
On Sun, 26 Jul 2020 at 19:33, Henry Lin  wrote:

>
>- Any class implementing the `__eq__` operator is no longer hashable
>
>
You can use:

def __hash__(self):
return id(self)
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/AQESXZVDMQHY367WVYNH2N46R67CD5OF/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Thoughts about implementing object-compare in unittest package?

2020-07-26 Thread Henry Lin
Hi Steven,

You're right, declaring `__eq__` for the class we want to compare would
solve this issue. However, we have the tradeoff that

   - All classes need to implement the `__eq__` method to compare two
   instances;
   - Any class implementing the `__eq__` operator is no longer hashable
   - Developers might not want to leak the `__eq__` function to other
   developers; I wouldn't want to invade the implementation of my class just
   for testing.

In terms of the "popularity" of this potential feature, from what I
understand (and through my own development), there are testing libraries
built with this feature. For example, testfixtures.compare

can compare two objects recursively, and I am using it in my development
for this purpose.

On Sun, Jul 26, 2020 at 4:56 AM Steven D'Aprano  wrote:

> On Sat, Jul 25, 2020 at 10:15:16PM -0500, Henry Lin wrote:
> > Hey all,
> >
> > What are thoughts about implementing an object-compare function in the
> > unittest package? (Compare two objects recursively, attribute by
> > attribute.)
>
> Why not just ask the objects to compare themselves?
>
> assertEqual(actual, expected)
>
> will work if actual and expected define a sensible `__eq__` and are the
> same type. If they aren't the same type, why not?
>
> actual = MyObject(spam=1, eggs=2, cheese=3)
> expected = DifferentObject(spam=1, eggs=2, cheese=3)
>
>
> > This seems like a common use case in many testing scenarios,
>
> I've never come across it. Can you give an example where defining an
> `__eq__` method won't be the right solution?
>
>
>
>
> --
> Steven
> ___
> Python-ideas mailing list -- python-ideas@python.org
> To unsubscribe send an email to python-ideas-le...@python.org
> https://mail.python.org/mailman3/lists/python-ideas.python.org/
> Message archived at
> https://mail.python.org/archives/list/python-ideas@python.org/message/TDLFBURVX4N4JJP4ELIRLKULR775VNOY/
> Code of Conduct: http://python.org/psf/codeofconduct/
>
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/LZYZIDMFBRIHHPWITSGZT6ITIA2P2ZUW/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Thoughts about implementing object-compare in unittest package?

2020-07-26 Thread Steven D'Aprano
On Sat, Jul 25, 2020 at 10:15:16PM -0500, Henry Lin wrote:
> Hey all,
> 
> What are thoughts about implementing an object-compare function in the
> unittest package? (Compare two objects recursively, attribute by
> attribute.)

Why not just ask the objects to compare themselves?

assertEqual(actual, expected)

will work if actual and expected define a sensible `__eq__` and are the 
same type. If they aren't the same type, why not?

actual = MyObject(spam=1, eggs=2, cheese=3)
expected = DifferentObject(spam=1, eggs=2, cheese=3)


> This seems like a common use case in many testing scenarios,

I've never come across it. Can you give an example where defining an 
`__eq__` method won't be the right solution?




-- 
Steven
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/TDLFBURVX4N4JJP4ELIRLKULR775VNOY/
Code of Conduct: http://python.org/psf/codeofconduct/