[issue46757] dataclasses should define an empty __post_init__

2022-02-22 Thread Neil Girdhar


Neil Girdhar  added the comment:

@eric

Good thinking.  Would it make sense to add to the documentation as well that 
the __post_init__ methods aren't collected, and you should call super's 
__post_init__ if there is one using something like

if hasattr(super(), "__post_init__"):
super().__post_init__()

Noting this will make it easier to point to the docs if someone wonders when 
they see code like this.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue46757] dataclasses should define an empty __post_init__

2022-02-22 Thread Eric V. Smith


Eric V. Smith  added the comment:


New changeset 288af845a32fd2a92e3b49738faf8f2de6a7bf7c by Eric V. Smith in 
branch 'main':
bpo-46757: Add a test to verify dataclass's __post_init__ isn't being 
automatically added. (GH-31523)
https://github.com/python/cpython/commit/288af845a32fd2a92e3b49738faf8f2de6a7bf7c


--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue46757] dataclasses should define an empty __post_init__

2022-02-22 Thread Eric V. Smith


Eric V. Smith  added the comment:

I'm adding a test that mimic's Raymond's example of the proposed addition being 
a breaking change. This way, if we ever decide to actually add this feature, 
we'll break this test. If we do decide to continue and make the change anyway, 
at least we'll do so with the knowledge that it's a breaking change.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue46757] dataclasses should define an empty __post_init__

2022-02-22 Thread Eric V. Smith


Change by Eric V. Smith :


--
pull_requests: +29650
pull_request: https://github.com/python/cpython/pull/31523

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue46757] dataclasses should define an empty __post_init__

2022-02-22 Thread Eric V. Smith


Eric V. Smith  added the comment:

I'm going to close this issue. As Raymond says, it's a breaking change, and the 
workaround is easy enough.

--
resolution:  -> rejected
stage: patch review -> resolved
status: open -> closed

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue46757] dataclasses should define an empty __post_init__

2022-02-21 Thread Neil Girdhar


Neil Girdhar  added the comment:

@Raymond yeah I've been thinking about this some more, and there's no way to 
have a "top level" method with the dataclass decorator.

I think I will make a case to have a class version of dataclasses that works 
with inheritance.  Class versions of dataclasses are used in some places like 
here: https://github.com/google/flax/blob/main/flax/struct.py#L184
They just call dataclass on the class in __init_subclass__.

If we had something like this in the standard library, then you could put your 
empty __post_init__ in that class.  You could also make __init__ call super so 
that mixins would be initialized (right now the collider pattern you showed 
also breaks if B is not a dataclass, and has a non-trivial __init__).

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue46757] dataclasses should define an empty __post_init__

2022-02-21 Thread Raymond Hettinger


Raymond Hettinger  added the comment:

Note that adding an empty __post_init__ method would be a breaking change.  The 
following prints out 'B' then 'C'.  But if class A adds an empty __post_init__, 
then 'B' never gets printed.  The arrangement relies on class A being a 
passthrough to class B.

from dataclasses import dataclass

@dataclass
class A:
pass

@dataclass
class B:
def __post_init__(self):
print('B')

@dataclass
class C(A, B):
def __post_init__(self):
super().__post_init__()
print('C')

c = C()

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue46757] dataclasses should define an empty __post_init__

2022-02-20 Thread Raymond Hettinger


Raymond Hettinger  added the comment:

-1

* I concur with Eric that this is mostly not needed.  Probably 99.95% of 
dataclass use case don't need this.  When it is needed, it is trivial to 
implement and probably should be explicit rather that implicit.

* Vedran is also correct in noting that it would sometimes mask errors.

* There is a runtime cost for all dataclasses that do not use __post_init__.  
Calling an empty method takes time.  We shouldn't make all users pay for a 
feature that almost no one needs.

* With respect to use of super(), it might help a little bit, but only because 
__post_init__() takes no arguments.  For the other methods in cooperative 
multiple inheritance, a user would need to either incorporate a straight 
forward hasattr() check or add a terminal root class as described in 
super-considered-super.  I see no reason to make __post_init__ a special case 
that would differ from other methods (for example, object() doesn't provide an 
empty __call__ or __getitem__).

* Adding a default __post_init__ will create a new problem.  Currently, it is 
possible to easily introspect and determine whether a __post_init__ has been 
provided, but if there is an empty default, it becomes a much more tricky test. 
 We had this problem with functools.total_ordering.  When that tool was first 
created, we could easily use hasattr() to test for a user defined rich 
comparison method.  But after default methods were added to object(), the test 
because much more delicate:  ``getattr(cls, op, None) is not getattr(object, 
op, None)``.  Likewise the Hashable ABC cannot just use hasattr() because 
__hash__() is always present and has to be set to None to turn it off.  A 
default __post_init__ is worse than both cases.  You can't test for it, so you 
just have to call it, not knowing in advance whether it would do anything.

* Anyone supporting multiple versions of Python will still need to write the 
hasattr() check or provide a terminal/root class.  They won't be able to rely 
on the default being present.

I recommend leaving dataclasses as is.

--
nosy: +rhettinger

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue46757] dataclasses should define an empty __post_init__

2022-02-20 Thread Eric V. Smith


Change by Eric V. Smith :


--
assignee:  -> eric.smith

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue46757] dataclasses should define an empty __post_init__

2022-02-20 Thread Neil Girdhar


Neil Girdhar  added the comment:

> How would an arbitrary derived class know how to call this? It can't. There 
> has to be knowledge of the base class's requirements already. Surely knowing 
> "__post_init__ must be called with some_arg" isn't too different from "I know 
> __post_init__ doesn't exist".

This is exactly the same problem you have with all other "augmenting methods" 
that have arbitrary parameters (e.g., __init__).  When calling super on a 
non-final class you could simply forward keyword arguments.


@dataclass
class X:
def __post_init__(self, **kwargs):
super().__post_init__(**kwargs)
...

@dataclass
class Y(X):
def __post_init__(self, **kwargs):
super().__post_init__(**kwargs)
...

> I'm still unconvinced, but I'll hold off on making a decision to see if 
> there's more support. Maybe taking it to python-ideas would be worthwhile.

Sounds good, done:  https://groups.google.com/g/python-ideas/c/-gctNaSqgew

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue46757] dataclasses should define an empty __post_init__

2022-02-19 Thread Eric V. Smith


Eric V. Smith  added the comment:

The fact that it's never been needed in the years that dataclasses and attrs 
have existed tell me it's kind of a niche requirement.

This does not seem like the ugliest code I've ever seen:

if hasattr(super(), "__post_init__"):
super().__post_init__()

or the probably more efficient, but more lines of code:

try:
post_init = super().__post_init__
except AttributeError:
pass
else:
post_init()

As always with calling super functions, the whole hierarchy needs to cooperate. 
Say you had a dataclass:

@dataclass
class Base:
def __post_init__(self, some_arg):
pass

How would an arbitrary derived class know how to call this? It can't. There has 
to be knowledge of the base class's requirements already. Surely knowing 
"__post_init__ must be called with some_arg" isn't too different from "I know 
__post_init__ doesn't exist". I don't think adding ways to make the "always 
call super" pattern easier is a good idea.

I'm still unconvinced, but I'll hold off on making a decision to see if there's 
more support. Maybe taking it to python-ideas would be worthwhile.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue46757] dataclasses should define an empty __post_init__

2022-02-19 Thread Neil Girdhar


Neil Girdhar  added the comment:

> I'm not crazy about adding a method to every dataclass just for the 0.1% of 
> the times it's needed.

I'm not sure I totally understand the cost here.  You can have a single shared 
global function that you set on each dataclass.  So the only cost would be an 
entry in each dataclass type's dict.  Even if a user creates a thousand 
dataclasses, that should only be tens of killobytes in pointers.

> I think using hasattr or catching the exception is a better way to go.

If you choose this, then I think this should be documented under the 
__post_init__ saying that any time you define __post_init__, you should either 
be a final class, or else call super.  If you call super, you musteither use 
hasattr(super().__post_init__) or catch the exception.

I have to admit, I find this quite ugly from a user perspective.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue46757] dataclasses should define an empty __post_init__

2022-02-19 Thread Eric V. Smith


Eric V. Smith  added the comment:

I'm not crazy about adding a method to every dataclass just for the 0.1% of the 
times it's needed.

I think using hasattr or catching the exception is a better way to go.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue46757] dataclasses should define an empty __post_init__

2022-02-19 Thread Neil Girdhar

Neil Girdhar  added the comment:

On Sat, Feb 19, 2022 at 2:51 AM Vedran Čačić  wrote:

>
> Vedran Čačić  added the comment:
>
> That "except AttributeError" approach is a powerful bug magnet, since it
> can very easily mask real attribute errors stemming from misspelled
> attribute names in the __post_init__ call itself. What you should _really_
> do is
>
> def __post_init__(self):
> with contextlib.suppress(AttributeError):
> post_init = super().__post_init__
> post_init()
>
> But of course, nobody will ever write that.
>
> Great point!

Raymond in his "super considered super" video (
> https://youtu.be/xKgELVmrqfs?t=2068) says the right thing to do is to
> make your own root which knows exactly what classes it manages, and drops
> the supercalls from them (after possibly verifying that all kwargs have
> actually been used and so on).
>
> But in case of dataclasses, usually any class can serve as such a root,
> and the main reason people use dataclasses is to avoid boilerplate code. So
> I think it would be a nice compromise.
>

I'm not sure I understand what you're saying here.  Anyone can inherit from
a class C with the special root and some other class D, and then your
introduced root won't catch D's super calls.

>
> --
> nosy: +veky
>
> ___
> Python tracker 
> 
> ___
>

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue46757] dataclasses should define an empty __post_init__

2022-02-18 Thread Vedran Čačić

Vedran Čačić  added the comment:

That "except AttributeError" approach is a powerful bug magnet, since it can 
very easily mask real attribute errors stemming from misspelled attribute names 
in the __post_init__ call itself. What you should _really_ do is

def __post_init__(self):
with contextlib.suppress(AttributeError):
post_init = super().__post_init__
post_init()

But of course, nobody will ever write that.

Raymond in his "super considered super" video 
(https://youtu.be/xKgELVmrqfs?t=2068) says the right thing to do is to make 
your own root which knows exactly what classes it manages, and drops the 
supercalls from them (after possibly verifying that all kwargs have actually 
been used and so on).

But in case of dataclasses, usually any class can serve as such a root, and the 
main reason people use dataclasses is to avoid boilerplate code. So I think it 
would be a nice compromise.

--
nosy: +veky

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue46757] dataclasses should define an empty __post_init__

2022-02-18 Thread Nikita Sobolev


Change by Nikita Sobolev :


--
type:  -> behavior
versions: +Python 3.11

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue46757] dataclasses should define an empty __post_init__

2022-02-18 Thread Nikita Sobolev


Change by Nikita Sobolev :


--
keywords: +patch
pull_requests: +29565
stage:  -> patch review
pull_request: https://github.com/python/cpython/pull/31430

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue46757] dataclasses should define an empty __post_init__

2022-02-18 Thread Nikita Sobolev


Nikita Sobolev  added the comment:

I like this idea.

`attrs` right now behave the same way (no default `__attrs_post_init__`:

```
>>> import attrs
>>> @attrs.define
... class Some:
...   x: int
... 
>>> @attrs.define
... class Other(Some):
...def __attrs_post_init__(self):
...   super().__attrs_post_init__()
...   self.x += 1
... 
>>> Other(1)
Traceback (most recent call last):
  File "", line 1, in 
  File "", line 3, in __init__
  File "", line 4, in __attrs_post_init__
AttributeError: 'super' object has no attribute '__attrs_post_init__'
```

I am attaching a simple patch.

Alternative solution is to not merge this patch and recommend this instead:

```
def __post_init__(self):
  try:
super().__post_init__()
  except AttributeError:
pass
```

--
nosy: +sobolevn

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue46757] dataclasses should define an empty __post_init__

2022-02-15 Thread Karthikeyan Singaravelan


Change by Karthikeyan Singaravelan :


--
nosy: +eric.smith

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue46757] dataclasses should define an empty __post_init__

2022-02-15 Thread Neil Girdhar


New submission from Neil Girdhar :

When defining a dataclass, it's possible to define a post-init (__post_init__) 
method to, for example, verify contracts.

Sometimes, when you inherit from another dataclass, that dataclass has its own 
post-init method.  If you want that method to also do its checks, you need to 
explicitly call it with super.  However, if that method doesn't exist calling 
it with super will crash.

Since you don't know whether your superclasses implement post-init or not, 
you're forced to check if the superclass has one or not, and call it if it does.

Essentially, post-init implements an "augmenting pattern" like __init__, 
___enter__, __exit__, __array_finalize__, etc.  All such methods define an 
empty method at the top level so that child classes can safely call super.

Please consider adding such an empty method to dataclasses so that children who 
implement __post_init__ can safely call super.

--
components: Library (Lib)
messages: 413283
nosy: NeilGirdhar
priority: normal
severity: normal
status: open
title: dataclasses should define an empty __post_init__

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com