On Fri, 31 Jan 2014 14:52:15 -0500, Ned Batchelder wrote:
Why can't we call __init__ the constructor and __new__ the allocator?
__new__ constructs the object, and __init__ initialises it. What's wrong
with calling them the constructor and initialiser? Is this such a
difficult concept that the average programmer can't learn it?
I've met people who have difficulty with OOP principles, at least at
first. But once you understand the idea of objects, it isn't that hard to
understand the idea that:
- first, the object has to be created, or constructed, or allocated
if you will;
- only then can it be initialised.
Thus, two methods. __new__ constructs (creates, allocates) a new object;
__init__ initialises it after the event.
(In hindsight, it was probably a mistake for Python to define two create-
an-object methods, although I expect it was deemed necessary for
historical reasons. Most other languages make do with a single method,
Objective-C being an exception with "alloc" and "init" methods.)
Earlier in this post, you wrote:
But that distinction [between __new__ and __init__] isn't useful in
most programs.
Well, I don't know about that. I guess it depends on what sort of objects
you're creating. If you're creating immutable objects, then the
distinction is vital. If you're subclassing from immutable built-ins, of
which there are a few, the distinction may be important. If you're using
the object-pool design pattern, the distinction is also vital. It's not
*rare* to care about these things.
The thing most people mean by "constructor" is "the method that gets
invoked right at the beginning of the object's lifetime, where you can
add code to initialize it properly." That describes __init__.
"Most people". I presume you've done a statistically valid survey then
*wink*
It *better* describes __new__, because it is *not true* that __init__
gets invoked "right at the beginning of the object's lifetime". Before
__init__ is invoked, the object's lifetime has already begun, inside the
call to __new__. Excluding metaclass shenanigans, the object lifetime
goes:
Prior to the object existing:
- static method __new__ called on the class[1]
- __new__ creates the object[2] <=== start of object lifetime
Within the object's lifetime:
- the rest of the __new__ method runs, which may perform arbitrarily
complex manipulations of the object;
- __new__ exits, returning the object
- __init__ runs
So __init__ does not occur *right at the beginning*, and it is completely
legitimate to write your classes using only __new__. You must use __new__
for immutable objects, and you may use __new__ for mutable ones. __init__
may be used by convention, but it is entirely redundant.
I do not buy the argument made by some people that Python ought to follow
whatever (possibly inaccurate or misleading) terminology other languages
use. Java and Ruby have the exact same argument passing conventions as
Python, but one calls it "call by value" and the other "call by
reference", and neither is the same meaning of "call by value/reference"
as used by Pascal, C, Visual Basic, or other languages. So which
terminology should Python use? Both C++ and Haskell have "functors", but
they are completely different things. What Python calls a class method,
Java calls a static method. We could go on for days, just listing
differences in terminology.
In Python circles, using "constructor" for __new__ and "initialiser" for
__init__ are well-established. In the context of Python, they make good
sense: __new__ creates ("constructs") the object, and __init__
_init_ialises it. Missing the opportunity to link the method name
__init__ to *initialise* would be a mistake.
We can decry the fact that computer science has not standardised on a
sensible set of names for concepts, but on the other hand since the
semantics of languages differ slightly, it would be more confusing to try
to force all languages to use the same words for slightly different
concepts.
The reality is, if you're coming to Python from another language, you're
going to have to learn a whole lot of new stuff anyway, so having to
learn a few language-specific terms is just a small incremental cost. And
if you have no idea about other languages, then it is no harder to learn
that __new__ / __init__ are the constructor/initialiser than it would be
to learn that they are the allocator/constructor or preformulator/
postformulator.
I care about using the right terminology that will cause the least amount
of cognitive dissonance to users' understanding of Python, not whether
they have to learn new terminology, and in the context of Python's object
module, "constructor" and "initialiser" best describe what __new__ and
__init__ do.