Re: [Python-3000] A better way to initialize PyTypeObject

Brett Cannon Wed, 29 Nov 2006 11:36:23 -0800

On 11/28/06, Talin <[EMAIL PROTECTED]> wrote:

Guido van Rossum wrote:
> On 11/28/06, Talin <[EMAIL PROTECTED]> wrote:
>> Guido van Rossum wrote:
>> > Some comments:
>> >
>> > - Fredrik's solution makes one call per registered method. (I don't
>> > know if the patch he refers to follows that model.) That seems a fair
>> > amount of code for an average type -- I'm wondering if it's too early
>> > to worry about code bloat (I don't think the speed is going to
>> > matter).
>>
>> One other thought: The special constants could themselves be nothing
>> more than the offset into the PyTypeObject struct, i.e.:
>>
>>     #define SPECMETHOD_NEW ((const char*)offsetof(PyTypeObject,tp_new))
>
> I think this would cause too many issues with backwards compatibility.
>
> I like the idea much better to use special names (e.g. starting with a
> ".").
>
>> In the PyType_Ready code, you would see if the method name had a value
>> of less than sizeof(PyTypeObject); If so, then it's a special method
>> name, and you fill in the struct at the specified offset.
>>
>> So the interpretation of the table could be very simple and fast. It
has
>> a slight disadvantage from the approach of using actual string names
for
>> special methods, in that it doesn't allow the VM to silently
>> promote/demote methods to 'special' status.
>
> I think the interpretation will be fast enough (or else what you said
> about premature optimization earlier wouldn't be correct. :-)

OK, based on these comments and the other feedback from this thread,
here's a more concrete proposal:

== Method Table ==

Method definitions are stored in a static table, identical in format to
the existing PyMethodDef table.

For non-method initializers, the most commonly-used ones will be passed
in as parameters to the type creation function. Those that are less
commonly used can be written in as a secondary step after the type has
been created, or in some cases represented in the tp_members table.

== Method Names ==

As suggested by Guido, we use a naming convention to determine how a
method in the method table is handled. I propose that methods be divided
into three categories, which are "Normal", "Special", and "Internal"
methods, and which are interpreted slightly differently at type
initialization time.

* Internal methods are those that have no equivalent Python name, such
as tp_free/tp_alloc. Internal methods names start with a dot ("."), so
tp_alloc would be represented by the string ".tp_alloc".



Haven't we had various arguments about how it's bad to use a leading dot to
have a special meaning?  I understand why we need some way to flag internal
methods on a type and I support going with an explicit way of specifying,
but is a dot really the best solution?  I mean something like INTERNAL_METH
"tp_alloc" would even work using C's automatic string concatentation and
doing::

 #define INTERNAL_METH "."

or whatever string we wanted that was not valid in a method name.  I don't
think this would lead us down the road of tons of macros and it makes things
very visible.

Internal methods are always stored into a slot in the PyTypeObject. If

there is no corresponding slot for a given name, that is a runtime error.

* Special methods have the double-underscore (__special__) naming
convention. A special method may or may not have a slot definition in
PyTypeObject. If there is such a slot, the method pointer will be stored
into it; If there is no such slot, then the method pointer is stored
into the class dict just like a normal method.

Because the decision whether to put the method into a slot is made by
the VM, the set of available slots can be modified in future Python
releases without breaking existing code.

* Normal methods are any methods that are neither special or internal.
They are not placed in a slot, but are simply stored in the class dict.

Brett Cannon brought up the point about __getitem__ being ambiguous,
since there are two slots, one for lists and one for mappings. This is
handled as follows:

The "mapping" version of __getitem__ is a special method, named
"__getitem__". The "list" version, however, is considered an internal
method (since it's more specialized), and has the name ".tp_getitem".



Or the other option is that in the future we just don't have the distinction
and make sure that the __getitem__ methods do the requisite type checks.
The type check is done at some point in the C code anyway so it isn't like
there is a performance reason for the different slots.  And as for providing
a C-level function that provides a __getitem__ that takes Py_ssize_t, that
can still be provided, it just isn't what is put into the struct.

The one problem this does cause is testing for the interface support at the
C level.  But that could be a C function that looks for specific defined
functions.  Plus this would help make the C code less distinct from the way
things expose themselves at the Python level (which I personally think is a
good thing).

Greg Ewing's point about "next" is handled as follows: A function named

"next" will never be treated as a special method name, since it does not
follow the naming convention of either internal or special names.
However, if you want to fill in the "tp_next" slot of the PyTypeObject,
you can use the string ".tp_next" rather than "next".

== Type Creation ==

For backwards compatibility, the existing PyType_Ready function will
continue to work on statically-declared PyTypeObject structures. A new
function, 'PyType_Create' will be added that creates a new type from the
input parameters and the method initialization tables as described
previously. The actual type record may be allocated dynamically, as
suggested by Greg Ewing.

Structures such as tp_as_sequence which extend the PyTypeObject will be
created as needed, if there are any methods that require those extension
structures.

== Backwards compatibility ==

The existing PyType_Ready and C-style static initialization mechanism
will continue to work - the new method for type creation will coexist
alongside the old.

It is an open question as to whether PyType_Ready should attempt to
interpret the special method names and fill in the PyTypeObject slots.
If it does, then PyType_Create can all PyType_Ready as a subroutine
during the type creation process.

Otherwise, the only modifications to the interpreter will be the
creation of the new PyType_Create function and any required subroutines.
Existing code should be unaffected.



Overall sounds good to me!

-Brett

_______________________________________________
Python-3000 mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-3000
Unsubscribe: 
http://mail.python.org/mailman/options/python-3000/archive%40mail-archive.com

Re: [Python-3000] A better way to initialize PyTypeObject

Reply via email to