On 02/09/2011 08:40 PM, Carl Banks wrote:
I explained why in my last post; there's a bunch of reasons.
Generally you can't assume someone's going to go through the type
structure to find the object's dict, nor can you expect inherited
methods to always use the derived class's type structure (some methods
might use their own type's tp_dictoffset or tp_weakreflist, which
would be wrong if called from a superclass that changes those
values).

Who do you mean by someone? The code is generated by a program. No human is required to touch it. If it needs to be updated, the program is simply run again with the updated specification file. Thus I can make those assumptions because I have total control over the code. The only thing I don't have control over is the Python code that imports the extension, but in Python, the user doesn't get to choose how they access the weaklist and instance dictionary.


Even if you are careful to avoid such usage, the Python
interpreter can't be sure.  So it has to check for layout conflicts,
and these checks would become very complex if it allowed dict and
weakreflist to appear in different locations in the layout (it's have
to check a lot more).

What is so complex about this? It already uses "obj_instance + obj_instance->ob_type->tp_weaklistoffset". That's all the checking it needs. It only becomes a problem when trying to derive from two or more classes that already have these defined. In such a case the Python interpreter can't deduce what the values of tp_weaklistoffset and tp_dictoffset in the derived type should be, but it doesn't have to because my program tells it what they need to be.


I would say you do.  Python's type system specifies that a derived
type's layout is a superset of its base types' layout.  You seem to
have found a way to derive a type without a common layout, perhaps by
exploiting a bug, and you claim to be able to keep data access
straight.  But Python types are not intended to work that way, and you
are asking for trouble if you try to do it.

I'm not really circumventing this system (except for the varying location of the dictionaries. See the explanation below for that). Python allows variable-sized objects. Tuples and strings are variable sized. This allows them to store the data directly in the object instead of having a pointer to another location in memory. And the objects I generate are basically this:

struct MyObject {
    PyObject_HEAD
    storage_mode mode;
    char[x] opaque_data;
};

I use the real type instead of char[] when possible because it will have the proper alignment but I still treat it like a private hunk of memory that only my generate code will touch. What I store in opaque_data is up to me. I can store a copy of the wrapped type, or I can store a pointer to it. "mode" specifies what is in opaque_data. A derived type would look like this:

struct MyDerivedObject {
    PyObject_HEAD
    storage_mode mode;
    char[y] opaque_data;
};

Where y >= x. It's still the same layout. All that's left is some way for the original object to know what C++ type is stored in opaque_data. I could have used another variable like 'mode', but since there is a one-to-one correspondence between PyObject->ob_type and the type that is being wrapped, I can determine the type from ob_type instead.

There is no bug being exploited. The actual implementation is a little different than this, but the principle is the same. I said before that the layout varies, but that's only if you consider the contents of opaque_data, but that is neither Python's nor the user's concern.


I guess there's also no point in arguing that tp_dictoffset and
tp_weakreflist need to have the same value for base and derived types,
since you're rejecting the premise that layouts need to be
compatible.  Therefore, I'll only point out that the layout checking
code is based on this premise, so that's why you're running afoul of
it.

That's not what the Python documentation says. Under http://docs.python.org/c-api/typeobj.html#tp_weaklistoffset it says "This field is inherited by subtypes, but see the rules listed below. A subtype may override this offset; this means that the subtype uses a different weak reference list head than the base type. Since the list head is always found via tp_weaklistoffset, this should not be a problem." And under http://docs.python.org/c-api/typeobj.html#tp_dictoffset it says "This field is inherited by subtypes, but see the rules listed below. A subtype may override this offset; this means that the subtype instances store the dictionary at a difference offset than the base type. Since the dictionary is always found via tp_dictoffset, this should not be a problem."


You claimed in another post you weren't trying to mimic the C++ type
hierarchy in Python, but this line suggests you are.

When did I make that claim? Perhaps you misunderstood me I said "I kind-of already did. The issue only comes up when multiply-inheriting from types that have a different combination of the weaklist and instance dictionaries. I don't have to support this particular feature."

I was saying I kind-of already did mimic the C++ hierarchy. And when I said "this particular feature", I was talking about the thing I described in the immediately preceding sentence, not the C++ type hierarchy.

--
http://mail.python.org/mailman/listinfo/python-list

Reply via email to