Stefan Behnel wrote: > Greg Ewing, 29.11.2009 01:27: > >> Robert Bradshaw wrote: >> >> >>> I'm not sure. It depends on if it's just the idea of a "system default >>> encoding" that's deprecated, or if the slot containing a encoded >>> reference is going away. >>> >> My reading of it is that the slot is still there and still >> supported, but that it always contains utf8, and the right >> way to access it is via PyString_AsUTF8String(). >> > > Ok, there is no PyString_AsUTF8String, so I assume you meant > PyUnicode_AsUTF8String, but that doesn't use defenc at all. There is an > internal function _PyUnicode_AsDefaultEncodedString(), about which the > header file clearly says "Exported for internal use by the interpreter only". > > Reading the PyUnicodeObject header: > > ---------- > PyObject *defenc; /* (Default) Encoded version as Python > string, or NULL; this is used for > implementing the buffer protocol */ > ---------- > > So my reading of the docs is that defenc is purely an implementation > detail. Assuming that we can happily depend on its value to keep the char* > around, or that it will do no harm to just keep a dangling reference during > the lifetime of a function (or module) sounds fragile to me. Think of this > code: > > cdef unicode u = "abcdefg" > cdef char* s = u > u = None > print s > > Now, where's the reference to the string buffer now? Would this work > because Cython keeps the reference alive for all eternity, or would this > fail because Cython deletes the reference immediately after taking the > char* from it? Or would Cython see the "u = None" and discard the encoded > string at the same time? Or would we use the buffer interface on the > unicode string internally, although the syntax doesn't resemble the buffer > syntax at all? And why should this only work in Py3, what would you do in Py2? > Not that I'm in favour of the proposal, but it would be possible to handle this in restricted situations like conversions of arguments to function calls or on conversion of function arguments (which was one of Robert's worries). This is what is allowed (or was planned?) for e.g. passing Python lists to functions expecting int*.
(Then one could use the buffer protocols to get the pointer, as you say.) Dag Sverre _______________________________________________ Cython-dev mailing list [email protected] http://codespeak.net/mailman/listinfo/cython-dev
