Christopher Barker wrote:
> Robert Bradshaw wrote:
>
>>Would
>>
>>def flump(utf8 s):
>> return s
>>
>>return a bytes object?
>
> I would expect it to return a unicode object -- in Python, I'd expect
> bytes+encoding to be returned as a unicode object -- it's the only way
> not to lose the encoding information.
I've been thinking something similar myself. Perhaps there
should be a rule that the encoded-bytes types are only for
"internal" use by Cython code, and whenever one gets coerced
to a generic Python object, it gets decoded into a unicode
string.
I think that would allow us to drop the C versions of the
encoded types altogether, and write things like
cdef extern from "somewhere.h":
char *cflump(char *)
def utf8 flump(utf8 s):
return cflump(s)
Advantages of this are that all the declarations are now
symmetrical and there is no need for any encoding
declarations on the C side.
A disadvantage is that it may not be obvious that flump()
actually returns a unicode string despite being declared
as returning utf8.
If you wanted it to actually return a bytes object,
you would have to write
def bytes flump(utf8 s):
return cflump(s)
>>Will there be a different
>>handling for function signatures, or will it work the same everywhere? I.e.
>>will a "def func(bytes b)" function always accept unicode,
Not under my version of the proposal -- there is only
automatic conversion between unicode and a bytes type
with a declared encoding. Unicode and plain bytes are
still incompatible.
--
Greg
_______________________________________________
Cython-dev mailing list
[email protected]
http://codespeak.net/mailman/listinfo/cython-dev