Dag Sverre Seljebotn wrote: >>> - Have a seperate mechanism for specifying what encoding should be used >>> for conversion to C buffers. >>> >> I don't see a reason to go that route, given the existing PEP. >> > I argued: The PEP is about **input** source code. It declares the > encoding of your source file, which likely depends on the preferred > programming environment of the Cython coder, and has absolutely nothing > to do with the encoding of the runtime C library, which is likely on a > different system with a potentially different encoding. > > The moment the compiled behaviour of your code depends on the encoding > of the source file, you have big problems, and the exact reason for the > PEP was to *avoid* the behaviour you seem to want.
I want to have distinct behaviour between byte sequences and unicode character sequences. If you use a byte (string) literal in your code, Cython must not alter it (except for PEP 263 input encoding) and must support any conversion from and to a char*. This works fine with current Cython as long as you use the same input encoding for Cython code and C code. If you use a unicode literal in your code, Cython must take care that it gets correctly converted from source code bytes to a unicode character sequence (PEP 263), which then behaves the same on all systems. Cython must raise a compiler error if you try to convert it to a char*. Both things work just fine with current Cython as long as your code is UTF-8 encoded. I don't see why anything beyond PEP 263 is needed here. Stefan _______________________________________________ Cython-dev mailing list [email protected] http://codespeak.net/mailman/listinfo/cython-dev
