Dag Sverre Seljebotn wrote:
>>> - Have a seperate mechanism for specifying what encoding should be used 
>>> for conversion to C buffers.
>>>     
>> I don't see a reason to go that route, given the existing PEP.
>>   
> I argued: The PEP is about **input** source code. It declares the 
> encoding of your source file, which likely depends on the preferred 
> programming environment of the Cython coder, and has absolutely nothing 
> to do with the encoding of the runtime C library, which is likely on a 
> different system with a potentially different encoding.
> 
> The moment the compiled behaviour of your code depends on the encoding 
> of the source file, you have big problems, and the exact reason for the 
> PEP was to *avoid* the behaviour you seem to want.

I want to have distinct behaviour between byte sequences and unicode character
sequences.

If you use a byte (string) literal in your code, Cython must not alter it
(except for PEP 263 input encoding) and must support any conversion from and
to a char*. This works fine with current Cython as long as you use the same
input encoding for Cython code and C code.

If you use a unicode literal in your code, Cython must take care that it gets
correctly converted from source code bytes to a unicode character sequence
(PEP 263), which then behaves the same on all systems. Cython must raise a
compiler error if you try to convert it to a char*. Both things work just fine
with current Cython as long as your code is UTF-8 encoded.

I don't see why anything beyond PEP 263 is needed here.

Stefan

_______________________________________________
Cython-dev mailing list
[email protected]
http://codespeak.net/mailman/listinfo/cython-dev

Reply via email to