Hi,

just answering the first part of your comments for now.

Dag Sverre Seljebotn wrote:
>> Also, I really like the fact that "test" is a plain byte string in Cython 
>> that
>> can directly be converted to a C char*, depending on its use. This shouldn't
>> change, even if Py3 dictates that this literal becomes a Unicode string.
>>   
> What exactly are the consequences here... if it is just about the 
> runtime object used then I suppose it can be inferred from context?

"In the face of ambiguity, refuse the temptation to guess." :)

Somehow "inferring" the difference between str and unicode literals is the
wrong thing to do.


> (I.e., coercion to char* deals with it...) Or does it mean that string 
> literals converted to char* should be UTF-8 strings or something?

You cannot automatically convert a unicode object to a char*, that's why I
said that a byte string makes more sense in the Cython context.


> What is the current behaviour for string literals anyway..probably that 
> the encoding of the Cython source gets carried through to the strings in 
> C source?

Yes, they are passed through to the C compiler as they are - although that's
not really what I'd call "well defined semantics". We can improve on this by
supporting PEP 263.

http://www.python.org/doc/2.3/whatsnew/section-encodings.html

The current string literal semantics in Cython are:

"text" is a literal byte sequence that translates directly to a Py2 str object
or a C char*.

u"text" is a unicode literal that is parsed as UTF-8 encoded byte sequence and
converted into a Python unicode object (at runtime).

Stefan
_______________________________________________
Cython-dev mailing list
[email protected]
http://codespeak.net/mailman/listinfo/cython-dev

Reply via email to