Hi, just answering the first part of your comments for now.
Dag Sverre Seljebotn wrote: >> Also, I really like the fact that "test" is a plain byte string in Cython >> that >> can directly be converted to a C char*, depending on its use. This shouldn't >> change, even if Py3 dictates that this literal becomes a Unicode string. >> > What exactly are the consequences here... if it is just about the > runtime object used then I suppose it can be inferred from context? "In the face of ambiguity, refuse the temptation to guess." :) Somehow "inferring" the difference between str and unicode literals is the wrong thing to do. > (I.e., coercion to char* deals with it...) Or does it mean that string > literals converted to char* should be UTF-8 strings or something? You cannot automatically convert a unicode object to a char*, that's why I said that a byte string makes more sense in the Cython context. > What is the current behaviour for string literals anyway..probably that > the encoding of the Cython source gets carried through to the strings in > C source? Yes, they are passed through to the C compiler as they are - although that's not really what I'd call "well defined semantics". We can improve on this by supporting PEP 263. http://www.python.org/doc/2.3/whatsnew/section-encodings.html The current string literal semantics in Cython are: "text" is a literal byte sequence that translates directly to a Py2 str object or a C char*. u"text" is a unicode literal that is parsed as UTF-8 encoded byte sequence and converted into a Python unicode object (at runtime). Stefan _______________________________________________ Cython-dev mailing list [email protected] http://codespeak.net/mailman/listinfo/cython-dev
