Nathaniel Smith wrote:
> On Sat, Dec 12, 2009 at 1:51 AM, Stefan Behnel <[email protected]>
> wrote:
>> Nathaniel Smith, 12.12.2009 10:05:
>>> After upgrading to Cython 0.12 today (Python 2.5.2, x86-64, linux),
>>> some code of mine broke. Specifically, it's code for reading a binary
>>> format, and in the tests I had a string that made Cython fail to
>>> compile with the error:
>>>   String decoding as 'UTF-8' failed. Consider using a byte string or
>>> unicode string explicitly, or adjust the source code encoding.
>>>
>>> As an example, here's a complete file that Cython 0.12 will refuse to
>>> compile:
>>> -------------
>>> s = "\x12\x34\x9f\x65"
>>> -------------
>>>
>>> I'm not sure why it's nattering about the source code encoding when
>>> the problem is with explicitly quoted byte values
>>
>> Because you are using a 'str' literal, which needs to be decoded in
>> Python
>> 3 to become the equivalent str (i.e. unicode) object. A check for that
>> is
>> required for the semantics of the 'str' type in Cython, as it would
>> otherwise be impossible to switch the type in the generated C code - you
>> simply can't write out a unicode literal into C in a portable way.
>>
>> The relevant CEP is here:
>>
>> http://wiki.cython.org/enhancements/stringliterals
>
> Sure, I know. But I'm not using Python 3 (I'm using 2.5.2, as
> mentioned), and that page says "Unmarked string literals, when used in
> a Python context, would be [...] byte strings in Py2", and the table
> labeled "Proposal" seems to imply that in Py2, cython will treat "foo"
> and b"foo" as equivalent (just as CPython would). Similarly, under
> "Cons" it notes that the changes under discussion may cause backwards
> compatibility problems when moving from Py2 to Py3, but it does not
> note that they also cause (IMHO rather more serious) backwards
> incompatibility between Cython 0.11+Py2 and Cython 0.12+Py2.
>
>>> but... my question
>>> is, I can fix this by adding a "b" sigil on the front, but that's
>>> incompatible with earlier versions of Cython.
>>
>> Yes, bytes literals were fixed up fairly recently - may have been 0.11
>> or
>> so. Given that they were partly broken before that, I don't really see
>> why
>> you would want to support earlier versions of Cython anyway.
>
> Oh, does that work in 0.11? All the documentation I had found (e.g. at
> the top of that page you linked) only mentions py3-style string
> handling in the context of 0.12. That solves my personal problem.
>
>>> (And was it really intentional to break Python source compatibility so
>>> badly?)
>>
>> What do you mean? And what version of Python are you referring to?
>
> Just that -- AFAICT -- I can no longer rely on Py2 syntax for
> specifying string literals in Py2 extension modules. That seems odd.

The point is that Cython doesn't know when compiling whether the target is
py2 or py3; the same C source works for both.

The alternative would be to do something like allowing the code on Py2 but
make the module always raise an exception when being loaded in
Py3...doesn't seem like an improvement to me though.

Dag Sverre

_______________________________________________
Cython-dev mailing list
[email protected]
http://codespeak.net/mailman/listinfo/cython-dev

Reply via email to