Robert Bradshaw, 11.05.2010 06:16:
> On May 10, 2010, at 5:27 PM, Paul Harrison wrote:
>> Also, I find I can no longer cast str to char* (again with Python
>> 2.6.1). The documentation seems to indicate this should only be a
>> problem in Python 3, or have I missed something?
Regardless of any Py2/Py3 portability issues, you have to take care what
type of string you use for what purpose. Python 2 and earlier Cython
versions that targeted Py2 have been very lax here, that's why a lot of
code has been written that has problems (not only in Py3, see the infamous
UnicodeEncodeError in Py2). They inevitably show when migrating to Py3,
which is rather late for most current users. Cython has therefore opted to
follow Py3 in some major aspects to make it easier for users to write
correct and portable code. This is very important for Cython code, which is
supposed to run on Py2 and Py3 platforms without regenerating the C code.
> Cython is super strict about this right now even if you're only using
> Py2 (too strict in my opinion, but that's a whole can of worms there
> that we're not going to re-visit at this time...) Cast to an object
> first, i.e. if x is a str, do
>
> char* s =<object>x
That's a hack. In any case, it won't work in Py3.
> (Short of type checking, is there a good way to encode in 3.x and
> still allow str in 2.x?)
Given that Py2 tries auto-conversion between byte strings and unicode
strings, you can do
byte_string = some_string.encode('ASCII')
This will fail for byte strings in Py3 and work for ASCII-only unicode
strings. In Py2, it will work identically for unicode strings, but it will
implicitly auto-decode a byte string to unicode before it gets encoded.
This is horribly wasteful, but it works for ASCII strings on any Python
runtime that has an ASCII compatible default encoding - implying that it
will not work on some runtimes if the default encoding is *not* ASCII
compatible, but that's rare enough. It's a "mostly good enough" trick in
doctests, for example.
Given how wasteful the above is in Py2 and the fact that it isn't even 100%
portable, my general answer to this question is "no". It's truly a good
thing to be explicit about the type of string you are dealing with. Avoids
a lot of problems, now and later, and keeps you from writing error prone code.
Stefan
_______________________________________________
Cython-dev mailing list
[email protected]
http://codespeak.net/mailman/listinfo/cython-dev