Stefan Behnel wrote:
> Robert Bradshaw, 06.09.2010 19:01:
>   
>> On Mon, Sep 6, 2010 at 9:36 AM, Dag Sverre Seljebotn
>>     
>>> I don't understand this suggestion. What happens in each of these cases,
>>> for different settings of "from __future__ import unicode_literals"?
>>>
>>> cdef char* x1 = 'abc\u0001'
>>>       
>
> As I said in my other mail, I don't think anyone would use the above in 
> real code. The alternative below is just too obvious and simple.
>
>
>   
>>> cdef char* x2 = 'abc\x01'
>>>       
>> from __future__ import unicode_literals (or -3)
>>
>>      len(x1) == 4
>>      len(x2) == 4
>>
>> Otherwise
>>
>>      len(x1) == 9
>>      len(x2) == 4
>>     
>
> Hmm, now *that* looks unexpected to me. The way I see it, a C string is the 
> C equivalent of a Python byte string and should always and predictably 
> behave like a Python byte string, regardless of the way Python object 
> literals are handled.
>   
While the "cdef char*" case isn't that horrible,

f('abc\x01')

is. Imagine throwing in a type in the signature of f and then get 
different data in.

I really, really don't like having the value of a literal depend on type 
of the variable it gets assigned to (I know, I know about ints and so 
on, but let's try to keep the number of instances down).

My vote is for identifying a set of completely safe strings (no \x or 
\u, ASCII-only) that is the same regardless of any setting, and allow 
that. Anything else, demand a b'' prefix to assign to a char*. Putting 
in a b'' isn't THAT hard.

Dag Sverre
_______________________________________________
Cython-dev mailing list
[email protected]
http://codespeak.net/mailman/listinfo/cython-dev

Reply via email to