Re: [Cython] Another string encoding idea

Robert Bradshaw Wed, 02 Dec 2009 16:50:00 -0800

On Dec 1, 2009, at 11:01 AM, Christopher Barker wrote:

> Robert Bradshaw wrote:
>> On Nov 30, 2009, at 10:14 AM, Christopher Barker wrote:
>>> If text, then the natural python3 data type is a
>>> unicode string. If data, then bytes -- we should really follow  
>>> that as best
>>> we can.
>>
>> Exactly.
>>
>> unicode = char* + length + encoding
>> bytes = char* + length
>>
>>> It needs to be easy, and perhaps automatic, to write code that
>>> crosses the Python-C border in these cases.
>>>
>>> I've lost track of what has been proposed here, but it seems to me
>>> that
>>> we need a Cython type:
>>>
>>> ANSI_string  (not that that's what it should be called)
>>>
>>> It seems this would handle the very common case of libraries  
>>> expecting
>>> simple ascii strings for flags, etc.
>>
>> That is another idea. A new type would handle conversion to char*,  
>> but
>> not from char*. Bytes objects would still be returned by default
>> unless one did something extra there (which is fine for some uses,  
>> but
>> for other str is more natural).
>
> This doesn't quite fit my vision -- I was thinking that a the
> "ANSI_string" type would look like a text string in python --  
> therefor a
> Unicode object, certainly for py3. Py2 is a mess in this regard no
> matter how you slice it, but i would think a string or Unicode object
> would make more sense than bytes -- the idea is that this would be  
> used
> explicitly for "text", not data -- so the user would not want to get
> bytes back.


Yes, this is the case that I'm thinking of as well. I wasn't seeing  
how a new type would fix the

     cdef char* s = ...
     return

case.

- Robert


_______________________________________________
Cython-dev mailing list
[email protected]
http://codespeak.net/mailman/listinfo/cython-dev

Re: [Cython] Another string encoding idea

Reply via email to