On Dec 12, 2009, at 4:05 PM, Greg Ewing wrote:

> The reason for the intermediate bytes object is that it neatly
> solves the memory management issue that arises if you try to
> go directly from str to char *, and it does it without having
> to make a special case of function arguments.

I agree, that is nice.

>> My whole goal was to not have to be explicit at each point, but to be
>> able to specify the encoding (or at least to use a default encoding)
>> for an entire file
>
> Yes, I realise it doesn't fully address your use case.
> It's more aimed at people who think a blanket declaration
> would be too implicit and error-prone.

With the exception of function argument declaration, I think the  
people who don't want blanket declarations are are already fairly well  
served with encode() and decode().

     cdef bytes[encoding='utf8'] ss = s

or even

     cdef utf8 ss = s

is not (to me at least) clearer than

     cdef bytes ss = s.encode('utf8')

which requires no new syntax or types.

> However, it seems to be difficult to implement fully
> automatic conversions directly between str and char *
> except for a very few encodings -- ascii and utf8 --
> and even the latter would appear to hinge on a
> deprecated feature held over from Py2.

I think ascii and utf8 alone would cover a broad range of usecases,  
especially for those who want more global declarations. The defenc  
slot is a real concern though.

> The advantages of my proposal are that it would work
> for any encoding and wouldn't be restricted to function
> arguments.

I think this is a valuable point.

- Robert

_______________________________________________
Cython-dev mailing list
[email protected]
http://codespeak.net/mailman/listinfo/cython-dev

Reply via email to