On Dec 12, 2009, at 4:05 PM, Greg Ewing wrote:
> The reason for the intermediate bytes object is that it neatly
> solves the memory management issue that arises if you try to
> go directly from str to char *, and it does it without having
> to make a special case of function arguments.
I agree, that is nice.
>> My whole goal was to not have to be explicit at each point, but to be
>> able to specify the encoding (or at least to use a default encoding)
>> for an entire file
>
> Yes, I realise it doesn't fully address your use case.
> It's more aimed at people who think a blanket declaration
> would be too implicit and error-prone.
With the exception of function argument declaration, I think the
people who don't want blanket declarations are are already fairly well
served with encode() and decode().
cdef bytes[encoding='utf8'] ss = s
or even
cdef utf8 ss = s
is not (to me at least) clearer than
cdef bytes ss = s.encode('utf8')
which requires no new syntax or types.
> However, it seems to be difficult to implement fully
> automatic conversions directly between str and char *
> except for a very few encodings -- ascii and utf8 --
> and even the latter would appear to hinge on a
> deprecated feature held over from Py2.
I think ascii and utf8 alone would cover a broad range of usecases,
especially for those who want more global declarations. The defenc
slot is a real concern though.
> The advantages of my proposal are that it would work
> for any encoding and wouldn't be restricted to function
> arguments.
I think this is a valuable point.
- Robert
_______________________________________________
Cython-dev mailing list
[email protected]
http://codespeak.net/mailman/listinfo/cython-dev