Hi Greg,

Greg Ewing, 12.12.2009 03:02:
> I've had an idea that might help with making the
> encoding and decoding of unicode strings more
> automatic.
> 
> Suppose we have a way of expressing a type parameterised
> with an encoding, maybe something like
> 
>    encoding[name]
> 
> We could have a few predefined ones, such as
> 
>    ctypedef encoding['ascii'] ascii
>    ctypedef encoding['utf8'] utf8
>    ctypedef encoding['latin1'] latin1
> 
> These are Python object types. Internally they're
> represented as bytes objects, but the compiler knows
> statically that they have an encoding associated with
> them, and the appropriate encoding and decoding
> operations are performed when coercing from and to
> strings.
> 
> Being bytes, they can also be cast to char * without
> any problem. So we can write things like
> 
>    cdef extern from "foo.h":
>      void cflump(char *)
> 
>    def flump(utf8 s):
>      cflump(s)
> 
> Now we can pass a unicode string to flump() and it will
> first be encoded to bytes as utf8, and then passed to
> cflump() as a char *.

Thanks for bringing my recent proposals back into the discussion. I
actually prefer something closer to the existing buffer syntax, but I'm
certainly +1 on such a feature. Note that Cython has cpdef functions
already, so adding a return type to def functions isn't far off.

However, the above describes a new feature - not a solution to Robert's
intention of making string recoding fully automatic for existing code.

Stefan

_______________________________________________
Cython-dev mailing list
[email protected]
http://codespeak.net/mailman/listinfo/cython-dev

Reply via email to