Hi Greg, Greg Ewing, 12.12.2009 03:02: > I've had an idea that might help with making the > encoding and decoding of unicode strings more > automatic. > > Suppose we have a way of expressing a type parameterised > with an encoding, maybe something like > > encoding[name] > > We could have a few predefined ones, such as > > ctypedef encoding['ascii'] ascii > ctypedef encoding['utf8'] utf8 > ctypedef encoding['latin1'] latin1 > > These are Python object types. Internally they're > represented as bytes objects, but the compiler knows > statically that they have an encoding associated with > them, and the appropriate encoding and decoding > operations are performed when coercing from and to > strings. > > Being bytes, they can also be cast to char * without > any problem. So we can write things like > > cdef extern from "foo.h": > void cflump(char *) > > def flump(utf8 s): > cflump(s) > > Now we can pass a unicode string to flump() and it will > first be encoded to bytes as utf8, and then passed to > cflump() as a char *.
Thanks for bringing my recent proposals back into the discussion. I actually prefer something closer to the existing buffer syntax, but I'm certainly +1 on such a feature. Note that Cython has cpdef functions already, so adding a return type to def functions isn't far off. However, the above describes a new feature - not a solution to Robert's intention of making string recoding fully automatic for existing code. Stefan _______________________________________________ Cython-dev mailing list [email protected] http://codespeak.net/mailman/listinfo/cython-dev
