Stefan Behnel, 13.12.2009 08:35:
> So I think the right solution is to support automatic conversion *only* at
> the Python call boundary, i.e. for Python function parameters and return
> values.
> 
> Now, parameters are easy as long as we stick with the bytes type, for which
> "bytes[encoding='utf-8']" would be an obvious syntax in Cython. Function
> return values can be made to work in the same way, by simply allowing their
> declaration also for 'def' functions. And ctypedefs would make this quite
> writeable, as Greg suggested.
> 
> Again, this won't rescue code that was already written, but I think it
> would solve the problem for future code, and existing (unicode unaware)
> code could be fixed up relatively easily by replacing char* in Python
> function signatures with "bytes[encoding=...]" or the ctypedef-ed equivalent.

Thinking about this some more, I actually believe that the main usage
pattern would be to declare a function like this:

    def str[encoding='ASCII'] func(bytes[encoding='ASCII'] s):
        ...

So most my-data-is-not-unicode users would want to make sure that they
always get an easy-to-use bytes object on the way in and that the return
value is an easy-to-use Python value, i.e. it follows the normal platform
str type: bytes on Py2 and unicode on Py3. So there is an intrinsic
asymmetry in input and output types here.

Stefan

_______________________________________________
Cython-dev mailing list
[email protected]
http://codespeak.net/mailman/listinfo/cython-dev

Reply via email to