> I find it easier to read
>
>     def dostuff(str text):
>         cdef char* s = text.encode("UTF-8")
>         # do UTF-8 handling stuff
>         return s.decode("UTF-8")
>
> than anything you could do with internal magic.
>   
Will this work though? (I'm ignorant in this area of Cython). I.e., when 
will the temporary returned from text.encode exit scope and be 
collected? So one would need two more lines (or magic support for 
keeping temporaries to the end of the function scope when assigning 
temporaries to char*).

I forgot one important case in my list though: Passing string constants 
to C libraries. With no conversion, one cannot at the same time keep 
nice language consistency and also allow

cdef char* s = "asdf"

Robert's proposal has the advantage that it allows this notation in a 
more consistent way.

Personally I'm now (forget about earlier opinions :-) ) ready to take 
the "b" at this point, rather than breaking consistency or doing 
undeclared magic. It's a nice reminder that using char* for strings is 
not trivial, and probably avoids more bugs.

Also, nowadays it is rather seldom I think that char* is used directly 
for strings... C++ have std::string, Linux GUI apps use specific QT or 
GTK/GObject strings, and so on. Console applications sometimes use char* 
but are conscious about encoding matters at a low level.

You suggested to use whatever encoding the source file is in in the case 
above? Or have you backtracked from that now?

Dag Sverre
_______________________________________________
Cython-dev mailing list
[email protected]
http://codespeak.net/mailman/listinfo/cython-dev

Reply via email to