Hi,

Dag Sverre Seljebotn wrote:
> Still I think I disagree about this though:
> 
> ==
> Also, I really like the fact that "test" is a plain byte string in Cython that
> can directly be converted to a C char*, depending on its use. This shouldn't
> change, even if Py3 dictates that this literal becomes a Unicode string.

including PEP 263 input conversion, obviously.

> ==
>
> Because in my mind this change in Python 3 changes what I consider a 
> real deficiency in Python 2, which is that the source input encoding 
> matter.

Well, it does matter in both Py2 and Py3. See PEP 263.


> Most recent C libraries will happily pass through char* buffers in the 
> current runtime encoding as strings, and if one is crazy enough to write 
> Python code like:
> 
> # note: Python 3 code against libc in Cython
> handle = libc.stdlib.fopen("Fødselsår.txt", "r")

This is an entirely independent matter as it depends on the *file system
encoding*, not the locale. I hope you do not want Cython to do this kind of
magic for you.


> ...then having automatic, runtime platform default dependant conversion 
> to char* will make this work on different systems.

I would prefer the phrasing: "break" on different systems, in different ways.


> As for using a C library on different encodings, consider the following 
> example on my UTF-8 machine:
> 
> $ touch åå
> $ ./checkfile åå
> C3 A5 C3 A5  -> fopen: 6295568
> 
> Contents of checkfile.c:
> 
> int main(int argc, char* argv[]) {
>     char* ch;
>     for (ch = argv[1]; *ch != 0; ++ch) {
>         printf("%hhX ", *ch);
>     }
>     printf(" -> fopen: %ld\n", (long)fopen(argv[1], "r"));
> }

so you have a UTF-8 filesystem and a UTF-8 console, just as I do. Have you
tried this on a latin1 filesystem, or a latin1 filesystem respectively?

Stefan
_______________________________________________
Cython-dev mailing list
[email protected]
http://codespeak.net/mailman/listinfo/cython-dev

Reply via email to