Hi,
Dag Sverre Seljebotn wrote:
> Still I think I disagree about this though:
>
> ==
> Also, I really like the fact that "test" is a plain byte string in Cython that
> can directly be converted to a C char*, depending on its use. This shouldn't
> change, even if Py3 dictates that this literal becomes a Unicode string.
including PEP 263 input conversion, obviously.
> ==
>
> Because in my mind this change in Python 3 changes what I consider a
> real deficiency in Python 2, which is that the source input encoding
> matter.
Well, it does matter in both Py2 and Py3. See PEP 263.
> Most recent C libraries will happily pass through char* buffers in the
> current runtime encoding as strings, and if one is crazy enough to write
> Python code like:
>
> # note: Python 3 code against libc in Cython
> handle = libc.stdlib.fopen("Fødselsår.txt", "r")
This is an entirely independent matter as it depends on the *file system
encoding*, not the locale. I hope you do not want Cython to do this kind of
magic for you.
> ...then having automatic, runtime platform default dependant conversion
> to char* will make this work on different systems.
I would prefer the phrasing: "break" on different systems, in different ways.
> As for using a C library on different encodings, consider the following
> example on my UTF-8 machine:
>
> $ touch åå
> $ ./checkfile åå
> C3 A5 C3 A5 -> fopen: 6295568
>
> Contents of checkfile.c:
>
> int main(int argc, char* argv[]) {
> char* ch;
> for (ch = argv[1]; *ch != 0; ++ch) {
> printf("%hhX ", *ch);
> }
> printf(" -> fopen: %ld\n", (long)fopen(argv[1], "r"));
> }
so you have a UTF-8 filesystem and a UTF-8 console, just as I do. Have you
tried this on a latin1 filesystem, or a latin1 filesystem respectively?
Stefan
_______________________________________________
Cython-dev mailing list
[email protected]
http://codespeak.net/mailman/listinfo/cython-dev