Re: Yet another unicode WTF

2009-06-05 Thread Ned Deily
In article , Ned Deily wrote: > In python 3.x, of course, the encoding happens automatically but you > still have to tell python, via the "encoding" argument to open, what the > encoding of the file's content is (or accept python's default which may > not be very useful): > > >>> open('foo1',

Re: Yet another unicode WTF

2009-06-05 Thread Paul Boddie
On 5 Jun, 11:51, Ben Finney wrote: > > Actually strings in Python 2.4 or later have the ‘encode’ method, with > no need for importing extra modules: > > = > $ python -c 'import sys; sys.stdout.write(u"\u03bb\n".encode("utf-8"))' > λ > > $ python -c 'import sys; sys.stdout.write(u"\u03bb\n".enc

Re: Yet another unicode WTF

2009-06-05 Thread Ben Finney
Paul Boddie writes: > The only way to think about this (in Python 2.x, at least) is to > consider stream and file objects as things which only understand plain > byte strings. Consequently, use of the codecs module is required if > receiving/sending Unicode objects from/to streams and files. Act

Re: Yet another unicode WTF

2009-06-05 Thread Paul Boddie
On 5 Jun, 03:18, Ron Garret wrote: > > According to what I thought I knew about unix (and I had fancied myself > a bit of an expert until just now) this is impossible.  Python is > obviously picking up a different default encoding when its output is > being piped to a file, but I always thought on

Re: Yet another unicode WTF

2009-06-05 Thread Ned Deily
In article <8763fbmk5a@benfinney.id.au>, Ben Finney wrote: > Ned Deily writes: > > $ python2.6 -c 'import sys; print sys.stdout.encoding, \ > > sys.stdout.isatty()' > > UTF-8 True > > $ python2.6 -c 'import sys; print sys.stdout.encoding, \ > > sys.stdout.isatty()' > foo ; cat foo > > None

Re: Yet another unicode WTF

2009-06-04 Thread Ben Finney
"Gabriel Genellina" writes: > Python knows the terminal encoding (or at least can make a good > guess), but a file may use *any* encoding you want, completely > unrelated to your terminal settings. It may, yes, and the programmer is free to specify any encoding. > So when stdout is redirected,

Re: Yet another unicode WTF

2009-06-04 Thread Lawrence D'Oliveiro
In message , Gabriel Genellina wrote: > Python knows the terminal encoding (or at least can make a good guess), > but a file may use *any* encoding you want, completely unrelated to your > terminal settings. It should still respect your localization settings, though. -- http://mail.python.org/

Re: Yet another unicode WTF

2009-06-04 Thread Ben Finney
Ned Deily writes: > $ python2.6 -c 'import sys; print sys.stdout.encoding, \ > sys.stdout.isatty()' > UTF-8 True > $ python2.6 -c 'import sys; print sys.stdout.encoding, \ > sys.stdout.isatty()' > foo ; cat foo > None False So shouldn't the second case also detect UTF-8? The filesystem knows i

Re: Yet another unicode WTF

2009-06-04 Thread Ned Deily
In article , Ron Garret wrote: > Python 2.6.2 on OS X 10.5.7: > > [...@mickey:~]$ echo $LANG > en_US.UTF-8 > [...@mickey:~]$ cat frob.py > #!/usr/bin/env python > print u'\u03BB' > > [...@mickey:~]$ ./frob.py > ª > [...@mickey:~]$ ./frob.py > foo > Traceback (most recent call last): > File

Re: Yet another unicode WTF

2009-06-04 Thread Benjamin Kaplan
On Thu, Jun 4, 2009 at 10:06 PM, Ben Finney > wrote: > Ron Garret writes: > > > Python 2.6.2 on OS X 10.5.7: > > > > [...@mickey:~]$ echo $LANG > > en_US.UTF-8 > > [...@mickey:~]$ cat frob.py > > #!/usr/bin/env python > > print u'\u03BB' > > > > [...@mickey:~]$ ./frob.py > > ª > > [...@mickey:~]

Re: Yet another unicode WTF

2009-06-04 Thread Ben Finney
Ron Garret writes: > Python 2.6.2 on OS X 10.5.7: > > [...@mickey:~]$ echo $LANG > en_US.UTF-8 > [...@mickey:~]$ cat frob.py > #!/usr/bin/env python > print u'\u03BB' > > [...@mickey:~]$ ./frob.py > ª > [...@mickey:~]$ ./frob.py > foo > Traceback (most recent call last): > File "./frob.py",

Re: Yet another unicode WTF

2009-06-04 Thread Ron Garret
In article , Lawrence D'Oliveiro wrote: > In message , Ron > Garret wrote: > > > Python 2.6.2 on OS X 10.5.7: > > Same result, Python 2.6.1-3 on Debian Unstable. My $LANG is en_NZ.UTF-8. > > > ... I always thought one of the fundamental > > invariants of unix processes was that there's no wa

Re: Yet another unicode WTF

2009-06-04 Thread Gabriel Genellina
En Thu, 04 Jun 2009 22:18:24 -0300, Ron Garret escribió: Python 2.6.2 on OS X 10.5.7: [...@mickey:~]$ echo $LANG en_US.UTF-8 [...@mickey:~]$ cat frob.py #!/usr/bin/env python print u'\u03BB' [...@mickey:~]$ ./frob.py ª [...@mickey:~]$ ./frob.py > foo Traceback (most recent call last): Fil

Re: Yet another unicode WTF

2009-06-04 Thread Ben Finney
Ron Garret writes: > According to what I thought I knew about unix (and I had fancied myself > a bit of an expert until just now) this is impossible. Python is > obviously picking up a different default encoding when its output is > being piped to a file, but I always thought one of the funda

Re: Yet another unicode WTF

2009-06-04 Thread Lawrence D'Oliveiro
In message , Ron Garret wrote: > Python 2.6.2 on OS X 10.5.7: Same result, Python 2.6.1-3 on Debian Unstable. My $LANG is en_NZ.UTF-8. > ... I always thought one of the fundamental > invariants of unix processes was that there's no way for a process to > know what's on the other end of its stdo

Yet another unicode WTF

2009-06-04 Thread Ron Garret
Python 2.6.2 on OS X 10.5.7: [...@mickey:~]$ echo $LANG en_US.UTF-8 [...@mickey:~]$ cat frob.py #!/usr/bin/env python print u'\u03BB' [...@mickey:~]$ ./frob.py ª [...@mickey:~]$ ./frob.py > foo Traceback (most recent call last): File "./frob.py", line 2, in print u'\u03BB' UnicodeEncodeE