Re: Python 3 encoding question: Read a filename from stdin, subsequently open that filename

2010-12-05 Thread Dan Stromberg
Ultimately I switched to reading the filenames from file descriptor 0 using os.read(); this gave back bytes in 3.x, strings of single-byte characters in 2.x - which are similar enough for my purposes, and eliminated the filesystem encoding(s) question nicely. I rewrote readline0 (http://stromberg.

Re: Python 3 encoding question: Read a filename from stdin, subsequently open that filename

2010-12-02 Thread Nobody
On Thu, 02 Dec 2010 12:17:53 +0100, Peter Otten wrote: >> This was actually a critical flaw in Python 3.0, as it meant that >> filenames which weren't valid in the locale's encoding simply couldn't be >> passed via argv or environ. 3.1 fixed this using the "surrogateescape" >> encoding, so now it'

Re: Python 3 encoding question: Read a filename from stdin, subsequently open that filename

2010-12-02 Thread Peter Otten
Nobody wrote: > This was actually a critical flaw in Python 3.0, as it meant that > filenames which weren't valid in the locale's encoding simply couldn't be > passed via argv or environ. 3.1 fixed this using the "surrogateescape" > encoding, so now it's only an annoyance (i.e. you can recover the

Re: Python 3 encoding question: Read a filename from stdin, subsequently open that filename

2010-12-01 Thread Nobody
On Wed, 01 Dec 2010 10:34:24 +0100, Peter Otten wrote: >> Python 3.x's decision to treat filenames (and environment variables) as >> text even on Unix is, in short, a bug. One which, IMNSHO, will mean that >> Python 2.x is still around when Python 4 is released. > > For filenames in Python 3 the

Re: Python 3 encoding question: Read a filename from stdin, subsequently open that filename

2010-12-01 Thread Peter Otten
Nobody wrote: > Python 3.x's decision to treat filenames (and environment variables) as > text even on Unix is, in short, a bug. One which, IMNSHO, will mean that > Python 2.x is still around when Python 4 is released. For filenames in Python 3 the user has the choice between "text" (str) and by

Re: Python 3 encoding question: Read a filename from stdin, subsequently open that filename

2010-12-01 Thread Antoine Pitrou
On Tue, 30 Nov 2010 22:22:01 -0500 Albert Hopkins wrote: > And I can freely copy > these "invalid" files across different (Unix) systems, because the OS > doesn't care about encoding. And so can Python, thanks to PEP 383. > > That's where encodings which can be used globally come in. > > By the

Re: Python 3 encoding question: Read a filename from stdin, subsequently open that filename

2010-12-01 Thread Antoine Pitrou
On Tue, 30 Nov 2010 16:57:57 -0800 Dan Stromberg wrote: > >> --- On Tue, 11/30/10, Dan Stromberg wrote: > >> > In Python 3, I'm finding that I have encoding issues with > >> > characters > >> > with their high bit set.  Things are fine with strictly > >> > ASCII > >> > filenames.  With high-bit-s

Re: Python 3 encoding question: Read a filename from stdin, subsequently open that filename

2010-12-01 Thread Nobody
On Wed, 01 Dec 2010 02:14:09 +, MRAB wrote: > If the filenames are to be shown to a user then there needs to be a > mapping between bytes and glyphs. That's an encoding. If different > users use different encodings then exchange of textual data becomes > difficult. OTOH, the exchange of binar

Re: Python 3 encoding question: Read a filename from stdin, subsequently open that filename

2010-11-30 Thread Martin v. Loewis
> This sounds like a strong prospect for how to get things working (I > didn't realize open would accept a bytes argument for the filename), > but I'm also interested in whether reading filenames from stdin and > subsequently opening them is supposed to "just work" given a suitable > encoding - lik

Re: Python 3 encoding question: Read a filename from stdin, subsequently open that filename

2010-11-30 Thread Martin v. Loewis
> The world does not revolve around Python. Unix filenames have been > encoding-agnostic long before Python was around. If Python3 does not > support this then it's a regression on Python's part. Fortunately, Python 3 does support that. Regards, Martin -- http://mail.python.org/mailman/listinf

Re: Python 3 encoding question: Read a filename from stdin, subsequently open that filename

2010-11-30 Thread Martin v. Löwis
> It'd be great if all programs used the same encoding on a given OS, > but at least on Linux, I believe historically filenames have been > created with different encodings. IOW, if I pick one encoding and go > with it, filenames written in some other encoding are likely to cause > problems. So I

Re: Python 3 encoding question: Read a filename from stdin, subsequently open that filename

2010-11-30 Thread Albert Hopkins
On Wed, 2010-12-01 at 02:14 +, MRAB wrote: > If the filenames are to be shown to a user then there needs to be a > mapping between bytes and glyphs. That's an encoding. If different > users use different encodings then exchange of textual data becomes > difficult. That's presentation, that's s

Re: Python 3 encoding question: Read a filename from stdin, subsequently open that filename

2010-11-30 Thread MRAB
On 01/12/2010 01:28, Nobody wrote: On Tue, 30 Nov 2010 18:53:14 +0100, Peter Otten wrote: I think this is wrong. In Unix there is no concept of filename encoding. Filenames can have any arbitrary set of bytes (except '/' and '\0'). But the filesystem itself neither knows nor cares about enc

Re: Python 3 encoding question: Read a filename from stdin, subsequently open that filename

2010-11-30 Thread Nobody
On Tue, 30 Nov 2010 18:53:14 +0100, Peter Otten wrote: >> I think this is wrong. In Unix there is no concept of filename >> encoding. Filenames can have any arbitrary set of bytes (except '/' and >> '\0'). But the filesystem itself neither knows nor cares about >> encoding. > > I think you mi

Re: Python 3 encoding question: Read a filename from stdin, subsequently open that filename

2010-11-30 Thread Nobody
On Mon, 29 Nov 2010 21:26:23 -0800, Dan Stromberg wrote: > Does anyone know what I need to do to read filenames from stdin with > Python 3.1 and subsequently open them, when some of those filenames > include characters with their high bit set? Use "bytes" rather than "str". Everywhere. This means

Re: Python 3 encoding question: Read a filename from stdin, subsequently open that filename

2010-11-30 Thread Dan Stromberg
On Tue, Nov 30, 2010 at 9:53 AM, Peter Otten <__pete...@web.de> wrote: > $ ls > $ python3 > Python 3.1.1+ (r311:74480, Nov  2 2009, 15:45:00) > [GCC 4.4.1] on linux2 > Type "help", "copyright", "credits" or "license" for more information. with open(b"\xe4\xf6\xfc.txt", "w") as f: > ...     f.w

Re: Python 3 encoding question: Read a filename from stdin, subsequently open that filename

2010-11-30 Thread Dan Stromberg
On Tue, Nov 30, 2010 at 7:19 AM, Antoine Pitrou wrote: > On Mon, 29 Nov 2010 21:52:07 -0800 (PST) > Yingjie Lan wrote: >> --- On Tue, 11/30/10, Dan Stromberg wrote: >> > In Python 3, I'm finding that I have encoding issues with >> > characters >> > with their high bit set.  Things are fine with

Re: Python 3 encoding question: Read a filename from stdin, subsequently open that filename

2010-11-30 Thread Dan Stromberg
On Tue, Nov 30, 2010 at 11:47 AM, Martin v. Loewis wrote: >> Does anyone know what I need to do to read filenames from stdin with >> Python 3.1 and subsequently open them, when some of those filenames >> include characters with their high bit set? > > If your files on disk use file names encoded i

Re: Python 3 encoding question: Read a filename from stdin, subsequently open that filename

2010-11-30 Thread Martin v. Loewis
> Does anyone know what I need to do to read filenames from stdin with > Python 3.1 and subsequently open them, when some of those filenames > include characters with their high bit set? If your files on disk use file names encoded in iso-8859-1, don't set your locale to a UTF-8 locale (as you app

Re: Python 3 encoding question: Read a filename from stdin, subsequently open that filename

2010-11-30 Thread Peter Otten
Albert Hopkins wrote: > On Tue, 2010-11-30 at 11:52 +0100, Peter Otten wrote: > Dan Stromberg wrote: >> >> > I've got a couple of programs that read filenames from stdin, and > then >> > open those files and do things with them. These programs sort of do >> > the *ix xargs thing, without requiri

Re: Python 3 encoding question: Read a filename from stdin, subsequently open that filename

2010-11-30 Thread Antoine Pitrou
On Mon, 29 Nov 2010 21:52:07 -0800 (PST) Yingjie Lan wrote: > --- On Tue, 11/30/10, Dan Stromberg wrote: > > In Python 3, I'm finding that I have encoding issues with > > characters > > with their high bit set.  Things are fine with strictly > > ASCII > > filenames.  With high-bit-set characters,

Re: Python 3 encoding question: Read a filename from stdin, subsequently open that filename

2010-11-30 Thread Albert Hopkins
On Tue, 2010-11-30 at 11:52 +0100, Peter Otten wrote: Dan Stromberg wrote: > > > I've got a couple of programs that read filenames from stdin, and then > > open those files and do things with them. These programs sort of do > > the *ix xargs thing, without requiring xargs. > > > > In Python 2, t

Re: Python 3 encoding question: Read a filename from stdin, subsequently open that filename

2010-11-30 Thread Peter Otten
Dan Stromberg wrote: > I've got a couple of programs that read filenames from stdin, and then > open those files and do things with them. These programs sort of do > the *ix xargs thing, without requiring xargs. > > In Python 2, these work well. Irrespective of how filenames are > encoded, thin

Re: Python 3 encoding question: Read a filename from stdin, subsequently?open that filename

2010-11-30 Thread Marc Christiansen
Dan Stromberg wrote: > I've got a couple of programs that read filenames from stdin, and then > open those files and do things with them. These programs sort of do > the *ix xargs thing, without requiring xargs. > > In Python 2, these work well. Irrespective of how filenames are > encoded, thin

Re: Python 3 encoding question: Read a filename from stdin, subsequently open that filename

2010-11-29 Thread Yingjie Lan
--- On Tue, 11/30/10, Dan Stromberg wrote: > In Python 3, I'm finding that I have encoding issues with > characters > with their high bit set.  Things are fine with strictly > ASCII > filenames.  With high-bit-set characters, even if I > change stdin's > encoding with: Co-ask. I have also had pro

Python 3 encoding question: Read a filename from stdin, subsequently open that filename

2010-11-29 Thread Dan Stromberg
I've got a couple of programs that read filenames from stdin, and then open those files and do things with them. These programs sort of do the *ix xargs thing, without requiring xargs. In Python 2, these work well. Irrespective of how filenames are encoded, things are opened OK, because it's all