Ultimately I switched to reading the filenames from file descriptor 0
using os.read(); this gave back bytes in 3.x, strings of single-byte
characters in 2.x - which are similar enough for my purposes, and
eliminated the filesystem encoding(s) question nicely.
I rewrote readline0
(http://stromberg.
On Thu, 02 Dec 2010 12:17:53 +0100, Peter Otten wrote:
>> This was actually a critical flaw in Python 3.0, as it meant that
>> filenames which weren't valid in the locale's encoding simply couldn't be
>> passed via argv or environ. 3.1 fixed this using the "surrogateescape"
>> encoding, so now it'
Nobody wrote:
> This was actually a critical flaw in Python 3.0, as it meant that
> filenames which weren't valid in the locale's encoding simply couldn't be
> passed via argv or environ. 3.1 fixed this using the "surrogateescape"
> encoding, so now it's only an annoyance (i.e. you can recover the
On Wed, 01 Dec 2010 10:34:24 +0100, Peter Otten wrote:
>> Python 3.x's decision to treat filenames (and environment variables) as
>> text even on Unix is, in short, a bug. One which, IMNSHO, will mean that
>> Python 2.x is still around when Python 4 is released.
>
> For filenames in Python 3 the
Nobody wrote:
> Python 3.x's decision to treat filenames (and environment variables) as
> text even on Unix is, in short, a bug. One which, IMNSHO, will mean that
> Python 2.x is still around when Python 4 is released.
For filenames in Python 3 the user has the choice between "text" (str) and
by
On Tue, 30 Nov 2010 22:22:01 -0500
Albert Hopkins wrote:
> And I can freely copy
> these "invalid" files across different (Unix) systems, because the OS
> doesn't care about encoding.
And so can Python, thanks to PEP 383.
> > That's where encodings which can be used globally come in.
> > By the
On Tue, 30 Nov 2010 16:57:57 -0800
Dan Stromberg wrote:
> >> --- On Tue, 11/30/10, Dan Stromberg wrote:
> >> > In Python 3, I'm finding that I have encoding issues with
> >> > characters
> >> > with their high bit set. Things are fine with strictly
> >> > ASCII
> >> > filenames. With high-bit-s
On Wed, 01 Dec 2010 02:14:09 +, MRAB wrote:
> If the filenames are to be shown to a user then there needs to be a
> mapping between bytes and glyphs. That's an encoding. If different
> users use different encodings then exchange of textual data becomes
> difficult.
OTOH, the exchange of binar
> This sounds like a strong prospect for how to get things working (I
> didn't realize open would accept a bytes argument for the filename),
> but I'm also interested in whether reading filenames from stdin and
> subsequently opening them is supposed to "just work" given a suitable
> encoding - lik
> The world does not revolve around Python. Unix filenames have been
> encoding-agnostic long before Python was around. If Python3 does not
> support this then it's a regression on Python's part.
Fortunately, Python 3 does support that.
Regards,
Martin
--
http://mail.python.org/mailman/listinf
> It'd be great if all programs used the same encoding on a given OS,
> but at least on Linux, I believe historically filenames have been
> created with different encodings. IOW, if I pick one encoding and go
> with it, filenames written in some other encoding are likely to cause
> problems. So I
On Wed, 2010-12-01 at 02:14 +, MRAB wrote:
> If the filenames are to be shown to a user then there needs to be a
> mapping between bytes and glyphs. That's an encoding. If different
> users use different encodings then exchange of textual data becomes
> difficult.
That's presentation, that's s
On 01/12/2010 01:28, Nobody wrote:
On Tue, 30 Nov 2010 18:53:14 +0100, Peter Otten wrote:
I think this is wrong. In Unix there is no concept of filename
encoding. Filenames can have any arbitrary set of bytes (except '/' and
'\0'). But the filesystem itself neither knows nor cares about
enc
On Tue, 30 Nov 2010 18:53:14 +0100, Peter Otten wrote:
>> I think this is wrong. In Unix there is no concept of filename
>> encoding. Filenames can have any arbitrary set of bytes (except '/' and
>> '\0'). But the filesystem itself neither knows nor cares about
>> encoding.
>
> I think you mi
On Mon, 29 Nov 2010 21:26:23 -0800, Dan Stromberg wrote:
> Does anyone know what I need to do to read filenames from stdin with
> Python 3.1 and subsequently open them, when some of those filenames
> include characters with their high bit set?
Use "bytes" rather than "str". Everywhere. This means
On Tue, Nov 30, 2010 at 9:53 AM, Peter Otten <__pete...@web.de> wrote:
> $ ls
> $ python3
> Python 3.1.1+ (r311:74480, Nov 2 2009, 15:45:00)
> [GCC 4.4.1] on linux2
> Type "help", "copyright", "credits" or "license" for more information.
with open(b"\xe4\xf6\xfc.txt", "w") as f:
> ... f.w
On Tue, Nov 30, 2010 at 7:19 AM, Antoine Pitrou wrote:
> On Mon, 29 Nov 2010 21:52:07 -0800 (PST)
> Yingjie Lan wrote:
>> --- On Tue, 11/30/10, Dan Stromberg wrote:
>> > In Python 3, I'm finding that I have encoding issues with
>> > characters
>> > with their high bit set. Things are fine with
On Tue, Nov 30, 2010 at 11:47 AM, Martin v. Loewis wrote:
>> Does anyone know what I need to do to read filenames from stdin with
>> Python 3.1 and subsequently open them, when some of those filenames
>> include characters with their high bit set?
>
> If your files on disk use file names encoded i
> Does anyone know what I need to do to read filenames from stdin with
> Python 3.1 and subsequently open them, when some of those filenames
> include characters with their high bit set?
If your files on disk use file names encoded in iso-8859-1, don't set
your locale to a UTF-8 locale (as you app
Albert Hopkins wrote:
> On Tue, 2010-11-30 at 11:52 +0100, Peter Otten wrote:
> Dan Stromberg wrote:
>>
>> > I've got a couple of programs that read filenames from stdin, and
> then
>> > open those files and do things with them. These programs sort of do
>> > the *ix xargs thing, without requiri
On Mon, 29 Nov 2010 21:52:07 -0800 (PST)
Yingjie Lan wrote:
> --- On Tue, 11/30/10, Dan Stromberg wrote:
> > In Python 3, I'm finding that I have encoding issues with
> > characters
> > with their high bit set. Things are fine with strictly
> > ASCII
> > filenames. With high-bit-set characters,
On Tue, 2010-11-30 at 11:52 +0100, Peter Otten wrote:
Dan Stromberg wrote:
>
> > I've got a couple of programs that read filenames from stdin, and
then
> > open those files and do things with them. These programs sort of do
> > the *ix xargs thing, without requiring xargs.
> >
> > In Python 2, t
Dan Stromberg wrote:
> I've got a couple of programs that read filenames from stdin, and then
> open those files and do things with them. These programs sort of do
> the *ix xargs thing, without requiring xargs.
>
> In Python 2, these work well. Irrespective of how filenames are
> encoded, thin
Dan Stromberg wrote:
> I've got a couple of programs that read filenames from stdin, and then
> open those files and do things with them. These programs sort of do
> the *ix xargs thing, without requiring xargs.
>
> In Python 2, these work well. Irrespective of how filenames are
> encoded, thin
--- On Tue, 11/30/10, Dan Stromberg wrote:
> In Python 3, I'm finding that I have encoding issues with
> characters
> with their high bit set. Things are fine with strictly
> ASCII
> filenames. With high-bit-set characters, even if I
> change stdin's
> encoding with:
Co-ask. I have also had pro
I've got a couple of programs that read filenames from stdin, and then
open those files and do things with them. These programs sort of do
the *ix xargs thing, without requiring xargs.
In Python 2, these work well. Irrespective of how filenames are
encoded, things are opened OK, because it's all
26 matches
Mail list logo