Re: [Python-3000] Pre-PEP: Easy Text File Decoding

2006-09-11 Thread Walter Dörwald
Paul Prescod wrote: > I went based on the current setdefaultencoding. But it seems that we will > accumulate 3 or 4 related functions so I'm pursuaded that there should be a > module. > > encodingdetection.setdefaultfileencoding > encodingdetection.registerencodingdetector > encodingdetection.gues

Re: [Python-3000] Pre-PEP: Easy Text File Decoding

2006-09-11 Thread Oleg Broytmann
On Sun, Sep 10, 2006 at 12:02:44PM -0700, Paul Prescod wrote: > * Eastern Unix/Linux users using UTF-8 apps like gedit or apps "saving as" > UTF-8 Finally I've got the definitive answer for "is Russia Europe or Asia?" It is an Eastern country! At last! ;) > Maybe the guessing algorithm should

Re: [Python-3000] Pre-PEP: Easy Text File Decoding

2006-09-11 Thread Marcin 'Qrczak' Kowalczyk
"Paul Prescod" <[EMAIL PROTECTED]> writes: > Guido's goal was that quick and dirty text processing should "just > work" for newbies and encoding-disintererested expert programmers. What does 'guess' mean for creating files? Consider a program which reads one file and writes data extracted from i

Re: [Python-3000] iostack, second revision

2006-09-11 Thread Barry Warsaw
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On Sep 10, 2006, at 1:53 PM, Anders J. Munch wrote: > I say drop seek_cur and seek_end altogether, and keep only absolute > seek. I was just looking through some of our elf/dwarf parsing code and we use seek_cur quite a bit. Not that it couldn't b

Re: [Python-3000] Pre-PEP: Easy Text File Decoding

2006-09-11 Thread Paul Prescod
On 9/11/06, Oleg Broytmann <[EMAIL PROTECTED]> wrote: On Sun, Sep 10, 2006 at 12:02:44PM -0700, Paul Prescod wrote:> * Eastern Unix/Linux users using UTF-8 apps like gedit or apps "saving as"> UTF-8   Finally I've got the definitive answer for "is Russia Europe or Asia?" It is an Eastern country! A

Re: [Python-3000] Pre-PEP: Easy Text File Decoding

2006-09-11 Thread Paul Prescod
On 9/11/06, Marcin 'Qrczak' Kowalczyk <[EMAIL PROTECTED]> wrote: "Paul Prescod" <[EMAIL PROTECTED]> writes:> Guido's goal was that quick and dirty text processing should "just> work" for newbies and encoding-disintererested expert programmers. What does 'guess' mean for creating files?I wasn't sure

Re: [Python-3000] Pre-PEP: Easy Text File Decoding

2006-09-11 Thread Oleg Broytmann
On Mon, Sep 11, 2006 at 06:58:42AM -0700, Paul Prescod wrote: > For these purposes, Russia is European, isn't it? If the test is "a BOM in UTF-8 text files on Unices" - then no. :) > Russian text can be subsumed by UTF-8 with relatively minor expansion, right? Sorry, what do you mean? That

Re: [Python-3000] Pre-PEP: Easy Text File Decoding

2006-09-11 Thread Michael Chermside
Paul Prescod writes: [... Pre-PEP proposal ...] Quick thoughts: * I like it. Good work. * I agree with Guido: "open" is the right spelling for this. * I agree with Paul: mandatory specification is the way to go. 10,000 different blog entries, tutorials, and cookbook recipies can re

Re: [Python-3000] educational aspects of Python 3000

2006-09-11 Thread Michael Chermside
Toby Donaldson writes: > Any suggestions for how educators interested in the > educational/learning aspects of Python 3000 could more fruitfully > participate? You're doing pretty well so far! Seriously... just speak up: Pythonistas (including, in particular, Guido) value the fact that Python is a

Re: [Python-3000] educational aspects of Python 3000

2006-09-11 Thread Brett Cannon
On 9/11/06, Michael Chermside <[EMAIL PROTECTED]> wrote: Toby Donaldson writes:> Any suggestions for how educators interested in the> educational/learning aspects of Python 3000 could more fruitfully> participate?You're doing pretty well so far! Seriously... just speak up: Pythonistas (including, i

Re: [Python-3000] educational aspects of Python 3000

2006-09-11 Thread Antoine Pitrou
Le lundi 11 septembre 2006 à 11:22 -0700, Michael Chermside a écrit : > The idea of a standard edu library though is a GREAT one. That would > provide a standard place for things like raw_input() (with a better > name) as well as lots of other "helper functions" useful to beginners > and/or studen

Re: [Python-3000] Pre-PEP: Easy Text File Decoding

2006-09-11 Thread Paul Moore
On 9/11/06, Michael Chermside <[EMAIL PROTECTED]> wrote: > Paul Prescod writes: > [... Pre-PEP proposal ...] > > Quick thoughts: My quick thoughts on this whole subject: * Yes, it should be "open". Anything else feels like gratuitous breakage. * There should be a default encoding, and it sho

Re: [Python-3000] Pre-PEP: Easy Text File Decoding

2006-09-11 Thread Jim Jewett
On 9/10/06, Paul Prescod <[EMAIL PROTECTED]> wrote: > encodingdetection.setdefaultfileencoding > encodingdetection. registerencodingdetector > encodingdetection.guessfileencoding(filename) > encodingdetection.guessfileencoding(bytestream) This demonstrates two of problems with requiring an explic

Re: [Python-3000] Pre-PEP: Easy Text File Decoding

2006-09-11 Thread Paul Prescod
I think that the basis of your concern is a misunderstanding of the proposal (at least as documented in the PEP). On 9/11/06, Jim Jewett <[EMAIL PROTECTED]> wrote: > On 9/10/06, Paul Prescod <[EMAIL PROTECTED]> wrote: > > > encodingdetection.setdefaultfileencoding > > encodingdetection. registeren

Re: [Python-3000] educational aspects of Python 3000

2006-09-11 Thread Guido van Rossum
On 9/11/06, Antoine Pitrou <[EMAIL PROTECTED]> wrote: > Le lundi 11 septembre 2006 à 11:22 -0700, Michael Chermside a écrit : > > The idea of a standard edu library though is a GREAT one. That would > > provide a standard place for things like raw_input() (with a better > > name) as well as lots of

Re: [Python-3000] Pre-PEP: Easy Text File Decoding

2006-09-11 Thread Paul Prescod
On 9/11/06, Paul Moore <[EMAIL PROTECTED]> wrote: > On 9/11/06, Michael Chermside <[EMAIL PROTECTED]> wrote: > > Paul Prescod writes: > > [... Pre-PEP proposal ...] > > > > Quick thoughts: > > My quick thoughts on this whole subject: > > * Yes, it should be "open". Anything else feels like gra

Re: [Python-3000] Pre-PEP: Easy Text File Decoding

2006-09-11 Thread Marcin 'Qrczak' Kowalczyk
"Paul Moore" <[EMAIL PROTECTED]> writes: > Of course, I'm in the useful position of having an OS default > character set which contains ASCII as a subset. I don't know what > issues someone with Greek/Russian/Japanese or whatever as an OS > default would have (one thought - if your default charact

Re: [Python-3000] Pre-PEP: Easy Text File Decoding

2006-09-11 Thread David Hopwood
Paul Prescod wrote: > On 9/10/06, David Hopwood <[EMAIL PROTECTED]> wrote: > >> ... if you think that guessing based on content is a good idea -- I >> don't. In any case, such guessing necessarily depends on the expected file >> format, so it should be done by the application itself, or by a libra

Re: [Python-3000] Pre-PEP: Easy Text File Decoding

2006-09-11 Thread David Hopwood
Paul Prescod wrote: > The PEP doesn't deal with streams. It is about files. An important part of the Unix design philosophy (partially adopted by Windows) is to make streams and files behave as similarly as possible. It is quite feasible to make *some* detection algorithms work for streams, and th

Re: [Python-3000] Pre-PEP: Easy Text File Decoding

2006-09-11 Thread Marcin 'Qrczak' Kowalczyk
"Paul Prescod" <[EMAIL PROTECTED]> writes: >> The bizarre Windows behavious of using different >> encodings for console and GUI programs doesn't >> bother me either. Really. I promise." > > So according to this philosophy, Windows and Mac users will probably > never be able to open UTF-8 documents

Re: [Python-3000] Pre-PEP: Easy Text File Decoding

2006-09-11 Thread Paul Prescod
On 9/11/06, David Hopwood <[EMAIL PROTECTED]> wrote: > > I disagree. If a non-trivial file can be decoded as a UTF-* encoding > > it probably is that encoding. > > That is quite false for UTF-16, at least. It is also false for short UTF-8 > files. True UTF-16 (as opposed to UTF-16 BE/UTF 16 LE) fi

Re: [Python-3000] Pre-PEP: Easy Text File Decoding

2006-09-11 Thread Paul Prescod
On 9/11/06, Marcin 'Qrczak' Kowalczyk <[EMAIL PROTECTED]> wrote: > "Paul Prescod" <[EMAIL PROTECTED]> writes: > > >> The bizarre Windows behavious of using different > >> encodings for console and GUI programs doesn't > >> bother me either. Really. I promise." > > > > So according to this philosoph

Re: [Python-3000] educational aspects of Python 3000

2006-09-11 Thread Talin
Guido van Rossum wrote: > from scripting import raw_input, autotextfile > > I'm not so keen on 'scripting' as the name either, but I'm sure we can > come up with something. Perhaps easyio, simpleio or basicio? (Not to > be confused with vbio. :-) > > I'm also not completely against revising t

Re: [Python-3000] The future of exceptions

2006-09-11 Thread Greg Ewing
Marcin 'Qrczak' Kowalczyk wrote: > It's lazily instantiated today (see PyErr_NormalizeException). Only in C code, though, not Python. And if the separate type/value specification when raising goes away, it might not be possible any more even in C. > 'WithExit' constructs a unique exception objec

Re: [Python-3000] Pre-PEP: Easy Text File Decoding

2006-09-11 Thread Greg Ewing
Guido van Rossum wrote: > if possible I'd like the > guessing function to have access to what was in the file before it was > emptied by the "create" function, or what's at the start before > appending to the end, Which further suggests that the encoding-guesser needs to be fairly intimately built

Re: [Python-3000] sys.stdin and sys.stdout with textfile

2006-09-11 Thread Greg Ewing
Guido van Rossum wrote: > All sorts of things are different when reading stdin vs. opening a > filename. e.g. stdin may be a pipe. Which suggests that if anything is going to try to guess the encoding, it would be better for it to start reading from the actual stream you're going to use and buffe

Re: [Python-3000] educational aspects of Python 3000

2006-09-11 Thread Greg Ewing
Michael Chermside wrote: > The idea of a standard edu library though is a GREAT one. That would > provide a standard place for things like raw_input() (with a better > name) as well as lots of other "helper functions" useful to beginners > and/or students -- and all it would cost is a single line

Re: [Python-3000] iostack, second revision

2006-09-11 Thread Greg Ewing
Anders J. Munch wrote: > any file that supports seeking to the end will also support > reporting the file size. Thus > f.seek(f.length) > should suffice, Although the micro-optimisation circuit in my brain complains that it will take 2 system calls when it could be done with 1... -- Greg Ewin

Re: [Python-3000] text editors

2006-09-11 Thread Antoine Pitrou
Le lundi 11 septembre 2006 à 18:16 -0700, Paul Prescod a écrit : > On Unix, VIM is also set up to auto-detect UTF-8 (using the BOM or > full decoding attemption). According to Google, XEmacs also has some > kind of UTF-8/BOM detector but I don't know the details. GNU Emacs: > According to "Emacs wi