Paul Prescod wrote:
> I went based on the current setdefaultencoding. But it seems that we will
> accumulate 3 or 4 related functions so I'm pursuaded that there should be a
> module.
>
> encodingdetection.setdefaultfileencoding
> encodingdetection.registerencodingdetector
> encodingdetection.gues
On Sun, Sep 10, 2006 at 12:02:44PM -0700, Paul Prescod wrote:
> * Eastern Unix/Linux users using UTF-8 apps like gedit or apps "saving as"
> UTF-8
Finally I've got the definitive answer for "is Russia Europe or Asia?"
It is an Eastern country! At last! ;)
> Maybe the guessing algorithm should
"Paul Prescod" <[EMAIL PROTECTED]> writes:
> Guido's goal was that quick and dirty text processing should "just
> work" for newbies and encoding-disintererested expert programmers.
What does 'guess' mean for creating files?
Consider a program which reads one file and writes data extracted
from i
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1
On Sep 10, 2006, at 1:53 PM, Anders J. Munch wrote:
> I say drop seek_cur and seek_end altogether, and keep only absolute
> seek.
I was just looking through some of our elf/dwarf parsing code and we
use seek_cur quite a bit. Not that it couldn't b
On 9/11/06, Oleg Broytmann <[EMAIL PROTECTED]> wrote:
On Sun, Sep 10, 2006 at 12:02:44PM -0700, Paul Prescod wrote:> * Eastern Unix/Linux users using UTF-8 apps like gedit or apps "saving as"> UTF-8 Finally I've got the definitive answer for "is Russia Europe or Asia?"
It is an Eastern country! A
On 9/11/06, Marcin 'Qrczak' Kowalczyk <[EMAIL PROTECTED]> wrote:
"Paul Prescod" <[EMAIL PROTECTED]> writes:> Guido's goal was that quick and dirty text processing should "just> work" for newbies and encoding-disintererested expert programmers.
What does 'guess' mean for creating files?I wasn't sure
On Mon, Sep 11, 2006 at 06:58:42AM -0700, Paul Prescod wrote:
> For these purposes, Russia is European, isn't it?
If the test is "a BOM in UTF-8 text files on Unices" - then no. :)
> Russian text can be subsumed by UTF-8 with relatively minor expansion, right?
Sorry, what do you mean? That
Paul Prescod writes:
[... Pre-PEP proposal ...]
Quick thoughts:
* I like it. Good work.
* I agree with Guido: "open" is the right spelling for this.
* I agree with Paul: mandatory specification is the way to go.
10,000 different blog entries, tutorials, and cookbook recipies can
re
Toby Donaldson writes:
> Any suggestions for how educators interested in the
> educational/learning aspects of Python 3000 could more fruitfully
> participate?
You're doing pretty well so far! Seriously... just speak up: Pythonistas
(including, in particular, Guido) value the fact that Python is a
On 9/11/06, Michael Chermside <[EMAIL PROTECTED]> wrote:
Toby Donaldson writes:> Any suggestions for how educators interested in the> educational/learning aspects of Python 3000 could more fruitfully> participate?You're doing pretty well so far! Seriously... just speak up: Pythonistas
(including, i
Le lundi 11 septembre 2006 à 11:22 -0700, Michael Chermside a écrit :
> The idea of a standard edu library though is a GREAT one. That would
> provide a standard place for things like raw_input() (with a better
> name) as well as lots of other "helper functions" useful to beginners
> and/or studen
On 9/11/06, Michael Chermside <[EMAIL PROTECTED]> wrote:
> Paul Prescod writes:
> [... Pre-PEP proposal ...]
>
> Quick thoughts:
My quick thoughts on this whole subject:
* Yes, it should be "open". Anything else feels like gratuitous breakage.
* There should be a default encoding, and it sho
On 9/10/06, Paul Prescod <[EMAIL PROTECTED]> wrote:
> encodingdetection.setdefaultfileencoding
> encodingdetection. registerencodingdetector
> encodingdetection.guessfileencoding(filename)
> encodingdetection.guessfileencoding(bytestream)
This demonstrates two of problems with requiring an explic
I think that the basis of your concern is a misunderstanding of the
proposal (at least as documented in the PEP).
On 9/11/06, Jim Jewett <[EMAIL PROTECTED]> wrote:
> On 9/10/06, Paul Prescod <[EMAIL PROTECTED]> wrote:
>
> > encodingdetection.setdefaultfileencoding
> > encodingdetection. registeren
On 9/11/06, Antoine Pitrou <[EMAIL PROTECTED]> wrote:
> Le lundi 11 septembre 2006 à 11:22 -0700, Michael Chermside a écrit :
> > The idea of a standard edu library though is a GREAT one. That would
> > provide a standard place for things like raw_input() (with a better
> > name) as well as lots of
On 9/11/06, Paul Moore <[EMAIL PROTECTED]> wrote:
> On 9/11/06, Michael Chermside <[EMAIL PROTECTED]> wrote:
> > Paul Prescod writes:
> > [... Pre-PEP proposal ...]
> >
> > Quick thoughts:
>
> My quick thoughts on this whole subject:
>
> * Yes, it should be "open". Anything else feels like gra
"Paul Moore" <[EMAIL PROTECTED]> writes:
> Of course, I'm in the useful position of having an OS default
> character set which contains ASCII as a subset. I don't know what
> issues someone with Greek/Russian/Japanese or whatever as an OS
> default would have (one thought - if your default charact
Paul Prescod wrote:
> On 9/10/06, David Hopwood <[EMAIL PROTECTED]> wrote:
>
>> ... if you think that guessing based on content is a good idea -- I
>> don't. In any case, such guessing necessarily depends on the expected file
>> format, so it should be done by the application itself, or by a libra
Paul Prescod wrote:
> The PEP doesn't deal with streams. It is about files.
An important part of the Unix design philosophy (partially adopted by Windows)
is to make streams and files behave as similarly as possible. It is quite
feasible to make *some* detection algorithms work for streams, and th
"Paul Prescod" <[EMAIL PROTECTED]> writes:
>> The bizarre Windows behavious of using different
>> encodings for console and GUI programs doesn't
>> bother me either. Really. I promise."
>
> So according to this philosophy, Windows and Mac users will probably
> never be able to open UTF-8 documents
On 9/11/06, David Hopwood <[EMAIL PROTECTED]> wrote:
> > I disagree. If a non-trivial file can be decoded as a UTF-* encoding
> > it probably is that encoding.
>
> That is quite false for UTF-16, at least. It is also false for short UTF-8
> files.
True UTF-16 (as opposed to UTF-16 BE/UTF 16 LE) fi
On 9/11/06, Marcin 'Qrczak' Kowalczyk <[EMAIL PROTECTED]> wrote:
> "Paul Prescod" <[EMAIL PROTECTED]> writes:
>
> >> The bizarre Windows behavious of using different
> >> encodings for console and GUI programs doesn't
> >> bother me either. Really. I promise."
> >
> > So according to this philosoph
Guido van Rossum wrote:
> from scripting import raw_input, autotextfile
>
> I'm not so keen on 'scripting' as the name either, but I'm sure we can
> come up with something. Perhaps easyio, simpleio or basicio? (Not to
> be confused with vbio. :-)
>
> I'm also not completely against revising t
Marcin 'Qrczak' Kowalczyk wrote:
> It's lazily instantiated today (see PyErr_NormalizeException).
Only in C code, though, not Python. And if the
separate type/value specification when raising
goes away, it might not be possible any more
even in C.
> 'WithExit' constructs a unique exception objec
Guido van Rossum wrote:
> if possible I'd like the
> guessing function to have access to what was in the file before it was
> emptied by the "create" function, or what's at the start before
> appending to the end,
Which further suggests that the encoding-guesser
needs to be fairly intimately built
Guido van Rossum wrote:
> All sorts of things are different when reading stdin vs. opening a
> filename. e.g. stdin may be a pipe.
Which suggests that if anything is going to try
to guess the encoding, it would be better for it
to start reading from the actual stream you're
going to use and buffe
Michael Chermside wrote:
> The idea of a standard edu library though is a GREAT one. That would
> provide a standard place for things like raw_input() (with a better
> name) as well as lots of other "helper functions" useful to beginners
> and/or students -- and all it would cost is a single line
Anders J. Munch wrote:
> any file that supports seeking to the end will also support
> reporting the file size. Thus
> f.seek(f.length)
> should suffice,
Although the micro-optimisation circuit in my
brain complains that it will take 2 system
calls when it could be done with 1...
--
Greg Ewin
Le lundi 11 septembre 2006 à 18:16 -0700, Paul Prescod a écrit :
> On Unix, VIM is also set up to auto-detect UTF-8 (using the BOM or
> full decoding attemption). According to Google, XEmacs also has some
> kind of UTF-8/BOM detector but I don't know the details. GNU Emacs:
> According to "Emacs wi
29 matches
Mail list logo