Re: [Python-3000] The future of exceptions

2006-09-10 Thread Marcin 'Qrczak' Kowalczyk
Greg Ewing <[EMAIL PROTECTED]> writes: > Flow control exceptions typically don't need most of the exception > machinery -- they don't carry data of their own, so you don't need > to instantiate a class every time, It's lazily instantiated today (see PyErr_NormalizeException). > Or maybe there sh

Re: [Python-3000] Pre-PEP: Easy Text File Decoding

2006-09-10 Thread Antoine Pitrou
Le samedi 09 septembre 2006 à 20:29 -0700, Paul Prescod a écrit : > The type could be a true encoding or one of a small set of additional > symbolic values. The two main symbolic values are: Actually your proposal has three ;) > For example, a Japanese school teacher using Windows might default >

Re: [Python-3000] Pre-PEP: Easy Text File Decoding

2006-09-10 Thread Oleg Broytmann
On Sat, Sep 09, 2006 at 08:29:05PM -0700, Paul Prescod wrote: > "the protocol header says that this data is latin-1"). "Protocol metadata" if you allow me to suggest the word. Oleg. -- Oleg Broytmannhttp://phd.pp.ru/[EMAIL PROTECTED] Programmers don't d

Re: [Python-3000] Pre-PEP: Easy Text File Decoding

2006-09-10 Thread Antoine Pitrou
> The Site Decoding Hook > > > The "sys" module could have a function called > "setdefaultfileencoding". The encoding specified could be a true > encoding name or one of the encoding detection scheme names ( e.g. > "guess" or "XML"). Isn't it more intuitive to gather fu

Re: [Python-3000] Pre-PEP: Easy Text File Decoding

2006-09-10 Thread Nick Coghlan
Antoine Pitrou wrote: > So, here is an alternative proposal : > Make it so that textfile() doesn't recognize system-wide defaults (as in > your proposal), but also provide autotextfile() which would recognize > those defaults (with a by_content=False optional argument to enable > content-based gues

Re: [Python-3000] Pre-PEP: Easy Text File Decoding

2006-09-10 Thread Nick Coghlan
Paul Prescod wrote: > The function to open a text file will tenatively be called textfile(), > though the function name is not an integral part of this PEP. The > function takes three arguments, the filename, the mode ("r", "w", "r+", > etc.) and the type. > > The type could be a true encoding

Re: [Python-3000] Pre-PEP: Easy Text File Decoding

2006-09-10 Thread Antoine Pitrou
Le dimanche 10 septembre 2006 à 21:58 +1000, Nick Coghlan a écrit : > Antoine Pitrou wrote: > > So, here is an alternative proposal : > > Make it so that textfile() doesn't recognize system-wide defaults (as in > > your proposal), but also provide autotextfile() which would recognize > > those defa

[Python-3000] encoding='guess' ?

2006-09-10 Thread Antoine Pitrou
Hi, Let me add that 'guess' should probably be forbidden as an encoding parameter (instead, a separate function argument should be used as in my proposal). Here is a schematic example to show why : def append_text(filename, encoding): src = textfile(filename, "r", encoding) my_text = sr

Re: [Python-3000] encoding='guess' ?

2006-09-10 Thread Nick Coghlan
Antoine Pitrou wrote: > Hi, > > Let me add that 'guess' should probably be forbidden as an encoding > parameter (instead, a separate function argument should be used as in my > proposal). > > Here is a schematic example to show why : > > def append_text(filename, encoding): > src = textfile(

Re: [Python-3000] Pre-PEP: Easy Text File Decoding

2006-09-10 Thread David Hopwood
Antoine Pitrou wrote: > Le dimanche 10 septembre 2006 à 21:58 +1000, Nick Coghlan a écrit : >>Antoine Pitrou wrote: >> >>>So, here is an alternative proposal : >>>Make it so that textfile() doesn't recognize system-wide defaults (as in >>>your proposal), but also provide autotextfile() which would

Re: [Python-3000] encoding='guess' ?

2006-09-10 Thread Antoine Pitrou
Le dimanche 10 septembre 2006 à 23:44 +1000, Nick Coghlan a écrit : > Interesting. This goes back more towards the model of "no default encoding, > but provide the right tools to make it easy for a program to choose one in > the > absence of any metadata". In the "clean" API yes. But it would

Re: [Python-3000] Pre-PEP: Easy Text File Decoding

2006-09-10 Thread Antoine Pitrou
Le dimanche 10 septembre 2006 à 14:52 +0100, David Hopwood a écrit : > > On the other hand "autotextfile('myfile.txt', by_content=True)" would > > enable content-based guessing, thus be equivalent to Paul's > > "encoding='guess'". > > As I pointed out earlier, any file open function that guesses t

[Python-3000] sys.stdin and sys.stdout with textfile

2006-09-10 Thread Antoine Pitrou
Hi, Another aspect of the textfile discussion. sys.stdin and sys.stdout are for now, concretely, byte streams (AFAIK, at least under Unix). Yet it must be possible to read/write text to and from them. So two questions: - Is there a builtin text.stdin / text.stdout counterpart to sys.stdin / sys

Re: [Python-3000] Pre-PEP: Easy Text File Decoding

2006-09-10 Thread Marcin 'Qrczak' Kowalczyk
"Paul Prescod" <[EMAIL PROTECTED]> writes: > The type could be a true encoding or one of a small set of additional > symbolic values. The two main symbolic values are: Here is a counter-proposal. There is a variable sys.default_encoding. It's used by file opening functions when the encoding is n

Re: [Python-3000] sys.stdin and sys.stdout with textfile

2006-09-10 Thread Guido van Rossum
On 9/10/06, Antoine Pitrou <[EMAIL PROTECTED]> wrote: > Another aspect of the textfile discussion. > sys.stdin and sys.stdout are for now, concretely, byte streams (AFAIK, > at least under Unix). No, they are conceptually text streams, because that's what they are on Windows, which the only remain

Re: [Python-3000] Pre-PEP: Easy Text File Decoding

2006-09-10 Thread Guido van Rossum
On 9/10/06, Nick Coghlan <[EMAIL PROTECTED]> wrote: > The 'additional symbolic values' should be implemented as true encodings > (i.e., it should be possible to look up 'site', 'guess' and 'locale' in the > codecs registry, and replace them there as well). That's hard to do since guessing, at leas

Re: [Python-3000] Pre-PEP: Easy Text File Decoding

2006-09-10 Thread Guido van Rossum
On 9/10/06, Marcin 'Qrczak' Kowalczyk <[EMAIL PROTECTED]> wrote: > Here is a counter-proposal. > > There is a variable sys.default_encoding. It's used by file opening > functions when the encoding is not specified explicitly, among others. > Its initial value is set in site.py with a site-specific

Re: [Python-3000] iostack, second revision

2006-09-10 Thread Anders J. Munch
Nick Coghlan wrote: > Jim Jewett wrote: >> Why not just borrow the standard symbolic names of cur and end? >> >> seek(pos=0) >> seek_cur(pos=0) >> seek_end(pos=0) I say drop seek_cur and seek_end altogether, and keep only absolute seek. The C library caters for archaic file syst

Re: [Python-3000] The future of exceptions

2006-09-10 Thread Josiah Carlson
Greg Ewing <[EMAIL PROTECTED]> wrote: > Or maybe there should be a different mechanism altogether > for non-local gotos. I'd like to see some kind of "longjmp" > object that could be invoked to cause a jump back to > a specific place. That would help alleviate the problem > that exceptions used fo

Re: [Python-3000] Pre-PEP: Easy Text File Decoding

2006-09-10 Thread Paul Prescod
Suggestion accepted.On 9/10/06, Oleg Broytmann <[EMAIL PROTECTED]> wrote: On Sat, Sep 09, 2006 at 08:29:05PM -0700, Paul Prescod wrote:> "the protocol header says that this data is latin-1").   "Protocol metadata" if you allow me to suggest the word.Oleg. -- Oleg Broytmannhttp://phd

Re: [Python-3000] Pre-PEP: Easy Text File Decoding

2006-09-10 Thread Paul Prescod
I went based on the current setdefaultencoding. But it seems that we will accumulate 3 or 4 related functions so I'm pursuaded that there should be a module.encodingdetection.setdefaultfileencodingencodingdetection. registerencodingdetectorencodingdetection.guessfileencoding(filename)encodingdetect

Re: [Python-3000] Pre-PEP: Easy Text File Decoding

2006-09-10 Thread Josiah Carlson
David Hopwood <[EMAIL PROTECTED]> wrote: > Here is a very simple, reasonably (although not completely) safe, and much > more predictable guessing algorithm, based on a generalization of > : > >Let A, B, C, and D be the first 4 bytes of the stream, o

Re: [Python-3000] Pre-PEP: Easy Text File Decoding

2006-09-10 Thread Paul Prescod
On 9/10/06, Nick Coghlan <[EMAIL PROTECTED]> wrote: Paul Prescod wrote:> The function to open a text file will tenatively be called textfile(),> though the function name is not an integral part of this PEP. The> function takes three arguments, the filename, the mode ("r", "w", "r+", > etc.) and the

Re: [Python-3000] Pre-PEP: Easy Text File Decoding

2006-09-10 Thread Paul Prescod
I don't mind your name of autotextfile but I think that your by_content argument defeats the goal of having a very simple API for quick and dirty stuff. If content detection is a good idea (usually right) then we should do it. If it isn't, we shouldn't. I don't see a need for an argument to turn it

Re: [Python-3000] Pre-PEP: Easy Text File Decoding

2006-09-10 Thread Paul Prescod
On 9/10/06, David Hopwood <[EMAIL PROTECTED]> wrote: Here is a very simple, reasonably (although not completely) safe, and muchmore predictable guessing algorithm, based on a generalization of:Your algorithm is more predictable but will confuse BOM-less

Re: [Python-3000] Pre-PEP: Easy Text File Decoding

2006-09-10 Thread Antoine Pitrou
Le dimanche 10 septembre 2006 à 12:02 -0700, Paul Prescod a écrit : > Your algorithm is more predictable but will confuse BOM-less UTF-8 > with the system encoding frequently. I don't think it is desirable to acknowledge only some kinds of UTF-8. It will confuse the hell out of programmers, and u

[Python-3000] educational aspects of Python 3000

2006-09-10 Thread Toby Donaldson
Hello, There's been an explosion of discussion on the EDU-SIG list recently about the removal of raw_input and input from Python 3000. For teaching purposes, many educators report that they like raw_input (and input). The basic argument is that, for beginners, code like name = raw_input('Mo

Re: [Python-3000] content-based detection

2006-09-10 Thread Antoine Pitrou
Le dimanche 10 septembre 2006 à 11:30 -0700, Paul Prescod a écrit : > I don't mind your name of autotextfile but I think that your > by_content argument defeats the goal of having a very simple API for > quick and dirty stuff. If content detection is a good idea (usually > right) then we should do

Re: [Python-3000] Pre-PEP: Easy Text File Decoding

2006-09-10 Thread David Hopwood
Josiah Carlson wrote: > David Hopwood <[EMAIL PROTECTED]> wrote: > >>Here is a very simple, reasonably (although not completely) safe, and much >>more predictable guessing algorithm, based on a generalization of >>: >> >> Let A, B, C, and D be the firs

Re: [Python-3000] Pre-PEP: Easy Text File Decoding

2006-09-10 Thread David Hopwood
Paul Prescod wrote: > Maybe the guessing algorithm should read the WHOLE FILE. That wouldn't work for streams (e.g. stdin). The algorithm I gave does work for streams, provided that they have a push-back buffer of at least 4 bytes. -- David Hopwood <[EMAIL PROTECTED]> _

Re: [Python-3000] Pre-PEP: Easy Text File Decoding

2006-09-10 Thread Josiah Carlson
David Hopwood <[EMAIL PROTECTED]> wrote: > Josiah Carlson wrote: [snip] > > Using the xml guessing mechanism is fine, as long as you get it right. > > A first pass with BOM detection and a second pass to "guess" based on > > content in the case that a BOM isn't detected seems to make sense. > >

Re: [Python-3000] Pre-PEP: Easy Text File Decoding

2006-09-10 Thread Paul Prescod
On 9/10/06, Antoine Pitrou <[EMAIL PROTECTED]> wrote: > > ... > > Modern I/O is astonishingly fast anyhow. On my computer it takes five > > seconds to decode a quarter gigabyte of UTF-8 text through Python. > > Maybe we shouldn't be that presomptuous. Modern I/O is fast but memory > is not infinite

Re: [Python-3000] Pre-PEP: Easy Text File Decoding

2006-09-10 Thread Paul Prescod
The PEP doesn't deal with streams. It is about files. On 9/10/06, David Hopwood <[EMAIL PROTECTED]> wrote: > Paul Prescod wrote: > > Maybe the guessing algorithm should read the WHOLE FILE. > > That wouldn't work for streams (e.g. stdin). The algorithm I gave > does work for streams, provided that

Re: [Python-3000] Pre-PEP: Easy Text File Decoding

2006-09-10 Thread Paul Prescod
On 9/10/06, David Hopwood <[EMAIL PROTECTED]> wrote: > Josiah Carlson wrote: > ... if you think that guessing based on content is a good idea -- I don't. > In any case, such guessing necessarily depends on the expected file format, > so it should be done by the application itself, or by a library t

Re: [Python-3000] Pre-PEP: Easy Text File Decoding

2006-09-10 Thread Paul Prescod
On 9/10/06, Marcin 'Qrczak' Kowalczyk <[EMAIL PROTECTED]> wrote: >... > Other than that, guessing the encoding from the contents of the text > stream, especially statistical guessing basing on well-formed UTF-8 > non-ASCII characters, shouldn't be encouraged, because it's effect is > not predictabl

Re: [Python-3000] Help on text editors

2006-09-10 Thread Jeff Wilcox
> Great: but what is the default Textedit encoding on a Japanized version of the Mac? > Paul Prescod I'm fairly sure that the settings on the computer I looked at this on are default, but I borrowed the machine so I can't guarantee it. In textpad with OS X set to Japanese there were three choices