Re: [Python-3000] Help on text editors

2006-10-03 Thread Martin v. Löwis
David Hopwood schrieb: >> If you have access to "German Windows XP", "Japanese Windows XP", > > Since Win2K there is actually no such thing, from a technical point of view -- > just Win2K or WinXP with a German or Japanese "language group" installed, > and a corresponding locale selected as the in

Re: [Python-3000] Help on text editors

2006-10-03 Thread Martin v. Löwis
Antoine Pitrou schrieb: > Ok, I hexdump'ed a few .mo files (the gettext-compatible files which > contain translation strings) and the result is a bit funny: > Gnome/KDE .mo files use utf-8, while .mo files for various command-line > tools (e.g. aspell) use iso-8859-15. This is a gettext feature: g

Re: [Python-3000] string C API

2006-10-03 Thread Martin v. Löwis
Jim Jewett schrieb: >>> Interning may get awkward if multiple encodings are allowed within a >>> program, regardless of whether they're allowed for single strings. It >>> might make sense to intern only strings that are in the same encoding >>> as the source code. (Or whose values are limited to

Re: [Python-3000] sys.stdin and sys.stdout with textfile

2006-10-03 Thread Martin v. Löwis
Greg Ewing schrieb: >> All sorts of things are different when reading stdin vs. opening a >> filename. e.g. stdin may be a pipe. > > Which suggests that if anything is going to try > to guess the encoding, it would be better for it > to start reading from the actual stream you're > going to use an

Re: [Python-3000] string C API

2006-10-03 Thread Jim Jewett
On 10/3/06, "Martin v. Löwis" <[EMAIL PROTECTED]> wrote: > Jim Jewett schrieb: > > The problem isn't the hash; it is the equality. Which encoding do you > > keep interned? When I wrote this, I had been assuming that UCS4(string) and UCS2(string) would be completely unrelated objects. With more

Re: [Python-3000] string C API

2006-10-03 Thread Martin v. Löwis
Jim Jewett schrieb: > In python 3, a string object might look like > > #define PyObject_str_HEAD \ >PyObject_VAR_HEAD \ >long ob_shash; \ >PyObject *cache; > > with a typical concrete implementation looking like > > typedef struct { >PyObject_str_HEAD >PyObject *encodin

Re: [Python-3000] string C API

2006-10-03 Thread Jim Jewett
On 10/3/06, "Martin v. Löwis" <[EMAIL PROTECTED]> wrote: > Jim Jewett schrieb: > > In python 3, a string object might look like > > #define PyObject_str_HEAD \ > >PyObject_VAR_HEAD \ > >long ob_shash; \ > >PyObject *cache; > > with a typical concrete implementation looking like

Re: [Python-3000] string C API

2006-10-03 Thread Martin v. Löwis
Jim Jewett schrieb: > By knowing that there is only one possible representation for a given > string, he skips the equivalency cache. On the other hand, he also > loses the equivalency cache. What is an equivalency cache, and why would one like to have one? > When python 2.x chooses the unicode

Re: [Python-3000] string C API

2006-10-03 Thread Jim Jewett
On 10/3/06, "Martin v. Löwis" <[EMAIL PROTECTED]> wrote: > Jim Jewett schrieb: > > By knowing that there is only one possible representation for a given > > string, he skips the equivalency cache. On the other hand, he also > > loses the equivalency cache. > What is an equivalency cache, and why

Re: [Python-3000] Help on text editors

2006-10-03 Thread David Hopwood
Martin v. Löwis wrote: > David Hopwood schrieb: > >>>If you have access to "German Windows XP", "Japanese Windows XP", >> >>Since Win2K there is actually no such thing, from a technical point of view -- >>just Win2K or WinXP with a German or Japanese "language group" installed, >>and a correspondi

Re: [Python-3000] string C API

2006-10-03 Thread Josiah Carlson
"Jim Jewett" <[EMAIL PROTECTED]> wrote: > On 10/3/06, "Martin v. Löwis" <[EMAIL PROTECTED]> wrote: > > Jim Jewett schrieb: > > > By knowing that there is only one possible representation for a given > > > string, he skips the equivalency cache. On the other hand, he also > > > loses the equivalen