-----Original Message-----
From: [EMAIL PROTECTED] On Behalf Of Philippe Verdy
Sent: 14 December 2004 22:47
To: Marcin 'Qrczak' Kowalczyk
Cc: [EMAIL PROTECTED]
Subject: Re: Roundtripping in Unicode

From: "Marcin 'Qrczak' Kowalczyk" <[EMAIL PROTECTED]>
"Arcane Jill" <[EMAIL PROTECTED]> writes:
If so, Marcin, what exactly is the error, and whose fault is it?

It's an error to use locales with different encodings on the same system.

I confess I don't know much about Unix, but still, I'm not sure your assertion (Marcin) makes sense. Unix is a multi-user system. If you log on as User A, then User B's settings are hidden from you, unless User B has explicitly decided to share them. It may even be possible that there may be users of whose existence you are not even aware. Unix makes is possible for /you/ to change /your/ locale - but by your reasoning, this is an error, unless all other users do so simultaneously. Your reasoning implies that no Unix user should ever change their locale unless they have an absolute guarantee that all other users are going to do so simultaneously ... but I don't know if you can ever get such a guarantee. Or maybe you're saying that the error lies with Unix itself. Maybe that's fair comment, but I gather Unix was invented before Unicode, so it can hardly be blamed for breaking Unicode's conceptual model.


But it goes beyond that. Copy a file onto a floppy disc and then physically take that floppy disc to a different Unix machine and log on as "guest" and insert the disc ... Will the filename look the same? It would seem that "the same system", is effectively every Unix machine on the planet, since files may be interchanged between them.

The obvious solution is for all Unix machines everywhere to be using the same locale - and it had better be UTF-8. But an instantaneous global switch-over is never going to happen, so we see this gradual switch-over ... and it is during this transition phase that Lars's problem manifests.

Phillipe adds...
More simply, I think that it's an error to have the encoding part of any
locale...

which again attaches blame to Unix itself. All very "not my problem", but I think Lars has found that it actually /is/ his problem. (Not that I support his solution).


The system should not depend on them, and for critical things like
filesystem volumes, the encoding should be forced by the filesystem itself,
and applications should mandatorily follow the filesystem rules.

Of course, you are suggesting not /really/ suggesting that the Unix kernel be rewritten. But it's hard to for me to see how else this could be achieved.


Now think about the web itself: it's really a filesystem, with billions
users, or trillion applications using simultaneously hundreds or thousands
of incompatible encodings... Many resources on the web seem to have valid
URLs for some users but not for others, until URLs are made independant to
any user locale, and then not considered as encoded plain-text but only as
strings of bytes.

Oh yeah - and that too. Well spotted. Jill




Reply via email to