Jarkko Hietaniemi <[EMAIL PROTECTED]> writes: >> Let's not 'fix' it (not carve it on a stone), but offer a few >> well-thought-out options. For instance, Perl may offer (not that these >> are particularly well-thought-out) 'just treat this as a sequence of >> octets', 'locale', and 'unicode'. 'locale' on Unix means multibyte >> encoding returned by nl_langinfo(CODESET) or equivalent. On Windows, >> it's whatever 'A' APIs accept or is returned by ACP_??(). 'unicode' >> is utf8 on Unix-like OS, BeOS and 'utf-16(le)' on Windows. > >Something like that could work, yes.
Agreed. > >> creating files with UTF-8 names while still using en_GB.ISO-8859-1 >> locale. Why does Perl have to be held responsible for your intentional >> act >> that is bound to break things? > >Whoa! It's the other way round here. Nick is using a locale that suits >him for other reasons (e.g. getting time and data formats in proper >British >ways), but why should he be constrained not to use for his filenames >whatever >he wants? I was at least partly being a devil's (UTF-8) advocate anyway, and to that end Jungshik Shin's intervention saying use a UTF-8 locale is positive. When I want non-ASCII it is for one of the following: For phonetics for the speech synthesis stuff To represent Euro currency symbol To typeset mother-in-law's welsh poetry cross-references for Japanese customers of day job There is no "locale" for phonetics, there is for Euro issues of course, but setting my locale to "cy_GB" so I can name file by poem is going to render dates and the like opaque to me the user, likewise for Japanese. So for _my_ use UTF-8 is what I want - but I _don't_ want some locale derived multi-byte guess. Unicode suits me. > >> Well, actually, if your WinXP file system has only characters covered >> by Windows-1252, Well AFAIK there isn't a Windows code page that covers welsh accented characters (and certainly not if you mix in phonetics). The shared drives at work I mount have user's which are native speakers of not only English, Italian, Norwegian, Swedish, but also two kinds of Chinese, and various Indian languages - and we have Japanese customers, so even in a small English startup cp1252 does not give them all the freedom to give files natural names. > >And how would Nick know that, or he could he guarantee that, if the >Windows >share is in multiuser use? > >PLEASE, PEOPLE: stop thinking of this in terms of an environment >controlled >solely by one user. Exactly - a file system should be able to cope even if files are named in english, welsh, chinese, ... So IMHO perl's -d etc. should be helping the move to Unicode not pandering to multi-byte compromises. I have no objection to some way to name files in shift-jis if that has been done, but I hope for a to-become common practice of "unicode"