I think there is a fundamental conflict between "keep language out of it" and "filenames should be case insensitive." I think x'C1' and x'81' are "the same character" only in the context of English and other Romance/Germanic language conventions, where most people would see Apricot and apricot as both representing the same fruit.
What do you do about the equivalence in Unicode of é as U+00E9 but also as U+0065 + U+0301? I completely *personally like* case insensitive filenames but I also think it is an bottomless morass if you are defining a system that will have global and/or Unicode usage. The list of "these two different bit patterns mean the same thing to a human" cases is endless. Charles On Sat, 25 Oct 2025 18:13:54 +0200, Thomas Berg <[email protected]> wrote: >From my pov file names shouldn't be seen or treated as "language". It may >use text/words from a language as a convenience and to make the life easier >for users and developers but the point of it is identification, not >adherence to language as such. >If we don't see it in this way we are on a path towards endless problems. > >And here in the thread we can see the problems that will arise. > >As it is, due to historical reasons, we have implemented english as the >source of the used character set, with obvious limitations. > >And KISS. (Keep It Simple Stupid.) >Use at most upper case and lower case letters and some useful special >characters in file names and code. That is A-Z, a-z, 0-9, maybe so called >"national characters" and some specials. >Any other needs, e g other languages or more descriptive needs or adherence >to "correct language" have to be kept in some "meta data" files/file parts. > >And I don't know why the original authors in the unix community saw it as >useful to make a distinction between file names based on the usage of case >but it will inevitable cause problems due to confusion (as everything that >looks like part of your language will cause you (=your brain) to treat it >as such). > >If the file name is "Hereisanexample" you will often be confused if there >is another file with the name "HereisanExample". Take that times 10000 and >you have a lot of time wasted. >As I see it, the best solution is have all these file names point to the >same file: >Hereisanexample >HereisanExample >hereisanexample >etc > >And if you need to use the "key span" of using all characters for a file >name, like password like formats, hashes, and system uses - have a >convention for it in the file system like (as in a known usage) having a >special character as the first like ".something". But I would prefer a char >that is more seldom used in a natural language. > >About the other needs as I mentioned above and "meta data" it needs to be >somewhat universal like char set id's. >Anyway in practice we will never get chars like éĕưüşîâ play well in >neither programming or file systems. Especially when there are users with >different languages and systems. > > >Thomas Berg > > > > >"I wash off the hatred of my enemies and the greed and wrath of powerful >people." > >“I clearly saw the skeleton underneath all this show of personality. What >is left of a man and all his pride but bones?” > >Den lör 25 okt. 2025 17:[email protected] <[email protected]> >skrev: > >> * >> When choosing case insensitivity designers must carefully >> * >> consider what its scope should be. >> * >> >> >> This is a key point. File names are often mentioned in text (books, email, >> newsgroups, etc). Sometimes the file name is copied (maybe cut & paste) >> from code examples, and sometimes it is simply typed by the author. Should >> text processors "recognize" that the text is a file name and automatically >> convert it to upper case? Or convert it to lower case to look better in the >> middle of a paragraph? >> >> There is another part to text cases: terminal keyboards, and not all of >> these are "standard" English. And, of course, some languages are >> "right-to-left" instead of the "left-to-right" that most of us are >> accustomed to use. Should text processors somehow recognize when a file >> name is being discussed and provide special handling? Sometimes a >> particular case is important for recognition (DeLorenko vs DELORENKO or >> delorenko could make a customer unhappy!) As mentioned already, automatic >> case changes are not clearly defined in some languages. >> >> "Text processor" can mean anything from ISPF edit to an expensive >> "professional" author's tool. (I use both; many of us use a wide range of >> these tools.) >> >> While it was less true in "7 bit ASCII days" we should remember that the >> computer world is a world-wide concept today. I can grasp how "upper case" >> happened in keypunch days (no lower case) and "7 bit" days but, IMHO, it is >> unfortunate that z/OS has stuck with some upper case restrictions. (Of >> course, changing this now might cause nightmares in some production >> operations!) >> >> Bill Ogden >> >> ---------------------------------------------------------------------- >> For IBM-MAIN subscribe / signoff / archive access instructions, >> send email to [email protected] with the message: INFO IBM-MAIN >> > >---------------------------------------------------------------------- >For IBM-MAIN subscribe / signoff / archive access instructions, >send email to [email protected] with the message: INFO IBM-MAIN ---------------------------------------------------------------------- For IBM-MAIN subscribe / signoff / archive access instructions, send email to [email protected] with the message: INFO IBM-MAIN
