Edward H. Trager wrote:
> UTF-8's home directory). So both users could probably guess
> the filename
> they were looking at.
Which, BTW, is true for most of Europe but is not true for some other combinations of locales.
>
> d??claration_des_droits.utf8
>
> The terminal, being set to interpret the legacy locale, does not know
> how to interpret the two bytes that are used for the UTF-8 "é".
This is well known but is only the start of what the thread was discussing.
Your example only shows a difference in interpretation. You are still able to copy and paste the filename, use it in scripts and open in it in any program.
Now switch your locale to Latin 1 and create a file with that name in Latin 1. Switch back to UTF-8 and try doing various things with this file. I assume the following happens:
1 - Instead of letters being misinterpreted, they are lost. Leading to empty filenames in extreme cases.
2 - You cannot open the file by copying its name from the terminal.
3 - You can probably still specify it in scripts (which need to be edited in Latin 1), but if someone would start validating the script when in UTF-8 locale, you would lose that ability.
4 - Most C programs should be able to process the file. But I would not bet on some more 'advanced' languages. The more they comply with Unicode, the less likely it is they will open the file.
5 - Windows is likely having problems accessing that file.
And, yes, the solution is still to convert all filenames to UTF-8. That is, if all users on a particular system agree that this is what should be done with their files. But does not prevent such files from being generated, whatever the reason or cause is.
Lars