On Wed, Jun 24, 2020 at 10:20:48AM +0200, Hans Åberg wrote:
>
>
> The problem is that I haven't changed my environment variable.
>
> > LC_ALL=UTF-8
> …
> > LC_ALL=fr_FR.UTF-8
>
> I pointed out that out: There is a double bug, locale dependent generation of
> the parser file, and relying on software that can't handle LC_CTYPE=UTF-8.
On (at least) linux using glibc, LC_CTYPE requires a valid locale.
And UTF-8 on its own is not a valid locale.
A quick search on google suggests that LC_CTYPE will, among other
things, control what is a valid letter, and lowercase|uppercase
conversions.
Taking an easy case, with languages written in latin alphabets, what
is the uppercase of 'i' ? In Turkey it is İ (with a dot), because
in turkish dotted-i and dotless-i are different letters.
ĸen
--
He died at the console, of hunger and thirst.
Next day he was buried, face-down, nine-edge first.
- the perfect programmer