Henry Spencer writes:
> If you're not using the raw input, why does this matter?
Many programs use the raw input, for example the kernel - which
doesn't know about encodings at the filename level -, 90% of the GNU
fileutils, 50% of the GNU textutils, etc. Whereas others know that
it's UTF-8 and perform to conversion in order to have a different
internal representation.
The point of ASCII compatibility of UTF-8 is that software changes are
kept to a minimum. It would be stupid if the kernel had to verify
every filename passed to it via a system call or read from disk to see
whether it's well-formed UTF-8. For the kernel, a filename continues
to be a sequence of bytes with a NUL byte at the end.
Bruno
-
Linux-UTF8: i18n of Linux on all levels
Archive: http://mail.nl.linux.org/lists/