On Mon, Feb 11, 2002, Ilya Konstantinov wrote about "Re: Linux filenames with definite 
encoding (Was: FTP server with intl support)":
> I wasn't suggesting readdir should have another argument to specify the
> desired encoding, but rather that a standard encoding should be chosen.
> e.g. the ext2 standard should be revised to allow specifying that a
> given filename is in Unicode encoding.
> 
> Eventually, glibc should offer u_readdir() and readdir().
> u_readdir() would return the filename in UTF-8 encoding (by asking the
> kernel for the Unicode filenames via the new syscall)
> readdir() would also call the kernel's new syscall and then convert the
> filenames to the locale's encoding.

Next thing you'll ask for u_read() and u_write() for writing/reading unicode
text from files...

No, UNIX traditionally operates on strings of "chars" (bytes/octets). No
special treatment is ever given by system calls to any byte except null
(and "/" in pathnames) - not nl/cr, not ascii/nonascii, or anything of that
sort. The only thing "required" of this encoding is to leave nulls and
slashes alone (i.e., no encoding of another character can contain slashes
or nulls), and both UTF8 and ISO-8859-* encodings indeed have that feature.

Having filenames in (say) UTF8 should be a convention, just like putting
binaries in (say) "/usr/bin" is a convention: it isn't a requirement of
the kernel, and not even a requirement of the filesystem.


-- 
Nadav Har'El                        |      Monday, Feb 11 2002, 29 Shevat 5762
[EMAIL PROTECTED]             |-----------------------------------------
Phone: +972-53-245868, ICQ 13349191 |Someone offered you a cute little quote
http://nadav.harel.org.il           |for your signature? JUST SAY NO!

=================================================================
To unsubscribe, send mail to [EMAIL PROTECTED] with
the word "unsubscribe" in the message body, e.g., run the command
echo unsubscribe | mail [EMAIL PROTECTED]

Reply via email to