I don't know why this mail came to me, but since I was given a chance....(^^;) >>>>> "MM" == Motomichi Matsuzaki <[EMAIL PROTECTED]> writes: MM> * filenames recorded on Unix filesystems (e.g. FFS, MFS) use MM> an arbitrary codeset, for example Unicode. Rather, let's use "codepage + codeset" information, so that we can find the difference between Chinese "BONE" and Japanese "BONE". WE NEED THEM TO BE DIFFERENT, YOU KNOW. For example, save filename using 64bit per character, containing codepage with 32bit, and codeset using UCS-4. # We have enough diskspace and Memory to handle them, don't worry. # And even if we didn't now, we will, within 2 years. Do normalization for codepage against those characters that will not be effected by codepage, so that comparison will be easier. Many might say it's rediculous to have filename encoding different from system call interface coding systems. But this is so only because BUGGY UNICODE is current trend. If we could have codeset that does not need codepage, the problem did not occur. And the very reason why we happend to have this BUGGY UNICODE, is because they stint bits. We should not do the same mistakes. So, there's only two selection. 1) Let's use Unicode for interface, and let's have large enough bits per characeter internally... like 256bits/character. 2) Let's create Truely Unified coding system, which not only allow us to describe the "currently used language", but also, exterminated languages like Cuneiform Characters as well. And use it for internally, and interface. ( This, also requres lot larger coding space than current. I think we do need 256bits anyway ). best regards, ---- Kenichi Okuyama@Tokyo Research Lab, IBM-Japan, Co. To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message