2009/9/2 IWAMURO Motonori: > I want to use UTF-8 throughout. > Because: > - a lot of UNIX tools using network (e.g. rsync, scp, ...) treat the > file name as 8bit byte array. > - default locale of modern UNIX based OS is *.UTF-8. > - The file with the filename including the character outside the > codepage (e.g. files in iTunes folder) can be handled.
I'm minded to agree, but actually there's a big stumbling block here: many interactive programs in Cygwin do not (yet) support UTF-8, e.g. nano, mutt, and mc. If you try, you get all sorts of funny effects with invalid characters and mispositioned cursors. That's not acceptable as default. Which leaves one apparently good solution for the "C" locale: >> - Use the default Windows codepage for filenames, console, and >> multibyte functions. This is what happens already if you specifiy a >> locale with a language but no charset, e.g. "en". Maximum 1.5 >> compatibility. On a closely related note, Debian are introducing a "C.UTF-8" locale as a language-neutral locale with a UTF-8 character set. This is useful for choosing UTF-8 without picking up language-specific stuff like sorting rules. See here: http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=522776. It's a rather lengthy thread, but in the end they did decide to go for it. Cygwin 1.7, through newlib, already has "C-UTF-8", as well as the likes of "C-ISO-8859-1" or "C-SJIS". So how about replacing the "C-" with "C." in those, considering that Cygwin has no backward compatibility requirement regarding those? Andy -- Problem reports: http://cygwin.com/problems.html FAQ: http://cygwin.com/faq/ Documentation: http://cygwin.com/docs.html Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple