On Nov 9, 2010, at 4:29 PM, Reece Dunn wrote: > You could use autoconf to detect: > 1/ broken handling of UTF-8 characters by sed; > 2/ name of LC_ALL flag that handles UTF-8
In theory, you only need to set LC_CTYPE, not any other aspect of the locale. And for that, you don't need the language or country. On Mac OS X, the encoding can be bare, such as LC_CTYPE=UTF-8. The Makefile used to set LANG, then commit 492ac292b918a3369900532e4edfadaeeba32064 changed it to LC_ALL. That wasn't explained. I assume it was because LANG could be superseded by LC_* variables in the user's environment, and that is undesirable. Perhaps another approach would be to explicitly unset LC_ALL and export LC_CTYPE=UTF-8. On Nov 9, 2010, at 4:13 PM, Charles Davis wrote: > Unfortunately, I just remembered that the name of the UTF-8 encoding is > different on Mac OS ('UTF-8') and Linux ('utf8'). Are you sure about that? Checking on a couple of Linux systems here, the "locale" command reports: $ locale LANG=en_US.UTF-8 LC_CTYPE="en_US.UTF-8" ... Hmm. However, using a bare encoding for LC_CTYPE doesn't seem to fly on Linux. Darn, so close to a simple fix. :( -Ken