On Wed, Dec 11, 2013 at 12:46:09AM -0500, Alan Feuerbacher wrote: > > LC_ALL=en_US locale charmap -> ISO-8859-1 > LC_ALL=en_US.iso88591 locale charmap -> ISO-8859-1 > LC_ALL=en_US.utf8 locale charmap -> UTF-8 > > So far as I understand, US English installations work with either of the > above charmap settings. > > Can someone explain the difference?
So long as you use _only_ ASCII characters or the few symbols and accented letters offered in it, ISO-8859-1 works fine. Once people start using UTF-8 (like in my .sig), things break down. If you look at iso-8859-1 on wikipedia it will show you the limited range of glyphs / codepoints it supports. What that page *doesn't* mention is the encoding. For that, look at the UTF-8 page if you are interested in the messy details. The point is that ANY latin-1 (ISO-8859-1) character with a value greater than 0x7F is represented by a single byte. However, when I send you the same character in UTF-8 it will occupy more than one byte. For example, the copyright sign is 0x00A9 - in UTF-8 that becomes 0xC2 0xA9 [ © ] if I've read the UTF-8 wiki page correctly. > And what I should set in the Samba > smb.conf file for "unix charset"? > If you have ISO-8859-1 data in the files offered by Samba, then I guess you need to use 8859-1. Otherwise, use UTF-8. Windows has supported UTF-8 for a long time. ĸen -- das eine Mal als Tragödie, dieses Mal als Farce -- http://linuxfromscratch.org/mailman/listinfo/blfs-support FAQ: http://www.linuxfromscratch.org/blfs/faq.html Unsubscribe: See the above information page