On Wed, Dec 11, 2013 at 12:46:09AM -0500, Alan Feuerbacher wrote:
> 
> LC_ALL=en_US locale charmap             -> ISO-8859-1
> LC_ALL=en_US.iso88591 locale charmap    -> ISO-8859-1
> LC_ALL=en_US.utf8 locale charmap        -> UTF-8
> 
> So far as I understand, US English installations work with either of the
> above charmap settings.
> 
> Can someone explain the difference?

 So long as you use _only_ ASCII characters or the few symbols and
accented letters offered in it, ISO-8859-1 works fine.  Once people
start using UTF-8 (like in my .sig), things break down.

 If you look at iso-8859-1 on wikipedia it will show you the limited
range of glyphs / codepoints it supports.  What that page *doesn't*
mention is the encoding.  For that, look at the UTF-8 page if you
are interested in the messy details.  The point is that ANY latin-1
(ISO-8859-1) character with a value greater than 0x7F is represented
by a single byte.

 However, when I send you the same character in UTF-8 it will occupy
more than one byte.  For example, the copyright sign is 0x00A9 - in
UTF-8 that becomes 0xC2 0xA9 [ © ] if I've read the UTF-8 wiki page
correctly.

> And what I should set in the Samba
> smb.conf file for "unix charset"?
> 
 If you have ISO-8859-1 data in the files offered by Samba, then I
guess you need to use 8859-1.  Otherwise, use UTF-8.  Windows has
supported UTF-8 for a long time.

ĸen
-- 
das eine Mal als Tragödie, dieses Mal als Farce
-- 
http://linuxfromscratch.org/mailman/listinfo/blfs-support
FAQ: http://www.linuxfromscratch.org/blfs/faq.html
Unsubscribe: See the above information page

Reply via email to