On 12-09-26 20:59:51, Christian Ebert wrote:
> * Ambrevar on Wednesday, September 26, 2012 at 21:43:18 +0200
> > [Using Mutt-1.5.21 with FreeBSD and Arch Linux]
> > 
> > Hi there,
> > 
> > I noticed a nasty behaviour I do not understand. It happens if the e-mail
> > contains latin characters with acute, like 'é', but no unicode character not
> > covered by latin1. The text editor -- Emacs, but I tested with others too --
> > will correctly set it to UTF-8.
> > 
> >  file /tmp/mutt-...
> > 
> > will confirm this.
> > 
> > When I close the editor, thus swithing back to Mutt, it sees the content as
> > iso8859-1. This is on the "Mutt Compose" screen, right before actually 
> > sending
> > the mail.  Of course I can convert it at this very same point with the
> > 'edit-type' function. But I do not want to do it manually. It would be nice 
> > if I
> > could send all my mails in UTF-8 automatically.
> > 
> > If I add a unicode character not covered by Latin1 (e.g. '€'), then the 
> > e-mail
> > is correctly marked as being utf-8.
> > 
> > All my locale variables are set to "en_US.UTF-8". And
> > 
> >  :set &charset ?charset
> > 
> > returns 'utf-8'.
> > 
> > So I guess it comes from Mutt that fails at guessing the proper character
> > encoding. I do not know how it works internally, but I do not have so much 
> > time
> > right now to crawl in the source code.
> > 
> > To reproduce the issue: write a e-mail with the sole letter 'é' as content.
> > 
> > Note that the e-mail is not screwed up. It gets properly converted from 
> > utf-8 to
> > latin1, 'é' goes from a9c3 (utf-8) to e9 (latin1).
> > 
> > Any clue?
> 
> man 5 muttrc:
> 
> send_charset
>       Type: string
>       Default: "us-ascii:iso-8859-1:utf-8"
> 
>       A  colon-delimited list of character sets for outgoing messages. Mutt 
> will
>       use the first char- acter set into which the text can be converted
>       exactly.  If your $charset is not       "iso-8859-1" and recipients may 
> not
>       understand "UTF-8", it is advisable to include in the list an appropri-
>       ate widely used standard character set  (such  as  "iso-8859-2",        
> "koi8-r"
>       or  "iso-2022-jp") either instead of or after "iso-8859-1".
> 
>       In  case the text cannot be converted into one of these exactly, mutt 
> uses
>       $charset as a fall- back.
> 
> I have:
> set send_charset="us-ascii:iso-8859-1:iso-8859-15:windows-1252:utf-8"
> 
> And when I write €, this email will probably be iso-8859-15.
> 
> It's all about saving old-fashionedly some bits over the wire.
> 
> -- 
> theatre - books - texts - movies
> Black Trash Productions at home: http://www.blacktrash.org
> Black Trash Productions on Facebook:
> http://www.facebook.com/blacktrashproductions

Thank you so much for this quick answer! Well, it definitely works!

I wasn't aware of the send_charset variable: I must have missed it in the
not-so-short man page and in the wiki. The thing is, I was looking for
'encoding', not 'charset'. :/

Saving bits may be understandable, but in my humble opinion universality and
simplicity is more important. UTF-8 saves us a lot of pain, that's why I'm a
UTF-8 advocate -- and for a lot of other reasons too.

Cheers.

Reply via email to