* Ambrevar on Wednesday, September 26, 2012 at 21:43:18 +0200 > [Using Mutt-1.5.21 with FreeBSD and Arch Linux] > > Hi there, > > I noticed a nasty behaviour I do not understand. It happens if the e-mail > contains latin characters with acute, like 'é', but no unicode character not > covered by latin1. The text editor -- Emacs, but I tested with others too -- > will correctly set it to UTF-8. > > file /tmp/mutt-... > > will confirm this. > > When I close the editor, thus swithing back to Mutt, it sees the content as > iso8859-1. This is on the "Mutt Compose" screen, right before actually sending > the mail. Of course I can convert it at this very same point with the > 'edit-type' function. But I do not want to do it manually. It would be nice > if I > could send all my mails in UTF-8 automatically. > > If I add a unicode character not covered by Latin1 (e.g. '€'), then the e-mail > is correctly marked as being utf-8. > > All my locale variables are set to "en_US.UTF-8". And > > :set &charset ?charset > > returns 'utf-8'. > > So I guess it comes from Mutt that fails at guessing the proper character > encoding. I do not know how it works internally, but I do not have so much > time > right now to crawl in the source code. > > To reproduce the issue: write a e-mail with the sole letter 'é' as content. > > Note that the e-mail is not screwed up. It gets properly converted from utf-8 > to > latin1, 'é' goes from a9c3 (utf-8) to e9 (latin1). > > Any clue?
man 5 muttrc: send_charset Type: string Default: "us-ascii:iso-8859-1:utf-8" A colon-delimited list of character sets for outgoing messages. Mutt will use the first char- acter set into which the text can be converted exactly. If your $charset is not "iso-8859-1" and recipients may not understand "UTF-8", it is advisable to include in the list an appropri- ate widely used standard character set (such as "iso-8859-2", "koi8-r" or "iso-2022-jp") either instead of or after "iso-8859-1". In case the text cannot be converted into one of these exactly, mutt uses $charset as a fall- back. I have: set send_charset="us-ascii:iso-8859-1:iso-8859-15:windows-1252:utf-8" And when I write €, this email will probably be iso-8859-15. It's all about saving old-fashionedly some bits over the wire. -- theatre - books - texts - movies Black Trash Productions at home: http://www.blacktrash.org Black Trash Productions on Facebook: http://www.facebook.com/blacktrashproductions