* Ambrevar on Wednesday, September 26, 2012 at 21:43:18 +0200
> [Using Mutt-1.5.21 with FreeBSD and Arch Linux]
> 
> Hi there,
> 
> I noticed a nasty behaviour I do not understand. It happens if the e-mail
> contains latin characters with acute, like 'é', but no unicode character not
> covered by latin1. The text editor -- Emacs, but I tested with others too --
> will correctly set it to UTF-8.
> 
>  file /tmp/mutt-...
> 
> will confirm this.
> 
> When I close the editor, thus swithing back to Mutt, it sees the content as
> iso8859-1. This is on the "Mutt Compose" screen, right before actually sending
> the mail.  Of course I can convert it at this very same point with the
> 'edit-type' function. But I do not want to do it manually. It would be nice 
> if I
> could send all my mails in UTF-8 automatically.
> 
> If I add a unicode character not covered by Latin1 (e.g. '€'), then the e-mail
> is correctly marked as being utf-8.
> 
> All my locale variables are set to "en_US.UTF-8". And
> 
>  :set &charset ?charset
> 
> returns 'utf-8'.
> 
> So I guess it comes from Mutt that fails at guessing the proper character
> encoding. I do not know how it works internally, but I do not have so much 
> time
> right now to crawl in the source code.
> 
> To reproduce the issue: write a e-mail with the sole letter 'é' as content.
> 
> Note that the e-mail is not screwed up. It gets properly converted from utf-8 
> to
> latin1, 'é' goes from a9c3 (utf-8) to e9 (latin1).
> 
> Any clue?

man 5 muttrc:

send_charset
      Type: string
      Default: "us-ascii:iso-8859-1:utf-8"

      A  colon-delimited list of character sets for outgoing messages. Mutt will
      use the first char- acter set into which the text can be converted
      exactly.  If your $charset is not "iso-8859-1" and recipients may not
      understand "UTF-8", it is advisable to include in the list an appropri-
      ate widely used standard character set  (such  as  "iso-8859-2",  "koi8-r"
      or  "iso-2022-jp") either instead of or after "iso-8859-1".

      In  case the text cannot be converted into one of these exactly, mutt uses
      $charset as a fall- back.

I have:
set send_charset="us-ascii:iso-8859-1:iso-8859-15:windows-1252:utf-8"

And when I write €, this email will probably be iso-8859-15.

It's all about saving old-fashionedly some bits over the wire.

-- 
theatre - books - texts - movies
Black Trash Productions at home: http://www.blacktrash.org
Black Trash Productions on Facebook:
http://www.facebook.com/blacktrashproductions

Reply via email to