On Sat, Jun 26, 2010 at 12:27:28PM +0300, Alexander Gattin
wrote:
>
> On Fri, Jun 25, 2010 at 11:31:23PM -0700, George
> Davidovich wrote:
> > 32 Content-Type: text/plain; charset=iso-8859-1
> > 33 Content-Transfer-Encoding: quoted-printable
> > 34
> > 35 George=A0=A0-=A0=A0 ...
>
> > As I understand it, "A0" represents the
> > non-breaking space character.
>
> In iso-8859-1? Maybe. If I create this file:
> $ echo -e 'a\xa0b' > /tmp/nbsp
> and then
> $ vim /tmp/nbsp
> vim shows:
> a b
> and
> "/tmp/nbsp" [converted] 1L, 5C
> in its status line at the bottom.
> This means that vim converted text to utf8 (my
> locale) and "nbsp" character now takes 2 bytes (in
> utf8). And the 5th byte is "\n" BTW.
Correction noted. My terminal (for better or worse)
doesn't support UTF-8. That's becoming increasingly
worse.
> > Mutt displays the message correctly, but in vim,
> > the character appears as a pipe symbol.
>
> Please verify that vim can or cannot correctly
> handle \xa0 character using the abovementioned
> method.
Yes, Vim has no problems.
> > And, as you can tell, there's a whole lot of them.
>
> If vim itself works OK but e-mails from mutt still
> show a lot of pipes, then, well, mutt really feeds
> vim with pipes.
Yes, mutt is feeding vim with pipes, and the entire email
appears in vim as a single line.
Can't complain, really, as Yahoo's email software is
presenting messages from me with a paperclip icon! Still
I'd like to know whether this a case of "yet another badly
formatted email" or a shortcoming in mutt. I suspect it's
the former as running the message through
MIME::QuotedPrinted doesn't strip the nbsp characters.
> > 2. As a workaround, how do I search/replace
> > non-printable characters
> > in vim?
>
> If you want to perform a substitution
> automatically when any file of type "mail" is
> opened by vim, then the following snippet in your
> ~/.vimrc will help:
>
> if has("autocmd")
> " Replace all iso8859-1 nbsp chars with "_"
> autocmd BufReadPost *
> \ if &filetype == "mail" && &fileencoding == "latin1" |
> \ %s/\%xA0/_/g |
> \ endif
> endif
Interesting. I've always relied on single 'autocmd
Bufenter FileType mail' lines, but using BufReadPost with
if statements makes much more sense.
Using 's/\%xA0/' works for known characters, but the more
optimal solution for interactive use that I've since
discovered (and that doesn't require external utilities,
memorisation, or reference documentation lookups) is
# yank unknown character into unnamed buffer
:
<C-r> "
# position cursor at unknown character
:
<C-r> <C-a>
Maybe the above will help other mutt users in similar
situations.
Thanks for the help, Alexander.
--
George