Tobias Herp wrote:
Hi, fellow vimmers,

I' struggling for quite a while now to get the character encoding right; I'd 
like vim to guess right, or at least to know which magical comment I could use 
to force vim to use the correct encoding settings. This is an everyday problem 
to me, since I work on Windows (different encoding conventions for GUI and 
shell programs!) as well as several Linux machines which are slightly 
differently configured.

Via our web-based bugtracker, I created an example file (attached) which 
contains german umlauts and their Javascript and HTML encodings and should look 
like this:

<snip>
ä       %E4     &auml; (auml)
ö       %F6     &ouml; (ouml)
ü       %FC     &uuml; (uuml)

Ä       %C4     &Auml; (Auml)
Ö       %D6     &Ouml; (Ouml)
Ü       %DC     &Uuml; (Uuml)

ß       %DF     &szlig; (szlig)
</snip>

(to cover the case the webmail interface scrambles the HTML entities I repeated 
them in the 4th column without the &amp; and ;)

The umlauts are displayed correctly when I open the file with WinXP's notepad 
(which in turn doesn't like the *IX line endings), but vim doesn't get them 
right (Bram's Vim 7.0 on a german WinXP prof, +multi_byte_ime/dyn).

Is there something I can do to make vim guess right, at the very least for this 
document?

Thanks a lot in advance!

After saving the attachment and loading it in gvim, I see it all right. I am using:

VIM - Vi IMproved 7.0 (2006 May 7, compiled Jul 23 2006 22:50:51)
Included patches: 1-42
Compiled by [EMAIL PROTECTED]
Huge version with GTK2-GNOME GUI.  Features included (+) or not (-):
[etc.]

'encoding' is set to utf-8 and the file opening heuristic also sets 'fileencoding' to utf-8 without BOM. This is weird since the attachment header says

        Content-Type: text/plain; charset="iso-8859-1"

I wonder if Thunderbird converted it to UTF-8 or what.



What does your Vim say on this file in reply to

        :verbose set enc? fenc? fencs?

?


Notes:

-- To set 'fileencoding' to something else than what Vim would normally expect, use the ++enc option to :edit, see ":help ++opt".

-- To force recognition of a file as Unicode (e.g., UTF-8), use ":setlocal bomb" on it; then check that 'fileencoding' is setlocal'ed to some Unicode encoding (such as utf-8) and save.

-- To force recognition of a file as not UTF-8 but Latin1 (assuming 'fileencodings' [plural] is set to "ucs-bom,utf-8,latin1"), put a number of upper-ASCII bytes (bytes >127) near the beginning, maybe in a comment. If the file is a text file, you can also use it as "weird underlining" (e.g. underline your main title with a row of ££££ (pounds sterling) or of Danish ØØØØ (slashed O's); then ":setlocal fenc=latin1" and save. The following works well in one of my text files:

-----------------------------------------
# zim: set fenc=latin1 nomod : £££££µµµµµ
# "zim" (not "vim") above is intentional
-----------------------------------------



Best regards,
Tony.

Reply via email to