Am 23.05.2005 um 16:13 schrieb Mads Jensen:
æøå gets turned into something like Â¥...
What see is the 'translation' of some ISO Latin encoding into UTF-8 and
then displaying these double byte values as unibytes!
This could explain a bit:
; oct dec hex UCS2 UTF-8
;=====================================
= 240 = 160 = A0 = U+00A0 = C2 A0 : NO-BREAK SPACE
Ą = 241 = 161 = A1 = U+0104 = C4 84 : LATIN CAPITAL LETTER A WITH
OGONEK
ĸ = 242 = 162 = A2 = U+0138 = C4 B8 : LATIN SMALL LETTER KRA
Ŗ = 243 = 163 = A3 = U+0156 = C5 96 : LATIN CAPITAL LETTER R WITH
CEDILLA
¤ = 244 = 164 = A4 = U+00A4 = C2 A4 : CURRENCY SIGN
Ĩ = 245 = 165 = A5 = U+0128 = C4 A8 : LATIN CAPITAL LETTER I WITH
TILDE
Ļ = 246 = 166 = A6 = U+013B = C4 BB : LATIN CAPITAL LETTER L WITH
CEDILLA
§ = 247 = 167 = A7 = U+00A7 = C2 A7 : SECTION SIGN
¨ = 250 = 168 = A8 = U+00A8 = C2 A8 : DIAERESIS
Š = 251 = 169 = A9 = U+0160 = C5 A0 : LATIN CAPITAL LETTER S WITH
CARON
Ē = 252 = 170 = AA = U+0112 = C4 92 : LATIN CAPITAL LETTER E WITH
MACRON
Ģ = 253 = 171 = AB = U+0122 = C4 A2 : LATIN CAPITAL LETTER G WITH
CEDILLA
Ŧ = 254 = 172 = AC = U+0166 = C5 A6 : LATIN CAPITAL LETTER T WITH
STROKE
= 255 = 173 = AD = U+00AD = C2 AD : HYPHEN-MINUS
Ž = 256 = 174 = AE = U+017D = C5 BD : LATIN CAPITAL LETTER Z WITH
CARON
Á = 301 = 193 = C1 = U+00C1 = C3 81 : LATIN CAPITAL LETTER A WITH
ACUTE
 = 302 = 194 = C2 = U+00C2 = C3 82 : LATIN CAPITAL LETTER A WITH
CIRCUMFLEX
à = 303 = 195 = C3 = U+00C3 = C3 83 : LATIN CAPITAL LETTER A WITH
TILDE
Ä = 304 = 196 = C4 = U+00C4 = C3 84 : LATIN CAPITAL LETTER A WITH
DIAERESIS
Å = 305 = 197 = C5 = U+00C5 = C3 85 : LATIN CAPITAL LETTER A WITH
RING ABOVE
Æ = 306 = 198 = C6 = U+00C6 = C3 86 : LATIN CAPITAL LETTER AE
æ = 346 = 230 = E6 = U+00E6 = C3 A6 : LATIN SMALL LETTER AE
First column contains the glyphs as they are, next columns have the
glyph's byte value expressed as octal, decimal, or hexadecimal
numerals. Next column, UCS2, show the slot number (ASCII code) of that
glyph in Unicode (which, I think, is too the internal representation in
GNU Emacs). The next column now shows into which bytes the glyphs from
column 1 are translated as UTF-8. As you can see you can 'see' the
UTF-8 bytes as 'normal' characters, a UTF-8 encoded æ is just 'ÄĻ' if
displayed in ISO Latin-4, 'Ħ' in ISO Latin-1 ...
So, to conclude: your Emacs obviously saves your input as UTF-8, and
you have to make the buffer display in UTF-8 too! The correct headers
would look like
;;; -*- mode: Text; coding: utf-8; -*-
Once you have the file opened in the wrong encoding you can change that
with revert-buffer-with-coding-system, C-x RET r utf-8 RET.
Have you thought of
(prefer-coding-system 'utf-8-unix)
Could be it cures a lot. There is too (set-language-environment
'Danish) ...
--
Mit friedvollen Grüßen
Pete
In a world without walls and fences, who needs gates and windows?
_______________________________________________
Help-gnu-emacs mailing list
Help-gnu-emacs@gnu.org
http://lists.gnu.org/mailman/listinfo/help-gnu-emacs