Re: desktop and encodings

Peter Dyballa Mon, 23 May 2005 11:04:57 -0700


Am 23.05.2005 um 16:13 schrieb Mads Jensen:

æøå gets turned into something like Â¥...

What see is the 'translation' of some ISO Latin encoding into UTF-8 andthen displaying these double byte values as unibytes!


This could explain a bit:

;   oct   dec   hex    UCS2    UTF-8
;=====================================
  = 240 = 160 = A0 = U+00A0 =    C2 A0 : NO-BREAK SPACE

Ą = 241 = 161 = A1 = U+0104 = C4 84 : LATIN CAPITAL LETTER A WITHOGONEK

ĸ = 242 = 162 = A2 = U+0138 =    C4 B8 : LATIN SMALL LETTER KRA

Ŗ = 243 = 163 = A3 = U+0156 = C5 96 : LATIN CAPITAL LETTER R WITHCEDILLA

¤ = 244 = 164 = A4 = U+00A4 =    C2 A4 : CURRENCY SIGN

Ĩ = 245 = 165 = A5 = U+0128 = C4 A8 : LATIN CAPITAL LETTER I WITHTILDEĻ = 246 = 166 = A6 = U+013B = C4 BB : LATIN CAPITAL LETTER L WITHCEDILLA

§ = 247 = 167 = A7 = U+00A7 =    C2 A7 : SECTION SIGN
¨ = 250 = 168 = A8 = U+00A8 =    C2 A8 : DIAERESIS

Š = 251 = 169 = A9 = U+0160 = C5 A0 : LATIN CAPITAL LETTER S WITHCARONĒ = 252 = 170 = AA = U+0112 = C4 92 : LATIN CAPITAL LETTER E WITHMACRONĢ = 253 = 171 = AB = U+0122 = C4 A2 : LATIN CAPITAL LETTER G WITHCEDILLAŦ = 254 = 172 = AC = U+0166 = C5 A6 : LATIN CAPITAL LETTER T WITHSTROKE

 = 255 = 173 = AD = U+00AD =    C2 AD : HYPHEN-MINUS

Ž = 256 = 174 = AE = U+017D = C5 BD : LATIN CAPITAL LETTER Z WITHCARON

Á = 301 = 193 = C1 = U+00C1 = C3 81 : LATIN CAPITAL LETTER A WITHACUTEÂ = 302 = 194 = C2 = U+00C2 = C3 82 : LATIN CAPITAL LETTER A WITHCIRCUMFLEXÃ = 303 = 195 = C3 = U+00C3 = C3 83 : LATIN CAPITAL LETTER A WITHTILDEÄ = 304 = 196 = C4 = U+00C4 = C3 84 : LATIN CAPITAL LETTER A WITHDIAERESISÅ = 305 = 197 = C5 = U+00C5 = C3 85 : LATIN CAPITAL LETTER A WITHRING ABOVE

Æ = 306 = 198 = C6 = U+00C6 =    C3 86 : LATIN CAPITAL LETTER AE

æ = 346 = 230 = E6 = U+00E6 =    C3 A6 : LATIN SMALL LETTER AE

First column contains the glyphs as they are, next columns have theglyph's byte value expressed as octal, decimal, or hexadecimalnumerals. Next column, UCS2, show the slot number (ASCII code) of thatglyph in Unicode (which, I think, is too the internal representation inGNU Emacs). The next column now shows into which bytes the glyphs fromcolumn 1 are translated as UTF-8. As you can see you can 'see' theUTF-8 bytes as 'normal' characters, a UTF-8 encoded æ is just 'ÄĻ' ifdisplayed in ISO Latin-4, 'Ä¦' in ISO Latin-1 ...

So, to conclude: your Emacs obviously saves your input as UTF-8, andyou have to make the buffer display in UTF-8 too! The correct headerswould look like


        ;;; -*- mode: Text; coding: utf-8; -*-

Once you have the file opened in the wrong encoding you can change thatwith revert-buffer-with-coding-system, C-x RET r utf-8 RET.


Have you thought of

(prefer-coding-system     'utf-8-unix)

Could be it cures a lot. There is too (set-language-environment'Danish) ...



--
Mit friedvollen Grüßen

  Pete

In a world without walls and fences, who needs gates and windows?



_______________________________________________
Help-gnu-emacs mailing list
[email protected]
http://lists.gnu.org/mailman/listinfo/help-gnu-emacs

Re: desktop and encodings

Reply via email to