Re: problems with german umlauts

yota moteuchi Thu, 25 Jan 2007 13:37:38 -0800

Well, since charsets issue is my hobby ... I can write a short explanation
(try to... my shortage of english vocabulary could be an issue)


After having defined a 128 character table (0 -127 on 8 bits, well one zero
+ 7bits) covering only the English characters and some signs, called the
ASCII table; has been defined many extended tables using the 128 - 255 range
to store some "regional" character. There are around 10 extended tables to
fit with either the french, the danish, the Greek specificities (but not all
at the same time) : http://en.wikipedia.org/wiki/Category:ISO_8859

Japaneses and Chineses had some strange way to store their 36 000 ideograms
and there it started to be a mess.

Unicode consortium defined a HUGE table aiming to store every character
(well almost, but it's another problem) on 32 bits.
But the devilish ASCII was always here, hidden in the dark. So they ended to
design UTF-8 encoding system which is only an amazing trick to store the
unicode table :
- all the characters of the former 0 - 127 range of the ascii table are
stored on 8bits... so a pure ascii file is also a genuine UTF-8 file ^^
- To store other characters they use an ingeniously designed system of
drawers using the characters from 128 to 255 (and some more bytes if
necessary)

UTF-8 is the only way to write both in danish AND french on the same text...
and it is fully compatible with ASCII files...
nice isn't it ?

Yota

hope it's clear... hips
I could explain this more easily, in french, with a whiteboard and a cup of
coffee

On 1/25/07, Mats Bengtsson <[EMAIL PROTECTED]> wrote:


You are mistaken. ASCII only defines character codes up to 127, see for
example http://www.asciitable.com/.
What your table shows is probably Latin1 (ISO 8859-1).

   /Mats

Quoting Jonathan Henkelman <[EMAIL PROTECTED]>:

> Mats Bengtsson <mats.bengtsson <at> ee.kth.se> writes:
>
>
>
>> If you search the mailing list archives from the time before we
introduced
>> unicode support, you will be surprised how many questions there are
related
>> to Russian or Hebrew or Mandarin or ...
>>
>>    /Mats
>
> It wasn't intended to be a stupid question. I'm all over unicode for
> languages
> that use other character sets - cyrillic, hebrew, asian etc.  I was just
> surprised at how difficult it was to put an umlaut on a u for a
> german peice I
> was typesetting.
>
> Perhaps the problem lies in the documentation.  It suggests that if you
want
> to use "non-ascii" characters you have to save the document as unicode -
fair
> enough. (In fact it implies you can use any 8-bit ascii pg. 112, last
> paragrph, PDF version 2.10.0)  But I wanted to use ascii 252 (presumably
> similar to David in the original post) and I just inserted it into my
> document - and it compiled to a space.  Here I am trying to use an ascii
> character and hence expect not to have to do anything special, but would
I
> still have to save it as unicode?  When I used \char, I had to find the
tweak
> to get rid of the spaces before and after that character...
>
>> Because most accented European characters can not be accessed within
> ascii
>
> My ascii table shows all French, Norwegian, Danish characters as well as
most
> spanish, and german (can't profess to be an expert there) see characters
191-
> 255 (xBF - xff).  Are these accessable in a non-unicode document?
>
> Thanks,
> J
>
>
>
>
>
> _______________________________________________
> lilypond-user mailing list
> lilypond-user@gnu.org
> http://lists.gnu.org/mailman/listinfo/lilypond-user
>





_______________________________________________
lilypond-user mailing list
lilypond-user@gnu.org
http://lists.gnu.org/mailman/listinfo/lilypond-user

_______________________________________________
lilypond-user mailing list
lilypond-user@gnu.org
http://lists.gnu.org/mailman/listinfo/lilypond-user

Re: problems with german umlauts

Reply via email to