On Fri May 19 19:45:43 CDT 2006, [EMAIL PROTECTED] wrote:
>   There's no such thing as an accented letter in a Russian language.
>   That was the exact point of my initial remark.

the text was /romanized/ russian names.  it was not written in the cyrillic 
alphabet.

> 
>   Now, if you allow me to educate myself in Unicode a little bit,
>   I'm about to follow through with your example. Be patient with me ;-)

as long as you're patient with me.

> 
> > suppose that U+x is the cp for the letter.  
> > suppose U+y is the cp for the accent.
> 
>   Ok.
> 
> > suppose that we're lucky and there exists U+w ? U+xU+y.  
> 
>   Just to make sure I still follow: U+w is supposed to *visually* 
>   look like U+x followed by U+y, right ?

yes.  they must be the same.

> 
> > then U+w should be the same glyph as U+xU+y.
> 
>   The same glyph from a visual standpoint, right ?

a glyph IS the visual representation. 

> 
> > cannonical composition would yield
> >     compose(U+xU+y) U+w
> >     compose(U+w)            U+w
> > while cannonical decompostion would yield
> >     decompose(U+xU+y)       U+xU+y
> >     decompose(U+w)          U+xU+y
> 
>   And that's exactly the place where I think Unicode goes against common
>   sense and language rules. I would expect it to mandate that a *decomposable*
>   character is supposed to be used over the decomposition. Which in your
>   original example was the case.

rob agrees with you.

however, there is a big advantage to a composed character -- you don't have to 
figure out
how to stick the horn, breve, slash, &c on top of, under, on the shoulder of, 
through, &c
the original character.  in plan 9, characters are bitmaps making this 
operation extra
annoying.  also, there are no rules in unicode preventing /arbitrary/ 
compositions.
this is valid unicode
        u+0069 u+0300 u+0301 u+0302 u+0303
all those combining codepoints attach to the base cp u+0069.  figure out how to 
build that
glyph.

> 
>   "There are no accents in Russian language" (*)
>

now you're confusing language and alphabet! ☺

- erik

Reply via email to