On Fri, May 19, 2006 at 07:13:33PM -0500, [EMAIL PROTECTED] wrote:
> On Fri May 19 19:13:39 CDT 2006, [EMAIL PROTECTED] wrote:
> >   Since I'm no expert
> >   in UNICODE I'm quite curious to know how one is supposed to
> >   tell between a real character and a combination of a diacritic
> >   and some other character when they are visually indistinguishable ?
> 
> say i have a random accented letter.  

  There's no such thing as an accented letter in a Russian language.
  That was the exact point of my initial remark.

  Now, if you allow me to educate myself in Unicode a little bit,
  I'm about to follow through with your example. Be patient with me ;-)

> suppose that U+x is the cp for the letter.  
> suppose U+y is the cp for the accent.

  Ok.

> suppose that we're lucky and there exists U+w ? U+xU+y.  

  Just to make sure I still follow: U+w is supposed to *visually* 
  look like U+x followed by U+y, right ?

> then U+w should be the same glyph as U+xU+y.

  The same glyph from a visual standpoint, right ?

> cannonical composition would yield
>       compose(U+xU+y) U+w
>       compose(U+w)            U+w
> while cannonical decompostion would yield
>       decompose(U+xU+y)       U+xU+y
>       decompose(U+w)          U+xU+y

  And that's exactly the place where I think Unicode goes against common
  sense and language rules. I would expect it to mandate that a *decomposable*
  character is supposed to be used over the decomposition. Which in your
  original example was the case.

> >   I would expect unicode to always favor single glyphs from a particular 
> >   page over anything else.
> 
> it's always a single glyph.  don't confuse letters, codepoints, and glyphs.

  It is still a bit hard to not confuse letters and glyphs :-(

> i'll send you a png of the character.  i don't have the books.
> 
> what language rule are you trying to get at?

  "There are no accents in Russian language" (*)

Thanks,
Roman.

(*) well, except for a Ukrainian one ;-) 

Reply via email to