On Fri, May 19, 2006 at 06:43:14PM -0500, [EMAIL PROTECTED] wrote: > eh? you speak russian. ;-)
and two versions of it too ;-) > no. the unicode sequences (e.g. U+0069 U+0361) are correct. > i checked this and several other examples with the actual books. How did you check it ? Visual inspection ? Since I'm no expert in UNICODE I'm quite curious to know how one is supposed to tell between a real character and a combination of a diacritic and some other character when they are visually indistinguishable ? I would expect unicode to always favor single glyphs from a particular page over anything else. btw, could you send me a .png with the actual title ? > i think you misunderstand how unicode works. That could very well be the case ;-) But I know how Russian language works regardless of what committee members think. > a base cp like U+0069 followed by a combining cp like U+0361 > make a single character. this identification is called "composition". > unicode contains some precomposed cps, but not U+0069 U+0361. That's ok. My only point is -- I would expect anybody who enters titles into a database adhere to the rules of the language the title is written in. Maybe its too much to expect, though. Thanks, Roman.
