> -----Original Message----- > From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On > Behalf Of Asmus Freytag
> >>Canonical equivalence must be taken into account in rendering multiple > >>accents, so that any two canonically equivalent sequences display as the > same. > > This statement goes to the core of Unicode. If it is followed, it > guarantees that normalizing a string does not change its appearance (and > therefore it remains the 'same' string as far as the user is concerned.) I agree in principle. There are two ways in which the philosophy behind this breaks down in real life, though: 1. There are cases of combining marks given a class of 0, meaning that combinations of marks in different positions relative to the base will be visually indistinguishable, but the encoded representations are not the same, and not canonically equivalent. E.g. (taken from someone else on the Indic list) Devanagari ka + i + u vs. ka + u + i. 2. Relying on normalization, and specifically canonical ordering, to happen in a rendering engine IS liable to be a noticeable performance issue. I suggest that whoever wrote > Rendering systems should handle any of the canonically equivalent > orders of combining marks. This is not a performance issue: The amount > of time necessary to reorder combining marks is insignificant compared > to the time necessary to carry out other work required for rendering. was not speaking from experience. > The interesting digressions on string libraries aside, the statement made > here is in the context of the tasks needed for rendering. If you take a > rendering library and add a normalization pass on the front of it, you'll > be hard-pressed to notice a difference in performance, especially for any > complex scripts. If what is normalized is the backing store. If what is normalized is a string at an intermediate stage in the rendering process, then this is not the case. The reason is the number of times text-rendering APIs get called. As you mention, > However, from the other messages on this thread we conclude: normalizing > *every* string, *every time* it gets touched, *is* a performance issue. Peter Peter Constable Globalization Infrastructure and Font Technologies Microsoft Windows Division