On Fri, Mar 15, 2013 at 6:52 PM, Richard Wordingham <
richard.wording...@ntlworld.com> wrote:

> > The "fractional" refers to the same kind of mechanism as the "large
> > weight values" in the UCA spec.
>
> Yes.  The problem is that formally the UCA clearly treats 'large
> weights' as being in multiple collation elements, whereas, in various
> places, for transforming collation element tables properly, one needs
> them to be treated as being in a single collation element.
>

Correct, that's where the complexities are that I mentioned. ICU's code has
to look at whether a CE is a "continuation CE" for whether to apply the
script-reordering permutation or the uppercase-first permutation, etc.

> The point is that no sequence of
> > units (8-bit, 16-bit or whatever the implementation uses) can be an
> > exact prefix of another sequence.
>
> That's only for efficiency.


No, it's critical for correctness.

 One could allocate low unit values to the
> start units and high unit values to continuation units.  By using high
> values for continuation units, DUCET simplifies the identification'
>

One could pick nearly any range for the trailing units. With the UCA spec
using 16-bit units and only 21 bits to encode in a pair, there is nearly
free choice for the range of trail units.

markus

Reply via email to