On Mon, 4 Dec 2017 12:48:11 -0800 Markus Scherer via Unicode <unicode@unicode.org> wrote:
> On Mon, Dec 4, 2017 at 5:30 AM, Richard Wordingham via Unicode < > unicode@unicode.org> wrote: > > Would an implementation that supported no characters be compliant? > I guess so. I assume that would mean that the CET maps nothing, and > that the implementation does implement the implicit weighting of Han > characters and unassigned (here: unmapped) code points. It would also > have to do NFD first. I am extrapolating from the comment on UTS10-C1 in UTS#10, "In particular, a conformant implementation must be able to compare any two canonical-equivalent strings as being equal, for all Unicode characters supported by that implementation." There is now nothing that forces the implementation to support any Unicode characters! Possibly this results from an attempt to allow an implementation to conform to Version x.y.z of the UCA with supporting normalisation for some other set of characters or choosing not to support character with non-zero canonical combining class, which, while not eliminating the need to address canonical equivalence, goes a long way towards doing so. I am not aware of any general requirement that a CET be a tailoring of DUCET or of the CLDR root collation, so the implicit weights would be irrelevant in this case. The implicit weights are part of DUCET. If no characters are supported, performing NFD will be a rather obvious trivial transformation of the null string to itself. > > It used to be that for an implementation to be claimed as compliant, > it > > also had to pass a specific conformance test. This requirement has > > now been abandoned, perhaps because the Default Unicode Collation > > Element Table (DUCET) is incompatible with the CLDR Collation > > Algorithm. > > The DUCET is missing some things that are needed by the CLDR Collation > Algorithm, but that has nothing to do with UCA compliance. An implementation that only implements the CLDR collation algorithm cannot be tailored to support DUCET, because DUCET (at Version 10.0.0) has the ordering U+FFF8 < U+FFFE < U+1004E, which is incompatible with UTS#35 Part 5 Section 1.1.1 - "U+FFFE maps to a CE with a minimal, unique primary weight". Therefore one could only apply the published UCA conformance test if it deliberately avoided strings containing U+FFFE. > The simple fact is that tailorings are common, and it has to be > possible to conform to the algorithm without forbidding tailorings. It's the CLDR collation algorithm that prohibits DUCET. Thankfully, the CLDR root collation can be interpreted to be compatible with the UCA. (Tailorings may be incompatible, or at least, incompatible with the concept of a finite CET.) Richard.