Re: CLDR and ICU

Richard Wordingham Fri, 27 Jul 2012 13:35:40 -0700

On Fri, 27 Jul 2012 09:01:13 -0700
Mark Davis ☕ <[email protected]> wrote:


> The key term is 'open interchange'.

XML documents are textual objects.  It is therefore reasonable to look
at them using tools for displaying textual objects.  However,
> "<snip> noncharacters are <snip>
> permanently reserved (unassigned) and have no interpretation
> whatsoever outside of their possible application-internal private
> uses."

> For CLDR collation data - *not open interchange, but specific to use
> in CLDR collation data* - these characters have specified use as
> sentinel characters, marking the boundaries for CJK 'buckets' for use
> in indexes.

I hope you're addressing a complaint I haven't made.  I haven't
complained about tailoring involving non-characters, though it
does strike me as a least evil. Are you perhaps arguing that I become
part of some CLDR application when I read CLDR XML files? 

> This is described in
> http://unicode.org/reports/tr35/#Collation_Elements. The
> noncharacters are chosen specifically so that they do not overlap
> with publicly interchanged private use characters. Of course,
> implementations of LDML can tailor the collations to remove them, or
> replace by other mechanisms.

I was going to ask when the LDML element suppress_contractions took
effect.  At least I now have some idea of the answer.

> Unfortunately, some restrictions that were perfectly reasonable for
> use in document interchange become annoying flaws in a general
> structured data interchange format. The inability to interchange all
> Unicode scalar values is one.

The restrictions improve legibility.  As it is, many of the
character-level elements in CLDR XML files tend to be unreadable.  It
would be better for them not to require genuinely complex text
rendering.  In a related matter, it was very inconvenient to have to
treat collation test files as binary data because they could not be DOS
text files - ctrl/Z in the comments cut the files short.

Richard.

Re: CLDR and ICU

Reply via email to