Peter Constable wrote:

> https://www.unicode.org/review/pri486/https://www.unicode.org/review/pri486/
> — UAX #42 provides the data for the Unicode Character Database in XML
> format. (UCD is character property data for use in processing
> algorithms that is provide with each version of Unicode. This PRI is
> for feedback on a planned UTC action to freeze UAX #42 as of Unicode
> 15.1.

This is a shame. I don’t know how widely the XML files were adopted, but I 
certainly found them easier to process than the traditional Unicode data files.

I imagine creating these files was a matter of auto-generation with custom 
tools, combined with human fine-tuning and judgment (i.e. where to draw the 
line when grouping characters). It would be great if Eric and/or Laurențiu 
could donate any tools, but the human effort is probably what could not be 
replaced.

--
Doug Ewell, CC, ALB | Lakewood, CO, US | ewellic.org


Reply via email to