On 8/23/2011 7:22 AM, Doug Ewell wrote:
Of all applications, a word processor or DTP application would want to
know more about the properties of characters than just whether they are
RTL.  Line breaking, word breaking, and case mapping come to mind.

I would think the format used by standard UCD files, or the XML
equivalent, would be preferable to making one up:

The right answer would follow the XML format of the UCD.

That's the only format that allows all necessary information contained in one file, and it would leverage of any effort that users of the main UCD have made in parsing the XML format.

An XML format shold also be flexible in that you can add/remove not just characters, but properties as needed.

The worst thing do do, other than designing something from scratch, would be to replicate the UnicodeData.txt layout with its random, but fixed collection of properties and insanely many semi-colons. None of the existing UCD txt files carries all the needed data in a single file.


Reply via email to