I tried to find something like a normative description of the default bidi class of unassigned code points.

In UTR #9, it says (http://www.unicode.org/reports/tr9/tr9-23.html#Bidirectional_Character_Types):

Unassigned characters are given strong types in the algorithm. This is an explicit exception to the general Unicode conformance requirements with respect to unassigned characters. As characters become assigned in the future, these bidirectional types may change. For assignments to character types, see DerivedBidiClass.txt [DerivedBIDI] in the [UCD].

The DerivedBidiClass.txt file, as far as I understand, is mainly a condensation of bidi classes into character ranges (rather than giving them for each codepoint independently as in UnicodeData.txt). I.e. it can at any moment be derived automatically from UnicodeData.txt, and is as such not normative.

Why is it then that the default class assignments are only given in this file (unless I have overlooked something)? And why is it that they are only given in comments? I'm trying to create a program that takes all the bidi assignments (including default ones) and creates the data part of a bidi algorithm implementation, but I don't feel confident to code against stuff that's in comments. Any advice? Is it possible that this could be fixed (making it more normative, and putting it in a form that's easier to process automatically)?

Regards,   Martin.

Reply via email to