> > Collation isn't really based on combining sequences (even though UTS 10 > > specifies a certain "spanning" over non-blocking (combining) > > This is a very ignorant question: where in your public documentation > are these issues discussed? ... > I still don't understand even what happens with basic > collation in Hebrew, what > effect the shin / sin dots have.
Ignored at level 1, considered at level 2. From the 14651 data file: <U05C1> IGNORE;<SHINP>;<MIN>;<U05C1> % HEBREW POINT SHIN DOT <U05C2> IGNORE;<SINPT>;<MIN>;<U05C2> % HEBREW POINT SIN DOT > And, of course, I don't > understand any of the > more complicated issues either, such as what will happen when > your database > sorts un-pointed Hebrew epigraphy (just the consonants) and > pointed medieval > Hebrew (all the jots and tittles added). Re. collation, see UTS 10, and associated data files, and if you're really interested, see ISO/IEC 14651 (sort of a parallel to UTS 10, but different), and its data file. /kent k