On Wed, 23 May 2012 15:50:24 -0700 Markus Scherer <[email protected]> wrote:
> On Wed, May 23, 2012 at 2:01 PM, Richard Wordingham < > [email protected]> wrote: > > > While we're picking on that poor routine - it looks as though it > > could come unstuck with kana in the supplementary planes - the Kana > > Supplement, and possibly also the Enclosed Ideographic Supplement. > > Do you want a comment on that added to the ticket, or does that > > issue deserve a whole ticket to itself? > > > > I don't think we need another ticket, but I also don't know what you > mean with "it could come unstuck...". I was worrying that the kana conversion routines would write whole characters to the destination strings - both source and destination are specified as being a single code unit. Since I last looked at the ticket, you've picked up the issue of code unit v. character, so if you did preserve the kana conversion logic you would note the issue with destination sizes. > > Is there a definition of the precise > > relationship between DUCET and FractionalUCA.txt, or does > > FractionalUCA.txt define the relationship? > See http://www.unicode.org/Public/UCA/latest/CollationAuxiliary.html As far as I can see, it just says they're different and gives some *principles* for changes. For example, it doesn't mention the contractions 0FB2+0F71 and 0FB3+0F71. The text doesn't clearly say that all changes are identified. I haven't sat down to search for all the changes - in principle that's a 'hard' task, but in practice it should be possible to pick out a small residue for human inspection. Richard.

