> Believe it or not, the IJ and ij digraphs *were* included for
> compatibility with an 8-bit legacy character set (ISO 6937).

6937 is a multibyte encoding (one or two bytes per character).
There are no combining characters at all in 6937, even though
there is a common misunderstanding that there are, since the
lead bytes are (almost) systematically assigned.

> Whether
> that automatically means they should have been assigned canonical
> instead of compatibility decompositions, I don't know.

I think in this case it is correct that the decomposition is a compatibility
one.  It could have been: none; like for the oe and ae ligatures.
This is in contrast to the MICRO SIGN which ideally should have had
a canonical decomposition; but Latin-1 characters got special treatment
(and ASCII characters have even more special treatment in this regard,
where some spacing accents are not decomposed at all).

                /kent k


Reply via email to