> Then why does UnicodeData break them down as (e.g.) 0064 030C rather than > 0064 0315?
To keep the upper case and lower case characters in sync for decomposition, they always have the same combining characters.
Yes. There is nothing technically or grammatically incorrect about thinking of d' l' and t' as letters with 'carons': it is only typographically incorrect to represent them with the typical caron mark. The encoding of characters and the visual representation of characters do not always directly correspond.
For another example, G with cedilla gets the cedilla on top when it's a capital, but it still decomposes to the ordinary combining cedilla. These are essentially font-ligaturing issues.
Not quite, in that the font does not necessarily require ligature substitution data for characters that are encoded in Unicode in precomposed forms. Systems and applications should take care of canonical composition, not fonts.
By the way, although Unicode calls it a cedilla, the correct form to use with G is the disconnected, 'under comma' form.
John Hudson
Tiro Typeworks www.tiro.com Vancouver, BC [EMAIL PROTECTED]
It is necessary that by all means and cunning, the cursed owners of books should be persuaded to make them available to us, either by argument or by force. - Michael Apostolis, 1467

