Philippe Verdy wrote:

We have:
02A7;LATIN SMALL LETTER TESH DIGRAPH;Ll;0;L;;;;;N;LATIN SMALL LETTER T
ESH;;;;
but no canonical or compatibility decomposition as t + esh, even though it
is a clear ligature
using the short-leg esh.


I wonder why there's no VARIANT defined for the short leg ESH (i.e. that has
no descender
below the baseline).


In fact other interesting "digraphs" are:
02A3;LATIN SMALL LETTER DZ DIGRAPH;Ll;0;L;;;;;N;LATIN SMALL LETTER D Z;;;;
02A4;LATIN SMALL LETTER DEZH DIGRAPH;Ll;0;L;;;;;N;LATIN SMALL LETTER D
YOGH;;;;
02A5;LATIN SMALL LETTER DZ DIGRAPH WITH CURL;Ll;0;L;;;;;N;LATIN SMALL LETTER
D Z CURL;;;;
02A6;LATIN SMALL LETTER TS DIGRAPH;Ll;0;L;;;;;N;LATIN SMALL LETTER T S;;;;
02A7;LATIN SMALL LETTER TESH DIGRAPH;Ll;0;L;;;;;N;LATIN SMALL LETTER T
ESH;;;;
02A8;LATIN SMALL LETTER TC DIGRAPH WITH CURL;Ll;0;L;;;;;N;LATIN SMALL LETTER
T C CURL;;;;
02A9;LATIN SMALL LETTER FENG DIGRAPH;Ll;0;L;;;;;N;;;;;
02AA;LATIN SMALL LETTER LS DIGRAPH;Ll;0;L;;;;;N;;;;;
02AB;LATIN SMALL LETTER LZ DIGRAPH;Ll;0;L;;;;;N;;;;;


For D Z CURL, it's strange that we don't find in the UCD a decomposition
similar to the decomposition of D Z...

None of this is strange.


The point of these characters is that they can be used in phonetics for particular reasons and even contrast with the what appear to be their graphic elements if these also appear separately.

For example, 02A6 LATIN SMALL LETTER TS DIGRAPH strongly suggests that the user intends this to represent a single phoneme within whatever phonemic system is represented and it may contrast to _t_ followed by _s_ which would be simple /ts/.

None of these are *optional* ligatures which can be broken down into their graphic components without losing semantics. Therefore they have no canonical or compatibility equivalents.

Finally, it seems that these two:
021C;LATIN CAPITAL LETTER YOGH;Lu;0;L;;;;;N;;;;021D;
021D;LATIN SMALL LETTER YOGH;Ll;0;L;;;;;N;;;021C;;021C
are variants of
01B7;LATIN CAPITAL LETTER EZH;Lu;0;L;;;;;N;LATIN CAPITAL LETTER YOGH;;;0292;
0292;LATIN SMALL LETTER EZH;Ll;0;L;;;;;N;LATIN SMALL LETTER YOGH;;01B7;;01B7
and I wonder how these YOGH differ from EZH, or if the Unicode 1.0 name of
EZH was misleading...

See Michael Everson's discussion at http://www.evertype.com/standards/wynnyogh/ezhyogh.html


Of course in fact EZH has often been used for YOGH and was well on its way to becoming the modern glyph for YOGH in citations of Middle English or in some linguistic work.

This creates a quandary: if citing a text which uses an EZH glyph for YOGH (as for example is found in many of the "History of Middle-earth" books containing Christopher Tolkien's editing of his father J.R.R. Tolkien's unpublished papers) should one quote by spelling and display the Unicode EZH character or silently substitute the Unicode YOGH character?

There is no obviously right answer for all cases or even for many individual cases.

When in doubt use the number three. ;-)

Jim Allan









Reply via email to