Re: mixed-script writing systems

Jim Allan Fri, 15 Nov 2002 13:46:38 -0800

Peter Constable posted on Wakhi:

So, the question is this: Should we say that this writing system is
completely Latin (keeping the norm that orthographic writing systems use a
single script) and apply the principle of unification -- across languages
but not across scripts -- to imply that we need to encode new characters,
Latin delta, Latin theta and Latin yeru? Or, do we say that this writing
system is only *mostly* Latin-based, and that it mixes in a few characters
from other scripts?

There are quite a number of fonts available that present Latin letters, Greek letters and Cyrillic letters in matching styles.

If the extra characters were encoded separately would they be available in as many fonts? If not, (and certainly very few Latin Letter fonts encode all the Latin letters in Unicode) I would expect that that those entering Wakhi text continue to use the Greek and Cyrillic characters instead of the supposedly proper ones because of the wider number of fonts which could be used.

There is little purpose in encoding Latin letter clones of Greek and Cyrillic characters if in practice the Greek and Cyrillic originals will continue to be the ones normally used.

Of course, Ezh and even Schwa is missing from most Latin letter fonts and Dze from most sets of Cyrillic characters included in fonts containing basic Latin, Greek and Cyrillic letters.

Yet I note the schwa used in the sample does not match the other vowel letters in style or width, apparently here borrowed from a different font.

This brings up the question of whether the sample is in other ways a typographical compromise. I have seen popular books of linguistics that for typographical reasons used a Greek gamma in place of the IPA gamma and a delta in place of the ð symbol.

Could this be happening here?

Even if so, if a typographical compromise has often occurred it could have been forgotten in time that it was originally a compromise, and the substituted symbols might now be thought to be the correct ones. In that case, they indeed they now are correct ones.

Since seemingly all the characters needed for Wakhi are already encoded in Unicode, though some must be taken from non-Latin scripts, I would think the matter best left pending information from actual users of the Wakhi writing system as to their own desires.

Note that the traditional Latin letter transliteration of Avestan includes the Greek letters theta and chi.

A reason for adding clones of all Greek letters used in mostly Latin character sets is that stylistically Wakhi, Avestan, IPA will appear incorrectly if an application is using a font in which the Greek lower case letters are in an "italic" style, as is sometimes the case.

But Unicode generally avoids considering such stylistic matters, noting only that not all fonts are suitable for all uses.

As to sortation, that is also something the Unicode writing usually claims is outside its purview, other than that the standard provides an optional default sort which might be useful for characters that fall outside the characters covered by particular sort conventions.

Jim Allan

Re: mixed-script writing systems

Reply via email to