Re: [aspell-devel] Thoughts on using aspell for Indian language ing

Kevin Atkinson Mon, 13 Nov 2006 00:50:12 -0800

On Mon, 13 Nov 2006, [EMAIL PROTECTED] wrote:

[Sorry if this messes up the Unicode characters, pine is lame and doesn'tsupport utf-8]

The base characters themselves certainly fit. However, if one wishes to
operate on syllables (made by combining consonants in the base
character set), the number of these syllables can exceed 256.
 Here is a short example of just one of the issues that come up when
treating characters, rather than syllables as the base unit in Hindi.
Take, for example, the conjunct, "kra", à¤à¥à¤°. This is represented
linguistically, and in UTF-8, as à¤ + à¥ + à¤° (U0915 + U094D + U0930).
It makes no sense to swap the "halant" (U094D) with the "ka" or the
"ra", as that creates a completely different conjunct, and is not a
mistake that would typically be made. As you suggest, I could just
include "kra" in the encoding, but, in many Indian languages, the
256 available slots are not sufficient for all such conjuncts.


I am going to need a better explanation.

So "kra" is stored in Unicode using three "characters"? But you want tostore it using the "kra" conjunct? Which is not the way it is normallystored. What is the Unicode character for "kra"?

_______________________________________________
Aspell-devel mailing list
[email protected]
http://lists.gnu.org/mailman/listinfo/aspell-devel

Re: [aspell-devel] Thoughts on using aspell for Indian language ing

Reply via email to