[aspell-devel] UTF-8 in phonetic code table

gora Fri, 24 Nov 2006 13:41:59 -0800

Hi,
  I have been trying out rules for Hindi in the phonetic code table,
by adding hi_phonet.dat, appropriately modifying the hi.dat file,
and remaking the dictionary. Using UTF-8 in this file is OK, is it
not? Simple rules seem to work, like Devanagari vowel sign i being
equivalent to Devanagari vowel sign ii. However, I am getting mixed
results with another simple example, a rule that a consonant sounds
similar to the same consonant, plus vowel sign a. Here is just one
example that I have added to the file
  à¤¹ à¤¹à¤¾
By adding this rule, I would have expected that any word would be
zero edit distance away from another, if they differed only in that
one used à¤¹, and the other à¤¹à¤¾ However, I am not seeing that. The way
I am testing is by mispelling a word such that it is an edit distance
of two away from a known word in the dictionary, assuming that the
rule above makes the edit distance of à¤¹ and à¤¹à¤¾ zero. I would then
expect
the correct word to show up in the list of suggestions, and indeed
close to the top. However, I am not seeing that. Am I missing something?


Regards,
Gora



_______________________________________________
Aspell-devel mailing list
[email protected]
http://lists.gnu.org/mailman/listinfo/aspell-devel

[aspell-devel] UTF-8 in phonetic code table

Reply via email to