Dear Murod and Erdal, On Thu, 2006-12-21 at 10:19 +0500, Murod Latifov wrote:
> Let me not agree with you, using affixes will help a lot and will > decrease dictionary size and increases with speed. What speed? I reckon using flat lists is faster than generating words... > For example you can > have prefix "be" and use it with nouns, like benom, bekitob, It works for your examples and many others, but surely not for every noun. Consider nouns "sarmaa" (means cold) or "shanbeh" (means Saturday), "bi-sarma" and "bi-shanbeh" are simply wrong. > or suffixes > "ho", "horo", "hoi", "hoyam", "hoyat", "hoyashon", "hoyamro", "hoyatro", > "hoyashonro" and more to this list and use single root word to populate > other words, like kitobamro, kitobatro, kitobashro and block only those > combinations that are not in use or are incorrect. "Haa" is the suffix for plural form in Persian, but it is not applicable on all nouns (consider "iran-haayeman"). Don't get me wrong, we do use affixes, but we hand-check auto-generated combinations against some carefully made corpus and Persian dictionaries, and then enter the valid ones into the list. As I said before, exceptions are so many that it's only possible this way. > This even can be done by applying certain rules. I'd be very grateful if you can be more specific about these rules. Some people spent their life on finding such rules, and they were not very successful. Thanks, -mee --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]