Anil,

I suppose it depends on how complex the language is and what is acceptable for your 
program.  I have written a couple of stemmers that are fairly straightforward based on 
papers that I have read and work well for the langs. we are using.  Your best bet is 
probably to do a literature search for the languages you are interested in and go from 
there.  

I am, of course, assumming stemmers for your languages don't already exist.  If your 
languages are common, there probably is a stemmer available in some form that you can 
use or adapt. You'd be suprised at what you get by doing a simple google search for 
"<lang X> stemmer" where lang X is the language you are interested in and no quotes.

Hooking them into Lucene is straightforward and there are several examples of this 
available in the docs and code.

-Grant

>>> [EMAIL PROTECTED] 06/03/04 04:09PM >>>

Hi,

Can anyone provide some help on writing a stemmer for non-english languages?
How proficient must I be in a language for which I wish to write the stemmer?

Regards,
Anil

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED] 
For additional commands, e-mail: [EMAIL PROTECTED] 



---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to