Re: Writing a stemmer

2004-06-06 Thread Vladimir Yuryev
On Sat, 05 Jun 2004 21:15:23 +0200 Andrzej Bialecki <[EMAIL PROTECTED]> wrote: Vladimir Yuryev wrote: Hi, Andjej! How you tested the Polish texts with what stemer? Thanks, Vladimir. No reason to be too modest, Leo.. I tested your stemmer on English, Swedish and Polish texts (including F-measure v

Re: Writing a stemmer

2004-06-05 Thread Andrzej Bialecki
Vladimir Yuryev wrote: Hi, Andjej! How you tested the Polish texts with what stemer? Thanks, Vladimir. No reason to be too modest, Leo.. I tested your stemmer on English, Swedish and Polish texts (including F-measure vs. training set size plots), and it works exceptionally well indeed. Highly rec

Re: Writing a stemmer

2004-06-04 Thread Vladimir Yuryev
Hi, Andjej! How you tested the Polish texts with what stemer? Thanks, Vladimir. No reason to be too modest, Leo.. I tested your stemmer on English, Swedish and Polish texts (including F-measure vs. training set size plots), and it works exceptionally well indeed. Highly recommended! -- Best rega

RE: Writing a stemmer

2004-06-04 Thread Musku, Anil (LA)
mer. Moreover, this stemmer can be used with the egothor search engine only? Can I use this stemmer with Lucene? If yes, how? Regards, Anil -Original Message- From: Leo Galambos [mailto:[EMAIL PROTECTED] Sent: Thursday, June 03, 2004 8:54 PM To: Lucene Users List Subject: Re: Writing a stem

Re: Writing a stemmer

2004-06-04 Thread Andrzej Bialecki
Leo Galambos wrote: Erik Hatcher <[EMAIL PROTECTED]> wrote: __ How proficient must I be in a language for which I wish to write the stemmer? I would venture to say you would need to be an expert in a language to write a decent stemmer. I'm sorry for a self-promo ;), but the stemmer of e

Re: Writing a stemmer

2004-06-03 Thread Leo Galambos
Erik Hatcher <[EMAIL PROTECTED]> wrote: __ >> How proficient must I be in a language for which I wish to write the >> stemmer? >I would venture to say you would need to be an expert in a language to >write a decent stemmer. I'm sorry for a self-promo ;), but the stemmer of egothor proje

Re: Writing a stemmer

2004-06-03 Thread Erik Hatcher
On Jun 3, 2004, at 4:09 PM, Musku, Anil (LA) wrote: Can anyone provide some help on writing a stemmer for non-english languages? Have a look at the snowball project in the Lucene sandbox. If its non-European-based languages, I suspect it's quite complex. It's highly language dependent. How pr

Re: Writing a stemmer

2004-06-03 Thread Grant Ingersoll
Anil, I suppose it depends on how complex the language is and what is acceptable for your program. I have written a couple of stemmers that are fairly straightforward based on papers that I have read and work well for the langs. we are using. Your best bet is probably to do a literature searc