On Sat, 05 Jun 2004 21:15:23 +0200
Andrzej Bialecki <[EMAIL PROTECTED]> wrote:
Vladimir Yuryev wrote:
Hi, Andjej!
How you tested the Polish texts with what stemer?
Thanks,
Vladimir.
No reason to be too modest, Leo.. I tested your stemmer on English,
Swedish and Polish texts (including F-measure v
Vladimir Yuryev wrote:
Hi, Andjej!
How you tested the Polish texts with what stemer?
Thanks,
Vladimir.
No reason to be too modest, Leo.. I tested your stemmer on English,
Swedish and Polish texts (including F-measure vs. training set size
plots), and it works exceptionally well indeed. Highly rec
Hi, Andjej!
How you tested the Polish texts with what stemer?
Thanks,
Vladimir.
No reason to be too modest, Leo.. I tested your stemmer on English,
Swedish and Polish texts (including F-measure vs. training set size
plots), and it works exceptionally well indeed. Highly recommended!
--
Best rega
mer. Moreover,
this stemmer can be used with the egothor search engine only? Can I use this
stemmer with Lucene? If yes, how?
Regards,
Anil
-Original Message-
From: Leo Galambos [mailto:[EMAIL PROTECTED]
Sent: Thursday, June 03, 2004 8:54 PM
To: Lucene Users List
Subject: Re: Writing a stem
Leo Galambos wrote:
Erik Hatcher <[EMAIL PROTECTED]> wrote:
__
How proficient must I be in a language for which I wish to write the
stemmer?
I would venture to say you would need to be an expert in a language to
write a decent stemmer.
I'm sorry for a self-promo ;), but
the stemmer of e
Erik Hatcher <[EMAIL PROTECTED]> wrote:
__
>> How proficient must I be in a language for which I wish to write the
>> stemmer?
>I would venture to say you would need to be an expert in a language to
>write a decent stemmer.
I'm sorry for a self-promo ;), but
the stemmer of egothor proje
On Jun 3, 2004, at 4:09 PM, Musku, Anil (LA) wrote:
Can anyone provide some help on writing a stemmer for non-english
languages?
Have a look at the snowball project in the Lucene sandbox. If its
non-European-based languages, I suspect it's quite complex. It's
highly language dependent.
How pr
Anil,
I suppose it depends on how complex the language is and what is acceptable for your
program. I have written a couple of stemmers that are fairly straightforward based on
papers that I have read and work well for the langs. we are using. Your best bet is
probably to do a literature searc