Hi,
Can anybody point me to some references how to create an ideal set of stop
words? I konw that this is more like a theoretical question but how do
Luceners determine which words shuold be excluded when creating Analyzers
for a new languages? And which technique was used for validation of stop
10 maj 2007 kl. 20.39 skrev Lukas Vlcek:
Can anybody point me to some references how to create an ideal set
of stop
words? I konw that this is more like a theoretical question but how do
Luceners determine which words shuold be excluded when creating
Analyzers
for a new languages?
The id
See also en.wikipedia.org/wiki/Stop_words and
www.ranks.nl/tools/stopwords.html
karl wettin <[EMAIL PROTECTED]> wrote on 10/05/2007 13:57:33:
>
> 10 maj 2007 kl. 20.39 skrev Lukas Vlcek:
>
> > Can anybody point me to some references how to create an ideal set
> > of stop
> > words? I konw that
Also, from the empirical side, have a look at Luke (after indexing w/
o any stopwords, or just the standard ones) and see what the most
common terms are and see if they are meaningful or not in the context
of your application.
-Grant
On May 10, 2007, at 7:41 PM, Doron Cohen wrote:
See al
AIL PROTECTED]>
To: java-user@lucene.apache.org
Sent: Thursday, May 10, 2007 2:39:35 PM
Subject: Stop words (how to create ideal set of stop words?)
Hi,
Can anybody point me to some references how to create an ideal set of stop
words? I konw that this is more like a theoretical question but
PROTECTED]>
To: java-user@lucene.apache.org
Sent: Thursday, May 10, 2007 2:39:35 PM
Subject: Stop words (how to create ideal set of stop words?)
Hi,
Can anybody point me to some references how to create an ideal set of stop
words? I konw that this is more like a theoretical question but how do
L
sage
From: Lukas Vlcek <[EMAIL PROTECTED]>
To: java-user@lucene.apache.org
Sent: Thursday, May 10, 2007 2:39:35 PM
Subject: Stop words (how to create ideal set of stop words?)
Hi,
Can anybody point me to some references how to create an ideal set
of stop
words? I konw that this is more like a theoret
a "Zipf visualisation" plug-in for Luke which may help.
I can post the code somewhere if this is useful.
Mark
- Original Message
From: Grant Ingersoll <[EMAIL PROTECTED]>
To: java-user@lucene.apache.org
Sent: Friday, 11 May, 2007 12:14:12 PM
Subject: Re: Stop words (h