Stop words (how to create ideal set of stop words?)

Lukas Vlcek Thu, 10 May 2007 11:39:58 -0700

Hi,

Can anybody point me to some references how to create an ideal set of stop
words? I konw that this is more like a theoretical question but how do
Luceners determine which words shuold be excluded when creating Analyzers
for a new languages? And which technique was used for validation of stop
word lists in current Analyzers?


More specificaly I am interested in situations when there is a need to build
a search engine around specific corpus (for example when we need to search
set of articles related to programming languages only). Given a specific
corpus is there any recommended technique of stop words derivation?

Thanks,
Lukas

Stop words (how to create ideal set of stop words?)

Reply via email to