On Mon, 2003-09-22 at 10:04, Glen Wagley wrote:
> It's funny how some projects from school rear their ugly head in the workplace.
> The company I work for has decided they want a simple search engine for a
> specific part of the site (I know some of you are screaming htdig at me right
> now). Anyhow,  I was just wondering if anyone knew of or had a good stopwords
> file. Thanx.

Htdig and Mysql have their own stop words files. For mysql it's in
myisam/ft_static.c in the source distribution. For htdig it comes with a
bad_words which is installed in /etc/htdig on my box. I'm sure it's in
the source tarball somewhere too.

I'm not sure about the legalities of yoinking a list of words out of a
program, though. The Mysql list is pretty substantial so you may need to
GPL your program to use it. That's not necessarily a bad thing. ;)

BTW, my recommendation: use htdig.

Corey



____________________
BYU Unix Users Group 
http://uug.byu.edu/ 
___________________________________________________________________
List Info: http://uug.byu.edu/cgi-bin/mailman/listinfo/uug-list

Reply via email to