On Mon, 2003-09-22 at 10:04, Glen Wagley wrote: > It's funny how some projects from school rear their ugly head in the workplace. > The company I work for has decided they want a simple search engine for a > specific part of the site (I know some of you are screaming htdig at me right > now). Anyhow, I was just wondering if anyone knew of or had a good stopwords > file. Thanx.
Htdig and Mysql have their own stop words files. For mysql it's in myisam/ft_static.c in the source distribution. For htdig it comes with a bad_words which is installed in /etc/htdig on my box. I'm sure it's in the source tarball somewhere too. I'm not sure about the legalities of yoinking a list of words out of a program, though. The Mysql list is pretty substantial so you may need to GPL your program to use it. That's not necessarily a bad thing. ;) BTW, my recommendation: use htdig. Corey ____________________ BYU Unix Users Group http://uug.byu.edu/ ___________________________________________________________________ List Info: http://uug.byu.edu/cgi-bin/mailman/listinfo/uug-list
