Format of stopwords file

2003-11-13 Thread Jason Ramsey
In the docs is says that you can define your own stopwords file for fulltext
searching.  The following is under show variables...

ft_stopword_file The file from which to read the list of stopwords for
full-text searches. All the words from the file will be used; comments are
not honored. By default, built-in list of stopwords is used (as defined in
`myisam/ft_static.c'). Setting this parameter to an empty string () will
disable stopword filtering. Note: FULLTEXT indexes must be rebuilt after
changing this variable. (This option is new for MySQL 4.0.10)

.. However, it doesn't say what format this file should be in.  Should it be
a text document with one word per line?  Is there some other format?

Also, is there a way to list the words mysql is currently using as
stopwords?


-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



Re: Format of stopwords file

2003-11-13 Thread Matt W
Hi Jason,

There is no format per-se of the stopword file. The words are parsed
in the same way as when they're being indexed. e.g. A word is a
sequence of aplhanumeric characters, _ and '

So one line per word (which is how I do it) will work fine. As would
separating with spaces, commas, etc.

To see what MySQL is currently using as stopwords, you know about the
ft_stopword_file variable, right? That tells you the file, if the
built-in list isn't being used. The built-in list is defined in
myisam/ft_static.c as it says there in the manual. If you want to see
the built-in list of words without downloading the source, I can send it
to you. :-)


Hope that helps.


Matt


- Original Message -
From: Jason Ramsey
Sent: Thursday, November 13, 2003 4:08 PM
Subject: Format of stopwords file


 In the docs is says that you can define your own stopwords file for
fulltext
 searching.  The following is under show variables...

 ft_stopword_file The file from which to read the list of stopwords for
 full-text searches. All the words from the file will be used; comments
are
 not honored. By default, built-in list of stopwords is used (as
defined in
 `myisam/ft_static.c'). Setting this parameter to an empty string ()
will
 disable stopword filtering. Note: FULLTEXT indexes must be rebuilt
after
 changing this variable. (This option is new for MySQL 4.0.10)

 .. However, it doesn't say what format this file should be in.  Should
it be
 a text document with one word per line?  Is there some other format?

 Also, is there a way to list the words mysql is currently using as
 stopwords?


-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]