I've figured out a temp workaround for the problem/feature of words that appear in more than 50% of records in a fulltext index being considered stopwords. I just added as many dummy records as there are real records in the table. A fulltext search will now not disregard any words based on their frequency.
For performance I added a column called dummy with a flag set indicating if the record is real or dummy. I added an index on the dummy column and include a 'where dummy=1' clause in my SQL when doing fulltext searches. I also have a cron job that runs a report every 20 minutes that makes sure that 51% of the database is populated with dummy records. (*yuck!*) Clumsy, yet effective. If anyone has a better solution out there, I would very much like to hear from you. I agree with your logic of words that occur more frequently have a lesser weight - it makes alot of natural language sense. But there should be a way to either disable the '50% occurence = zero weight' setting or perhaps disable word weighting altogether for small datasets. kind regards, Mark. --------------------------------------------------------------------- Before posting, please check: http://www.mysql.com/manual.php (the manual) http://lists.mysql.com/ (the list archive) To request this thread, e-mail <[EMAIL PROTECTED]> To unsubscribe, e-mail <[EMAIL PROTECTED]> Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php