Scott -
Check this excerpt out (
http://dev.mysql.com/doc/mysql/en/fulltext-search.html ) from the MySQL
Documentation. I hope it helps!
--bemansell
...
"Every correct word in the collection and in the query is weighted according
to its significance in the collection or query. This way, a word that is
present in many documents has a lower weight (and may even have a zero
weight), because it has lower semantic value in this particular collection.
Conversely, if the word is rare, it receives a higher weight. The weights of
the words are then combined to compute the relevance of the row.
Such a technique works best with large collections (in fact, it was
carefully tuned this way). For very small tables, word distribution does not
adequately reflect their semantic value, and this model may sometimes
produce bizarre results. For example, although the word ``MySQL'' is present
in every row of the articles table, a search for the word produces no
results:
mysql> SELECT * FROM articles
-> WHERE MATCH (title,body) AGAINST ('MySQL');
Empty set (0.00 sec)
The search result is empty because the word ``MySQL'' is present in at
least 50% of the rows. As such, it is effectively treated as a stopword. For
large datasets, this is the most desirable behavior---a natural language
query should not return every second row from a 1GB table. For small
datasets, it may be less desirable.
A word that matches half of rows in a table is less likely to locate
relevant documents. In fact, it most likely finds plenty of irrelevant
documents. We all know this happens far too often when we are trying to find
something on the Internet with a search engine. It is with this reasoning
that rows containing the word are assigned a low semantic value for *the
particular dataset in which they occur*. A given word may exceed the 50%
threshold in one dataset but not another.
The 50% threshold has a significant implication when you first try full-text
searching to see how it works: If you create a table and insert only one or
two rows of text into it, every word in the text occurs in at least 50% of
the rows. As a result, no search returns any results. Be sure to insert at
least three rows, and preferably many more."
On 5/25/05, Scott Purcell <[EMAIL PROTECTED]> wrote:
>
> Hello,
> I am running 4.0.15 for Win95/98 and am working through the docs.
>
> I created a "text" type field with a 'fulltext' index. As I am
> experimenting, I have run into a couple of questions:
>
> First off, I was having trouble getting results. So I added the word
> "foobar" to one of the descriptions:
> and that worked with this query:
> select * from item where match(name, description) against('foobar')
>
>
>
> I have a word 'red' that appears 5-10 times, in a tmp table of 60 records.
> If I run that query with 'red'
> select * from item where match(name, description) against('red');
> it returns empty set
>
> Upon reading, it looks like it is really trying to only get "unique" names
> from the index. But in my case the 'red' is a description that I would like
> to get back. Anyway to force this to return results?
>
> Any info would be helpful. I have read, but it gets a little confusing
> first time through.
>
> Thanks,
> Scott
>
>
> --
> MySQL General Mailing List
> For list archives: http://lists.mysql.com/mysql
> To unsubscribe: http://lists.mysql.com/[EMAIL PROTECTED]
>
>