* Mike
> I want to provide free text searching on my database across
> numerous fields. The question is, is it better to use a freetext
> field, or pattern matching and or-ing the fields I want to search.
>
> What scares me off of Freetext is that I only have about 40
> records, they all relate to a specialized community, so it is
> quite possible that a search term could return more than 50%
> of the records.
>
> I'm using v3.23.27, which has fulltext, and cannot upgrade.
>
> I would appreciate any thoughts you have on this, or a pointer to
> a thread if this has already been discussed.

Fulltext indexing is simpler, but there are some limitations, like the 50%
threshold, as you point out. There is also the problem with small words,
which can be annoying depending on the community/words, there are for
instance very many three letter acronyms in the computing community.

With pattern matching (REGEXP/LIKE) and or'ing you get a lot more
flexibillity, but also some more programming work. You could search on other
field types, not only strings/words, and you could use different indexes
depending on actual values provided for different fields and so on. You
should index all the relevant columns and let the internal optimizer figure
out what index to use.

However, the index will only work for single value columns, not text/memo
fields. In other words: no index can be used if the quesy is "...WHERE col1
LIKE '%word%'". An index can also not be used when you are OR'ing different
columns, _even_ if you have single value columns: "...WHERE col1 = 'A' or
col2 = 'A'" can not use an index.

If speed was an issue, if you had like 40K rows, not just 40, the best thing
would probably be to normalize. In this case it would mean to split all text
into separate words, and store each word once in a separate word-table. A
link table between the word table and your data table would also be needed.
Every insert/update would have to go through the same parsing, inserting
into the link/word tables as needed. This is essentially the same as making
your own freetext index.

In your case, try the pattern matching and OR'ing, resulting in a full table
scan (no index is used). If the speed is acceptable, problem is solved,
right? Of course, it will not scale very well... if your data and/or user
group is growing, you will need to involve some sort of indexing sooner or
later.

--
Roger


-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:    http://lists.mysql.com/[EMAIL PROTECTED]

Reply via email to