Hi!

On Feb 18, Peter Bryant wrote:
> My problem is that my fulltext searches run quite slowly.
> 
> I have a 2300MB table with 4.3M rows and an index (primarily a fulltext 
> index) of 1300M.  My hardware is a dual Athlon MP1500+ with 2GB of memory.  
> I am running MySQL 4.0.10.
> 
> My my.cnf includes:
> set-variable = key_buffer=16M
> set-variable = key_buffer_size=128M

Why do you set both key_buffer and key_buffer_size ?
key_buffer is old (and obsolete) name for key_buffer_size.
Check

SHOW VARIABLES LIKE 'key_buffer%'

to see whether it's 16M or 128M.
Manual says:

     Increase this to get better index handling (for all reads and
     multiple writes) to as much as you can afford; 64M on a 256M
     machine that mainly runs MySQL is quite common.  If you, however,
     make this too big (for instance more than 50% of your total
     memory) your system may start to page and become extremely slow.
     Remember that because MySQL does not cache data reads, you will
     have to leave some room for the OS filesystem cache.

     You can check the performance of the key buffer by doing `SHOW
     STATUS' and examine the variables `Key_read_requests',
     `Key_reads', `Key_write_requests', and `Key_writes'.  The
     `Key_reads/Key_read_request' ratio should normally be < 0.01.

> I have found that:
> 1) Searching for non-existant words is fast (<1s)
> 2) Searching for common words is slow (e.g. select a count of matches on a 
> single word with 15135 matches on a 2300MB table takes 3 min 38.54 sec)

In boolean or natural-language mode ?

> 3) Boolean search for a +common word +nonexistant word completes 
> immediately.
> 4) Add in a LIMIT n on a popular term and the results complete quicker.  
> Completion time is less than O(n).  O(log n?)

In boolean or natural-language mode ?

For natural-language mode it means that a lot of time is spent reading
*huge* number of matched rows from the disk, after fulltext search itself
was completed (list of matched rows - disregarding LIMIT - was built).

For boolean mode it is expected to be O(n).

> 5) A repeat search runs faster (<1s vs. 35s).  Presumably because of OS 
> file caching and the key_buffer_size variable.
> 6) "against ('aword') limit 2000" and "against ('+adifferentword' in 
> boolean mode) limit 2000" take about the same time to reach their limits 
> (2000).

Sorry ? When you change *both* the word *and* the mode ?
How can you compare ? Try either the same word in both modes or
different words in the same mode.

> 7) "match(subject, message_body) against ('Linux Kernel') and message_body 
> like '%Linux%' and message_body like '%kernel%' limit 2000" returns in 
> about the same time as "against ('+Linux +kernel' in boolean mode) limit 
> 2000" (24s)

On really popular words search in boolean mode should be slightly slower
than LIKE - it's the same 1/3 heuristic MySQL uses in ... WHERE a=4 ...
queries. If in more than 30% of the table a=4, it would be faster to do
a table scan, reading rows sequentially, than to resolve the query with
an index, reading data rows in almost random order.

Search in natural language mode could be significantly slower.

> So, onto my questions:
> 1) Are there any server settings I should be looking at to improve fulltext 
> search performance?

key_buffer_size

> 2) Can my SQL be written differently (e.g. is boolean mode faster or not?)

Boolean mode is O(log(N)) faster, where N is number of rows found.
It matters when some words in the query are very popular.

> 3) Is there a way to get a list of the most commonly occuring words in a 
> fulltext index?  Then I can avoid them in searches.  MySQL doesn't happen 
> to drive it's queries by finding the word with lowest frequency first does 
> it?

Not yet. It queries all the words simultaneously - so if one word is
nonexistent (or very rare) it will abort the search early.
But optimization you're talking about will be added soon too.
(fulltext index structure was changed a month ago
to allow this particular optimization)

> 4) Are there any MySQL code changes on the horizon that could help to speed 
> up fulltext word searching?

See above.

> 5) Is there anything else I could do that would allow me to get results 
> faster (within about a 30 second timeframe).  Having searches that don't 
> return all matching rows would be acceptable (but not desirable).  e.g. I'm 
> thinking of just deleting less important rows from my large table.

"having searches that don't return all matching rows" could be done in
natural language search mode only (of course), and "it will be added
soon" too, as above.

egards,
Sergei

-- 
MySQL Development Team
   __  ___     ___ ____  __
  /  |/  /_ __/ __/ __ \/ /   Sergei Golubchik <[EMAIL PROTECTED]>
 / /|_/ / // /\ \/ /_/ / /__  MySQL AB, http://www.mysql.com/
/_/  /_/\_, /___/\___\_\___/  Osnabrueck, Germany
       <___/

---------------------------------------------------------------------
Before posting, please check:
   http://www.mysql.com/manual.php   (the manual)
   http://lists.mysql.com/           (the list archive)

To request this thread, e-mail <[EMAIL PROTECTED]>
To unsubscribe, e-mail <[EMAIL PROTECTED]>
Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php

Reply via email to