Hi,

I am the lead developer of Phorum <http://phorum.org>.  We are starting to
see some users with 1 Million messages in their forums, including me at
dealmac.com.  I am trying to optimize the tables and I am having good luck
analyzing the keys and building better ones.

However, I am having a problem with searches.  Currently, we use a LIKE
query to search the messages.  It works fine when we limit the data to the
last 30(30K rows), 60(60K rows), 90(90K rows) days.  However, when users
want to search the last year or even all dates (800,000+ rows) we see
terrible query performance.

So, I gathered some data from myphorum.de, a german site that provides
forums (using Phorum) to its users.  In all they had 1 million + rows of
data.  I pulled all that into one table and began my tests.

I ran into a problem when the queries started to call for 500,000+ rows.
The query would not return.  Here is the query and explain:

select SQL_NO_CACHE id from single_table where approved='Y' AND datestamp >=
'2001-08-16 00:00:00' AND  ( ( author LIKE '%myphorum%'  OR  subject LIKE
'%myphorum%'  OR  body LIKE '%myphorum%' ) ) order by datestamp desc;

+--------------+------+---------------+----------+---------+-------+--------
+------------+
| table        | type | possible_keys | key      | key_len | ref   | rows
| Extra      |
+--------------+------+---------------+----------+---------+-------+--------
+------------+
| single_table | ref  | approved      | approved |       1 | const | 567357
| where used |
+--------------+------+---------------+----------+---------+-------+--------
+------------+

here is the table:

CREATE TABLE `single_table` (
  `forum_id` int(10) unsigned NOT NULL default '0',
  `id` int(10) unsigned NOT NULL auto_increment,
  `datestamp` datetime NOT NULL default '0000-00-00 00:00:00',
  `thread` int(10) unsigned NOT NULL default '0',
  `parent` int(10) unsigned NOT NULL default '0',
  `author` varchar(37) NOT NULL default '',
  `subject` varchar(255) NOT NULL default '',
  `email` varchar(200) NOT NULL default '',
  `attachment` varchar(64) NOT NULL default '',
  `host` varchar(50) NOT NULL default '',
  `email_reply` char(1) NOT NULL default 'N',
  `approved` char(1) NOT NULL default 'N',
  `msgid` varchar(100) NOT NULL default '',
  `viewcount` int(11) NOT NULL default '0',
  `modifystamp` int(10) unsigned NOT NULL default '0',
  `userid` int(10) unsigned NOT NULL default '0',
  `body` text NOT NULL,
  PRIMARY KEY  (`id`),
  KEY `forum_id_3` (`forum_id`,`thread`,`modifystamp`,`approved`),
  KEY `approved` (`approved`,`datestamp`),
  FULLTEXT KEY `body` (`body`,`author`,`subject`)
) TYPE=MyISAM


Another query that looks at less rows has a much different explain and I
only changed the datestamp portion.  This query took from 1.79 to 9.9
seconds depending on if the table is open.

select SQL_NO_CACHE id from single_table where approved='Y' AND datestamp >=
'2002-07-16 00:00:00' AND  ( ( author LIKE '%myphorum%'  OR  subject LIKE
'%myphorum%'  OR  body LIKE '%myphorum%' ) ) order by datestamp desc;

+--------------+-------+---------------+----------+---------+------+--------
+------------+
| table        | type  | possible_keys | key      | key_len | ref  | rows
| Extra      |
+--------------+-------+---------------+----------+---------+------+--------
+------------+
| single_table | range | approved      | approved |       9 | NULL | 282736
| where used |
+--------------+-------+---------------+----------+---------+------+--------
+------------+

The server is a Duron 900 with 256MB DDR.  It was doing nothing else at the
time this was all done.  Here is my.cnf:

port            = 3306
socket          = /tmp/mysql.sock
skip-networking
skip-locking
datadir         = /usr/local/mysqldb
user            = mysql
set-variable    = key_buffer=16M
set-variable    = max_allowed_packet=1M
set-variable    = table_cache=64
set-variable    = sort_buffer=512K
set-variable    = net_buffer_length=8K
set-variable    = myisam_sort_buffer_size=8M

If a server this size just can't handle this kind of query, I can accept
that.  I just would like some confirmation that there is nothing I can do to
make this work better.

Oh, BTW, I did try Full-Text, I don't want to use it if I don't have to as I
want to support older MySQL versions.  This software is for distribution.
Also, i found those queries to be slow if an order by or other where
elements were used.  Plus, most Phorum's won't be this big and Full-Text
will not work as well.

Thanks,

Brian
Phorum Dev Team


---------------------------------------------------------------------
Before posting, please check:
   http://www.mysql.com/manual.php   (the manual)
   http://lists.mysql.com/           (the list archive)

To request this thread, e-mail <[EMAIL PROTECTED]>
To unsubscribe, e-mail <[EMAIL PROTECTED]>
Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php

Reply via email to