Hi, I am the lead developer of Phorum <http://phorum.org>. We are starting to see some users with 1 Million messages in their forums, including me at dealmac.com. I am trying to optimize the tables and I am having good luck analyzing the keys and building better ones.
However, I am having a problem with searches. Currently, we use a LIKE query to search the messages. It works fine when we limit the data to the last 30(30K rows), 60(60K rows), 90(90K rows) days. However, when users want to search the last year or even all dates (800,000+ rows) we see terrible query performance. So, I gathered some data from myphorum.de, a german site that provides forums (using Phorum) to its users. In all they had 1 million + rows of data. I pulled all that into one table and began my tests. I ran into a problem when the queries started to call for 500,000+ rows. The query would not return. Here is the query and explain: select SQL_NO_CACHE id from single_table where approved='Y' AND datestamp >= '2001-08-16 00:00:00' AND ( ( author LIKE '%myphorum%' OR subject LIKE '%myphorum%' OR body LIKE '%myphorum%' ) ) order by datestamp desc; +--------------+------+---------------+----------+---------+-------+-------- +------------+ | table | type | possible_keys | key | key_len | ref | rows | Extra | +--------------+------+---------------+----------+---------+-------+-------- +------------+ | single_table | ref | approved | approved | 1 | const | 567357 | where used | +--------------+------+---------------+----------+---------+-------+-------- +------------+ here is the table: CREATE TABLE `single_table` ( `forum_id` int(10) unsigned NOT NULL default '0', `id` int(10) unsigned NOT NULL auto_increment, `datestamp` datetime NOT NULL default '0000-00-00 00:00:00', `thread` int(10) unsigned NOT NULL default '0', `parent` int(10) unsigned NOT NULL default '0', `author` varchar(37) NOT NULL default '', `subject` varchar(255) NOT NULL default '', `email` varchar(200) NOT NULL default '', `attachment` varchar(64) NOT NULL default '', `host` varchar(50) NOT NULL default '', `email_reply` char(1) NOT NULL default 'N', `approved` char(1) NOT NULL default 'N', `msgid` varchar(100) NOT NULL default '', `viewcount` int(11) NOT NULL default '0', `modifystamp` int(10) unsigned NOT NULL default '0', `userid` int(10) unsigned NOT NULL default '0', `body` text NOT NULL, PRIMARY KEY (`id`), KEY `forum_id_3` (`forum_id`,`thread`,`modifystamp`,`approved`), KEY `approved` (`approved`,`datestamp`), FULLTEXT KEY `body` (`body`,`author`,`subject`) ) TYPE=MyISAM Another query that looks at less rows has a much different explain and I only changed the datestamp portion. This query took from 1.79 to 9.9 seconds depending on if the table is open. select SQL_NO_CACHE id from single_table where approved='Y' AND datestamp >= '2002-07-16 00:00:00' AND ( ( author LIKE '%myphorum%' OR subject LIKE '%myphorum%' OR body LIKE '%myphorum%' ) ) order by datestamp desc; +--------------+-------+---------------+----------+---------+------+-------- +------------+ | table | type | possible_keys | key | key_len | ref | rows | Extra | +--------------+-------+---------------+----------+---------+------+-------- +------------+ | single_table | range | approved | approved | 9 | NULL | 282736 | where used | +--------------+-------+---------------+----------+---------+------+-------- +------------+ The server is a Duron 900 with 256MB DDR. It was doing nothing else at the time this was all done. Here is my.cnf: port = 3306 socket = /tmp/mysql.sock skip-networking skip-locking datadir = /usr/local/mysqldb user = mysql set-variable = key_buffer=16M set-variable = max_allowed_packet=1M set-variable = table_cache=64 set-variable = sort_buffer=512K set-variable = net_buffer_length=8K set-variable = myisam_sort_buffer_size=8M If a server this size just can't handle this kind of query, I can accept that. I just would like some confirmation that there is nothing I can do to make this work better. Oh, BTW, I did try Full-Text, I don't want to use it if I don't have to as I want to support older MySQL versions. This software is for distribution. Also, i found those queries to be slow if an order by or other where elements were used. Plus, most Phorum's won't be this big and Full-Text will not work as well. Thanks, Brian Phorum Dev Team --------------------------------------------------------------------- Before posting, please check: http://www.mysql.com/manual.php (the manual) http://lists.mysql.com/ (the list archive) To request this thread, e-mail <[EMAIL PROTECTED]> To unsubscribe, e-mail <[EMAIL PROTECTED]> Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php