Afternoon all. I wonder if any experts on the join optimiser could help me
out. I have read the docs, including the material on how mysql optimises
joins, and have redesigned my schema somewhat to try and simplify the joins,
and that has worked to some extent. But I am still way off the performance I
should be getting on my problem queries.

My Question: Why don't my queries use the entire index I have created on the
spam(t1) table? Is there any way to force mysql to use my index as I intend?
What factors influence the choice of index/column in this case?

Basics first: I am using MySQL 4.0.18 (mysql.com official build) on a Dell
1750 dual Xeon running slackware 9.1, kernel 2.6.3, loads of memory, loads
of disk space. My my.cnf is pasted in at the end. All tables in this query
are innodb.

My problem: I have 2 problem queries which involve a join on 3 tables. I
have created an index on each table, and all columns required for the join
are indexed. All boolean operations are AND. All conditions are x=y or
x=const. On 2 of the tables, only 1 column is used - no problem there; on
the 'problem' table (spam(t1)) 3 columns are used, and I have created a
multi-column index (SP_RID_STA) which includes all 3 relevant columns. The
problem: according to 'EXPLAIN' mysql is not using the entire index. I have
two sample queries below; the first counts the items in the table, and uses
2/3 columns), and and the second selects the first 10 items (uses 2/3
colums). I am judging the index usage by the 'key_len' value in the explain
output. The first column should be '1', the second '8' and the third '1'. I
am therefore expecting '10' in the key_len result for both queries.

Currently the script which runs these queries times out. I need it really to
take of the order of a couple of seconds. I don't see why this should not be
possible with the right indexes, judging by what I have been abloe to
achieve with similar scripts on the same data.

Thanks in advance
Jim
[EMAIL PROTECTED]


Details:

spam (t1) contains 5317996 rows
recip (t2) conatins 9340685 rows
mailin (t3) contains 6464183 rows

Query 1:

explain SELECT COUNT(*) as NumRecords FROM spam t1, recip t2, mailin t3
WHERE SpamFilter='y' AND t1.RecipID = t2.RecipID AND t2.MailInID =
t3.MailInID AND t1.RecipID > 34035098 AND t1.Status="present"  ;
+-------+--------+-------------------------+------------+---------+---------
----+--------+--------------------------+
| table | type   | possible_keys           | key        | key_len | ref
| rows   | Extra                    |
+-------+--------+-------------------------+------------+---------+---------
----+--------+--------------------------+
| t1    | range  | SP_RID_STA,RecipID      | SP_RID_STA |       9 | NULL
| 467038 | Using where; Using index |
| t2    | eq_ref | PRIMARY,MailInID        | PRIMARY    |       8 |
t1.RecipID  |      1 |                          |
| t3    | eq_ref | PRIMARY,MID_AID,MID_DID | PRIMARY    |       8 |
t2.MailInID |      1 | Using index              |
+-------+--------+-------------------------+------------+---------+---------
----+--------+--------------------------+

== this query takes about 12 seconds to execute ==

Query 2:

explain SELECT t1.*,t3.* FROM spam t1, recip t2, mailin t3 WHERE
SpamFilter='y' AND t1.RecipID = t2.RecipID AND t2.MailInID = t3.MailInID AND
t1.Status="present" AND t1.RecipID > 34035098 ;
+-------+--------+-------------------------+------------+---------+---------
----+--------+-------------+
| table | type   | possible_keys           | key        | key_len | ref
| rows   | Extra       |
+-------+--------+-------------------------+------------+---------+---------
----+--------+-------------+
| t1    | ref    | SP_RID_STA,RecipID      | SP_RID_STA |       1 | const
| 294033 | Using where |
| t2    | eq_ref | PRIMARY,MailInID        | PRIMARY    |       8 |
t1.RecipID  |      1 |             |
| t3    | eq_ref | PRIMARY,MID_AID,MID_DID | PRIMARY    |       8 |
t2.MailInID |      1 |             |
+-------+--------+-------------------------+------------+---------+---------
----+--------+-------------+

== this query takes 90+ seconds to execute ==

Schema:

 CREATE TABLE `spam` (
  `SpamID` bigint(20) NOT NULL auto_increment,
  `RecipID` bigint(20) NOT NULL default '0',
  `Reason` text,
  `TermDate` datetime default NULL,
  `SpamFilter` enum('n','y') NOT NULL default 'n',
  `ImageFilter` enum('n','y') NOT NULL default 'n',
  `Status` enum('present','released','deleted') NOT NULL default 'present',
  PRIMARY KEY  (`SpamID`),
  KEY `ImageFilter` (`ImageFilter`),
  KEY `SP_RID_STA` (`SpamFilter`,`RecipID`,`Status`),
  KEY `RecipID` (`RecipID`)
) TYPE=InnoDB

CREATE TABLE `recip` (
  `RecipID` bigint(20) NOT NULL auto_increment,
  `Date` datetime NOT NULL default '0000-00-00 00:00:00',
  `AccountID` int(11) NOT NULL default '0',
  `DomainID` int(11) NOT NULL default '0',
  `EndUserID` int(11) NOT NULL default '-1',
  `Recipient` text NOT NULL,
  `MailInID` bigint(20) NOT NULL default '0',
  `Status` enum('pending','virus','quarantine','spam','removed by
rule','fail','deleted','ok','rejected','sending','no
data','undefined','received','filtering','filtered','stalled') NOT NULL
default 'received',
  `OtherID` bigint(20) default NULL,
  `Attempts` int(11) NOT NULL default '1',
  `Reported` enum('n','y') NOT NULL default 'n',
  PRIMARY KEY  (`RecipID`),
  KEY `MailInID` (`MailInID`),
  KEY `Status` (`Status`),
  KEY `DomainID` (`DomainID`),
  KEY `AccountID` (`AccountID`)
) TYPE=InnoDB

CREATE TABLE `mailin` (
  `MailInID` bigint(20) NOT NULL auto_increment,
  `Date` datetime NOT NULL default '0000-00-00 00:00:00',
  `FromAccountID` int(11) default NULL,
  `FromDomainID` int(11) default NULL,
  `FromEndUserID` int(11) NOT NULL default '-1',
  `FromHost` text,
  `ReversePath` text,
  `ForwardPath` text,
  `Status` text,
  `Session` text,
  `Subject` text,
  `Octets` int(11) default NULL,
  PRIMARY KEY  (`MailInID`),
  UNIQUE KEY `MID_AID` (`MailInID`,`FromAccountID`),
  UNIQUE KEY `MID_DID` (`MailInID`,`FromDomainID`)
) TYPE=InnoDB

my.cnf (relevant bits):

[mysqld]
user            = mysql
basedir         = /usr/local/mysql
datadir         = /var/lib/mysql
port            = 3306
socket          = /var/lib/mysql/mysql.sock
skip-locking
key_buffer = 100M
max_allowed_packet = 1M
table_cache = 32
sort_buffer_size = 2M
read_buffer_size = 2M
myisam_sort_buffer_size = 64M
query_cache_size = 32M
max_connections = 50
thread_cache = 8
thread_concurrency = 8
log-bin = /mnt/sdb1/mysql-bin/mysql-bin
server-id       = 1
innodb_data_home_dir = /mnt/sdc1/mysql-data/
innodb_data_file_path =
ibdata1:2000M;ibdata2:2000M;ibdata3:10M;ibdata4:2000M;ibdata5:2000M;ibdata6:
2000M;ibdata7:2000M;ibdata8:2000M;ibdata9:2000M;ibdata10:2000M;ibdata11:2000
M;ibdata12:2000M;ibdata13:2000M:autoextend
innodb_log_group_home_dir = /mnt/sdb1/mysql-logs/
innodb_log_arch_dir = /mnt/sdb1/mysql-logs/
innodb_buffer_pool_size = 1600M
innodb_additional_mem_pool_size = 20M
innodb_log_file_size = 280M
innodb_log_buffer_size = 8M
innodb_flush_log_at_trx_commit = 0
innodb_lock_wait_timeout = 50


Imagine a service that checks your business email for viruses, junk-mail and 
pornographic content BEFORE it reaches you.
Imagine a service that allows you to build custom rules for content management.
Imagine a service that does all this without having to make a single change to your 
computer system.

EMF Enterprise for businesses is a service that offers all of this

To make your imagination a reality go to www.emf-systems.com

(This email has been scanned for viruses by www.emf-systems.com)


-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:    http://lists.mysql.com/[EMAIL PROTECTED]

Reply via email to