I also find that if I have both tables in MyISAM and use STRAIGHT_JOIN to force the better query plan (enumerate the longer table, for each longer table row use the shorter table's index to pick out the one right matching row from the shorter table) then the server has low I/O utilization but the CPU utilization is about as high as can be expected for a single query running on a 16-CPU machine. Why should this thing be CPU-bound? Here is the query:
create table fp2 (p VARCHAR(200) NOT NULL, rcd DATETIME NOT NULL, rcms SMALLINT NOT NULL, msgid BIGINT NOT NULL, q VARCHAR(200) NOT NULL, scd DATETIME NOT NULL, scms SMALLINT NOT NULL, lat DECIMAL(14,3), INDEX p(p), INDEX q(q) ) AS SELECT fldrcv.p, fldrcv.cd as rcd, fldrcv.cms as rcms, fldrcv.msgid as msgid, fldrcv.q, fldsnd.cd as scd, fldsnd.cms as scms, TIMESTAMPDIFF(SECOND, fldsnd.cd, fldrcv.cd) + (fldrcv.cms-fldsnd.cms)/1000 as lat FROM fldrcv STRAIGHT_JOIN fldsnd ON fldrcv.q=fldsnd.p AND fldrcv.qboot=fldsnd.pboot AND fldrcv.msgid=fldsnd.msgid; and here is some `iostat -x 5` output that shows a total of less than 50% I/O utilization and about 15/16 CPU utilization: avg-cpu: %user %nice %system %iowait %steal %idle 4.27 0.00 1.82 0.00 0.03 93.89 Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz await svctm %util sda 0.00 1.20 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 sdb 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 sdc 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 sdd 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 sde 0.00 0.00 0.00 1.20 0.00 4.60 3.83 0.00 0.00 0.00 0.00 sdf 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 sdg 0.00 0.00 4.40 3.20 2252.80 1434.00 485.11 0.16 20.74 13.26 10.08 sdh 0.00 0.00 4.40 2.80 2252.80 1433.60 512.00 0.13 18.44 12.89 9.28 sdi 0.00 0.00 4.20 2.80 2150.40 1433.60 512.00 0.13 19.20 12.91 9.04 sdj 0.00 0.00 4.40 2.80 2252.80 1433.60 512.00 0.16 22.44 15.56 11.20 dm-0 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 dm-1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 avg-cpu: %user %nice %system %iowait %steal %idle 4.28 0.00 1.81 0.01 0.03 93.88 Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz await svctm %util sda 0.00 0.00 0.00 0.40 0.00 12.80 32.00 0.00 4.00 4.00 0.16 sdb 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 sdc 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 sdd 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 sde 0.00 0.00 0.00 0.60 0.00 3.80 6.33 0.00 0.00 0.00 0.00 sdf 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 sdg 0.00 0.00 4.40 3.00 2252.80 1433.80 498.19 0.17 23.57 16.65 12.32 sdh 0.00 0.00 4.40 2.80 2252.80 1433.60 512.00 0.16 21.67 14.78 10.64 sdi 0.00 0.00 4.40 2.80 2252.80 1433.60 512.00 0.15 20.89 14.44 10.40 sdj 0.00 0.00 4.20 2.80 2150.40 1433.60 512.00 0.15 21.71 14.74 10.32 dm-0 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 dm-1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 Thanks, Mike Spreitzer From: Mike Spreitzer/Watson/i...@ibmus To: Dan Nelson <dnel...@allantgroup.com> Cc: MySql <mysql@lists.mysql.com> Date: 08/11/2010 01:30 PM Subject: Re: idle query I finally started trying to optimize along the memory-based lines you suggested. I am surprised to find that the query plan is to enumerate the memory-based table and then pick out the hundreds of related rows from the much larger MyISAM table. What's going on here? `show create table` says this about the relevant tables: CREATE TABLE `fldsndm` ( `p` varchar(200) NOT NULL, `cd` datetime NOT NULL, `cms` smallint(6) NOT NULL, `pip` char(15) NOT NULL, `pport` smallint(6) NOT NULL, `pboot` bigint(20) NOT NULL, `msgid` bigint(20) NOT NULL, `startgtime` bigint(20) NOT NULL, `datalen` int(11) NOT NULL, `toself` tinyint(1) DEFAULT NULL, `sepoch` bigint(20) NOT NULL DEFAULT '0', `c` decimal(11,3) NOT NULL DEFAULT '0.000', UNIQUE KEY `pbm` (`p`,`pboot`,`msgid`) ) ENGINE=MEMORY DEFAULT CHARSET=latin1 CREATE TABLE `fldrcv` ( `p` varchar(200) NOT NULL, `cd` datetime NOT NULL, `cms` smallint(6) NOT NULL, `pip` char(15) NOT NULL, `pport` smallint(6) NOT NULL, `pboot` bigint(20) DEFAULT NULL, `qip` char(15) NOT NULL, `qport` smallint(6) NOT NULL, `qboot` bigint(20) DEFAULT NULL, `msgid` bigint(20) NOT NULL, `startgtime` bigint(20) NOT NULL, `datalen` int(11) NOT NULL, `q` varchar(200) DEFAULT NULL, `repoch` bigint(20) NOT NULL DEFAULT '0', `c` decimal(11,3) NOT NULL DEFAULT '0.000', KEY `c` (`c`), KEY `pec` (`p`,`repoch`,`c`), KEY `peqms` (`p`,`repoch`,`q`,`msgid`,`startgtime`), KEY `qbm` (`q`,`qboot`,`msgid`), KEY `pbm` (`p`,`pboot`,`msgid`) ) ENGINE=MyISAM DEFAULT CHARSET=latin1 And here is the query planning I see: mysql> explain select * from fldrcv join fldsndm on fldrcv.q=fldsndm.p AND fldrcv.qboot=fldsndm.pboot and fldrcv.msgid=fldsndm.msgid; +----+-------------+---------+------+---------------+------+---------+---------------------------------------------------------------------------------------------+-------+-------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+---------+------+---------------+------+---------+---------------------------------------------------------------------------------------------+-------+-------------+ | 1 | SIMPLE | fldsndm | ALL | pbm | NULL | NULL | NULL | 29036 | | | 1 | SIMPLE | fldrcv | ref | qbm | qbm | 220 | bigCell2906_flood.fldsndm.p,bigCell2906_flood.fldsndm.pboot,bigCell2906_flood.fldsndm.msgid | 452 | Using where | +----+-------------+---------+------+---------------+------+---------+---------------------------------------------------------------------------------------------+-------+-------------+ BTW, here are the table sizes: mysql> select count(*) from fldrcv; +----------+ | count(*) | +----------+ | 13785373 | +----------+ 1 row in set (0.00 sec) mysql> select count(*) from fldsndm; +----------+ | count(*) | +----------+ | 29036 | +----------+ Thanks, Mike Spreitzer