A colleague is running MySQL community server 5.1.34 on RHEL 5 on a big 
Xeon-based SMP (16 CPUs, 64 GB memory).  It is taking a surprisingly long 
time to execute a query, yet is not working particularly hard at it.  I 
wonder why this might be.  Following are details.  First, some `vmstat` 
output that shows the machine is doing almost nothing (I have inserted 
some additional spaces in the header to make it line up better); it shows 
no CPU activity and very little I/O:

# vmstat 5
procs -----------memory---------- ---swap-- -----io---- --system-- 
-----cpu------
 r  b   swpd     free   buff   cache   si   so    bi    bo   in   cs us sy 
 id wa st
 0  0      0 56954556 328608 1040188    0    0  2600   130    1    1  1  0 
99  0  0
 0  1      0 56954564 328608 1040188    0    0    74     6  301  961  0  0 
100  0  0
 0  0      0 56954564 328608 1040188    0    0    73     1  287  970  0  0 
100  0  0
 0  0      0 56954564 328608 1040188    0    0    73     5  297  925  0  0 
100  0  0
 0  0      0 56954564 328608 1040188    0    0    71    51  291  926  0  0 
100  0  0

Here is what "show processlist" gets me:

+----+------+-----------+-------------------+---------+-------+--------------+------------------------------------------------------------------------------------------------------+
| Id | User | Host      | db                | Command | Time  | State   | 
Info                           |
+----+------+-----------+-------------------+---------+-------+--------------+------------------------------------------------------------------------------------------------------+
| 3  | root | localhost | bigCell2906_flood | Query   | 32050 | Sending 
data | create table fldpar (p VARCHAR(200) NOT NULL, rcd DATETIME NOT 
NULL, rcms SMALLINT NOT NULL,  |
| 8  | root | localhost |                   | Query   | 0     |   | show 
processlist                      |
+----+------+-----------+-------------------+---------+-------+--------------+------------------------------------------------------------------------------------------------------+

Yes, that's nearly 9 hours so far on this query.  The amount of data 
involved here is not small, but it is not so big that 9 hours should be 
needed for this statement.  Here is the statement:

create table fldpar (p VARCHAR(200) NOT NULL,
       rcd DATETIME NOT NULL, rcms SMALLINT NOT NULL, msgid BIGINT NOT 
NULL,
       q VARCHAR(200) NOT NULL, scd DATETIME NOT NULL, scms SMALLINT NOT 
NULL,
       INDEX p(p), INDEX q(q) )
       as select fldrcv.p, fldrcv.cd as rcd, fldrcv.cms as rcms,
       fldrcv.msgid as msgid, fldrcv.q, fldsnd.cd as scd, fldsnd.cms as 
scms
       from fldrcv, fldsnd
       where fldrcv.q=fldsnd.p AND fldrcv.qboot=fldsnd.pboot
       AND fldrcv.msgid=fldsnd.msgid;

This statement makes a new table by joining two existing tables.  Here is 
what `explain` has to say about the select part of the statement:

mysql> explain select fldrcv.p, fldrcv.cd as rcd, fldrcv.cms as rcms,
    ->        fldrcv.msgid as msgid, fldrcv.q, fldsnd.cd as scd, 
fldsnd.cms as scms
    ->        from fldrcv, fldsnd
    ->        where fldrcv.q=fldsnd.p AND fldrcv.qboot=fldsnd.pboot
    ->        AND fldrcv.msgid=fldsnd.msgid;
+----+-------------+--------+------+---------------+------+---------+------------------------------------------------------------------------------------------+-------+-------------+
| id | select_type | table  | type | possible_keys | key  | key_len | ref  
         | rows  | Extra       |
+----+-------------+--------+------+---------------+------+---------+------------------------------------------------------------------------------------------+-------+-------------+
|  1 | SIMPLE      | fldsnd | ALL  | pec,pbm       | NULL | NULL    | NULL 
          | 29036 |             | 
|  1 | SIMPLE      | fldrcv | ref  | qbm           | qbm  | 220     | 
bigCell2906_flood.fldsnd.p,bigCell2906_flood.fldsnd.pboot,bigCell2906_flood.fldsnd.msgid
 
|   452 | Using where | 
+----+-------------+--------+------+---------------+------+---------+------------------------------------------------------------------------------------------+-------+-------------+

The fldrcv table has an index on precisely the fields used in this join. 
There are about 14 million rows in that table, which is about 480 times as 
many rows as there are in the fldsnd table.  I expect the result to be no 
larger than the fldrcv table.  So it looks like the index is making this 
query run about as fast as can be expected, right?  It did not take 
anywhere near 9 hours to make the fldrcv table ... so why is it taking so 
long to do this join to make the fldpar table?

/etc/my.cnf is based on the distribution's my-huge.cnf, with only minor 
customization.

Thanks,
Mike Spreitzer

Reply via email to