RE: idle query

Jerry Schwartz Tue, 27 Jul 2010 06:25:29 -0700

>-----Original Message-----
>From: Dan Nelson [mailto:dnel...@allantgroup.com]
>Sent: Monday, July 26, 2010 11:31 PM
>To: Mike Spreitzer
>Cc: MySql
>Subject: Re: idle query



>iostat -x output would be helpful here, too, so we can see whether your
>disks are at 100% busy.
>
[JS] 100% busy would be a disaster! Anything more than about 80% busy is going 
to slow things down rapidly, but by busy I mean not idle. The number of I/Os 
per second is not a good indicator, because there's a big difference between 
sequential reads, repeatable reads (that could be satisfied out of the disk's 
or controller's buffers), random reads, and writes of any of the same flavors. 
It also matters whether or not you have write caching (delayed writes) turned 
on.

Then you have to consider the effect of striping, the number of controllers 
across which the disks are spread, the smarts in the controllers (request 
chaining?) and the like.

There's also the behavior of the file system at the OS level. If reads are 
unlikely to be satisfied in the file system's buffers, then a small buffer 
pool would be marginally more efficient that a large one; if they are fairly 
localized or repeatable, a bigger buffer pool is better. A journaling file 
system is better for writing, under most circumstances.

In most cases, when a system is slow but the CPU's are all loafing it's a good 
guess that the bottleneck is in the disk subsystem.

[I was a performance consultant in a past life. In another past life I 
invented the sharp stick.]

Regards,

Jerry Schwartz
Global Information Incorporated
195 Farmington Ave.
Farmington, CT 06032

860.674.8796 / FAX: 860.674.8341
E-mail: je...@gii.co.jp
Web site: www.the-infoshop.com




>You're doing ~75 I/O's per second, so it looks like you're running on slow
>SATA drives (a 4500 RPM drive will tops out at 75 iops); with drives like
>that, anything that forces random I/O to a large table is going to kill you.
>MySQL runs a single thread for each submitted query (i.e.  no Parallel Query
>option like Oracle has), so if a query needs to do 14 million random reads,
>that'll be 1400000/70/3600 ~= 6 hours just to fetch table data (let alone
>the index lookups required to find the table rows).
>
>MySQL also only has a simple index join type on disk-based tables, so no
>hash joins (which would load your 2nd table into RAM temporarily as a hash
>table).  It does use hash indexes on memory tables though, so if you load
>fldrcv into a memory table and join fldsnd against it, that might work
>pretty well.
>
>Another option would be to create an index on fldrcv that covers both your
>WHERE clause fields and your SELECT clause fields.  MySQL will be able to
>use that index to satisfy the query's requirements without going to the
>table itself.  Then you can use the LOAD INDEX INTO CACHE command to preload
>that index into memory.  This ends up working similar to the memory table
>option (no disk I/O needed for the 2nd table), with tradeoffs on both sides
>(a preloaded index consumes disk space and mysql has to keep it updated when
>the table is updated, but a memory table has to be dropped and rebuilt every
>time you run your main query if the fldrcv table changes often).  Which
>option you choose depends on how often you run the query, and how often
>fldrcv changes.
>
>> create table fldpar (p VARCHAR(200) NOT NULL,
>>  rcd DATETIME NOT NULL, rcms SMALLINT NOT NULL, msgid BIGINT NOT  NULL, q
>>  VARCHAR(200) NOT NULL, scd DATETIME NOT NULL, scms SMALLINT NOT NULL,
>>  INDEX p(p), INDEX q(q) ) as select fldrcv.p, fldrcv.cd as rcd, fldrcv.cms
>>  as rcms, fldrcv.msgid as msgid, fldrcv.q, fldsnd.cd as scd, fldsnd.cms as
>>  scms from fldrcv, fldsnd where fldrcv.q=fldsnd.p AND
>>  fldrcv.qboot=fldsnd.pboot AND fldrcv.msgid=fldsnd.msgid;
>
>> This statement makes a new table by joining two existing tables.  Here is
>> what `explain` has to say about the select part of the statement:
>>
>> mysql> explain select fldrcv.p, fldrcv.cd as rcd, fldrcv.cms as rcms,
>>     ->  fldrcv.msgid as msgid, fldrcv.q, fldsnd.cd as scd,
>>     ->  fldsnd.cms as scms from fldrcv, fldsnd where
>>     ->  fldrcv.q=fldsnd.p AND fldrcv.qboot=fldsnd.pboot AND
>>     ->  fldrcv.msgid=fldsnd.msgid;
>> +----+-------------+--------+------+---------------+------+---------+--------
>-------------------------------------------------------------------------------
>---+-------+-------------+
>> | id | select_type | table  | type | possible_keys | key  | key_len | ref
>| rows  | Extra       |
>> +----+-------------+--------+------+---------------+------+---------+--------
>-------------------------------------------------------------------------------
>---+-------+-------------+
>> |  1 | SIMPLE      | fldsnd | ALL  | pec,pbm       | NULL | NULL    | NULL
>| 29036 |             |
>> |  1 | SIMPLE      | fldrcv | ref  | qbm           | qbm  | 220     |
>bigCell2906_flood.fldsnd.p,bigCell2906_flood.fldsnd.pboot,bigCell2906_flood.fld
>snd.msgid  |   452 | Using where |
>> +----+-------------+--------+------+---------------+------+---------+--------
>-------------------------------------------------------------------------------
>---+-------+-------------+
>>
>> The fldrcv table has an index on precisely the fields used in this join.
>> There are about 14 million rows in that table, which is about 480 times as
>> many rows as there are in the fldsnd table.  I expect the result to be no
>> larger than the fldrcv table.  So it looks like the index is making this
>> query run about as fast as can be expected, right?  It did not take
>> anywhere near 9 hours to make the fldrcv table ...  so why is it taking so
>> long to do this join to make the fldpar table?
>
>--
>       Dan Nelson
>       dnel...@allantgroup.com
>
>--
>MySQL General Mailing List
>For list archives: http://lists.mysql.com/mysql
>To unsubscribe:    http://lists.mysql.com/mysql?unsub=je...@gii.co.jp





-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:    http://lists.mysql.com/mysql?unsub=arch...@jab.org

RE: idle query

Reply via email to