Re: simple query on an indexed col in big table is extremely slow

Zhe Wang Thu, 28 Jul 2005 12:17:10 -0700

Hi, Shawn,

   Thanks a lot for your reply.

Running "analyze table" didn't help much since this table has notbeen updated after being built.

The table has 3 single indexes. What puzzled me was that the queriesagainst the first index were fast, only slow if against the second orthird index. The column types are the same for the first and secondindex (int(10) unsigned).

This is my first time to encounter a slow query on an indexedcolumn. I have another table which has about 750 millions rows, a searchagainst the index column (int(9) unsigned) has lightening speed. Does itmean the key cache on my machine is big enough?

Below are the values of the cache variables. Which variable do youthink need to be boosted up?


   Again, thank you very much or your help!

+------------------------------+----------------------+

| Variable_name                | Value                |

+------------------------------+----------------------+

| bdb_cache_size               | 8388600              |

| binlog_cache_size            | 32768                |

| have_query_cache             | YES                  |

| key_cache_age_threshold      | 300                  |

| key_cache_block_size         | 1024                 |

| key_cache_division_limit     | 100                  |

| max_binlog_cache_size        | 18446744073709551615 |

| query_cache_limit            | 1048576              |

| query_cache_min_res_unit     | 4096                 |

| query_cache_size             | 67108864             |

| query_cache_type             | ON                   |

| query_cache_wlock_invalidate | OFF                  |

| table_cache                  | 524288               |

| thread_cache_size            | 512                  |

+------------------------------+----------------------+


Regards,
Zhe




[EMAIL PROTECTED] wrote:

Zhe Wang <[EMAIL PROTECTED]> wrote on 07/28/2005 10:40:08 AM:
Hi, there,
I am have a hard time figuring out why a simple query is extremely
slow. I would greatly appreciate if you can shed some light!

     The table is in InnoDB:

CREATE TABLE `rps_hits` (
 `gi` int(10) unsigned NOT NULL default '0',
 `cddid` int(10) unsigned NOT NULL default '0',
 `bit_score` float NOT NULL default '0',
 `evalue` double NOT NULL default '0',
 `identity` smallint(5) unsigned NOT NULL default '0',
 `query_from` smallint(5) unsigned NOT NULL default '0',
 `query_to` smallint(5) unsigned NOT NULL default '0',
 `hit_from` smallint(5) unsigned NOT NULL default '0',
 `hit_to` smallint(5) unsigned NOT NULL default '0',
 `hit_len` smallint(5) unsigned NOT NULL default '0',
 `align_len` smallint(5) unsigned NOT NULL default '0',
 `bz_alignment` blob NOT NULL,
 KEY `gi` (`gi`),
 KEY `cddid` (`cddid`),
 KEY `evalue` (`evalue`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 MAX_ROWS=1410065408
AVG_ROW_LENGTH=300
It is a big table with more than 60 million rows, the rps_hits.ibd
file is 22 G. All the queries I mention below were run when no other job
were running against the database.
I did a very simple query against the table: "select gi, cddid,evalue from rps_hits where cddid=3161". It took 4 minutes and 29.90seconds to pull out only 1952 rows. Whereas another simply query on gi"select gi, cddid, evalue from rps_hits where gi=393396" pulled out 1532
rows in just 0.09 second.

      "Expalin" the above query gave:
mysql> explain select gi, cddid, evalue from rps_hits where cddid=3161\G
*************************** 1. row ***************************
          id: 1
 select_type: SIMPLE
       table: rps_hits
        type: ref
possible_keys: cddid
         key: cddid
     key_len: 4
         ref: const
        rows: 1376
       Extra: Using where
1 row in set (0.06 sec)

      It looks OK.
Then I did "show index from rps_hits", all the index propertiesfor each of the 3 indexes are the same except cardinality. Column gi has
a cardinality of 3084286 and cddid has 14. Though a specific "selectcount(distinct cddid) from rps_hits" returned 11156. Since 11156 uniquecddid is less than 0.01% of the total number of rows in the table, Ibelieved the server decided to do a full table scan (does anyone knowthe exact percentage number of the total counts that MySQL uses as acriteria when deciding to do a FTS?)
I then use "use index" in the query after I made sure the queryand index were no longer in the cache : "select gi, cddid, evalue fromrps_hits use index (cddid) where cddid=3161". It still took a long time(2 min 59.79 sec) to return the 1952 rows.
I also noticed that a simple query on evalue like "select gi,cddid, evalue from rps_hits where evalue=1.97906;" is also extremelyslow even "force index" was used (5.78 sec for 56 rows).
I have not been able to figure out what went wrong. Since theindex on gi worked fine, I am just wondering if the slowness is causedby the large size of the table and that the indexes on cddid and evaluewere created as second index and third index respectively in "createtable". However I have another huge table with blob column and withcomparable size and number of rows to this rps_hits table, if I searched
on the third index, it was very fast.
This problem really troubled me and I would greatly appreciate if
anyone could give me a hint. Thank you in advance!

Regards,
Zhe
My first idea is to have you run ANALYZE TABLE against your table. Analyzetable will update your index statistics (like cardinality).Your index cache may be too small or your the actual index may be toolarge to accomodate it in memory all at the same time. That means that youare using swap space to store part of your indexes on disk and it may betakeing just that much more time to crawl a paged index compared to onethat can reside completely (or mostly) in memory.
How many records are returned is only important if you and your server areconnected by a slow network or if you are receiving HUGE quantities ofdata (gigabytes) in your results. The slower the network, the longer itwill take to transfer the data from the server to you. However, mostmodern networks take that factor out of consideration for issues that youare describing. My suspicion is that you memory paging and cache sizes anddisk performance are your most likey bottlenecks.
Shawn Green
Database Administrator
Unimin Corporation - Spruce Pine

Re: simple query on an indexed col in big table is extremely slow

Reply via email to