We have a table with 40 million rows. It has statistics on traffic from our website. Logs are processed once a night, and the data from those logs are added.

Our table (traffic_boats, InnoDB) has three columns of interest:

day INT
yearmonth INT
stem_base VARCHAR(100)

There is an index on day, an index on yearmonth, an index on stem_base, an index on (day, yearmonth), an index on (day, yearmonth and stem_base). I added the last two today to try to fix the performance issues we are having.

A typical query would like like,

SELECT * FROM traffic_boats WHERE stem_base = 'xxxx' AND yearmonth = 200407 AND day >= 07 AND day <= 27;

An explain-plan shows that the optimizer is picking the index on stem_base. It *should* be picking the composite index on (day, yearmonth and stembase). The greater-than-less-than is throwing it off. I can add a "USE INDEX" to force it to use the index I want it to, but that's a little hokey (and it gives me flashbacks to the days that I managed one of those commercial-RDBMS where tuning was a nightmare).

I've tried "analyze table" and "optimize table" (it's InnoDB) without luck.

What's really weird is that optimizer comes up with a bad count of rows to be examined.

If I let the optimizer pick the index,

mysql> explain SELECT * FROM traffic_boats WHERE stem_base = 'xxxx' AND yearmonth = 200407 AND day >= 07 AND day <= 27;
+---------------+------+-----------------------------------------------------+----------+---------+-------+-------+-------------+
| table | type | possible_keys | key | key_len | ref | rows | Extra |
+---------------+------+-----------------------------------------------------+----------+---------+-------+-------+-------------+
| traffic_boats | ref | idx_yearmonth,idx_day,idx_stem,ymd_stem_idx,ymd_idx | idx_stem | 100 | const | 42600 | Using where |
+---------------+------+-----------------------------------------------------+----------+---------+-------+-------+-------------+
1 row in set (0.02 sec)


It thinks it needs to examine 42600 rows.

If I force the correct index,

mysql> explain SELECT * FROM traffic_boats USE INDEX (ymd_stem_idx) WHERE stem_base = 'xxxx' AND yearmonth = 200407 AND day >= 07 AND day <= 27;
+---------------+-------+---------------+--------------+---------+------+---------+-------------+
| table | type | possible_keys | key | key_len | ref | rows | Extra |
+---------------+-------+---------------+--------------+---------+------+---------+-------------+
| traffic_boats | range | ymd_stem_idx | ymd_stem_idx | 108 | NULL | 4019400 | Using where |
+---------------+-------+---------------+--------------+---------+------+---------+-------------+
1 row in set (0.00 sec)


It thinks it needs to examine 4,019,400 rows.

If I ran this query without the USE INDEX it would take a few minutes. If I force the index, it takes 20 seconds. You would think that using stem_base, day and yearmonth would be much more selective than using just stem_base.

Anyone got some insight into this?

David


-- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe: http://lists.mysql.com/[EMAIL PROTECTED]



Reply via email to