In the last episode (Dec 27), Chris Picton said: > I am trying to get to grips with understanding mysql performance. > > I have the following query: > > select sql_no_cache count(*), date_format(calldate, '%y-%m-%d') as m from > cdr_warehouse group by m; > > This gives me: > 115 rows in set (59.52 sec) > > mysql> explain select count(*), date_format(calldate, '%y-%m-%d') as m from > cdr_warehouse group by m\G > *************************** 1. row *************************** > id: 1 > select_type: SIMPLE > table: cdr_warehouse > type: index > possible_keys: NULL > key: calldate_idx > key_len: 8 > ref: NULL > rows: 43708571 > Extra: Using index; Using temporary; Using filesort > 1 row in set (0.00 sec) > > How can I improve the running speed of this query? I am running > 5.1.30, but don't (yet) want to partition the table (horizontally or > vertically). Nothing else on the server is touching this table at > the moment. The exact date_format is not important, as I may want to > group by 5 second intervals, or full months.
With an index scan on a 43-million-record index, you're going to be CPU-bound instead of I/O-bound on that query, so use the fastest expression you can in the group-by clause. Surprisingly, date_format seems to be pretty good. For grouping by days, date(calldate) is around 20% faster, but converting to a unix_timestamp and DIViding (the only way to get things like 5-second intervals) is 2x slower. Converting that datetime field into a timestamp field should speed up the unix_timestamp function (since timestamps are internally stored as unix time_t values), but you'll need to disable the auto-updating features as well: http://dev.mysql.com/doc/refman/5.1/en/timestamp.html http://dev.mysql.com/doc/refman/5.1/en/storage-requirements.html -- Dan Nelson dnel...@allantgroup.com -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe: http://lists.mysql.com/mysql?unsub=arch...@jab.org