Re: Performance tips

2008-12-30 Thread Baron Schwartz
Hi Jim,

On Tue, Dec 30, 2008 at 5:54 PM, Jim Lyons jlyons4...@gmail.com wrote:
 On Sat, Dec 27, 2008 at 12:38 PM, Jake Maul jakem...@gmail.com wrote:


 3) Obviously it'd probably be faster if you weren't using
 SQL_NO_CACHE... guessing you just did that to show us what it's like
 that way?


 Why would SQL_NO_CACHE slow it down?  By not checking the cache or storing

Because then it might be possible to return the query from the cache.

 the resultset into cache it should be quicker, at least a little bit.
 Unless, of course, the query would always return the same result set.  But
 with count(*) in there it might not.  In fact, I believe the count(*) would
 prevent MySQL from caching the query in the first place, like using now() -
 it's a non-deterministic function.

No, it should always return the same thing unless the data in the
table changes, so it's perfectly cacheable.

-- 
Baron Schwartz, Director of Consulting, Percona Inc.
Our Blog: http://www.mysqlperformanceblog.com/
Our Services: http://www.percona.com/services.html

-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/mysql?unsub=arch...@jab.org



Re: Performance tips

2008-12-27 Thread Jake Maul
I few random things come to mind...

1) Try the query with IGNORE INDEX calldate_idx ... I can't see how
this could possibly be faster, but I always like to check anyway. In
your case this should result in a full table scan, given the
information you've given us.

2) If the performance problem comes from the date_format() function,
there might not be much you can do about it, except to call it less
often. Maybe you could add one or more WHERE clauses to restrict the
range of rows you're looking at? (WHERE calldate  2000-01-01
01:01:01 or something).

3) Obviously it'd probably be faster if you weren't using
SQL_NO_CACHE... guessing you just did that to show us what it's like
that way?

4) I'd check the values of sort_buffer_size and read_rnd_buffer_size.
Perhaps you could benefit from raising them slightly.

As for general performance tips, Here's a couple scripts I like to use
that give some decent tuning advice:

http://www.day32.com/MySQL/
http://wiki.mysqltuner.com/MySQLTuner

Neither of these will be specific to this particular query of course.
There is a query profiler tool in maatkit (http://www.maatkit.org/),
however... might be worth a shot.

Good luck,
Jake

On Sat, Dec 27, 2008 at 10:15 AM, Chris Picton ch...@ecntelecoms.com wrote:
 Hi

 I am trying to get to grips with understanding mysql performance.

 I have the following query:

 select sql_no_cache count(*), date_format(calldate, '%y-%m-%d') as m from
 cdr_warehouse group by m;

 This gives me:
 115 rows in set (59.52 sec)


 mysql explain select count(*), date_format(calldate, '%y-%m-%d') as m from
 cdr_warehouse group by m\G
 *** 1. row ***
   id: 1
  select_type: SIMPLE
table: cdr_warehouse
 type: index
 possible_keys: NULL
  key: calldate_idx
  key_len: 8
  ref: NULL
 rows: 43708571
Extra: Using index; Using temporary; Using filesort
 1 row in set (0.00 sec)


 mysql show keys from cdr_warehouse \G;
 *** 1. row ***
   Table: cdr_warehouse
  Non_unique: 1
Key_name: uploaded_idx
 Seq_in_index: 1
  Column_name: uploaded
   Collation: A
  Cardinality: 66
Sub_part: NULL
  Packed: NULL
Null:
  Index_type: BTREE
 Comment:
 *** 2. row ***
   Table: cdr_warehouse
  Non_unique: 1
Key_name: calldate_idx
 Seq_in_index: 1
  Column_name: calldate
   Collation: A
  Cardinality: 5526774
Sub_part: NULL
  Packed: NULL
Null:
  Index_type: BTREE
 Comment:


 How can I improve the running speed of this query?  I am running 5.1.30, but
 don't (yet) want to partition the table (horizontally or vertically).
  Nothing else on the server is touching this table at the moment.  The exact
 date_format is not important, as I may want to group by 5 second intervals,
 or full months.

 Any tips/ideas for me?

 Chris


 --
 MySQL General Mailing List
 For list archives: http://lists.mysql.com/mysql
 To unsubscribe:http://lists.mysql.com/mysql?unsub=jakem...@gmail.com



-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/mysql?unsub=arch...@jab.org



Re: Performance tips

2008-12-27 Thread Dan Nelson
In the last episode (Dec 27), Chris Picton said:
 I am trying to get to grips with understanding mysql performance.
 
 I have the following query:
 
 select sql_no_cache count(*), date_format(calldate, '%y-%m-%d') as m from 
 cdr_warehouse group by m;
 
 This gives me:
 115 rows in set (59.52 sec)
 
 mysql explain select count(*), date_format(calldate, '%y-%m-%d') as m from 
 cdr_warehouse group by m\G
 *** 1. row ***
 id: 1
select_type: SIMPLE
  table: cdr_warehouse
   type: index
 possible_keys: NULL
key: calldate_idx
key_len: 8
ref: NULL
   rows: 43708571
  Extra: Using index; Using temporary; Using filesort
 1 row in set (0.00 sec)
 
 How can I improve the running speed of this query?  I am running
 5.1.30, but don't (yet) want to partition the table (horizontally or
 vertically).  Nothing else on the server is touching this table at
 the moment.  The exact date_format is not important, as I may want to
 group by 5 second intervals, or full months.

With an index scan on a 43-million-record index, you're going to be
CPU-bound instead of I/O-bound on that query, so use the fastest
expression you can in the group-by clause.  Surprisingly, date_format
seems to be pretty good.  For grouping by days, date(calldate) is
around 20% faster, but converting to a unix_timestamp and DIViding (the
only way to get things like 5-second intervals) is 2x slower. 
Converting that datetime field into a timestamp field should speed up
the unix_timestamp function (since timestamps are internally stored as
unix time_t values), but you'll need to disable the auto-updating
features as well:

http://dev.mysql.com/doc/refman/5.1/en/timestamp.html
http://dev.mysql.com/doc/refman/5.1/en/storage-requirements.html

-- 
Dan Nelson
dnel...@allantgroup.com

-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/mysql?unsub=arch...@jab.org



Re: Performance tips

2008-12-27 Thread ewen fortune
Hi,

On Sat, Dec 27, 2008 at 6:15 PM, Chris Picton ch...@ecntelecoms.com wrote:
 Hi

 I am trying to get to grips with understanding mysql performance.

 I have the following query:

 select sql_no_cache count(*), date_format(calldate, '%y-%m-%d') as m from
 cdr_warehouse group by m;

 This gives me:
 115 rows in set (59.52 sec)


 mysql explain select count(*), date_format(calldate, '%y-%m-%d') as m from
 cdr_warehouse group by m\G
 *** 1. row ***
   id: 1
  select_type: SIMPLE
table: cdr_warehouse
 type: index
 possible_keys: NULL
  key: calldate_idx
  key_len: 8
  ref: NULL
 rows: 43708571
Extra: Using index; Using temporary; Using filesort
 1 row in set (0.00 sec)


 mysql show keys from cdr_warehouse \G;
 *** 1. row ***
   Table: cdr_warehouse
  Non_unique: 1
Key_name: uploaded_idx
 Seq_in_index: 1
  Column_name: uploaded
   Collation: A
  Cardinality: 66
Sub_part: NULL
  Packed: NULL
Null:
  Index_type: BTREE
 Comment:
 *** 2. row ***
   Table: cdr_warehouse
  Non_unique: 1
Key_name: calldate_idx
 Seq_in_index: 1
  Column_name: calldate
   Collation: A
  Cardinality: 5526774
Sub_part: NULL
  Packed: NULL
Null:
  Index_type: BTREE
 Comment:


 How can I improve the running speed of this query?  I am running 5.1.30, but
 don't (yet) want to partition the table (horizontally or vertically).
  Nothing else on the server is touching this table at the moment.  The exact
 date_format is not important, as I may want to group by 5 second intervals,
 or full months.

 Any tips/ideas for me?

Have you tried doing GROUP BY calldate ?
select sql_no_cache count(*), date_format(calldate, '%y-%m-%d') as m from
cdr_warehouse group by calldate;

This may allow you to avoid the filesort by using the index directly
for ordering.

Also I presume this is MyISAM to avoid the InnoDB without WHERE COUNT(*) issue.


Ewen


 Chris


 --
 MySQL General Mailing List
 For list archives: http://lists.mysql.com/mysql
 To unsubscribe:http://lists.mysql.com/mysql?unsub=ewen.fort...@gmail.com



-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/mysql?unsub=arch...@jab.org



Re: Performance tips

2008-12-27 Thread ewen fortune
Mmm I just tested this and it does indeed work (although i tested with
slightly less rows :o) )


explain select count(*) , date_format(calldate, '%y-%m-%d') as m from
cdr_warehouse group by m \G
*** 1. row ***
   id: 1
  select_type: SIMPLE
table: cdr_warehouse
 type: index
possible_keys: NULL
  key: calldate_idx
  key_len: 4
  ref: NULL
 rows: 26
Extra: Using index; Using temporary; Using filesort
1 row in set (0.00 sec)

explain select count(*) , date_format(calldate, '%y-%m-%d') as m from
cdr_warehouse group by calldate \G
*** 1. row ***
   id: 1
  select_type: SIMPLE
table: cdr_warehouse
 type: index
possible_keys: NULL
  key: calldate_idx
  key_len: 4
  ref: NULL
 rows: 26
Extra: Using index

Cheers,

Ewen

On Sat, Dec 27, 2008 at 8:04 PM, ewen fortune ewen.fort...@gmail.com wrote:
 Hi,

 On Sat, Dec 27, 2008 at 6:15 PM, Chris Picton ch...@ecntelecoms.com wrote:
 Hi

 I am trying to get to grips with understanding mysql performance.

 I have the following query:

 select sql_no_cache count(*), date_format(calldate, '%y-%m-%d') as m from
 cdr_warehouse group by m;

 This gives me:
 115 rows in set (59.52 sec)


 mysql explain select count(*), date_format(calldate, '%y-%m-%d') as m from
 cdr_warehouse group by m\G
 *** 1. row ***
   id: 1
  select_type: SIMPLE
table: cdr_warehouse
 type: index
 possible_keys: NULL
  key: calldate_idx
  key_len: 8
  ref: NULL
 rows: 43708571
Extra: Using index; Using temporary; Using filesort
 1 row in set (0.00 sec)


 mysql show keys from cdr_warehouse \G;
 *** 1. row ***
   Table: cdr_warehouse
  Non_unique: 1
Key_name: uploaded_idx
 Seq_in_index: 1
  Column_name: uploaded
   Collation: A
  Cardinality: 66
Sub_part: NULL
  Packed: NULL
Null:
  Index_type: BTREE
 Comment:
 *** 2. row ***
   Table: cdr_warehouse
  Non_unique: 1
Key_name: calldate_idx
 Seq_in_index: 1
  Column_name: calldate
   Collation: A
  Cardinality: 5526774
Sub_part: NULL
  Packed: NULL
Null:
  Index_type: BTREE
 Comment:


 How can I improve the running speed of this query?  I am running 5.1.30, but
 don't (yet) want to partition the table (horizontally or vertically).
  Nothing else on the server is touching this table at the moment.  The exact
 date_format is not important, as I may want to group by 5 second intervals,
 or full months.

 Any tips/ideas for me?

 Have you tried doing GROUP BY calldate ?
 select sql_no_cache count(*), date_format(calldate, '%y-%m-%d') as m from
 cdr_warehouse group by calldate;

 This may allow you to avoid the filesort by using the index directly
 for ordering.

 Also I presume this is MyISAM to avoid the InnoDB without WHERE COUNT(*) 
 issue.


 Ewen


 Chris


 --
 MySQL General Mailing List
 For list archives: http://lists.mysql.com/mysql
 To unsubscribe:http://lists.mysql.com/mysql?unsub=ewen.fort...@gmail.com




-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/mysql?unsub=arch...@jab.org