Re: Performance tips

2008-12-30 Thread Baron Schwartz
Hi Jim,

On Tue, Dec 30, 2008 at 5:54 PM, Jim Lyons  wrote:
> On Sat, Dec 27, 2008 at 12:38 PM, Jake Maul  wrote:
>
>>
>> 3) Obviously it'd probably be faster if you weren't using
>> SQL_NO_CACHE... guessing you just did that to show us what it's like
>> that way?
>>
>>
> Why would SQL_NO_CACHE slow it down?  By not checking the cache or storing

Because then it might be possible to return the query from the cache.

> the resultset into cache it should be quicker, at least a little bit.
> Unless, of course, the query would always return the same result set.  But
> with count(*) in there it might not.  In fact, I believe the count(*) would
> prevent MySQL from caching the query in the first place, like using now() -
> it's a non-deterministic function.

No, it should always return the same thing unless the data in the
table changes, so it's perfectly cacheable.

-- 
Baron Schwartz, Director of Consulting, Percona Inc.
Our Blog: http://www.mysqlperformanceblog.com/
Our Services: http://www.percona.com/services.html

-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/mysql?unsub=arch...@jab.org



Re: Performance tips

2008-12-30 Thread Jim Lyons
On Sat, Dec 27, 2008 at 12:38 PM, Jake Maul  wrote:

>
> 3) Obviously it'd probably be faster if you weren't using
> SQL_NO_CACHE... guessing you just did that to show us what it's like
> that way?
>
>
Why would SQL_NO_CACHE slow it down?  By not checking the cache or storing
the resultset into cache it should be quicker, at least a little bit.
Unless, of course, the query would always return the same result set.  But
with count(*) in there it might not.  In fact, I believe the count(*) would
prevent MySQL from caching the query in the first place, like using now() -
it's a non-deterministic function.


-- 
Jim Lyons
Web developer / Database administrator
http://www.weblyons.com


Re: Performance tips

2008-12-27 Thread ewen fortune
Mmm I just tested this and it does indeed work (although i tested with
slightly less rows :o) )


explain select count(*) , date_format(calldate, '%y-%m-%d') as m from
cdr_warehouse group by m \G
*** 1. row ***
   id: 1
  select_type: SIMPLE
table: cdr_warehouse
 type: index
possible_keys: NULL
  key: calldate_idx
  key_len: 4
  ref: NULL
 rows: 26
Extra: Using index; Using temporary; Using filesort
1 row in set (0.00 sec)

explain select count(*) , date_format(calldate, '%y-%m-%d') as m from
cdr_warehouse group by calldate \G
*** 1. row ***
   id: 1
  select_type: SIMPLE
table: cdr_warehouse
 type: index
possible_keys: NULL
  key: calldate_idx
  key_len: 4
  ref: NULL
 rows: 26
Extra: Using index

Cheers,

Ewen

On Sat, Dec 27, 2008 at 8:04 PM, ewen fortune  wrote:
> Hi,
>
> On Sat, Dec 27, 2008 at 6:15 PM, Chris Picton  wrote:
>> Hi
>>
>> I am trying to get to grips with understanding mysql performance.
>>
>> I have the following query:
>>
>> select sql_no_cache count(*), date_format(calldate, '%y-%m-%d') as m from
>> cdr_warehouse group by m;
>>
>> This gives me:
>> 115 rows in set (59.52 sec)
>>
>>
>> mysql> explain select count(*), date_format(calldate, '%y-%m-%d') as m from
>> cdr_warehouse group by m\G
>> *** 1. row ***
>>   id: 1
>>  select_type: SIMPLE
>>table: cdr_warehouse
>> type: index
>> possible_keys: NULL
>>  key: calldate_idx
>>  key_len: 8
>>  ref: NULL
>> rows: 43708571
>>Extra: Using index; Using temporary; Using filesort
>> 1 row in set (0.00 sec)
>>
>>
>> mysql> show keys from cdr_warehouse \G;
>> *** 1. row ***
>>   Table: cdr_warehouse
>>  Non_unique: 1
>>Key_name: uploaded_idx
>> Seq_in_index: 1
>>  Column_name: uploaded
>>   Collation: A
>>  Cardinality: 66
>>Sub_part: NULL
>>  Packed: NULL
>>Null:
>>  Index_type: BTREE
>> Comment:
>> *** 2. row ***
>>   Table: cdr_warehouse
>>  Non_unique: 1
>>Key_name: calldate_idx
>> Seq_in_index: 1
>>  Column_name: calldate
>>   Collation: A
>>  Cardinality: 5526774
>>Sub_part: NULL
>>  Packed: NULL
>>Null:
>>  Index_type: BTREE
>> Comment:
>>
>>
>> How can I improve the running speed of this query?  I am running 5.1.30, but
>> don't (yet) want to partition the table (horizontally or vertically).
>>  Nothing else on the server is touching this table at the moment.  The exact
>> date_format is not important, as I may want to group by 5 second intervals,
>> or full months.
>>
>> Any tips/ideas for me?
>
> Have you tried doing GROUP BY calldate ?
> select sql_no_cache count(*), date_format(calldate, '%y-%m-%d') as m from
> cdr_warehouse group by calldate;
>
> This may allow you to avoid the filesort by using the index directly
> for ordering.
>
> Also I presume this is MyISAM to avoid the InnoDB without WHERE COUNT(*) 
> issue.
>
>
> Ewen
>
>>
>> Chris
>>
>>
>> --
>> MySQL General Mailing List
>> For list archives: http://lists.mysql.com/mysql
>> To unsubscribe:http://lists.mysql.com/mysql?unsub=ewen.fort...@gmail.com
>>
>>
>

-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/mysql?unsub=arch...@jab.org



Re: Performance tips

2008-12-27 Thread ewen fortune
Hi,

On Sat, Dec 27, 2008 at 6:15 PM, Chris Picton  wrote:
> Hi
>
> I am trying to get to grips with understanding mysql performance.
>
> I have the following query:
>
> select sql_no_cache count(*), date_format(calldate, '%y-%m-%d') as m from
> cdr_warehouse group by m;
>
> This gives me:
> 115 rows in set (59.52 sec)
>
>
> mysql> explain select count(*), date_format(calldate, '%y-%m-%d') as m from
> cdr_warehouse group by m\G
> *** 1. row ***
>   id: 1
>  select_type: SIMPLE
>table: cdr_warehouse
> type: index
> possible_keys: NULL
>  key: calldate_idx
>  key_len: 8
>  ref: NULL
> rows: 43708571
>Extra: Using index; Using temporary; Using filesort
> 1 row in set (0.00 sec)
>
>
> mysql> show keys from cdr_warehouse \G;
> *** 1. row ***
>   Table: cdr_warehouse
>  Non_unique: 1
>Key_name: uploaded_idx
> Seq_in_index: 1
>  Column_name: uploaded
>   Collation: A
>  Cardinality: 66
>Sub_part: NULL
>  Packed: NULL
>Null:
>  Index_type: BTREE
> Comment:
> *** 2. row ***
>   Table: cdr_warehouse
>  Non_unique: 1
>Key_name: calldate_idx
> Seq_in_index: 1
>  Column_name: calldate
>   Collation: A
>  Cardinality: 5526774
>Sub_part: NULL
>  Packed: NULL
>Null:
>  Index_type: BTREE
> Comment:
>
>
> How can I improve the running speed of this query?  I am running 5.1.30, but
> don't (yet) want to partition the table (horizontally or vertically).
>  Nothing else on the server is touching this table at the moment.  The exact
> date_format is not important, as I may want to group by 5 second intervals,
> or full months.
>
> Any tips/ideas for me?

Have you tried doing GROUP BY calldate ?
select sql_no_cache count(*), date_format(calldate, '%y-%m-%d') as m from
cdr_warehouse group by calldate;

This may allow you to avoid the filesort by using the index directly
for ordering.

Also I presume this is MyISAM to avoid the InnoDB without WHERE COUNT(*) issue.


Ewen

>
> Chris
>
>
> --
> MySQL General Mailing List
> For list archives: http://lists.mysql.com/mysql
> To unsubscribe:http://lists.mysql.com/mysql?unsub=ewen.fort...@gmail.com
>
>

-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/mysql?unsub=arch...@jab.org



Re: Performance tips

2008-12-27 Thread Dan Nelson
In the last episode (Dec 27), Chris Picton said:
> I am trying to get to grips with understanding mysql performance.
> 
> I have the following query:
> 
> select sql_no_cache count(*), date_format(calldate, '%y-%m-%d') as m from 
> cdr_warehouse group by m;
> 
> This gives me:
> 115 rows in set (59.52 sec)
> 
> mysql> explain select count(*), date_format(calldate, '%y-%m-%d') as m from 
> cdr_warehouse group by m\G
> *** 1. row ***
> id: 1
>select_type: SIMPLE
>  table: cdr_warehouse
>   type: index
> possible_keys: NULL
>key: calldate_idx
>key_len: 8
>ref: NULL
>   rows: 43708571
>  Extra: Using index; Using temporary; Using filesort
> 1 row in set (0.00 sec)
> 
> How can I improve the running speed of this query?  I am running
> 5.1.30, but don't (yet) want to partition the table (horizontally or
> vertically).  Nothing else on the server is touching this table at
> the moment.  The exact date_format is not important, as I may want to
> group by 5 second intervals, or full months.

With an index scan on a 43-million-record index, you're going to be
CPU-bound instead of I/O-bound on that query, so use the fastest
expression you can in the group-by clause.  Surprisingly, date_format
seems to be pretty good.  For grouping by days, date(calldate) is
around 20% faster, but converting to a unix_timestamp and DIViding (the
only way to get things like 5-second intervals) is 2x slower. 
Converting that datetime field into a timestamp field should speed up
the unix_timestamp function (since timestamps are internally stored as
unix time_t values), but you'll need to disable the auto-updating
features as well:

http://dev.mysql.com/doc/refman/5.1/en/timestamp.html
http://dev.mysql.com/doc/refman/5.1/en/storage-requirements.html

-- 
Dan Nelson
dnel...@allantgroup.com

-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/mysql?unsub=arch...@jab.org



Re: Performance tips

2008-12-27 Thread Jake Maul
I few random things come to mind...

1) Try the query with IGNORE INDEX calldate_idx ... I can't see how
this could possibly be faster, but I always like to check anyway. In
your case this should result in a full table scan, given the
information you've given us.

2) If the performance problem comes from the date_format() function,
there might not be much you can do about it, except to call it less
often. Maybe you could add one or more WHERE clauses to restrict the
range of rows you're looking at? (WHERE calldate > "2000-01-01
01:01:01" or something).

3) Obviously it'd probably be faster if you weren't using
SQL_NO_CACHE... guessing you just did that to show us what it's like
that way?

4) I'd check the values of sort_buffer_size and read_rnd_buffer_size.
Perhaps you could benefit from raising them slightly.

As for general performance tips, Here's a couple scripts I like to use
that give some decent tuning advice:

http://www.day32.com/MySQL/
http://wiki.mysqltuner.com/MySQLTuner

Neither of these will be specific to this particular query of course.
There is a query profiler tool in maatkit (http://www.maatkit.org/),
however... might be worth a shot.

Good luck,
Jake

On Sat, Dec 27, 2008 at 10:15 AM, Chris Picton  wrote:
> Hi
>
> I am trying to get to grips with understanding mysql performance.
>
> I have the following query:
>
> select sql_no_cache count(*), date_format(calldate, '%y-%m-%d') as m from
> cdr_warehouse group by m;
>
> This gives me:
> 115 rows in set (59.52 sec)
>
>
> mysql> explain select count(*), date_format(calldate, '%y-%m-%d') as m from
> cdr_warehouse group by m\G
> *** 1. row ***
>   id: 1
>  select_type: SIMPLE
>table: cdr_warehouse
> type: index
> possible_keys: NULL
>  key: calldate_idx
>  key_len: 8
>  ref: NULL
> rows: 43708571
>Extra: Using index; Using temporary; Using filesort
> 1 row in set (0.00 sec)
>
>
> mysql> show keys from cdr_warehouse \G;
> *** 1. row ***
>   Table: cdr_warehouse
>  Non_unique: 1
>Key_name: uploaded_idx
> Seq_in_index: 1
>  Column_name: uploaded
>   Collation: A
>  Cardinality: 66
>Sub_part: NULL
>  Packed: NULL
>Null:
>  Index_type: BTREE
> Comment:
> *** 2. row ***
>   Table: cdr_warehouse
>  Non_unique: 1
>Key_name: calldate_idx
> Seq_in_index: 1
>  Column_name: calldate
>   Collation: A
>  Cardinality: 5526774
>Sub_part: NULL
>  Packed: NULL
>Null:
>  Index_type: BTREE
> Comment:
>
>
> How can I improve the running speed of this query?  I am running 5.1.30, but
> don't (yet) want to partition the table (horizontally or vertically).
>  Nothing else on the server is touching this table at the moment.  The exact
> date_format is not important, as I may want to group by 5 second intervals,
> or full months.
>
> Any tips/ideas for me?
>
> Chris
>
>
> --
> MySQL General Mailing List
> For list archives: http://lists.mysql.com/mysql
> To unsubscribe:http://lists.mysql.com/mysql?unsub=jakem...@gmail.com
>
>

-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/mysql?unsub=arch...@jab.org



Performance tips

2008-12-27 Thread Chris Picton

Hi

I am trying to get to grips with understanding mysql performance.

I have the following query:

select sql_no_cache count(*), date_format(calldate, '%y-%m-%d') as m 
from cdr_warehouse group by m;


This gives me:
115 rows in set (59.52 sec)


mysql> explain select count(*), date_format(calldate, '%y-%m-%d') as m 
from cdr_warehouse group by m\G

*** 1. row ***
   id: 1
  select_type: SIMPLE
table: cdr_warehouse
 type: index
possible_keys: NULL
  key: calldate_idx
  key_len: 8
  ref: NULL
 rows: 43708571
Extra: Using index; Using temporary; Using filesort
1 row in set (0.00 sec)


mysql> show keys from cdr_warehouse \G;
*** 1. row ***
   Table: cdr_warehouse
  Non_unique: 1
Key_name: uploaded_idx
Seq_in_index: 1
 Column_name: uploaded
   Collation: A
 Cardinality: 66
Sub_part: NULL
  Packed: NULL
Null:
  Index_type: BTREE
 Comment:
*** 2. row ***
   Table: cdr_warehouse
  Non_unique: 1
Key_name: calldate_idx
Seq_in_index: 1
 Column_name: calldate
   Collation: A
 Cardinality: 5526774
Sub_part: NULL
  Packed: NULL
Null:
  Index_type: BTREE
 Comment:


How can I improve the running speed of this query?  I am running 5.1.30, 
but don't (yet) want to partition the table (horizontally or 
vertically).  Nothing else on the server is touching this table at the 
moment.  The exact date_format is not important, as I may want to group 
by 5 second intervals, or full months.


Any tips/ideas for me?

Chris


--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/mysql?unsub=arch...@jab.org