SLOW 22million rows, 5 hour query?
I have a table with 22,371,273 rows, current type is MyISAM. I have one query tha took nearly 5 hours to complete. I do not know if this like it should be or not, but I thought I'd ask the list. My gut feeling is the `gropu by` that's slowing it down but nonetheless 5 hours seems excessive. I'm trying to find the source of the limitation and work from there. Mysql: 4.0.18-standard, precompiled 32-bit sparc 2.8 Server: Sun 420 Solaris 2.8 4x450MHZ Ultrasparc-II 4GB Ram Two 50gb mounts, fiber channel scsi to EMC. Brand new systems, totally idle. I think everything relevant is here. The skinny: full table scan on 22 million rows, group and insert into new table. MyISAM and InnoDB appear to give similar results when used as the destination ('new_table'). insert into new_table select month_day, floor(bucket/3) as bucket, date, src, avg(value) as value from source_table group by month_day, bucket, src; Relevant `explain` details: Full table scan: 22,371,273 rows, Using temporary; Using filesort Query OK, 11495208 rows affected (4 hours 47 min 21.01 sec) Records: 11495208 Duplicates: 0 Warnings: 0 -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
Re: SLOW 22million rows, 5 hour query?
insert into new_table select month_day, floor(bucket/3) as bucket, date, src, avg(value) as value from source_table group by month_day, bucket, src; Relevant `explain` details: Full table scan: 22,371,273 rows, Using temporary; Using filesort Query OK, 11495208 rows affected (4 hours 47 min 21.01 sec) IMHO two things slow you down: 1) floor(bucket/3) as bucket. This means that MySQL has to calculate this value for each record to be able to use it as a grouping value. If you always use this same expression for grouping, it might be useful to have a column with the precalculated value. 2) there's only one table involved, so only one index will be used. If you had an index that contains month_day, bucket and src it would speed up things perhaps. Furthermore, it depends on the speed of the disks, your configuration, memory configuration and use how fast it will go. But explain indicates that MySQL needs to copy the data into a temporary table and use filesort to order and group things. Maybe it will help to surpress the automatic sorting that is done by MySQL because of the GROUP BY, by adding ORDER BY NULL? Regards, Jigal. -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
Re: SLOW 22million rows, 5 hour query?
On 23 Apr 2004, at 9:48 pm, [EMAIL PROTECTED] wrote: Relevant `explain` details: Full table scan: 22,371,273 rows, Using temporary; Using filesort The filesort is a giveaway. Can you increase the sort buffer size so that the sort can happen in memory rather than having to use a file to sort? Tim -- Dr Tim Cutts Informatics Systems Group Wellcome Trust Sanger Institute Hinxton, Cambridge, CB10 1SA, UK -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
RE: SLOW 22million rows, 5 hour query?
Use insert delayed, and you will cut your time in half. At least with my experience. But also how long does it actually take to run the query itself. Giving a summary explain doesn't help much. You really need a table structure that the select is using and a full explain. Donny -Original Message- From: Tim Cutts [mailto:[EMAIL PROTECTED] Sent: Saturday, April 24, 2004 6:02 AM To: [EMAIL PROTECTED] Cc: [EMAIL PROTECTED] Subject: Re: SLOW 22million rows, 5 hour query? On 23 Apr 2004, at 9:48 pm, [EMAIL PROTECTED] wrote: Relevant `explain` details: Full table scan: 22,371,273 rows, Using temporary; Using filesort The filesort is a giveaway. Can you increase the sort buffer size so that the sort can happen in memory rather than having to use a file to sort? Tim -- Dr Tim Cutts Informatics Systems Group Wellcome Trust Sanger Institute Hinxton, Cambridge, CB10 1SA, UK -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe: http://lists.mysql.com/[EMAIL PROTECTED] -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
Re: SLOW 22million rows, 5 hour query?
I must have done a poor job at explaining this. I already have all those primary keys, etc. If you are required to do a full table scan on the table, as I am in my example, please explain to me how an index even matters in that case. There is _no_ where clause in the query so how will an index help? :) Now I have been benchmarking creating a CKEY combined key column that just basically is a concat(month_day, ',', bucket, ',', src). I made that a char(44) column and make that the primary key. Things seem faster due to use only one row for the primary key instead of 3, as well as a less complex group by. Preliminary results show the CKEY to be 50% faster on the particular query I'm using. 1) Yes, it does have to calculate floor(bucket / 3) but that is fairly in expensive call on a per-row basis, one would think. 2) I already have that index. It's a Primary Key(month_day, src, bucket). I'm still confused on how the index will speed it up on the source table side. We are NOT I/O bound. Looks more like cpu bound to me. Mysql uses 25% cpu on the solaris which is 1 entire cpu on a 4 processor machine. I'll give the order by NULL a shot Cliff Jigal van Hemert [EMAIL PROTECTED] writes: insert into new_table select month_day, floor(bucket/3) as bucket, date, src, avg(value) as value from source_table group by month_day, bucket, src; Relevant `explain` details: Full table scan: 22,371,273 rows, Using temporary; Using filesort Query OK, 11495208 rows affected (4 hours 47 min 21.01 sec) IMHO two things slow you down: 1) floor(bucket/3) as bucket. This means that MySQL has to calculate this value for each record to be able to use it as a grouping value. If you always use this same expression for grouping, it might be useful to have a column with the precalculated value. 2) there's only one table involved, so only one index will be used. If you had an index that contains month_day, bucket and src it would speed up things perhaps. Furthermore, it depends on the speed of the disks, your configuration, memory configuration and use how fast it will go. But explain indicates that MySQL needs to copy the data into a temporary table and use filesort to order and group things. Maybe it will help to surpress the automatic sorting that is done by MySQL because of the GROUP BY, by adding ORDER BY NULL? Regards, Jigal. -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
Re: SLOW 22million rows, 5 hour query?
http://dev.mysql.com/doc/mysql/en/INSERT_SELECT.html Do not use DELAYED with INSERT ... SELECT. With respect to the table structure...can you explain how when you have to read every single row regardless how the structure (assuming you are going down the path of idexes) affects the query? Cliff Donny Simonton [EMAIL PROTECTED] writes: Use insert delayed, and you will cut your time in half. At least with my experience. But also how long does it actually take to run the query itself. Giving a summary explain doesn't help much. You really need a table structure that the select is using and a full explain. Donny -Original Message- From: Tim Cutts [mailto:[EMAIL PROTECTED] Sent: Saturday, April 24, 2004 6:02 AM To: [EMAIL PROTECTED] Cc: [EMAIL PROTECTED] Subject: Re: SLOW 22million rows, 5 hour query? On 23 Apr 2004, at 9:48 pm, [EMAIL PROTECTED] wrote: Relevant `explain` details: Full table scan: 22,371,273 rows, Using temporary; Using filesort The filesort is a giveaway. Can you increase the sort buffer size so that the sort can happen in memory rather than having to use a file to sort? Tim -- Dr Tim Cutts Informatics Systems Group Wellcome Trust Sanger Institute Hinxton, Cambridge, CB10 1SA, UK -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe: http://lists.mysql.com/[EMAIL PROTECTED] -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED] -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
RE: SLOW 22million rows, 5 hour query?
Cliff, still no explain still not table structure. Until that happens enjoy the 5 hour club. Donny -Original Message- From: Cliff Daniel [mailto:[EMAIL PROTECTED] Sent: Saturday, April 24, 2004 6:41 PM To: Donny Simonton Cc: 'Tim Cutts'; [EMAIL PROTECTED] Subject: Re: SLOW 22million rows, 5 hour query? http://dev.mysql.com/doc/mysql/en/INSERT_SELECT.html Do not use DELAYED with INSERT ... SELECT. With respect to the table structure...can you explain how when you have to read every single row regardless how the structure (assuming you are going down the path of idexes) affects the query? Cliff Donny Simonton [EMAIL PROTECTED] writes: Use insert delayed, and you will cut your time in half. At least with my experience. But also how long does it actually take to run the query itself. Giving a summary explain doesn't help much. You really need a table structure that the select is using and a full explain. Donny -Original Message- From: Tim Cutts [mailto:[EMAIL PROTECTED] Sent: Saturday, April 24, 2004 6:02 AM To: [EMAIL PROTECTED] Cc: [EMAIL PROTECTED] Subject: Re: SLOW 22million rows, 5 hour query? On 23 Apr 2004, at 9:48 pm, [EMAIL PROTECTED] wrote: Relevant `explain` details: Full table scan: 22,371,273 rows, Using temporary; Using filesort The filesort is a giveaway. Can you increase the sort buffer size so that the sort can happen in memory rather than having to use a file to sort? Tim -- Dr Tim Cutts Informatics Systems Group Wellcome Trust Sanger Institute Hinxton, Cambridge, CB10 1SA, UK -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe: http://lists.mysql.com/[EMAIL PROTECTED] -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe: http://lists.mysql.com/[EMAIL PROTECTED] -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe: http://lists.mysql.com/[EMAIL PROTECTED] -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]