Re: How to optimize a slow query?

2009-09-07 Thread Jia Chen

Hi Mike,

Thanks for you help!

Best,
Jia

mos wrote:

Jia,
 The code you sent seems to be able to get the job done.

You could try something simpler by executing 2 sql statements instead 
of using one. Something like:


create table rmpdata1
select ri.*, mv.* from  RItime as ri left join MVtime as mv on 
(ri.code=mv.code and ri.ndate=mv.ndate));


insert into rmpdata1
select ri.*, mv.* from  MVtime as mv left join RItime as ri  on 
(ri.code=mv.code and ri.ndate=mv.ndate)) where ri.code is null and 
ri.date is null;


You will of course have to play with the column list to avoid 
duplicate columns. The Insert statement will insert rows from mv that 
are missing in ri.


Mike

At 01:51 PM 9/6/2009, Jia Chen wrote:

Hi Mike,

Thanks for your detailed answer. Now, I understand what you mean. 
And, yes, I agree with you that keeping all data in one table works 
better for a bunch of 1:1 relationship tables.  Actually, this is 
what I was trying to do with that query.


Since you mention They all had a 1:1 relationship and occasionally 
some of the tables did not have a corresponding row. and From then 
on I've merged all 8 tables into one and if any of the subordinate 
table data isn't available for a row, its columns are set to NULL, I 
do want to ask you about how you set the columns to NULL for rows in 
subordinate table data unavailable because I have similar situation.


If I want to combine two tables into one, I think that a full outer 
join can achieve what you did. However, MySQL does not have full 
outer join. So, I am using


create table rmpdata1
(select ri.*, mv.MV, coalesce(ri.code,mv.code) as ccode,
  coalesce(ri.ndate,mv.ndate) as cndate
from
   RItime as ri left outer join MVtime as mv
on
   (ri.code=mv.code and ri.ndate=mv.ndate))
union
(select ri.*, mv.MV, coalesce(ri.code,mv.code) as ccode,
  coalesce(ri.ndate,mv.ndate) as cndate
from
   RItime as ri right outer join MVtime as mv
on
   (ri.code=mv.code and ri.ndate=mv.ndate));

This query takes more than twice as much time as the query in my 
first e-mail. Do you have a better way? Thanks.


Best,
Jia

mos wrote:

Jia,

Yes, it is a 1:1 relationship between table RItime and MVtime. 
However, I don't get your suggestion, I'd recommend joining the 
two tables into 1 table so you don't have to join them in the first 
place.  Could you elaborate that?


Sure but first I have to relate it to my own experience. I had 8 
tables of around 25 million rows each. They all had a 1:1 
relationship and occasionally some of the tables did not have a 
corresponding row. I felt it was better from a design point of view 
to have 8 different tables and do the joins on the tables that I 
needed for each of my particular queries. I'd be joining anywhere 
from 2 to 5 or 6 or even all 8 tables at a time, using a where 
clause to select 15k rows at a time. This is the way to do it from a 
normalized point of view. All of the information is in its 
respective table and only assemble the tables for each particular 
query.


Well, this was slwww! A heck of a lot of work was done to join 
the tables together on a 2 column key (like yours). I also had to 
run maintenance on the tables to see which tables where corrupted or 
were missing rows that should have been there. The tables also 
repeated columns from the other tables like date and product_id that 
is used to help identify each row. Well to make a long story short, 
it was far too much effort to juggle the relationships between all 
of these tables.


Then a colleague made the monumental announcement by saying I've 
never found the need to use more than 1 table when there was a 1:1 
relationship. There is a tremendous speed cost involved in piecing 
the data back together. I put all of the data into 1 table.  So the 
light went on for me. From then on I've merged all 8 tables into one 
and if any of the subordinate table data isn't available for a row, 
its columns are set to NULL, which is the values they would have had 
anyway after a left join.


I am perfectly happy know with one wide table with over 100 columns. 
Everything is in its place and maintenance is a dream. Queries are 
also quite fast because all of the information is under one table 
and not 8. I don't have to worry about optimizing the indexes for 
the table joins because there aren't any joins between these tables 
because it is all in 1 row.


So you really have to ask yourself, why spend  10 minutes each time 
your query is run? Instead you eliminate the query altogether by 
keeping the data of the 2 tables into 1 table in the first place.


Mike



At 09:45 AM 9/6/2009, Jia Chen wrote:

Thanks for your reply, Mike.

Yes, 13419851 rows were added to rmpdata1. However, 10 minutes seem 
to be too long. I run the same join by using SQL procedure in a 
statistical software called SAS on a similar machine. It only takes 
1 minute and 3 seconds.


Yes, it is a 1:1 relationship between table RItime and 

Re: How to optimize a slow query?

2009-09-06 Thread Jia Chen

Thanks for your reply, Mike.

Yes, 13419851 rows were added to rmpdata1. However, 10 minutes seem to 
be too long. I run the same join by using SQL procedure in a statistical 
software called SAS on a similar machine. It only takes 1 minute and 3 
seconds.


Yes, it is a 1:1 relationship between table RItime and MVtime. However, 
I don't get your suggestion, I'd recommend joining the two tables into 
1 table so you don't have to join them in the first place.  Could you 
elaborate that?


I was using B-tree index. Switching to HASH does help. Now, it takes 4 
min 50.17 sec to run the query. 


I also turn on profiling by using
mysql set profiling = 1;
Query OK, 0 rows affected (0.01 sec)

After the query finishes, I get
mysql show profile;
+--++
| Status   | Duration   |
+--++
| starting |   0.000123 |
| checking permissions |   0.10 |
| Opening tables   |   0.44 |
| System lock  |   0.07 |
| Table lock   |   0.11 |
| init |   0.83 |
| creating table   |   0.003428 |
| After create |   0.000124 |
| System lock  |   0.04 |
| Table lock   |   0.51 |
| optimizing   |   0.07 |
| statistics   |   0.33 |
| preparing|   0.20 |
| executing|   0.04 |
| Sending data | 290.153530 |
| end  |   0.08 |
| end  |   0.04 |
| query end|   0.03 |
| freeing items|   0.10 |
| closing tables   |   0.25 |
| logging slow query   |   0.01 |
| logging slow query   |   0.013429 |
| cleaning up  |   0.04 |
+--++
23 rows in set (0.02 sec)

MySQL spends most of its time sending data. According to 
http://dev.mysql.com/doc/refman/5.0/en/general-thread-states.html, 
sending data means that the thread is processing rows for a |SELECT| 
http://dev.mysql.com/doc/refman/5.0/en/select.html statement and also 
is sending data to the client.  Is there more room to optimize this 
query?  Thanks again.


Best,
Jia

mos wrote:
How many rows were added to rmpdata1 table? If it is 13.4 million rows 
then it is going to take several minutes to join this many rows from 
the 2 tables.
Is there a 1:1 relationship between the two tables or a 1:Many? If 
there is a 1:1 then I'd recommend joining the two tables into 1 table 
so you don't have to join them in the first place.
The only other thing I can suggest is to change the type of index on 
the tables being joined to see if that makes a speed difference. For 
example, if you are using BTREE then switch to HASH or vice versa. See 
http://dev.mysql.com/doc/refman/5.0/en/create-index.html for more info.


Mike

At 10:05 AM 9/5/2009, Jia Chen wrote:

Hi there,

One simple query took more than 10 minutes.  Here is how relevant 
rows in the slow query log looks like:


# Time: 090905 10:49:57
# u...@host: root[root] @ localhost []
# Query_time: 649  Lock_time: 0  Rows_sent: 0  Rows_examined: 26758561
use world;
create table rmpdata1 select ri.*,
mv.MV, coalesce(ri.code,mv.code) as ccode,
coalesce(ri.ndate,mv.ndate) as cndate
from RItime as ri left outer join
MVtime as mv
on (ri.code=mv.code and
ri.ndate=mv.ndate);

When I explain only the select clause, I get
+--+---+
| id | select_type | table | type   | possible_keys | key | 
key_len | ref  | rows | Extra |
++-+---++---+-+-+--+--+---+ 


|  1 | SIMPLE  | ri| ALL| NULL  | NULL| NULL
| NULL | 13419851 |   |
|  1 | SIMPLE  | mv| eq_ref | PRIMARY   | PRIMARY | 11
| world.ri.code,world.ri.ndate |1 |   |
++-+---++---+-+-+--+--+---+ 


2 rows in set (0.00 sec)

I use show table status from world; to get information about two 
tables, RItime and MVtime, in the join clause:

  Name: RItime
Engine: MyISAM
   Version: 10
Row_format: Dynamic
  Rows: 13419851
Avg_row_length: 31
   Data_length: 427721848
Max_data_length: 281474976710655
  Index_length: 347497472
 Data_free: 0
Auto_increment: NULL
   Create_time: 2009-09-03 10:17:57
   Update_time: 2009-09-03 12:04:02
Check_time: NULL
 Collation: latin1_swedish_ci
  Checksum: NULL
Create_options:
   Comment:
*** 2. row ***
  Name: MVtime
Engine: MyISAM
   Version: 10
Row_format: Dynamic
  Rows: 13562373
Avg_row_length: 31
   Data_length: 430220056
Max_data_length: 281474976710655
  Index_length: 350996480
 Data_free: 0
Auto_increment: NULL
   Create_time: 2009-09-03 13:31:33
   Update_time: 2009-09-03 

Re: How to optimize a slow query?

2009-09-06 Thread mos

Jia,

Yes, it is a 1:1 relationship between table RItime and MVtime. However, I 
don't get your suggestion, I'd recommend joining the two tables into 1 
table so you don't have to join them in the first place.  Could you 
elaborate that?


Sure but first I have to relate it to my own experience. I had 8 tables of 
around 25 million rows each. They all had a 1:1 relationship and 
occasionally some of the tables did not have a corresponding row. I felt it 
was better from a design point of view to have 8 different tables and do 
the joins on the tables that I needed for each of my particular queries. 
I'd be joining anywhere from 2 to 5 or 6 or even all 8 tables at a time, 
using a where clause to select 15k rows at a time. This is the way to do it 
from a normalized point of view. All of the information is in its 
respective table and only assemble the tables for each particular query.


Well, this was slwww! A heck of a lot of work was done to join the 
tables together on a 2 column key (like yours). I also had to run 
maintenance on the tables to see which tables where corrupted or were 
missing rows that should have been there. The tables also repeated columns 
from the other tables like date and product_id that is used to help 
identify each row. Well to make a long story short, it was far too much 
effort to juggle the relationships between all of these tables.


Then a colleague made the monumental announcement by saying I've never 
found the need to use more than 1 table when there was a 1:1 relationship. 
There is a tremendous speed cost involved in piecing the data back 
together. I put all of the data into 1 table.  So the light went on for 
me. From then on I've merged all 8 tables into one and if any of the 
subordinate table data isn't available for a row, its columns are set to 
NULL, which is the values they would have had anyway after a left join.


I am perfectly happy know with one wide table with over 100 columns. 
Everything is in its place and maintenance is a dream. Queries are also 
quite fast because all of the information is under one table and not 8. I 
don't have to worry about optimizing the indexes for the table joins 
because there aren't any joins between these tables because it is all in 1 row.


So you really have to ask yourself, why spend  10 minutes each time your 
query is run? Instead you eliminate the query altogether by keeping the 
data of the 2 tables into 1 table in the first place.


Mike



At 09:45 AM 9/6/2009, Jia Chen wrote:

Thanks for your reply, Mike.

Yes, 13419851 rows were added to rmpdata1. However, 10 minutes seem to be 
too long. I run the same join by using SQL procedure in a statistical 
software called SAS on a similar machine. It only takes 1 minute and 3 seconds.


Yes, it is a 1:1 relationship between table RItime and MVtime. However, I 
don't get your suggestion, I'd recommend joining the two tables into 1 
table so you don't have to join them in the first place.  Could you 
elaborate that?


I was using B-tree index. Switching to HASH does help. Now, it takes 4 min 
50.17 sec to run the query.

I also turn on profiling by using
mysql set profiling = 1;
Query OK, 0 rows affected (0.01 sec)

After the query finishes, I get
mysql show profile;
+--++
| Status   | Duration   |
+--++
| starting |   0.000123 |
| checking permissions |   0.10 |
| Opening tables   |   0.44 |
| System lock  |   0.07 |
| Table lock   |   0.11 |
| init |   0.83 |
| creating table   |   0.003428 |
| After create |   0.000124 |
| System lock  |   0.04 |
| Table lock   |   0.51 |
| optimizing   |   0.07 |
| statistics   |   0.33 |
| preparing|   0.20 |
| executing|   0.04 |
| Sending data | 290.153530 |
| end  |   0.08 |
| end  |   0.04 |
| query end|   0.03 |
| freeing items|   0.10 |
| closing tables   |   0.25 |
| logging slow query   |   0.01 |
| logging slow query   |   0.013429 |
| cleaning up  |   0.04 |
+--++
23 rows in set (0.02 sec)

MySQL spends most of its time sending data. According to 
http://dev.mysql.com/doc/refman/5.0/en/general-thread-states.html, sending 
data means that the thread is processing rows for a |SELECT| 
http://dev.mysql.com/doc/refman/5.0/en/select.html statement and also is 
sending data to the client.  Is there more room to optimize this 
query?  Thanks again.


Best,
Jia

mos wrote:
How many rows were added to rmpdata1 table? If it is 13.4 million rows 
then it is going to take several minutes to join this many rows from the 
2 tables.
Is there a 1:1 relationship between the two tables or a 1:Many? If there 
is a 1:1 then I'd recommend joining the two tables into 1 table 

Re: How to optimize a slow query?

2009-09-06 Thread Jia Chen

Hi Mike,

Thanks for your detailed answer. Now, I understand what you mean. And, 
yes, I agree with you that keeping all data in one table works better 
for a bunch of 1:1 relationship tables.  Actually, this is what I was 
trying to do with that query.


Since you mention They all had a 1:1 relationship and occasionally some 
of the tables did not have a corresponding row. and From then on I've 
merged all 8 tables into one and if any of the subordinate table data 
isn't available for a row, its columns are set to NULL, I do want to 
ask you about how you set the columns to NULL for rows in subordinate 
table data unavailable because I have similar situation.


If I want to combine two tables into one, I think that a full outer join 
can achieve what you did. However, MySQL does not have full outer join. 
So, I am using


create table rmpdata1
(select ri.*, mv.MV, coalesce(ri.code,mv.code) as ccode,
  coalesce(ri.ndate,mv.ndate) as cndate
from
   RItime as ri left outer join MVtime as mv
on
   (ri.code=mv.code and ri.ndate=mv.ndate))
union
(select ri.*, mv.MV, coalesce(ri.code,mv.code) as ccode,
  coalesce(ri.ndate,mv.ndate) as cndate
from
   RItime as ri right outer join MVtime as mv
on
   (ri.code=mv.code and ri.ndate=mv.ndate));

This query takes more than twice as much time as the query in my first 
e-mail. Do you have a better way? Thanks.


Best,
Jia

mos wrote:

Jia,

Yes, it is a 1:1 relationship between table RItime and MVtime. 
However, I don't get your suggestion, I'd recommend joining the two 
tables into 1 table so you don't have to join them in the first 
place.  Could you elaborate that?


Sure but first I have to relate it to my own experience. I had 8 
tables of around 25 million rows each. They all had a 1:1 relationship 
and occasionally some of the tables did not have a corresponding row. 
I felt it was better from a design point of view to have 8 different 
tables and do the joins on the tables that I needed for each of my 
particular queries. I'd be joining anywhere from 2 to 5 or 6 or even 
all 8 tables at a time, using a where clause to select 15k rows at a 
time. This is the way to do it from a normalized point of view. All of 
the information is in its respective table and only assemble the 
tables for each particular query.


Well, this was slwww! A heck of a lot of work was done to join the 
tables together on a 2 column key (like yours). I also had to run 
maintenance on the tables to see which tables where corrupted or were 
missing rows that should have been there. The tables also repeated 
columns from the other tables like date and product_id that is used to 
help identify each row. Well to make a long story short, it was far 
too much effort to juggle the relationships between all of these tables.


Then a colleague made the monumental announcement by saying I've 
never found the need to use more than 1 table when there was a 1:1 
relationship. There is a tremendous speed cost involved in piecing the 
data back together. I put all of the data into 1 table.  So the light 
went on for me. From then on I've merged all 8 tables into one and if 
any of the subordinate table data isn't available for a row, its 
columns are set to NULL, which is the values they would have had 
anyway after a left join.


I am perfectly happy know with one wide table with over 100 columns. 
Everything is in its place and maintenance is a dream. Queries are 
also quite fast because all of the information is under one table and 
not 8. I don't have to worry about optimizing the indexes for the 
table joins because there aren't any joins between these tables 
because it is all in 1 row.


So you really have to ask yourself, why spend  10 minutes each time 
your query is run? Instead you eliminate the query altogether by 
keeping the data of the 2 tables into 1 table in the first place.


Mike



At 09:45 AM 9/6/2009, Jia Chen wrote:

Thanks for your reply, Mike.

Yes, 13419851 rows were added to rmpdata1. However, 10 minutes seem 
to be too long. I run the same join by using SQL procedure in a 
statistical software called SAS on a similar machine. It only takes 1 
minute and 3 seconds.


Yes, it is a 1:1 relationship between table RItime and MVtime. 
However, I don't get your suggestion, I'd recommend joining the two 
tables into 1 table so you don't have to join them in the first 
place.  Could you elaborate that?


I was using B-tree index. Switching to HASH does help. Now, it takes 
4 min 50.17 sec to run the query.

I also turn on profiling by using
mysql set profiling = 1;
Query OK, 0 rows affected (0.01 sec)

After the query finishes, I get
mysql show profile;
+--++
| Status   | Duration   |
+--++
| starting |   0.000123 |
| checking permissions |   0.10 |
| Opening tables   |   0.44 |
| System lock  |   0.07 |
| Table lock   |   0.11 

Re: How to optimize a slow query?

2009-09-06 Thread mos

Jia,
 The code you sent seems to be able to get the job done.

You could try something simpler by executing 2 sql statements instead of 
using one. Something like:


create table rmpdata1
select ri.*, mv.* from  RItime as ri left join MVtime as mv on 
(ri.code=mv.code and ri.ndate=mv.ndate));


insert into rmpdata1
select ri.*, mv.* from  MVtime as mv left join RItime as ri  on 
(ri.code=mv.code and ri.ndate=mv.ndate)) where ri.code is null and ri.date 
is null;


You will of course have to play with the column list to avoid duplicate 
columns. The Insert statement will insert rows from mv that are missing in ri.


Mike

At 01:51 PM 9/6/2009, Jia Chen wrote:

Hi Mike,

Thanks for your detailed answer. Now, I understand what you mean. And, 
yes, I agree with you that keeping all data in one table works better for 
a bunch of 1:1 relationship tables.  Actually, this is what I was trying 
to do with that query.


Since you mention They all had a 1:1 relationship and occasionally some 
of the tables did not have a corresponding row. and From then on I've 
merged all 8 tables into one and if any of the subordinate table data 
isn't available for a row, its columns are set to NULL, I do want to ask 
you about how you set the columns to NULL for rows in subordinate table 
data unavailable because I have similar situation.


If I want to combine two tables into one, I think that a full outer join 
can achieve what you did. However, MySQL does not have full outer join. 
So, I am using


create table rmpdata1
(select ri.*, mv.MV, coalesce(ri.code,mv.code) as ccode,
  coalesce(ri.ndate,mv.ndate) as cndate
from
   RItime as ri left outer join MVtime as mv
on
   (ri.code=mv.code and ri.ndate=mv.ndate))
union
(select ri.*, mv.MV, coalesce(ri.code,mv.code) as ccode,
  coalesce(ri.ndate,mv.ndate) as cndate
from
   RItime as ri right outer join MVtime as mv
on
   (ri.code=mv.code and ri.ndate=mv.ndate));

This query takes more than twice as much time as the query in my first 
e-mail. Do you have a better way? Thanks.


Best,
Jia

mos wrote:

Jia,

Yes, it is a 1:1 relationship between table RItime and MVtime. However, 
I don't get your suggestion, I'd recommend joining the two tables into 
1 table so you don't have to join them in the first place.  Could you 
elaborate that?


Sure but first I have to relate it to my own experience. I had 8 tables 
of around 25 million rows each. They all had a 1:1 relationship and 
occasionally some of the tables did not have a corresponding row. I felt 
it was better from a design point of view to have 8 different tables and 
do the joins on the tables that I needed for each of my particular 
queries. I'd be joining anywhere from 2 to 5 or 6 or even all 8 tables at 
a time, using a where clause to select 15k rows at a time. This is the 
way to do it from a normalized point of view. All of the information is 
in its respective table and only assemble the tables for each particular query.


Well, this was slwww! A heck of a lot of work was done to join the 
tables together on a 2 column key (like yours). I also had to run 
maintenance on the tables to see which tables where corrupted or were 
missing rows that should have been there. The tables also repeated 
columns from the other tables like date and product_id that is used to 
help identify each row. Well to make a long story short, it was far too 
much effort to juggle the relationships between all of these tables.


Then a colleague made the monumental announcement by saying I've never 
found the need to use more than 1 table when there was a 1:1 
relationship. There is a tremendous speed cost involved in piecing the 
data back together. I put all of the data into 1 table.  So the light 
went on for me. From then on I've merged all 8 tables into one and if any 
of the subordinate table data isn't available for a row, its columns are 
set to NULL, which is the values they would have had anyway after a left join.


I am perfectly happy know with one wide table with over 100 columns. 
Everything is in its place and maintenance is a dream. Queries are also 
quite fast because all of the information is under one table and not 8. I 
don't have to worry about optimizing the indexes for the table joins 
because there aren't any joins between these tables because it is all in 1 row.


So you really have to ask yourself, why spend  10 minutes each time your 
query is run? Instead you eliminate the query altogether by keeping the 
data of the 2 tables into 1 table in the first place.


Mike



At 09:45 AM 9/6/2009, Jia Chen wrote:

Thanks for your reply, Mike.

Yes, 13419851 rows were added to rmpdata1. However, 10 minutes seem to 
be too long. I run the same join by using SQL procedure in a statistical 
software called SAS on a similar machine. It only takes 1 minute and 3 seconds.


Yes, it is a 1:1 relationship between table RItime and MVtime. However, 
I don't get your suggestion, I'd recommend 

How to optimize a slow query?

2009-09-05 Thread Jia Chen

Hi there,

One simple query took more than 10 minutes.  Here is how relevant rows 
in the slow query log looks like:


# Time: 090905 10:49:57
# u...@host: root[root] @ localhost []
# Query_time: 649  Lock_time: 0  Rows_sent: 0  Rows_examined: 26758561
use world;
create table rmpdata1 select ri.*,
mv.MV, coalesce(ri.code,mv.code) as ccode,
coalesce(ri.ndate,mv.ndate) as cndate
from RItime as ri left outer join
MVtime as mv
on (ri.code=mv.code and
ri.ndate=mv.ndate);

When I explain only the select clause, I get
+--+---+
| id | select_type | table | type   | possible_keys | key | key_len 
| ref  | rows | Extra |

++-+---++---+-+-+--+--+---+
|  1 | SIMPLE  | ri| ALL| NULL  | NULL| NULL
| NULL | 13419851 |   |
|  1 | SIMPLE  | mv| eq_ref | PRIMARY   | PRIMARY | 11  
| world.ri.code,world.ri.ndate |1 |   |

++-+---++---+-+-+--+--+---+
2 rows in set (0.00 sec)

I use show table status from world; to get information about two 
tables, RItime and MVtime, in the join clause:

  Name: RItime
Engine: MyISAM
   Version: 10
Row_format: Dynamic
  Rows: 13419851
Avg_row_length: 31
   Data_length: 427721848
Max_data_length: 281474976710655
  Index_length: 347497472
 Data_free: 0
Auto_increment: NULL
   Create_time: 2009-09-03 10:17:57
   Update_time: 2009-09-03 12:04:02
Check_time: NULL
 Collation: latin1_swedish_ci
  Checksum: NULL
Create_options:
   Comment:
*** 2. row ***
  Name: MVtime
Engine: MyISAM
   Version: 10
Row_format: Dynamic
  Rows: 13562373
Avg_row_length: 31
   Data_length: 430220056
Max_data_length: 281474976710655
  Index_length: 350996480
 Data_free: 0
Auto_increment: NULL
   Create_time: 2009-09-03 13:31:33
   Update_time: 2009-09-03 13:43:51
Check_time: NULL
 Collation: latin1_swedish_ci
  Checksum: NULL
Create_options:
   Comment:

I also describe these two tables:
mysql desc RItime;
+---++--+-++---+
| Field | Type   | Null | Key | Default| Extra |
+---++--+-++---+
| code  | varchar(6) | NO   | PRI ||   |
| ndate | date   | NO   | PRI | -00-00 |   |
| ri| double | YES  | | NULL   |   |
| time  | date   | YES  | | NULL   |   |
| bdate | date   | YES  | | NULL   |   |
+---++--+-++---+
5 rows in set (0.00 sec)

mysql desc MVtime;
+---++--+-++---+
| Field | Type   | Null | Key | Default| Extra |
+---++--+-++---+
| code  | varchar(6) | NO   | PRI ||   |
| ndate | date   | NO   | PRI | -00-00 |   |
| MV| double | YES  | | NULL   |   |
| time  | date   | YES  | | NULL   |   |
| bdate | date   | YES  | | NULL   |   |
+---++--+-++---+
5 rows in set (0.00 sec)

Could you give me some hint on how to improve the speed of this query?  
Thanks.


Best,
Jia




--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/mysql?unsub=arch...@jab.org



Re: How to optimize a slow query?

2009-09-05 Thread mos
How many rows were added to rmpdata1 table? If it is 13.4 million rows then 
it is going to take several minutes to join this many rows from the 2 tables.
Is there a 1:1 relationship between the two tables or a 1:Many? If there is 
a 1:1 then I'd recommend joining the two tables into 1 table so you don't 
have to join them in the first place.
The only other thing I can suggest is to change the type of index on the 
tables being joined to see if that makes a speed difference. For example, 
if you are using BTREE then switch to HASH or vice versa. See 
http://dev.mysql.com/doc/refman/5.0/en/create-index.html for more info.


Mike

At 10:05 AM 9/5/2009, Jia Chen wrote:

Hi there,

One simple query took more than 10 minutes.  Here is how relevant rows in 
the slow query log looks like:


# Time: 090905 10:49:57
# u...@host: root[root] @ localhost []
# Query_time: 649  Lock_time: 0  Rows_sent: 0  Rows_examined: 26758561
use world;
create table rmpdata1 select ri.*,
mv.MV, coalesce(ri.code,mv.code) as ccode,
coalesce(ri.ndate,mv.ndate) as cndate
from RItime as ri left outer join
MVtime as mv
on (ri.code=mv.code and
ri.ndate=mv.ndate);

When I explain only the select clause, I get
+--+---+
| id | select_type | table | type   | possible_keys | key | key_len | 
ref  | rows | Extra |

++-+---++---+-+-+--+--+---+
|  1 | SIMPLE  | ri| ALL| NULL  | NULL| NULL
| NULL | 13419851 |   |
|  1 | SIMPLE  | mv| eq_ref | PRIMARY   | PRIMARY | 11
| world.ri.code,world.ri.ndate |1 |   |
++-+---++---+-+-+--+--+---+
2 rows in set (0.00 sec)

I use show table status from world; to get information about two tables, 
RItime and MVtime, in the join clause:

  Name: RItime
Engine: MyISAM
   Version: 10
Row_format: Dynamic
  Rows: 13419851
Avg_row_length: 31
   Data_length: 427721848
Max_data_length: 281474976710655
  Index_length: 347497472
 Data_free: 0
Auto_increment: NULL
   Create_time: 2009-09-03 10:17:57
   Update_time: 2009-09-03 12:04:02
Check_time: NULL
 Collation: latin1_swedish_ci
  Checksum: NULL
Create_options:
   Comment:
*** 2. row ***
  Name: MVtime
Engine: MyISAM
   Version: 10
Row_format: Dynamic
  Rows: 13562373
Avg_row_length: 31
   Data_length: 430220056
Max_data_length: 281474976710655
  Index_length: 350996480
 Data_free: 0
Auto_increment: NULL
   Create_time: 2009-09-03 13:31:33
   Update_time: 2009-09-03 13:43:51
Check_time: NULL
 Collation: latin1_swedish_ci
  Checksum: NULL
Create_options:
   Comment:

I also describe these two tables:
mysql desc RItime;
+---++--+-++---+
| Field | Type   | Null | Key | Default| Extra |
+---++--+-++---+
| code  | varchar(6) | NO   | PRI ||   |
| ndate | date   | NO   | PRI | -00-00 |   |
| ri| double | YES  | | NULL   |   |
| time  | date   | YES  | | NULL   |   |
| bdate | date   | YES  | | NULL   |   |
+---++--+-++---+
5 rows in set (0.00 sec)

mysql desc MVtime;
+---++--+-++---+
| Field | Type   | Null | Key | Default| Extra |
+---++--+-++---+
| code  | varchar(6) | NO   | PRI ||   |
| ndate | date   | NO   | PRI | -00-00 |   |
| MV| double | YES  | | NULL   |   |
| time  | date   | YES  | | NULL   |   |
| bdate | date   | YES  | | NULL   |   |
+---++--+-++---+
5 rows in set (0.00 sec)

Could you give me some hint on how to improve the speed of this query?
Thanks.

Best,
Jia




--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/mysql?unsub=mo...@fastmail.fm



--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/mysql?unsub=arch...@jab.org