memory issue: paging, disk activity, table size - was: optimizing a select statement
Thanks for the information. Before I try to further look into the query, I would like to know if there is some memory issue. Its strange since the speed for executing same query differs. Its a Mac G5 with just 1GB ram. I could see lot of pageouts in the top command. The activity monitor shows that there is little free memory (12 -14M) and ~650M of inactive memory and ~250M of active memory. Read on web that the less free memory is the inactive memory will be used. When I run a query (like the one we had discussed) there is only a slight difference in this memory status, where as the disk activity shows active 'data in', 'read in' . Does this means that its out of physical memory and uses swap file? I am not sure how mysql uses memory. Couple of tables I read in the query is more than 4GB but the query uses only two column of those tables and it will not be more than 1 G for sure. I am not sure if I understood correct or not and would like to hear your suggestion. Is there any relation between the table size (no of rows and file size) and the system memory needed to get better performance? i read on web that its better to have ram more than the largest table size. is this true even if the query uses only few columns of big tables? Thanks for your help Karthik. At 07:44 PM 2/29/2004, you wrote: So if this is your query and based on the explain this is what I would recommend. SELECT Distinct (a.id) From table1 a INNER JOIN table1 b USING ( p_id ) INNER JOIN table2 c USING ( p_id ) INNER JOIN table3 d USING ( out_id ) INNER JOIN table4 e ON ( d.name_id = e.name_id ) INNER JOIN table4 f ON ( e.start_id BETWEEN f.left_id AND f.end_id ) WHERE (f.name_id =45 OR f.name_id =56) AND b.id =275 AND a.id != b.id For some reason you are getting the using temporary on table b, I assume based on the explain that you have an index called (id) that only has the id in it. Here's the first problem. You are limiting b, by id. And joining on p_id. So mysql is trying to use the index with the combination of both of them which is combine I assume. So removing distince really won't help with this one much, since your indexes will always have a little problems. INNER JOIN table4 f ON ( e.start_id BETWEEN f.left_id AND f.end_id ) This is probably the second biggest problem, you will always get a range. And ranges will always be slower. Don't really have a solution without actually touching the data. If you really have more than 50 million records this is really bad. a.id != b.id I would recommend trying to rewrite your query and just focus on table a and f. If you can get rid of them returning the extra 2 and 3 rows, I think that would solve your problem. Donny -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
Re: optimizing a select statement over a database with 50 million recs
here is the updated select statement i came with but still needs improvement SELECT Distinct (a.id) From table1 a INNER JOIN table1 b USING ( p_id ) INNER JOIN table2 c USING ( p_id ) INNER JOIN table3 d USING ( out_id ) INNER JOIN table4 e ON ( d.name_id = e.name_id ) INNER JOIN table4 f ON ( e.start_id BETWEEN f.left_id AND f.end_id ) WHERE (f.name_id =45 OR f.name_id =56) AND b.id =275 AND a.id != b.id This reduced the time from 0.5 secs to 0.3 secs but still i am looking for some major improvemnts. I am using this query in PHP and its been repeated several times to display a list. Any advice on this will be really useful. Is there anything I could do with the mysql and php configuration to increase the performance? The server has just 1GB ram, will adding more memory help? Thanks Karthik. Need help on optimizing the select statement: Table structure table1 -- id p_id table2 -- p_id out_id table3 -- out_id name_id table4 --- name_id (unique) prev_id start_id (unique) end_id (unique) Only table4 has unique fields all other fields are not unique. The following select statement does what is required but the tables are really huge ( 50 million records) so need to be optimized SELECT DISTINCT (a.id) FROM table1 a, table1 b, table2 c, table3 d, table4 e, table4 f WHERE a.id =275 AND a.p_id = b.p_id AND a.id != b.id AND a.p_id = c.p_id AND c.out_id = d.out_id AND d.name_id = e.name_id AND e.start_id = f.start_id AND e.end_id = f.end_id AND e.end_id !=0 AND (f.name_id =45 OR f.name_id =56) GROUP BY b.id The explain for the above statement is a | ref | id,p_id,combine | combine | 4 | const | 5281 | Using where; Using index; Using temporary; Using f... c | eq_ref | PRIMARY,p_id | PRIMARY | 4 | a.p_id | 1 | | | d | ref | name_id,out_id | out_id | 4 | b.out_id | 1 | | | b | ref | p_id | p_id | 4 | a.p_id | 3 | Using where | e | range | PRIMARY,start_id,end_id,combine,name_id | PRIMARY | 4 | NULL | 2 | Using where | f | eq_ref | PRIMARY,start_id,end_id,combine,name_id | PRIMARY | 4 | c.name_id | 1 | Using where | I am sure there should be some better way to do this using Inner join or something similar but I am not sure how. It will be helpful if you could suggest me some improvements for this query. If you need any further explanation please let me know. Thanks for your help Karu. -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED] -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
RE: optimizing a select statement over a database with 50 million recs
Thanks for looking at my problem. here is the explain for this version of select b | ref | id,p_id,combine | combine | 4 | const | 5281 | Using where; Using index; Using temporary | c | eq_ref | PRIMARY,p_id | PRIMARY | 4 | b.p_id | 1 | | | d | ref | name_id,out_id | out_id | 4 | c.out_id | 1 | | | a | ref | p_id | p_id | 4 | b.p_id | 3 | Using where | e | eq_ref | PRIMARY,start_id,combine,name_id | PRIMARY | 4 | d.name_id | 1 | Distinct | f | range | PRIMARY,name_id | PRIMARY | 4 | NULL | 2 | Using | where; | Distinct | Karthik Can you send us a new explain on this version? Donny -Original Message- From: karthik viswanathan [mailto:[EMAIL PROTECTED] Sent: Sunday, February 29, 2004 4:12 PM To: [EMAIL PROTECTED] Subject: Re: optimizing a select statement over a database with 50 million recs here is the updated select statement i came with but still needs improvement SELECT Distinct (a.id) From table1 a INNER JOIN table1 b USING ( p_id ) INNER JOIN table2 c USING ( p_id ) INNER JOIN table3 d USING ( out_id ) INNER JOIN table4 e ON ( d.name_id = e.name_id ) INNER JOIN table4 f ON ( e.start_id BETWEEN f.left_id AND f.end_id ) WHERE (f.name_id =45 OR f.name_id =56) AND b.id =275 AND a.id != b.id This reduced the time from 0.5 secs to 0.3 secs but still i am looking for some major improvemnts. I am using this query in PHP and its been repeated several times to display a list. Any advice on this will be really useful. Is there anything I could do with the mysql and php configuration to increase the performance? The server has just 1GB ram, will adding more memory help? Thanks Karthik. Need help on optimizing the select statement: Table structure table1 -- id p_id table2 -- p_id out_id table3 -- out_id name_id table4 --- name_id (unique) prev_id start_id (unique) end_id (unique) Only table4 has unique fields all other fields are not unique. The following select statement does what is required but the tables are really huge ( 50 million records) so need to be optimized SELECT DISTINCT (a.id) FROM table1 a, table1 b, table2 c, table3 d, table4 e, table4 f WHERE a.id =275 AND a.p_id = b.p_id AND a.id != b.id AND a.p_id = c.p_id AND c.out_id = d.out_id AND d.name_id = e.name_id AND e.start_id = f.start_id AND e.end_id = f.end_id AND e.end_id !=0 AND (f.name_id =45 OR f.name_id =56) GROUP BY b.id The explain for the above statement is a | ref | id,p_id,combine | combine | 4 | const | 5281 | Using where; Using index; Using temporary; Using f... c | eq_ref | PRIMARY,p_id | PRIMARY | 4 | a.p_id | 1 | | | d | ref | name_id,out_id | out_id | 4 | b.out_id | 1 | | | b | ref | p_id | p_id | 4 | a.p_id | 3 | Using where | e | range | PRIMARY,start_id,end_id,combine,name_id | PRIMARY | 4 | NULL | 2 | Using where | f | eq_ref | PRIMARY,start_id,end_id,combine,name_id | PRIMARY | 4 | c.name_id | 1 | Using where | I am sure there should be some better way to do this using Inner join or something similar but I am not sure how. It will be helpful if you could suggest me some improvements for this query. If you need any further explanation please let me know. Thanks for your help Karu. -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED] -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe: http://lists.mysql.com/[EMAIL PROTECTED] -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED] -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
RE: optimizing a select statement over a database with 50 million recs
Here is the command line explain, earlier i had used phpAdmin +-++-+--+-+--- --+--+---+ | table | type | possible_keys | key | key_len | ref | rows | Extra | +-++-+--+-+--- --+--+---+ | b | ref| id,p_id,combine | combine | 4 | const | 5281 | Using where; Using index; Using temporary | | c | eq_ref | PRIMARY,p_id| PRIMARY | 4 | b.p_id |1 | | | d | ref| name_id,out_id | out_id | 4 | c.out_id|1 | | | a | ref| p_id| p_id | 4 | b.p_id |3 | Using where | | e | eq_ref | PRIMARY,start_id,combine,taxId | PRIMARY | 4 | d.name_id |1 | Distinct | | f | range | PRIMARY,name_id | PRIMARY | 4 | NULL|2 | Using where; Distinct | +-++-+--+-+--- --+--+---+ But looking at your explain with is hard to understand as I mentioned above, it look like your rows are 5281 * 1 * 1 * 4 * 2 or 42248 rows it has to go through. I think u r right Also see if you can change WHERE (f.name_id =45 OR f.name_id =56) to use IN. It has proven to be much faster for me. But that won't speed it up .3 seconds. i tried that i dint see any difference. If I take out the distinct explain doesnt show the Using temporary but I dint see any difference in the time also. Please let me know if you need any further information Thanks Karthik -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
optimizing a select statement over a database with 50 million recs
Need help on optimizing the select statement: Table structure table1 -- id p_id table2 -- p_id out_id table3 -- out_id name_id table4 --- name_id (unique) prev_id start_id (unique) end_id (unique) Only table4 has unique fields all other fields are not unique. The following select statement does what is required but the tables are really huge ( 50 million records) so need to be optimized SELECT DISTINCT (a.id) FROM table1 a, table1 b, table2 c, table3 d, table4 e, table4 f WHERE a.id =275 AND a.p_id = b.p_id AND a.id != b.id AND a.p_id = c.p_id AND c.out_id = d.out_id AND d.name_id = e.name_id AND e.start_id = f.start_id AND e.end_id = f.end_id AND e.end_id !=0 AND (f.name_id =45 OR f.name_id =56) GROUP BY b.id The explain for the above statement is a | ref | id,p_id,combine | combine | 4 | const | 5281 | Using where; Using index; Using temporary; Using f... c | eq_ref | PRIMARY,p_id | PRIMARY | 4 | a.p_id | 1 | | | d | ref | name_id,out_id | out_id | 4 | b.out_id | 1 | | | b | ref | p_id | p_id | 4 | a.p_id | 3 | Using where | e | range | PRIMARY,start_id,end_id,combine,name_id | PRIMARY | 4 | NULL | 2 | Using where | f | eq_ref | PRIMARY,start_id,end_id,combine,name_id | PRIMARY | 4 | c.name_id | 1 | Using where | I am sure there should be some better way to do this using Inner join or something similar but I am not sure how. It will be helpful if you could suggest me some improvements for this query. If you need any further explanation please let me know. Thanks for your help Karu. -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]