Re: Help with slow query
Shawn, Thanks for the great help! It still is not working. I did an EXPLAIN on this query with your amended split out join statements and got this: ++-+---+---+---++-+--++-+ | id | select_type | table | type | possible_keys | key| key_len | ref | rows | Extra | ++-+---+---+---++-+--++-+ | 1 | SIMPLE | a | range | apptdate | apptdate | 4 | NULL | 296148 | Using where | | 1 | SIMPLE | p | ALL | NULL | NULL | NULL| NULL | 262462 | | | 1 | SIMPLE | t | ALL | NULL | NULL | NULL| NULL | 311152 | | | 1 | SIMPLE | c | ref | IdApptType| IdApptType | 51 | func | 1 | | | 1 | SIMPLE | af| ALL | NULL | NULL | NULL| NULL | 5680 | | ++-+---+---+---++-+--++-+ What I'm not catching is why it says there is no key it can use for the patient table; here is a portion of the show create: PRIMARY KEY (`zzk`), KEY `IdPatient` (`IdPatient`), KEY `SSN` (`SSN`), KEY `IdLastword` (`IdLastword`), KEY `DOB` (`DateOfBirth`), KEY `NameFirst` (`NameFirst`), KEY `NameLast` (`NameLast`) So, the IdPatient is at least a POSSIBLE key, right? On Mar 9, 2011, at 8:34 PM, Shawn Green (MySQL) wrote: Hi Jim, On 3/9/2011 17:57, Jim McNeely wrote: I am trying to set up an export query which is executing very slowly, and I was hoping I could get some help. Here is the query: SELECT a.IdAppt, a.IdPatient, p.NameLast, p.NameFirst, p.NameMI, a.IdProcedure, a.ProcName, CAST(CONCAT(a.ApptDate, , a.ApptTimeOut) AS CHAR) ApptDateTime, a.ApptLenMin Duration, a.Room Primary_Resource_ID, 'R' as Primary_Resource_Type, t.Note, c.IdPractAffil DoctorID, af.PractName DoctorName, a.ICD9code ICD9, '??' Diagnosis_free_test from Appt_ a LEFT JOIN (patient_ p, today_ t, Copy_ c, Affil_ af) ON (a.IdPatient = p.IdPatient AND a.IdPatientDate = t.IdPatientDate AND CONCAT(a.IdAppt, '0') = c.IdApptType AND a.IdPriCarePhy = af.IdAffil) WHERE a.ApptDate= '2009-03-01'; p.IdPatient, t.IdPatientDate, c.IdApptType, and af.IdAffil are all indexed. Also I selectively took out join parameters until there was nothing but a join on the patient table, and it was still slow, but when I took that out, the query was extremely fast. What might I be doing wrong? Thanks, Jim McNeely The performance problem is with your Cartesian product. I think you meant to write: from Appt_ a LEFT JOIN patient_ p ON a.IdPatient = p.IdPatient LEFT JOIN today_ t ON a.IdPatientDate = t.IdPatientDate LEFT JOIN Copy_ c ON CONCAT(a.IdAppt, '0') = c.IdApptType LEFT JOIN Affil_ af ON a.IdPriCarePhy = af.IdAffil As of 5.0.12, the comma operator for table joins was demoted in the 'order of precedence' for query execution. That means that MySQL became more complaint with the SQL standard but it also means that using a comma-join instead of an explicit ANSI join can result in a Cartesian product more frequently. Try my style and compare how it works. If both styles are similarly slow, collect the EXPLAIN plan for this query and share with the list. Yours, -- Shawn Green MySQL Principal Technical Support Engineer Oracle USA, Inc. - Hardware and Software, Engineered to Work Together. Office: Blountville, TN -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql?unsub=j...@newcenturydata.com
Re: Help with slow query
Rhino, Thanks for the help and time! Actually, I thought the same thing, but what's weird is that is the only thing that doesn't slow it down. If I take out all of the join clauses EXCEPT that one the query runs virtually instantaneously. for some reason it will use the index in that case and it works. If I take out everything like this: SELECT a.IdAppt, a.IdPatient, p.NameLast, p.NameFirst, p.NameMI from Appt_ a LEFT JOIN patient_ p ON a.IdPatient = p.IdPatient WHERE a.ApptDate = '2009-03-01'; It is still utterly slow. EXPLAIN looks like this: ++-+---+---+---+--+-+--++-+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | ++-+---+---+---+--+-+--++-+ | 1 | SIMPLE | a | range | apptdate | apptdate | 4 | NULL | 296166 | Using where | | 1 | SIMPLE | p | ALL | NULL | NULL | NULL| NULL | 262465 | | ++-+---+---+---+--+-+--++-+ But, very good try. I thought this might be it as well. Thanks, Jim McNeely On Mar 10, 2011, at 9:05 AM, Rhino wrote: What I'm about to say may be completely out to lunch so don't be afraid to dismiss it. I'm more a DB2 guy than a MySQL guy and I'm getting rusty on both but I've always been struck by how similar the two dtabases are. Therefore, I want to offer an insight on why this query would not perform terribly well in DB2. I simply don't know if it is applicable to MySQL. In DB2, using functions on predicates (conditions in a WHERE clause), prevents DB2 from using an index to satisfy that predicate. (Or at least it used to: I'm not certain if that has been remedied in recent versions of the DB2 optimizer.) Therefore, the CONCAT() function in the line AND CONCAT(a.IdAppt, '0') = c.IdApptType would ensure that no index on the IdAppt column would be used to find the rows of the table that satisfied that condition. My suggestion is that you try rewriting that condition to avoid using CONCAT() - or any other function - and see if that helps the performance of your query. That would require modifying your data to append a zero to the end of the existing date in IdApptType column, which may or may not be a reasonable thing to do. You'll have to decide about that. Again, I could be all wet here so don't have me killed if I'm wrong about this :-) I'm just trying to help ;-) -- Rhino On 2011-03-10 11:38, Jim McNeely wrote: Shawn, Thanks for the great help! It still is not working. I did an EXPLAIN on this query with your amended split out join statements and got this: ++-+---+---+---++-+--++-+ | id | select_type | table | type | possible_keys | key| key_len | ref | rows | Extra | ++-+---+---+---++-+--++-+ | 1 | SIMPLE | a | range | apptdate | apptdate | 4 | NULL | 296148 | Using where | | 1 | SIMPLE | p | ALL | NULL | NULL | NULL| NULL | 262462 | | | 1 | SIMPLE | t | ALL | NULL | NULL | NULL| NULL | 311152 | | | 1 | SIMPLE | c | ref | IdApptType| IdApptType | 51 | func | 1 | | | 1 | SIMPLE | af| ALL | NULL | NULL | NULL| NULL | 5680 | | ++-+---+---+---++-+--++-+ What I'm not catching is why it says there is no key it can use for the patient table; here is a portion of the show create: PRIMARY KEY (`zzk`), KEY `IdPatient` (`IdPatient`), KEY `SSN` (`SSN`), KEY `IdLastword` (`IdLastword`), KEY `DOB` (`DateOfBirth`), KEY `NameFirst` (`NameFirst`), KEY `NameLast` (`NameLast`) So, the IdPatient is at least a POSSIBLE key, right? On Mar 9, 2011, at 8:34 PM, Shawn Green (MySQL) wrote: Hi Jim, On 3/9/2011 17:57, Jim McNeely wrote: I am trying to set up an export query which is executing very slowly, and I was hoping I could get some help. Here is the query: SELECT a.IdAppt, a.IdPatient, p.NameLast, p.NameFirst, p.NameMI, a.IdProcedure, a.ProcName, CAST(CONCAT(a.ApptDate, , a.ApptTimeOut) AS CHAR) ApptDateTime, a.ApptLenMin Duration, a.Room Primary_Resource_ID, 'R' as Primary_Resource_Type, t.Note, c.IdPractAffil DoctorID, af.PractName DoctorName, a.ICD9code ICD9, '??' Diagnosis_free_test from Appt_ a LEFT JOIN (patient_ p, today_ t, Copy_ c, Affil_ af) ON (a.IdPatient = p.IdPatient AND a.IdPatientDate = t.IdPatientDate AND CONCAT(a.IdAppt, '0') = c.IdApptType AND
Re: Help with slow query
If the optimizer chooses the wrong index, you can tell it what index to use. SELECT a.IdAppt, a.IdPatient, p.NameLast, p.NameFirst, p.NameMI from Appt_ a force index(id_patient) LEFT JOIN patient_ p ON a.IdPatient = p.IdPatient WHERE a.ApptDate = '2009-03-01'; See http://dev.mysql.com/doc/refman/5.1/en/index-hints.html Mike At 11:32 AM 3/10/2011, Jim McNeely wrote: Rhino, Thanks for the help and time! Actually, I thought the same thing, but what's weird is that is the only thing that doesn't slow it down. If I take out all of the join clauses EXCEPT that one the query runs virtually instantaneously. for some reason it will use the index in that case and it works. If I take out everything like this: SELECT a.IdAppt, a.IdPatient, p.NameLast, p.NameFirst, p.NameMI from Appt_ a LEFT JOIN patient_ p ON a.IdPatient = p.IdPatient WHERE a.ApptDate = '2009-03-01'; It is still utterly slow. EXPLAIN looks like this: ++-+---+---+---+--+-+--++-+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | ++-+---+---+---+--+-+--++-+ | 1 | SIMPLE | a | range | apptdate | apptdate | 4 | NULL | 296166 | Using where | | 1 | SIMPLE | p | ALL | NULL | NULL | NULL| NULL | 262465 | | ++-+---+---+---+--+-+--++-+ But, very good try. I thought this might be it as well. Thanks, Jim McNeely On Mar 10, 2011, at 9:05 AM, Rhino wrote: What I'm about to say may be completely out to lunch so don't be afraid to dismiss it. I'm more a DB2 guy than a MySQL guy and I'm getting rusty on both but I've always been struck by how similar the two dtabases are. Therefore, I want to offer an insight on why this query would not perform terribly well in DB2. I simply don't know if it is applicable to MySQL. In DB2, using functions on predicates (conditions in a WHERE clause), prevents DB2 from using an index to satisfy that predicate. (Or at least it used to: I'm not certain if that has been remedied in recent versions of the DB2 optimizer.) Therefore, the CONCAT() function in the line AND CONCAT(a.IdAppt, '0') = c.IdApptType would ensure that no index on the IdAppt column would be used to find the rows of the table that satisfied that condition. My suggestion is that you try rewriting that condition to avoid using CONCAT() - or any other function - and see if that helps the performance of your query. That would require modifying your data to append a zero to the end of the existing date in IdApptType column, which may or may not be a reasonable thing to do. You'll have to decide about that. Again, I could be all wet here so don't have me killed if I'm wrong about this :-) I'm just trying to help ;-) -- Rhino On 2011-03-10 11:38, Jim McNeely wrote: Shawn, Thanks for the great help! It still is not working. I did an EXPLAIN on this query with your amended split out join statements and got this: ++-+---+---+---++-+--++-+ | id | select_type | table | type | possible_keys | key| key_len | ref | rows | Extra | ++-+---+---+---++-+--++-+ | 1 | SIMPLE | a | range | apptdate | apptdate | 4 | NULL | 296148 | Using where | | 1 | SIMPLE | p | ALL | NULL | NULL | NULL| NULL | 262462 | | | 1 | SIMPLE | t | ALL | NULL | NULL | NULL| NULL | 311152 | | | 1 | SIMPLE | c | ref | IdApptType| IdApptType | 51 | func | 1 | | | 1 | SIMPLE | af| ALL | NULL | NULL | NULL| NULL | 5680 | | ++-+---+---+---++-+--++-+ What I'm not catching is why it says there is no key it can use for the patient table; here is a portion of the show create: PRIMARY KEY (`zzk`), KEY `IdPatient` (`IdPatient`), KEY `SSN` (`SSN`), KEY `IdLastword` (`IdLastword`), KEY `DOB` (`DateOfBirth`), KEY `NameFirst` (`NameFirst`), KEY `NameLast` (`NameLast`) So, the IdPatient is at least a POSSIBLE key, right? On Mar 9, 2011, at 8:34 PM, Shawn Green (MySQL) wrote: Hi Jim, On 3/9/2011 17:57, Jim McNeely wrote: I am trying to set up an export query which is executing very slowly, and I was hoping I could get some help. Here is the query: SELECT a.IdAppt, a.IdPatient, p.NameLast, p.NameFirst, p.NameMI, a.IdProcedure, a.ProcName, CAST(CONCAT(a.ApptDate, , a.ApptTimeOut) AS CHAR) ApptDateTime, a.ApptLenMin Duration,
Re: Help with slow query
On 3/10/2011 12:32, Jim McNeely wrote: Rhino, Thanks for the help and time! Actually, I thought the same thing, but what's weird is that is the only thing that doesn't slow it down. If I take out all of the join clauses EXCEPT that one the query runs virtually instantaneously. for some reason it will use the index in that case and it works. If I take out everything like this: SELECT a.IdAppt, a.IdPatient, p.NameLast, p.NameFirst, p.NameMI from Appt_ a LEFT JOIN patient_ p ON a.IdPatient = p.IdPatient WHERE a.ApptDate= '2009-03-01'; It is still utterly slow. EXPLAIN looks like this: ++-+---+---+---+--+-+--++-+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | ++-+---+---+---+--+-+--++-+ | 1 | SIMPLE | a | range | apptdate | apptdate | 4 | NULL | 296166 | Using where | | 1 | SIMPLE | p | ALL | NULL | NULL | NULL| NULL | 262465 | | ++-+---+---+---+--+-+--++-+ But, very good try. I thought this might be it as well. ... snip ... According to this report, there are no indexes on the `patient_` table that include the column `IdPatient` as the first column. Fix that and this query should be much faster. -- Shawn Green MySQL Principal Technical Support Engineer Oracle USA, Inc. - Hardware and Software, Engineered to Work Together. Office: Blountville, TN -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql?unsub=arch...@jab.org
Re: Help with slow query
Shawn, This is the first thing that I though as well, but here is a portion from the show create table for patient_: PRIMARY KEY (`zzk`), KEY `IdPatient` (`IdPatient`), KEY `SSN` (`SSN`), KEY `IdLastword` (`IdLastword`), KEY `DOB` (`DateOfBirth`), KEY `NameFirst` (`NameFirst`), KEY `NameLast` (`NameLast`) This extremely simple join is still massively slow. Jim On Mar 10, 2011, at 10:00 AM, Shawn Green (MySQL) wrote: On 3/10/2011 12:32, Jim McNeely wrote: Rhino, Thanks for the help and time! Actually, I thought the same thing, but what's weird is that is the only thing that doesn't slow it down. If I take out all of the join clauses EXCEPT that one the query runs virtually instantaneously. for some reason it will use the index in that case and it works. If I take out everything like this: SELECT a.IdAppt, a.IdPatient, p.NameLast, p.NameFirst, p.NameMI from Appt_ a LEFT JOIN patient_ p ON a.IdPatient = p.IdPatient WHERE a.ApptDate= '2009-03-01'; It is still utterly slow. EXPLAIN looks like this: ++-+---+---+---+--+-+--++-+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | ++-+---+---+---+--+-+--++-+ | 1 | SIMPLE | a | range | apptdate | apptdate | 4 | NULL | 296166 | Using where | | 1 | SIMPLE | p | ALL | NULL | NULL | NULL| NULL | 262465 | | ++-+---+---+---+--+-+--++-+ But, very good try. I thought this might be it as well. ... snip ... According to this report, there are no indexes on the `patient_` table that include the column `IdPatient` as the first column. Fix that and this query should be much faster. -- Shawn Green MySQL Principal Technical Support Engineer Oracle USA, Inc. - Hardware and Software, Engineered to Work Together. Office: Blountville, TN -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql?unsub=j...@newcenturydata.com -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql?unsub=arch...@jab.org
Re: Help with slow query
On 3/10/2011 13:12, Jim McNeely wrote: Shawn, This is the first thing that I though as well, but here is a portion from the show create table for patient_: PRIMARY KEY (`zzk`), KEY `IdPatient` (`IdPatient`), KEY `SSN` (`SSN`), KEY `IdLastword` (`IdLastword`), KEY `DOB` (`DateOfBirth`), KEY `NameFirst` (`NameFirst`), KEY `NameLast` (`NameLast`) This extremely simple join is still massively slow. Jim On Mar 10, 2011, at 10:00 AM, Shawn Green (MySQL) wrote: On 3/10/2011 12:32, Jim McNeely wrote: Rhino, Thanks for the help and time! Actually, I thought the same thing, but what's weird is that is the only thing that doesn't slow it down. If I take out all of the join clauses EXCEPT that one the query runs virtually instantaneously. for some reason it will use the index in that case and it works. If I take out everything like this: SELECT a.IdAppt, a.IdPatient, p.NameLast, p.NameFirst, p.NameMI from Appt_ a LEFT JOIN patient_ p ON a.IdPatient = p.IdPatient WHERE a.ApptDate= '2009-03-01'; 1) Verify that the indexes on `patient_` haven't been disabled SHOW INDEXES FROM `patient_`; http://dev.mysql.com/doc/refman/5.5/en/show-index.html 2) Verify that the data types of `Appt_`.`IdPatient` and `patient_`.`IdPatient` are not incompatible. (for example: one is varchar, the other int) Thanks, -- Shawn Green MySQL Principal Technical Support Engineer Oracle USA, Inc. - Hardware and Software, Engineered to Work Together. Office: Blountville, TN -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql?unsub=arch...@jab.org
Re: Help with slow query
On 3/10/11 10:46 AM, Shawn Green (MySQL) wrote: On 3/10/2011 12:32, Jim McNeely wrote: Rhino, Thanks for the help and time! Actually, I thought the same thing, but what's weird is that is the only thing that doesn't slow it down. If I take out all of the join clauses EXCEPT that one the query runs virtually instantaneously. for some reason it will use the index in that case and it works. If I take out everything like this: SELECT a.IdAppt, a.IdPatient, p.NameLast, p.NameFirst, p.NameMI from Appt_ a LEFT JOIN patient_ p ON a.IdPatient = p.IdPatient WHERE a.ApptDate= '2009-03-01'; 1) Verify that the indexes on `patient_` haven't been disabled SHOW INDEXES FROM `patient_`; http://dev.mysql.com/doc/refman/5.5/en/show-index.html 2) Verify that the data types of `Appt_`.`IdPatient` and `patient_`.`IdPatient` are not incompatible. (for example: one is varchar, the other int) This last one can be HUGE. I tracked a big performance issue to this exact problem - the columns used in the join had the same name, but different data types. Correcting to be the same type (both ints) made a terrific performance increase. -- Andy Wallace iHOUSEweb, Inc. awall...@ihouseweb.com (866) 645-7700 ext 219 -- There are two ways to build software: Make it so simple that there are obviously no bugs, or make it so complex that there are no obvious bugs. -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql?unsub=arch...@jab.org
Help with slow query
I am trying to set up an export query which is executing very slowly, and I was hoping I could get some help. Here is the query: SELECT a.IdAppt, a.IdPatient, p.NameLast, p.NameFirst, p.NameMI, a.IdProcedure, a.ProcName, CAST(CONCAT(a.ApptDate, , a.ApptTimeOut) AS CHAR) ApptDateTime, a.ApptLenMin Duration, a.Room Primary_Resource_ID, 'R' as Primary_Resource_Type, t.Note, c.IdPractAffil DoctorID, af.PractName DoctorName, a.ICD9code ICD9, '??' Diagnosis_free_test from Appt_ a LEFT JOIN (patient_ p, today_ t, Copy_ c, Affil_ af) ON (a.IdPatient = p.IdPatient AND a.IdPatientDate = t.IdPatientDate AND CONCAT(a.IdAppt, '0') = c.IdApptType AND a.IdPriCarePhy = af.IdAffil) WHERE a.ApptDate = '2009-03-01'; p.IdPatient, t.IdPatientDate, c.IdApptType, and af.IdAffil are all indexed. Also I selectively took out join parameters until there was nothing but a join on the patient table, and it was still slow, but when I took that out, the query was extremely fast. What might I be doing wrong? Thanks, Jim McNeely -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql?unsub=arch...@jab.org
Re: Help with slow query
Hi Jim, On 3/9/2011 17:57, Jim McNeely wrote: I am trying to set up an export query which is executing very slowly, and I was hoping I could get some help. Here is the query: SELECT a.IdAppt, a.IdPatient, p.NameLast, p.NameFirst, p.NameMI, a.IdProcedure, a.ProcName, CAST(CONCAT(a.ApptDate, , a.ApptTimeOut) AS CHAR) ApptDateTime, a.ApptLenMin Duration, a.Room Primary_Resource_ID, 'R' as Primary_Resource_Type, t.Note, c.IdPractAffil DoctorID, af.PractName DoctorName, a.ICD9code ICD9, '??' Diagnosis_free_test from Appt_ a LEFT JOIN (patient_ p, today_ t, Copy_ c, Affil_ af) ON (a.IdPatient = p.IdPatient AND a.IdPatientDate = t.IdPatientDate AND CONCAT(a.IdAppt, '0') = c.IdApptType AND a.IdPriCarePhy = af.IdAffil) WHERE a.ApptDate= '2009-03-01'; p.IdPatient, t.IdPatientDate, c.IdApptType, and af.IdAffil are all indexed. Also I selectively took out join parameters until there was nothing but a join on the patient table, and it was still slow, but when I took that out, the query was extremely fast. What might I be doing wrong? Thanks, Jim McNeely The performance problem is with your Cartesian product. I think you meant to write: from Appt_ a LEFT JOIN patient_ p ON a.IdPatient = p.IdPatient LEFT JOIN today_ t ON a.IdPatientDate = t.IdPatientDate LEFT JOIN Copy_ c ON CONCAT(a.IdAppt, '0') = c.IdApptType LEFT JOIN Affil_ af ON a.IdPriCarePhy = af.IdAffil As of 5.0.12, the comma operator for table joins was demoted in the 'order of precedence' for query execution. That means that MySQL became more complaint with the SQL standard but it also means that using a comma-join instead of an explicit ANSI join can result in a Cartesian product more frequently. Try my style and compare how it works. If both styles are similarly slow, collect the EXPLAIN plan for this query and share with the list. Yours, -- Shawn Green MySQL Principal Technical Support Engineer Oracle USA, Inc. - Hardware and Software, Engineered to Work Together. Office: Blountville, TN -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/mysql?unsub=arch...@jab.org
Re: help with slow query
I know it's bad form to reply to yourself but I just found a major mental mistake in my response. See embedded: [EMAIL PROTECTED] wrote on 08/12/2005 12:18:21 AM: Sebastian [EMAIL PROTECTED] wrote on 08/11/2005 01:19:30 PM: well i managed to solve the problem myself, and im no sql genius... i thought i had an index on maps_rating.map which i didn't.. adding an index on it improved the query. i think that is about all the improvement i can get.. but if there is still room for more speed i'd like to know.. Sebastian wrote: Jigal van Hemert wrote: Sebastian wrote: this query runs slow because AVG and COUNT on maps_rating table i think. can anything be done to improve? You may want to include: - table definitions (output of SHOW CREATE TABLE table) - output of EXPLAIN query sorry for the lack of info. there are a couple of indexes on maps table, but i am not using them unless i use a where clause. not sure if the format is going to appear correctly on mailing list email, here is the info: id select_type table type possible_keys key key_len ref rows Extra 1 SIMPLE maps ALL /NULL/ /NULL/ /NULL/ /NULL/ 626 Using temporary; Using filesort 1 SIMPLE maps_rating ALL /NULL/ /NULL/ /NULL/ /NULL/ 1839 1 SIMPLE user eq_ref PRIMARY PRIMARY 4 site.maps.userid 1 CREATE TABLE `maps` ( `id` int(10) unsigned NOT NULL auto_increment, `mip` smallint(6) NOT NULL default '0', `map` varchar(50) NOT NULL default '', `userid` int(10) unsigned NOT NULL default '0', `filename` varchar(50) NOT NULL default '', `date` int(10) unsigned NOT NULL default '0', PRIMARY KEY (`id`), KEY `mip` (`mip`) ) ENGINE=MyISAM DEFAULT CHARSET=latin1 PACK_KEYS=0 AUTO_INCREMENT=740; CREATE TABLE `maps_rating` ( `id` int(10) unsigned NOT NULL auto_increment, `map` int(10) NOT NULL default '0', `rating` smallint(6) NOT NULL default '0', `userid` int(10) unsigned NOT NULL default '0', `ipaddress` varchar(15) NOT NULL default '', `dateline` int(10) unsigned NOT NULL default '0', PRIMARY KEY (`id`) ) ENGINE=MyISAM DEFAULT CHARSET=latin1 AUTO_INCREMENT=1884; the query: SELECT maps.*, AVG(maps_rating.rating) AS rating, COUNT(maps_rating.id) AS votes, user.username FROM maps LEFT JOIN maps_rating ON (maps.id = maps_rating.map) LEFT JOIN user ON (user.userid = maps.userid) GROUP BY maps.id ORDER BY maps.dateline DESC LIMIT $start, $perpage This is not to solve your problem (you already did that) but to respond to your request for possible additional optimizations. One of the basic principles I try to use when optimizing is to JOIN as little data as possible, even if it takes more than one step. What your original query does is to JOIN three tables then GROUP BY on the resulting combinations of records. If you eliminated all of the duplication from your secondary tables (maps_rating and user) by performing your AVG and COUNT in separate steps, you reduce the amount of data you need to reprocess through the GROUP BY by an order of magnitude. Less data = less time. Here is how I would approach your problem. I would create a temp table (or a static table if you run this often enough) that contains whatever statistics you want (your COUNTs, AVGs, etc.) then join that to the `maps` table to fill in the rest of the columns you wanted in your report. The whole query would resemble something like this: CREATE TEMPORARY TABLE tmpRatings(KEY(map)) SELECT map , avg(rating) as rating , count(id) as votes FROM maps_rating GROUP BY map; This query is wrong, I cut and pasted but forgot to edit...(I must have been WAY too tired to be online) SELECT maps.*, AVG(maps_rating.rating) AS rating, COUNT(maps_rating.id) AS votes, user.username FROM maps LEFT JOIN user ON (user.userid = maps.userid) LEFT JOIN tmpRatings ON tmpRatings.map = maps.id ORDER BY maps.dateline DESC LIMIT $start, $perpage; What I meant to give was this: SELECT maps.* , tr.rating , tr.votes , user.username FROM maps LEFT JOIN user ON (user.userid = maps.userid) LEFT JOIN tmpRatings ON tmpRatings.map = maps.id ORDER BY maps.dateline DESC LIMIT $start, $perpage; Assuming an average of 10 ratings per map, you save at least 10x the processing time in your final query as compared to your original. We did add a little processing to create the statistics table, however the additional overhead is not nearly as much as we saved so the net gain will still be quite noticeable. Please give it a shot and let me know how it compares to your original. Thanks. Shawn Green Database Administrator Unimin Corporation - Spruce Pine Again,
Re: help with slow query
no one has any info to help me out? all i need to know if there is a way to speed up the query or will i have to live with it. this query runs slow because AVG and COUNT on maps_rating table i think. can anything be done to improve? query: SELECT maps.*, AVG(maps_rating.rating) AS rating, COUNT(maps_rating.id) AS votes, user.username FROM maps LEFT JOIN maps_rating ON (maps.id = maps_rating.map) LEFT JOIN user ON (user.userid = maps.userid) GROUP BY maps.id ORDER BY maps.dateline DESC LIMIT $start, $perpage structure: maps - rows: 700 +--++-+---+++ | id | mip | map | userid | author | filename | +--++-+---+++ maps_rating - rows: 2,000 +--+-+--++---+---+ | id | map | rating | userid | ipaddress | dateline | +--+-+--++---+---+ user - rows: 10,000 +--+---+ | userid | username | +--+---+ -- No virus found in this outgoing message. Checked by AVG Anti-Virus. Version: 7.0.338 / Virus Database: 267.10.5/67 - Release Date: 8/9/2005 -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
Re: help with slow query
Sebastian wrote: this query runs slow because AVG and COUNT on maps_rating table i think. can anything be done to improve? You may want to include: - table definitions (output of SHOW CREATE TABLE table) - output of EXPLAIN query This way the list members can make better suggestions. Regards, Jigal. -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
Re: help with slow query
I think it's not fair to expect the list to reply with the kind of information you have provided. Are these tables indexed et all? What indices are you using? Have you tried to see what explain tell you about the plan the optimizer will use to execute the query? Rest assured, you surely stand a better chance of reply if you provide information on the above set of questions. Cheers Manoj - Original Message - From: Sebastian [EMAIL PROTECTED] To: mysql@lists.mysql.com Sent: Thursday, August 11, 2005 3:52 PM Subject: Re: help with slow query no one has any info to help me out? all i need to know if there is a way to speed up the query or will i have to live with it. this query runs slow because AVG and COUNT on maps_rating table i think. can anything be done to improve? query: SELECT maps.*, AVG(maps_rating.rating) AS rating, COUNT(maps_rating.id) AS votes, user.username FROM maps LEFT JOIN maps_rating ON (maps.id = maps_rating.map) LEFT JOIN user ON (user.userid = maps.userid) GROUP BY maps.id ORDER BY maps.dateline DESC LIMIT $start, $perpage structure: maps - rows: 700 +--++-+---+++ | id | mip | map | userid | author | filename | +--++-+---+++ maps_rating - rows: 2,000 +--+-+--++---+---+ | id | map | rating | userid | ipaddress | dateline | +--+-+--++---+---+ user - rows: 10,000 +--+---+ | userid | username | +--+---+ -- No virus found in this outgoing message. Checked by AVG Anti-Virus. Version: 7.0.338 / Virus Database: 267.10.5/67 - Release Date: 8/9/2005 -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED] -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
Re: help with slow query
Hello. I've created tables similar to your and the query runs fast enough on my test data (maybe I have good indexes). Please, provide the EXPLAIN output for your query and exact definitions of your tables (use SHOW CREATE TABLE). Sebastian [EMAIL PROTECTED] wrote: this query runs slow because AVG and COUNT on maps_rating table i think. can anything be done to improve? query: SELECT maps.*, AVG(maps_rating.rating) AS rating, COUNT(maps_rating.id) AS votes, user.username FROM maps LEFT JOIN maps_rating ON (maps.id = maps_rating.map) LEFT JOIN user ON (user.userid = maps.userid) GROUP BY maps.id ORDER BY maps.dateline DESC LIMIT $start, $perpage structure: maps - rows: 700 +--++-+---+++ | id | mip | map | userid | author | filename | +--++-+---+++ maps_rating - rows: 2,000 +--+-+--++---+---+ | id | map | rating | userid | ipaddress | dateline | +--+-+--++---+---+ user - rows: 10,000 +--+---+ | userid | username | +--+---+ -- For technical support contracts, goto https://order.mysql.com/?ref=ensita This email is sponsored by Ensita.NET http://www.ensita.net/ __ ___ ___ __ / |/ /_ __/ __/ __ \/ /Gleb Paharenko / /|_/ / // /\ \/ /_/ / /__ [EMAIL PROTECTED] /_/ /_/\_, /___/\___\_\___/ MySQL AB / Ensita.NET ___/ www.mysql.com -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
Re: help with slow query
Jigal van Hemert wrote: Sebastian wrote: this query runs slow because AVG and COUNT on maps_rating table i think. can anything be done to improve? You may want to include: - table definitions (output of SHOW CREATE TABLE table) - output of EXPLAIN query sorry for the lack of info. there are a couple of indexes on maps table, but i am not using them unless i use a where clause. not sure if the format is going to appear correctly on mailing list email, here is the info: id select_type table typepossible_keys key key_len ref rowsExtra 1 SIMPLE maps ALL /NULL/ /NULL/ /NULL/ /NULL/ 626 Using temporary; Using filesort 1 SIMPLE maps_rating ALL /NULL/ /NULL/ /NULL/ /NULL/ 1839 1 SIMPLE user eq_ref PRIMARY PRIMARY 4 site.maps.userid 1 CREATE TABLE `maps` ( `id` int(10) unsigned NOT NULL auto_increment, `mip` smallint(6) NOT NULL default '0', `map` varchar(50) NOT NULL default '', `userid` int(10) unsigned NOT NULL default '0', `filename` varchar(50) NOT NULL default '', `date` int(10) unsigned NOT NULL default '0', PRIMARY KEY (`id`), KEY `mip` (`mip`) ) ENGINE=MyISAM DEFAULT CHARSET=latin1 PACK_KEYS=0 AUTO_INCREMENT=740; CREATE TABLE `maps_rating` ( `id` int(10) unsigned NOT NULL auto_increment, `map` int(10) NOT NULL default '0', `rating` smallint(6) NOT NULL default '0', `userid` int(10) unsigned NOT NULL default '0', `ipaddress` varchar(15) NOT NULL default '', `dateline` int(10) unsigned NOT NULL default '0', PRIMARY KEY (`id`) ) ENGINE=MyISAM DEFAULT CHARSET=latin1 AUTO_INCREMENT=1884; the query: SELECT maps.*, AVG(maps_rating.rating) AS rating, COUNT(maps_rating.id) AS votes, user.username FROM maps LEFT JOIN maps_rating ON (maps.id = maps_rating.map) LEFT JOIN user ON (user.userid = maps.userid) GROUP BY maps.id ORDER BY maps.dateline DESC LIMIT $start, $perpage -- No virus found in this outgoing message. Checked by AVG Anti-Virus. Version: 7.0.338 / Virus Database: 267.10.6/69 - Release Date: 8/11/2005 -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
Re: help with slow query
well i managed to solve the problem myself, and im no sql genius... i thought i had an index on maps_rating.map which i didn't.. adding an index on it improved the query. i think that is about all the improvement i can get.. but if there is still room for more speed i'd like to know.. Sebastian wrote: Jigal van Hemert wrote: Sebastian wrote: this query runs slow because AVG and COUNT on maps_rating table i think. can anything be done to improve? You may want to include: - table definitions (output of SHOW CREATE TABLE table) - output of EXPLAIN query sorry for the lack of info. there are a couple of indexes on maps table, but i am not using them unless i use a where clause. not sure if the format is going to appear correctly on mailing list email, here is the info: id select_type table type possible_keys key key_len ref rows Extra 1 SIMPLE maps ALL /NULL/ /NULL/ /NULL/ /NULL/ 626 Using temporary; Using filesort 1 SIMPLE maps_rating ALL /NULL/ /NULL/ /NULL/ /NULL/ 1839 1 SIMPLE user eq_ref PRIMARY PRIMARY 4 site.maps.userid 1 CREATE TABLE `maps` ( `id` int(10) unsigned NOT NULL auto_increment, `mip` smallint(6) NOT NULL default '0', `map` varchar(50) NOT NULL default '', `userid` int(10) unsigned NOT NULL default '0', `filename` varchar(50) NOT NULL default '', `date` int(10) unsigned NOT NULL default '0', PRIMARY KEY (`id`), KEY `mip` (`mip`) ) ENGINE=MyISAM DEFAULT CHARSET=latin1 PACK_KEYS=0 AUTO_INCREMENT=740; CREATE TABLE `maps_rating` ( `id` int(10) unsigned NOT NULL auto_increment, `map` int(10) NOT NULL default '0', `rating` smallint(6) NOT NULL default '0', `userid` int(10) unsigned NOT NULL default '0', `ipaddress` varchar(15) NOT NULL default '', `dateline` int(10) unsigned NOT NULL default '0', PRIMARY KEY (`id`) ) ENGINE=MyISAM DEFAULT CHARSET=latin1 AUTO_INCREMENT=1884; the query: SELECT maps.*, AVG(maps_rating.rating) AS rating, COUNT(maps_rating.id) AS votes, user.username FROM maps LEFT JOIN maps_rating ON (maps.id = maps_rating.map) LEFT JOIN user ON (user.userid = maps.userid) GROUP BY maps.id ORDER BY maps.dateline DESC LIMIT $start, $perpage -- No virus found in this outgoing message. Checked by AVG Anti-Virus. Version: 7.0.338 / Virus Database: 267.10.6/69 - Release Date: 8/11/2005 -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
Re: help with slow query
Sebastian wrote: well i managed to solve the problem myself, and im no sql genius... i thought i had an index on maps_rating.map which i didn't.. adding an index on it improved the query. i think that is about all the improvement i can get.. but if there is still room for more speed i'd like to know.. Sebastian wrote: Jigal van Hemert wrote: Sebastian wrote: this query runs slow because AVG and COUNT on maps_rating table i think. can anything be done to improve? You may want to include: - table definitions (output of SHOW CREATE TABLE table) - output of EXPLAIN query sorry for the lack of info. there are a couple of indexes on maps table, but i am not using them unless i use a where clause. not sure if the format is going to appear correctly on mailing list email, here is the info: id select_type table type possible_keys key key_len ref rows Extra 1 SIMPLE maps ALL /NULL/ /NULL/ /NULL/ /NULL/ 626 Using temporary; Using filesort 1 SIMPLE maps_rating ALL /NULL/ /NULL/ /NULL/ /NULL/ 1839 1 SIMPLE user eq_ref PRIMARY PRIMARY 4 site.maps.userid 1 CREATE TABLE `maps` ( `id` int(10) unsigned NOT NULL auto_increment, `mip` smallint(6) NOT NULL default '0', `map` varchar(50) NOT NULL default '', `userid` int(10) unsigned NOT NULL default '0', `filename` varchar(50) NOT NULL default '', `date` int(10) unsigned NOT NULL default '0', PRIMARY KEY (`id`), KEY `mip` (`mip`) ) ENGINE=MyISAM DEFAULT CHARSET=latin1 PACK_KEYS=0 AUTO_INCREMENT=740; CREATE TABLE `maps_rating` ( `id` int(10) unsigned NOT NULL auto_increment, `map` int(10) NOT NULL default '0', `rating` smallint(6) NOT NULL default '0', `userid` int(10) unsigned NOT NULL default '0', `ipaddress` varchar(15) NOT NULL default '', `dateline` int(10) unsigned NOT NULL default '0', PRIMARY KEY (`id`) ) ENGINE=MyISAM DEFAULT CHARSET=latin1 AUTO_INCREMENT=1884; the query: SELECT maps.*, AVG(maps_rating.rating) AS rating, COUNT(maps_rating.id) AS votes, user.username FROM maps LEFT JOIN maps_rating ON (maps.id = maps_rating.map) LEFT JOIN user ON (user.userid = maps.userid) GROUP BY maps.id ORDER BY maps.dateline DESC LIMIT $start, $perpage It seems to me that an index on maps_rating.rating, maps.userid, user.userid might help. Also making the query ALTER TABLE maps ORDER BY maps.dateline DESC once a day or more, would help the ordering. I may be saying too include much indexes (probably the first), but it may not make bad at all (probably updates/inserts would be slower). Making some tests might help to see what is the best. -- Nuno Pereira -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
Re: help with slow query
Hello. i thought i had an index on maps_rating.map which i didn't.. adding an index on it improved the query. Have a look here: http://dev.mysql.com/doc/mysql/en/order-by-optimization.html http://dev.mysql.com/doc/mysql/en/group-by-optimization.html Sebastian [EMAIL PROTECTED] wrote: well i managed to solve the problem myself, and im no sql genius... i thought i had an index on maps_rating.map which i didn't.. adding an index on it improved the query. i think that is about all the improvement i can get.. but if there is still room for more speed i'd like to know.. Sebastian wrote: Jigal van Hemert wrote: Sebastian wrote: this query runs slow because AVG and COUNT on maps_rating table i think. can anything be done to improve? You may want to include: - table definitions (output of SHOW CREATE TABLE table) - output of EXPLAIN query sorry for the lack of info. there are a couple of indexes on maps table, but i am not using them unless i use a where clause. not sure if the format is going to appear correctly on mailing list email, here is the info: id select_type table type possible_keys key key_len ref rows Extra 1 SIMPLE maps ALL /NULL/ /NULL/ /NULL/ /NULL/ 626 Using temporary; Using filesort 1 SIMPLE maps_rating ALL /NULL/ /NULL/ /NULL/ /NULL/ 1839 1 SIMPLE user eq_ref PRIMARY PRIMARY 4 site.maps.userid 1 CREATE TABLE `maps` ( `id` int(10) unsigned NOT NULL auto_increment, `mip` smallint(6) NOT NULL default '0', `map` varchar(50) NOT NULL default '', `userid` int(10) unsigned NOT NULL default '0', `filename` varchar(50) NOT NULL default '', `date` int(10) unsigned NOT NULL default '0', PRIMARY KEY (`id`), KEY `mip` (`mip`) ) ENGINE=MyISAM DEFAULT CHARSET=latin1 PACK_KEYS=0 AUTO_INCREMENT=740; CREATE TABLE `maps_rating` ( `id` int(10) unsigned NOT NULL auto_increment, `map` int(10) NOT NULL default '0', `rating` smallint(6) NOT NULL default '0', `userid` int(10) unsigned NOT NULL default '0', `ipaddress` varchar(15) NOT NULL default '', `dateline` int(10) unsigned NOT NULL default '0', PRIMARY KEY (`id`) ) ENGINE=MyISAM DEFAULT CHARSET=latin1 AUTO_INCREMENT=1884; the query: SELECT maps.*, AVG(maps_rating.rating) AS rating, COUNT(maps_rating.id) AS votes, user.username FROM maps LEFT JOIN maps_rating ON (maps.id = maps_rating.map) LEFT JOIN user ON (user.userid = maps.userid) GROUP BY maps.id ORDER BY maps.dateline DESC LIMIT $start, $perpage -- For technical support contracts, goto https://order.mysql.com/?ref=ensita This email is sponsored by Ensita.NET http://www.ensita.net/ __ ___ ___ __ / |/ /_ __/ __/ __ \/ /Gleb Paharenko / /|_/ / // /\ \/ /_/ / /__ [EMAIL PROTECTED] /_/ /_/\_, /___/\___\_\___/ MySQL AB / Ensita.NET ___/ www.mysql.com -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
Re: help with slow query
Sebastian [EMAIL PROTECTED] wrote on 08/11/2005 01:19:30 PM: well i managed to solve the problem myself, and im no sql genius... i thought i had an index on maps_rating.map which i didn't.. adding an index on it improved the query. i think that is about all the improvement i can get.. but if there is still room for more speed i'd like to know.. Sebastian wrote: Jigal van Hemert wrote: Sebastian wrote: this query runs slow because AVG and COUNT on maps_rating table i think. can anything be done to improve? You may want to include: - table definitions (output of SHOW CREATE TABLE table) - output of EXPLAIN query sorry for the lack of info. there are a couple of indexes on maps table, but i am not using them unless i use a where clause. not sure if the format is going to appear correctly on mailing list email, here is the info: id select_type table type possible_keys key key_len ref rows Extra 1 SIMPLE maps ALL /NULL/ /NULL/ /NULL/ /NULL/ 626 Using temporary; Using filesort 1 SIMPLE maps_rating ALL /NULL/ /NULL/ /NULL/ /NULL/ 1839 1 SIMPLE user eq_ref PRIMARY PRIMARY 4 site.maps.userid 1 CREATE TABLE `maps` ( `id` int(10) unsigned NOT NULL auto_increment, `mip` smallint(6) NOT NULL default '0', `map` varchar(50) NOT NULL default '', `userid` int(10) unsigned NOT NULL default '0', `filename` varchar(50) NOT NULL default '', `date` int(10) unsigned NOT NULL default '0', PRIMARY KEY (`id`), KEY `mip` (`mip`) ) ENGINE=MyISAM DEFAULT CHARSET=latin1 PACK_KEYS=0 AUTO_INCREMENT=740; CREATE TABLE `maps_rating` ( `id` int(10) unsigned NOT NULL auto_increment, `map` int(10) NOT NULL default '0', `rating` smallint(6) NOT NULL default '0', `userid` int(10) unsigned NOT NULL default '0', `ipaddress` varchar(15) NOT NULL default '', `dateline` int(10) unsigned NOT NULL default '0', PRIMARY KEY (`id`) ) ENGINE=MyISAM DEFAULT CHARSET=latin1 AUTO_INCREMENT=1884; the query: SELECT maps.*, AVG(maps_rating.rating) AS rating, COUNT(maps_rating.id) AS votes, user.username FROM maps LEFT JOIN maps_rating ON (maps.id = maps_rating.map) LEFT JOIN user ON (user.userid = maps.userid) GROUP BY maps.id ORDER BY maps.dateline DESC LIMIT $start, $perpage This is not to solve your problem (you already did that) but to respond to your request for possible additional optimizations. One of the basic principles I try to use when optimizing is to JOIN as little data as possible, even if it takes more than one step. What your original query does is to JOIN three tables then GROUP BY on the resulting combinations of records. If you eliminated all of the duplication from your secondary tables (maps_rating and user) by performing your AVG and COUNT in separate steps, you reduce the amount of data you need to reprocess through the GROUP BY by an order of magnitude. Less data = less time. Here is how I would approach your problem. I would create a temp table (or a static table if you run this often enough) that contains whatever statistics you want (your COUNTs, AVGs, etc.) then join that to the `maps` table to fill in the rest of the columns you wanted in your report. The whole query would resemble something like this: CREATE TEMPORARY TABLE tmpRatings(KEY(map)) SELECT map , avg(rating) as rating , count(id) as votes FROM maps_rating GROUP BY map; SELECT maps.*, AVG(maps_rating.rating) AS rating, COUNT(maps_rating.id) AS votes, user.username FROM maps LEFT JOIN user ON (user.userid = maps.userid) LEFT JOIN tmpRatings ON tmpRatings.map = maps.id ORDER BY maps.dateline DESC LIMIT $start, $perpage; Assuming an average of 10 ratings per map, you save at least 10x the processing time in your final query as compared to your original. We did add a little processing to create the statistics table, however the additional overhead is not nearly as much as we saved so the net gain will still be quite noticeable. Please give it a shot and let me know how it compares to your original. Thanks. Shawn Green Database Administrator Unimin Corporation - Spruce Pine
help with slow query
this query runs slow because AVG and COUNT on maps_rating table i think. can anything be done to improve? query: SELECT maps.*, AVG(maps_rating.rating) AS rating, COUNT(maps_rating.id) AS votes, user.username FROM maps LEFT JOIN maps_rating ON (maps.id = maps_rating.map) LEFT JOIN user ON (user.userid = maps.userid) GROUP BY maps.id ORDER BY maps.dateline DESC LIMIT $start, $perpage structure: maps - rows: 700 +--++-+---+++ | id | mip | map | userid | author | filename | +--++-+---+++ maps_rating - rows: 2,000 +--+-+--++---+---+ | id | map | rating | userid | ipaddress | dateline | +--+-+--++---+---+ user - rows: 10,000 +--+---+ | userid | username | +--+---+ -- No virus found in this outgoing message. Checked by AVG Anti-Virus. Version: 7.0.338 / Virus Database: 267.10.5/67 - Release Date: 8/9/2005 -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
Help with SLOW query
Help: (and apologies if this is posted to the wrong list..)(pls let me know where to post if so.. Thx ;-) I have *inherited* an App that uses PHP / MySQL. THe internal search function within the application that I am supporting uses the following DB Table structure and runs the Query below to return a set of results ordered by the REGEXP match results (The score). The Query looks in each field, for a REGEXP match, if the REGEXP returns true, the score is incremented by 10 * number of matches (I think) I cannot use FULLTEXT indexes. What I would like to do is refactor the query to improve the speed. Can someone please advise as to why my queries ALWAYS use filesort, where and temporary (as found out by EXPLAIN...) Can someone recommend a new query to deliver the same results?? Thanks in advance ## START TABLE STRUCTURE ## CREATE TABLE article_related_communities ( comm_id int(6) default NULL, article_id varchar(12) default NULL, KEY article_id (article_id), KEY comm_id (comm_id) ) TYPE=MyISAM; CREATE TABLE articles ( id int(4) NOT NULL auto_increment, c_type int(4) NOT NULL default '1', author int(6) NOT NULL default '0', comm_id int(6) default '0', comm_type int(6) default '0', keywords longtext NOT NULL, title varchar(128) NOT NULL default '', synopsis text NOT NULL, release_date varchar(14) default NULL, expiry_date varchar(14) default NULL, start_date varchar(14) NOT NULL default '', closing_date varchar(14) NOT NULL default '', location varchar(255) NOT NULL default '', cost varchar(255) NOT NULL default '', times text NOT NULL, overnight_details text NOT NULL, remuneration varchar(255) NOT NULL default '', body text NOT NULL, status int(6) default NULL, expired int(2) NOT NULL default '0', deleted int(2) NOT NULL default '0', ctime varchar(14) NOT NULL default '', mtime varchar(14) NOT NULL default '', mod_author int(6) default '0', event_type int(2) default '0', cs_org int(8) NOT NULL default '0', expiry_reason varchar(128) NOT NULL default '', accumulated_rating int(6) NOT NULL default '0', num_ratings int(6) NOT NULL default '0', average_rating float NOT NULL default '0', event_duration varchar(128) NOT NULL default '0', event_organiser varchar(128) NOT NULL default '0', event_organiser_email varchar(128) NOT NULL default '0', PRIMARY KEY (id), KEY c_type (c_type), KEY status (status), KEY comm_id (comm_id), KEY author (author), KEY expired (expired), KEY deleted (deleted), KEY expiry_date (expiry_date), KEY release_date (release_date), FULLTEXT KEY idx_tit_syn_body (title,synopsis,body) ) TYPE=MyISAM; ## END TABLE STRUCTURE ## ## START QUERY ## SELECT distinct articles.* , ( ( (title REGEXP [[::]]MArket[[::]])*10+(body REGEXP [[::]]MArket[[::]])*10+(synopsis REGEXP [[::]]MArket[[::]])*10+(closing_date REGEXP [[::]]MArket[[::]])*10+(location REGEXP [[::]]MArket[[::]])*10+(cost REGEXP [[::]]MArket[[::]])*10+(times REGEXP [[::]]MArket[[::]])*10+(overnight_details REGEXP [[::]]MArket[[::]])*10+(remuneration REGEXP [[::]]MArket[[::]])*10+(keywords REGEXP [[::]]MArket[[::]])*10 ) + ( (title REGEXP [[::]]Segmentation[[::]])*10+(body REGEXP [[::]]Segmentation[[::]])*10+(synopsis REGEXP [[::]]Segmentation[[::]])*10+(closing_date REGEXP [[::]]Segmentation[[::]])*10+(location REGEXP [[::]]Segmentation[[::]])*10+(cost REGEXP [[::]]Segmentation[[::]])*10+(times REGEXP [[::]]Segmentation[[::]])*10+(overnight_details REGEXP [[::]]Segmentation[[::]])*10+(remuneration REGEXP [[::]]Segmentation[[::]])*10+(keywords REGEXP [[::]]Segmentation[[::]])*10 ) ) AS score FROM articles LEFT JOIN article_related_communities on articles.id = article_related_communities.article_id WHERE deleted=0 AND ( ( (title REGEXP [[::]]MArket[[::]])*10+(body REGEXP [[::]]MArket[[::]])*10+(synopsis REGEXP [[::]]MArket[[::]])*10+(closing_date REGEXP [[::]]MArket[[::]])*10+(location REGEXP [[::]]MArket[[::]])*10+(cost REGEXP [[::]]MArket[[::]])*10+(times REGEXP [[::]]MArket[[::]])*10+(overnight_details REGEXP [[::]]MArket[[::]])*10+(remuneration REGEXP [[::]]MArket[[::]])*10+(keywords REGEXP [[::]]MArket[[::]])*10 ) + ( (title REGEXP [[::]]Segmentation[[::]])*10+(body REGEXP [[::]]Segmentation[[::]])*10+(synopsis REGEXP [[::]]Segmentation[[::]])*10+(closing_date REGEXP [[::]]Segmentation[[::]])*10+(location REGEXP [[::]]Segmentation[[::]])*10+(cost REGEXP [[::]]Segmentation[[::]])*10+(times REGEXP [[::]]Segmentation[[::]])*10+(overnight_details REGEXP [[::]]Segmentation[[::]])*10+(remuneration REGEXP [[::]]Segmentation[[::]])*10+(keywords REGEXP [[::]]Segmentation[[::]])*10 ) )0 AND expiry_date '20041018' AND (status='2' OR (status='1' AND (author = 161 OR (articles.comm_id IN ('') AND release_date='20041018' AND (articles.comm_id IN ('559','560','9','286','425','484','103','324','323','18','1','1','2','4','5') OR article_related_communities.comm_id IN
Re: Help with SLOW query
You are always getting filesort, where and temporary because the optimizer cannot use an index. Both your WHERE clause and your ORDER BY clause use computed values. None of those values exist in an index because you calculate them for every query. What I did below is not a refactoring, just a different layout of your query. I did it this way to highlight the repetitiveness of parts of this query. (I hope that auto-wrapping doesn't destroy this too much): SELECT distinct articles.* , (( (title REGEXP [[::]]MArket[[::]])*10 +(body REGEXP [[::]]MArket[[::]])*10 +(synopsis REGEXP [[::]]MArket[[::]])*10 +(closing_date REGEXP [[::]]MArket[[::]])*10 +(location REGEXP [[::]]MArket[[::]])*10 +(cost REGEXP [[::]]MArket[[::]])*10) +(times REGEXP [[::]]MArket[[::]])*10 +(overnight_details REGEXP [[::]]MArket[[::]])*10 +(remuneration REGEXP [[::]]MArket[[::]])*10 +(keywords REGEXP [[::]]MArket[[::]])*10 ) +( (title REGEXP [[::]]Segmentation[[::]])*10 +(body REGEXP [[::]]Segmentation[[::]])*10 +(synopsis REGEXP [[::]]Segmentation[[::]])*10 +(closing_date REGEXP [[::]]Segmentation[[::]])*10 +(location REGEXP [[::]]Segmentation[[::]])*10 +(cost REGEXP [[::]]Segmentation[[::]])*10 +(times REGEXP [[::]]Segmentation[[::]])*10 +(overnight_details REGEXP [[::]]Segmentation[[::]])*10 +(remuneration REGEXP [[::]]Segmentation[[::]])*10 +(keywords REGEXP [[::]]Segmentation[[::]])*10 )) AS score FROM articles LEFT JOIN article_related_communities on articles.id = article_related_communities.article_id WHERE deleted=0 AND (( (title REGEXP [[::]]MArket[[::]])*10 +(body REGEXP [[::]]MArket[[::]])*10 +(synopsis REGEXP [[::]]MArket[[::]])*10 +(closing_date REGEXP [[::]]MArket[[::]])*10 +(location REGEXP [[::]]MArket[[::]])*10 +(cost REGEXP [[::]]MArket[[::]])*10 +(times REGEXP [[::]]MArket[[::]])*10 +(overnight_details REGEXP [[::]]MArket[[::]])*10 +(remuneration REGEXP [[::]]MArket[[::]])*10 +(keywords REGEXP [[::]]MArket[[::]])*10 ) +( (title REGEXP [[::]]Segmentation[[::]])*10 +(body REGEXP [[::]]Segmentation[[::]])*10 +(synopsis REGEXP [[::]]Segmentation[[::]])*10 +(closing_date REGEXP [[::]]Segmentation[[::]])*10 +(location REGEXP [[::]]Segmentation[[::]])*10 +(cost REGEXP [[::]]Segmentation[[::]])*10 +(times REGEXP [[::]]Segmentation[[::]])*10 +(overnight_details REGEXP [[::]]Segmentation[[::]])*10 +(remuneration REGEXP [[::]]Segmentation[[::]])*10 +(keywords REGEXP [[::]]Segmentation[[::]])*10 ))0 AND expiry_date '20041018' AND (status='2' OR (status='1' AND (author = 161 OR (articles.comm_id IN ('')) ) ) ) AND release_date='20041018' AND (articles.comm_id IN ('559','560','9','286','425','484','103','324','323','18','1','1','2','4','5') OR article_related_communities.comm_id IN ('559','560','9','286','425','484','103','324','323','18','1','1','2','4','5') ) ORDER BY score DESC LIMIT 0,10; It would be very nice if the optimizer detected the fact that part of your WHERE clause and your score column are the same formula, that way it would only need to compute that value once. I suspect that it does but I still suggest that you drop the calculation in your where clause and move your check to a HAVING clause. If you leave that calculation in the WHERE clause you will force a full table scan. You lose nothing with this change and you may gain in performance both because we might avoid a full table scan and just in case the optimizer didn't detect the duplication and factor out that calculation, you won't compute that value twice. It should also makes your code easier to maintain and your query simpler to parse. Here is what your new bottom half of your query would look like. WHERE deleted=0 AND expiry_date '20041018' AND ( status='2' OR ( status='1' AND (author = 161 OR (articles.comm_id IN ('')) ) ) ) AND release_date='20041018' AND ( articles.comm_id IN ('559','560','9','286','425','484','103','324','323','18','1','1','2','4','5') OR article_related_communities.comm_id IN ('559','560','9','286','425','484','103','324','323','18','1','1','2','4','5') ) HAVING score 0