I know it's bad form to reply to yourself but I just found a major mental 
mistake in my response. See embedded:

[EMAIL PROTECTED] wrote on 08/12/2005 12:18:21 AM:

> Sebastian <[EMAIL PROTECTED]> wrote on 08/11/2005 01:19:30 
PM:
> 
> > well i managed to solve the problem myself, and im no sql genius...
> > i thought i had an index on maps_rating.map which i didn't.. adding an 

> > index on it improved the query.
> > 
> > i think that is about all the improvement i can get.. but if there is 
> > still room for more speed i'd like to know..
> > 
> > Sebastian wrote:
> > 
> > > Jigal van Hemert wrote:
> > >
> > >> Sebastian wrote:
> > >>
> > >>> this query runs slow because AVG and COUNT on maps_rating table i 
> > >>> think.
> > >>> can anything be done to improve?
> > >>
> > >>
> > >>
> > >> You may want to include:
> > >> - table definitions (output of SHOW CREATE TABLE <table>)
> > >> - output of EXPLAIN <query>
> > >
> > >
> > > sorry for the lack of info.
> > > there are a couple of indexes on maps table, but i am not using them 

> > > unless i use a where clause.
> > >
> > > not sure if the format is going to appear correctly on mailing list 
> > > email, here is the info:
> > >
> > > id     select_type     table     type     possible_keys     key 
> > > key_len     ref     rows     Extra
> > > 1     SIMPLE     maps     ALL     /NULL/     /NULL/     /NULL/ 
> > > /NULL/     626     Using temporary; Using filesort
> > > 1     SIMPLE     maps_rating     ALL     /NULL/     /NULL/ 
> > > /NULL/     /NULL/     1839      1     SIMPLE     user     eq_ref 
> > > PRIMARY     PRIMARY     4     site.maps.userid     1 
> > >
> > > CREATE TABLE `maps` (
> > >  `id` int(10) unsigned NOT NULL auto_increment,
> > >  `mip` smallint(6) NOT NULL default '0',
> > >  `map` varchar(50) NOT NULL default '',
> > >  `userid` int(10) unsigned NOT NULL default '0',
> > >  `filename` varchar(50) NOT NULL default '',
> > >  `date` int(10) unsigned NOT NULL default '0',
> > >  PRIMARY KEY  (`id`),
> > >  KEY `mip` (`mip`)
> > > ) ENGINE=MyISAM DEFAULT CHARSET=latin1 PACK_KEYS=0 
AUTO_INCREMENT=740;
> > >
> > > CREATE TABLE `maps_rating` (
> > >  `id` int(10) unsigned NOT NULL auto_increment,
> > >  `map` int(10) NOT NULL default '0',
> > >  `rating` smallint(6) NOT NULL default '0',
> > >  `userid` int(10) unsigned NOT NULL default '0',
> > >  `ipaddress` varchar(15) NOT NULL default '',
> > >  `dateline` int(10) unsigned NOT NULL default '0',
> > >  PRIMARY KEY  (`id`)
> > > ) ENGINE=MyISAM DEFAULT CHARSET=latin1 AUTO_INCREMENT=1884;
> > >
> > > the query:
> > >
> > > SELECT
> > > maps.*, AVG(maps_rating.rating) AS rating, COUNT(maps_rating.id) AS 
> > > votes, user.username
> > > FROM maps
> > > LEFT JOIN maps_rating ON (maps.id = maps_rating.map)
> > > LEFT JOIN user ON (user.userid = maps.userid)
> > > GROUP BY maps.id ORDER BY maps.dateline DESC LIMIT $start, $perpage
> > >
> > 
> 
> This is not to solve your problem (you already did that) but to respond 
to 
> your request for possible additional optimizations.
> 
> One of the basic principles I try to use when optimizing is to JOIN as 
> little data as possible, even if it takes more than one step. What your 
> original query does is to JOIN three tables then GROUP BY on the 
resulting 
> combinations of records. If you eliminated all of the duplication from 
> your secondary tables (maps_rating and user) by performing your AVG and 
> COUNT in separate steps, you reduce the amount of data you need to 
> reprocess through the GROUP BY by an order of magnitude. Less data = 
less 
> time.
> 
> Here is how I would approach your problem. I would create a temp table 
(or 
> a static table if you run this often enough) that contains whatever 
> statistics you want (your COUNTs, AVGs, etc.) then join that to the 
`maps` 
> table to fill in the rest of the columns you wanted in your report. The 
> whole query would resemble something like this:
> 
> CREATE TEMPORARY TABLE tmpRatings(KEY(map))
> SELECT map
>         , avg(rating) as rating
>         , count(id) as votes
> FROM maps_rating
> GROUP BY map;


This query is wrong, I cut and pasted but forgot to edit...(I must have 
been WAY too tired to be online)
> 
> SELECT
> maps.*, AVG(maps_rating.rating) AS rating, COUNT(maps_rating.id) AS
> votes, user.username
> FROM maps
> LEFT JOIN user ON (user.userid = maps.userid)
> LEFT JOIN tmpRatings ON tmpRatings.map = maps.id
> ORDER BY maps.dateline DESC 
> LIMIT $start, $perpage;
> 

What I meant to give was this:

SELECT
        maps.*
        , tr.rating
        , tr.votes
        , user.username
FROM maps
LEFT JOIN user ON (user.userid = maps.userid)
LEFT JOIN tmpRatings ON tmpRatings.map = maps.id
ORDER BY maps.dateline DESC 
LIMIT $start, $perpage;

> Assuming an average of 10 ratings per map, you save at least 10x the 
> processing time in your final query as compared to your original. We did 

> add a little processing to create the statistics table, however the 
> additional overhead is not nearly as much as we saved so the net gain 
will 
> still be quite noticeable.
> 
> Please give it a shot and let me know how it compares to your original. 
> Thanks.
> 
> Shawn Green
> Database Administrator
> Unimin Corporation - Spruce Pine
>

Again, Sorry if I created any confusion!! -- Shawn

Reply via email to