I know it's bad form to reply to yourself but I just found a major mental
mistake in my response. See embedded:
[EMAIL PROTECTED] wrote on 08/12/2005 12:18:21 AM:
> Sebastian <[EMAIL PROTECTED]> wrote on 08/11/2005 01:19:30
PM:
>
> > well i managed to solve the problem myself, and im no sql genius...
> > i thought i had an index on maps_rating.map which i didn't.. adding an
> > index on it improved the query.
> >
> > i think that is about all the improvement i can get.. but if there is
> > still room for more speed i'd like to know..
> >
> > Sebastian wrote:
> >
> > > Jigal van Hemert wrote:
> > >
> > >> Sebastian wrote:
> > >>
> > >>> this query runs slow because AVG and COUNT on maps_rating table i
> > >>> think.
> > >>> can anything be done to improve?
> > >>
> > >>
> > >>
> > >> You may want to include:
> > >> - table definitions (output of SHOW CREATE TABLE <table>)
> > >> - output of EXPLAIN <query>
> > >
> > >
> > > sorry for the lack of info.
> > > there are a couple of indexes on maps table, but i am not using them
> > > unless i use a where clause.
> > >
> > > not sure if the format is going to appear correctly on mailing list
> > > email, here is the info:
> > >
> > > id select_type table type possible_keys key
> > > key_len ref rows Extra
> > > 1 SIMPLE maps ALL /NULL/ /NULL/ /NULL/
> > > /NULL/ 626 Using temporary; Using filesort
> > > 1 SIMPLE maps_rating ALL /NULL/ /NULL/
> > > /NULL/ /NULL/ 1839 1 SIMPLE user eq_ref
> > > PRIMARY PRIMARY 4 site.maps.userid 1
> > >
> > > CREATE TABLE `maps` (
> > > `id` int(10) unsigned NOT NULL auto_increment,
> > > `mip` smallint(6) NOT NULL default '0',
> > > `map` varchar(50) NOT NULL default '',
> > > `userid` int(10) unsigned NOT NULL default '0',
> > > `filename` varchar(50) NOT NULL default '',
> > > `date` int(10) unsigned NOT NULL default '0',
> > > PRIMARY KEY (`id`),
> > > KEY `mip` (`mip`)
> > > ) ENGINE=MyISAM DEFAULT CHARSET=latin1 PACK_KEYS=0
AUTO_INCREMENT=740;
> > >
> > > CREATE TABLE `maps_rating` (
> > > `id` int(10) unsigned NOT NULL auto_increment,
> > > `map` int(10) NOT NULL default '0',
> > > `rating` smallint(6) NOT NULL default '0',
> > > `userid` int(10) unsigned NOT NULL default '0',
> > > `ipaddress` varchar(15) NOT NULL default '',
> > > `dateline` int(10) unsigned NOT NULL default '0',
> > > PRIMARY KEY (`id`)
> > > ) ENGINE=MyISAM DEFAULT CHARSET=latin1 AUTO_INCREMENT=1884;
> > >
> > > the query:
> > >
> > > SELECT
> > > maps.*, AVG(maps_rating.rating) AS rating, COUNT(maps_rating.id) AS
> > > votes, user.username
> > > FROM maps
> > > LEFT JOIN maps_rating ON (maps.id = maps_rating.map)
> > > LEFT JOIN user ON (user.userid = maps.userid)
> > > GROUP BY maps.id ORDER BY maps.dateline DESC LIMIT $start, $perpage
> > >
> >
>
> This is not to solve your problem (you already did that) but to respond
to
> your request for possible additional optimizations.
>
> One of the basic principles I try to use when optimizing is to JOIN as
> little data as possible, even if it takes more than one step. What your
> original query does is to JOIN three tables then GROUP BY on the
resulting
> combinations of records. If you eliminated all of the duplication from
> your secondary tables (maps_rating and user) by performing your AVG and
> COUNT in separate steps, you reduce the amount of data you need to
> reprocess through the GROUP BY by an order of magnitude. Less data =
less
> time.
>
> Here is how I would approach your problem. I would create a temp table
(or
> a static table if you run this often enough) that contains whatever
> statistics you want (your COUNTs, AVGs, etc.) then join that to the
`maps`
> table to fill in the rest of the columns you wanted in your report. The
> whole query would resemble something like this:
>
> CREATE TEMPORARY TABLE tmpRatings(KEY(map))
> SELECT map
> , avg(rating) as rating
> , count(id) as votes
> FROM maps_rating
> GROUP BY map;
This query is wrong, I cut and pasted but forgot to edit...(I must have
been WAY too tired to be online)
>
> SELECT
> maps.*, AVG(maps_rating.rating) AS rating, COUNT(maps_rating.id) AS
> votes, user.username
> FROM maps
> LEFT JOIN user ON (user.userid = maps.userid)
> LEFT JOIN tmpRatings ON tmpRatings.map = maps.id
> ORDER BY maps.dateline DESC
> LIMIT $start, $perpage;
>
What I meant to give was this:
SELECT
maps.*
, tr.rating
, tr.votes
, user.username
FROM maps
LEFT JOIN user ON (user.userid = maps.userid)
LEFT JOIN tmpRatings ON tmpRatings.map = maps.id
ORDER BY maps.dateline DESC
LIMIT $start, $perpage;
> Assuming an average of 10 ratings per map, you save at least 10x the
> processing time in your final query as compared to your original. We did
> add a little processing to create the statistics table, however the
> additional overhead is not nearly as much as we saved so the net gain
will
> still be quite noticeable.
>
> Please give it a shot and let me know how it compares to your original.
> Thanks.
>
> Shawn Green
> Database Administrator
> Unimin Corporation - Spruce Pine
>
Again, Sorry if I created any confusion!! -- Shawn