"Octavian Rasnita" <[EMAIL PROTECTED]> wrote on 07/19/2005 02:45:58 AM:
> From: <[EMAIL PROTECTED]>
> Subject: Re: more queries vs a bigger one
>
>
> > Hello,
> > approx. how long does it take your "big" query to run as it is now?
Are
> > these queries appending a table? or are they buiding a result (from a
> > "chain" of queries)? Have you tried separating them out? Any
difference?
> > -sam
> >
> >
>
> That query takes more than 2 minutes.
> I have a table with less than 90.000 records, and this is the bigger
table
> from the database, so the query is very slow, because as you may see,
the
> tables are not very big.
>
> But that query might return more than 9.000 records, even though I limit
it
> to first 30.
>
> I will split it into more smaller queries and I will report if it works
> faster.
>
> The query is something like the example below, but I don't have it here
> right now to send it to the list, but I will send it soon:
>
> select a.pre_title, a.title, a.post_title, substring(a.body, 1,
> n.preview_size) as preview, n.title as publication, a.id_category,
cs.label,
> count(aco.hash_articles) as comments, count(act.hash_articles) as
counter
> from articles a
> left join newspapers n on(a.id_newspapers=n.id)
> left join sections s on(a.id_sections=s.id and s.id_newspapers=n.id)
> left join sections_categories cs on(a.id_categories=cs.id)
> inner join articles_counters act on(a.body_hash=act.hash_articles)
> inner join articles_comments aco on(a.body_hash=aco.hash_articles)
> where a.id_category=20
> and a.date between '2005-01-01' and '2005-12-31'
> group by a.body_hash order by rand();
>
> The table newspapers has only 20 records. The table sections has under
300
> records, but the tables articles_counter and articles_comments might
have
> many records... hundread of thousands.... millions.
>
> Thanks.
>
> Teddy
>
This is your original query (above) unwrapped and tabified:
select a.pre_title
, a.title
, a.post_title
, substring(a.body, 1, n.preview_size) as preview
, n.title as publication
, a.id_category
, cs.label
, count(aco.hash_articles) as comments
, count(act.hash_articles) as counter
from articles a
left join newspapers n
on(a.id_newspapers=n.id)
left join sections s
on(a.id_sections=s.id
and s.id_newspapers=n.id)
left join sections_categories cs
on(a.id_categories=cs.id)
inner join articles_counters act
on(a.body_hash=act.hash_articles)
inner join articles_comments aco
on(a.body_hash=aco.hash_articles)
where a.id_category=20
and a.date between '2005-01-01' and '2005-12-31'
group by a.body_hash
order by rand();
I see ONE GLARING problem with this query right away:If you weren't using
MySQL, this would be an illegal query.
Your GROUP BY clause does not contain enough columns and the column it
*does* contain doesn't appear in your SELECT clause. MySQL has a way of
making queries like this "work" but your results are not guaranteed to be
deterministic.
Another issue that is killing your speed is your ORDER BY RAND();
I believe it would be much faster to break this into 3 smaller queries:
first to pick your articles, second to get your statistics, the third will
combine the first two with some additional information to provide your
finished report. I am assuming that `articles`.`body_hash` has an index on
it and is unique. I assume it is unique and indexed because you are using
it as a foreign key to the `articles_counters` and `articles_comments`
tables. If you do not have an index on `articles`.`body_hash`, add one
then try your query again. If it is still too slow, try this query:
/* start query */
CREATE TEMPORARY TABLE tmpArtHash (key(body_hash, rndval))
SELECT body_hash, rand() as rndval
FROM articles
WHERE id_category=20
AND `date` between '2005-01-01' and '2005-12-31';
CREATE TEMPORARY TABLE tmpArtCounters (key(body_hash))
SELECT th.body_hash
, count(aco.hash_articles) as comments
, count(act.hash_articles) as counter
FROM tmpArtHash th
LEFT JOIN articles_counters act
on th.body_hash=act.hash_articles
LEFT JOIN articles_comments aco
on th.body_hash=aco.hash_articles
GROUP BY th.body_hash
SELECT a.pre_title
, a.title
, a.post_title
, substring(a.body, 1, n.preview_size) as preview
, n.title as publication
, a.id_category
, cs.label
, tc.comments
, tc.counter
FROM tmpArtHash th
INNER JOIN articles a
on a.body_hash = th.body_hash
left join newspapers n
on a.id_newspapers=n.id
left join sections s
on a.id_sections=s.id
and s.id_newspapers=n.id
left join sections_categories cs
on a.id_categories=cs.id
LEFT JOIN tmpArtCounters tc
ON tc.body_hash = th.body_hash
order by th.rndval;
DROP TEMPORARY TABLE tmpArtHash, tmpArtCounter;
/* end query */
Let us know how fast it works. It should take less than 3 seconds or so
(even for 100,000 records or more).