Re: [sqlite] Order of columns in group by statement affects query performance

Hick Gunter Thu, 25 Apr 2013 01:29:46 -0700

AFAIK SQLite treats GROUP BY the same way as ORDER BY (taken from hints in the 
virtual table description).

If you have an index that covers the GROUP BY clause in field order, then 
aggregate functions need store only the current value; if not, then you need an 
ephemeral table to hold the aggregate values.

In more detail:

IF you have an index that covers the GROUP BY clause in field order, then 
retrieving rows in index order guarantees that all rows belonging to the same 
group will be retrieved in one block AND makes the output rows come out sorted 
too. This allows SQLite to keep current aggregate values in registers.

IF you do not have an index that covers the GROUP BY clause, then rows will be 
retrieved in some deterministic order other than by group. You need to retrieve 
and update the current aggregate values for the group each row is in. This 
forces SQLite to keep current aggregate values in an ephemeral table.

IF you have an index that covers the GROUP BY clause in any other order, then 
you still have the guarantee that all rows belonging to the same group will be 
retrieved together, but the result rows will be ordered in index order and not 
GROUP BY order. This would probably require code paths presumably shared by 
GROUP BY and ORDER BY processing to be split.

-----Ursprüngliche Nachricht-----
Von: James K. Lowden [mailto:jklow...@schemamania.org]
Gesendet: Donnerstag, 25. April 2013 01:55
An: sqlite-users@sqlite.org
Betreff: Re: [sqlite] Order of columns in group by statement affects query 
performance

On Wed, 24 Apr 2013 17:46:00 +0100
Simon Slavin <slav...@bigfraud.org> wrote:

> On 24 Apr 2013, at 5:14pm, Igor Tandetnik <i...@tandetnik.org> wrote:
> > Note though that the query doesn't have an ORDER BY clause. It
> > doesn't request rows in any particular order. SQLite could, in
> > principle, reorder columns in GROUP BY to take advantage of the
> > index. I suppose the optimizer just happens to miss this particular
> > opportunity.
>
> But the GROUP BY clause has an order:
>
> >> Query 1:  SELECT A,B,count(*) from tableTest group by A,B Query 2:
> >> SELECT A,B,count(*) from tableTest group by B,A

The order in which the columns appear syntactically in the GROUP BY clause is 
meaningless in SQL.

Igor is correct that the query processor could use any index beginning with B,A 
or A,B, should it so choose.

--jkl
_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

--------------------------------------------------------------------------
 Gunter Hick
Software Engineer
Scientific Games International GmbH
Klitschgasse 2 – 4, A - 1130 Vienna, Austria
FN 157284 a, HG Wien
Tel: +43 1 80100 0
E-Mail: h...@scigames.at

This e-mail is confidential and may well also be legally privileged. If you 
have received it in error, you are on notice as to its status and accordingly 
please notify us immediately by reply e-mail and then delete this message from 
your system. Please do not copy it or use it for any purposes, or disclose its 
contents to any person as to do so could be a breach of confidence. Thank you 
for your cooperation.
_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Order of columns in group by statement affects query performance

Reply via email to