Hi,

I am trying to find the best way to write a query that has two levels of group 
by where the outer group by columns are a subset of the inner group by columns. 
In my example below I want to do an aggregation grouping by per, prod, and mar, 
then I want aggregate the results of this aggregation, grouping by just prod, 
and per. From the results of explain query plan, I can see a B-Tree is not used 
if I only do the first group by - this is mentioned in the query optimisation 
page. However, a B-Tree is used for the second group by when both group bys are 
present, and I don't understand why, as I think the rows from the subquery come 
out in an order already suitable for the second group by?

My actual data is a bit more complex, but I am seeing a 10x-20x speed 
difference between the query with the single group by and the query with both. 
If the B-Tree is necessary, it would be good to be able to understand why, and 
if not, it would be great if there were some way to communicate this to the 
query planner.

This below was run on a freshly downloaded and compiled 3.28.

Thanks,
Kev

---

QUERY PLAN
`--SCAN TABLE data USING INDEX sqlite_autoindex_data_1
QUERY PLAN
|--CO-ROUTINE 1
|  `--SCAN TABLE data USING INDEX sqlite_autoindex_data_1
|--SCAN SUBQUERY 1
`--USE TEMP B-TREE FOR GROUP BY

------------

create table data(
  prod integer not null,
  per integer not null,
  mar integer not null,
  off integer not null,
  val real not null,
  primary key(prod, per, mar, off)
);

explain query plan select
  prod,
  per,
  mar,
  sum(val) as val
from
  data
group by
  prod,
  per,
  mar
;

explain query plan select
  prod,
  per,
  min(val)
from
  (select
    prod,
    per,
    mar,
    sum(val) as val
  from
    data
  group by
    prod,
    per,
    mar)
group by
  prod,
  per
;


   
_______________________________________________
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to