On Sun, Jan 31, 2016 at 2:54 PM, Yannick Duch?ne <yannick_duchene at yahoo.fr>
wrote:

> Another mystery to me. Given this test table:
>
>     CREATE TABLE t (a TEXT, b TEXT, c INTEGER, PRIMARY KEY (a, b, c))
>
> ??this query:
>
>     SELECT Sum(c) FROM t GROUP BY a
>
> ??executes faster than any of these two:
>
>     SELECT Sum(c) FROM t GROUP BY b
>     SELECT Sum(c) FROM t GROUP BY c
>
> ? which executes in about the same time together, proportionally to the
> number of returned rows. With `GROUP BY a`, execution times seems to be
> about half than with the two formers. Adding or not adding a `WITHOUT
> ROWID` gives the same. I give the number of rows, to show if the first one
> is faster than the second one, that's not because it would returns less
> rows, on the opposite, it returns a bit more then with grouping by `b`:
>
>  * Grouping by `a` results into 1360 rows in about 40ms +/-3;
>  * Grouping by `b` results into 1170 rows in about 65ms +/-5;
>  * Grouping by `c` results into 3154 rows in about 90ms +/-4.
>

Run "explain query plan ..." [1] to see which plan SQLite chooses for your
different queries.

As [2] says "indexes are only useful if there are WHERE-clause constraints
on the left-most columns of the index",
with "left-most" being the keyword here. `a` is your left-most column, so
it works best with it.

But grouping by `b` may also benefits from the index, thanks to the
"skip-scan" optimization. (see [2] again).

if there are few distinct `a` values, the benefit of skip-scan improves.

The NGQP [3] is more sensitive than the old planner to good statistics, to
find the optimum plan.
So do compare you plans and performance before and after running ANALYZE on
your table(s).

[1] https://www.sqlite.org/eqp.html
[2] https://www.sqlite.org/optoverview.html#skipscan
[3] https://www.sqlite.org/queryplanner-ng.html

Reply via email to