On Sun, Jan 31, 2016 at 2:54 PM, Yannick Duch?ne <yannick_duchene at yahoo.fr> wrote:
> Another mystery to me. Given this test table: > > CREATE TABLE t (a TEXT, b TEXT, c INTEGER, PRIMARY KEY (a, b, c)) > > ??this query: > > SELECT Sum(c) FROM t GROUP BY a > > ??executes faster than any of these two: > > SELECT Sum(c) FROM t GROUP BY b > SELECT Sum(c) FROM t GROUP BY c > > ? which executes in about the same time together, proportionally to the > number of returned rows. With `GROUP BY a`, execution times seems to be > about half than with the two formers. Adding or not adding a `WITHOUT > ROWID` gives the same. I give the number of rows, to show if the first one > is faster than the second one, that's not because it would returns less > rows, on the opposite, it returns a bit more then with grouping by `b`: > > * Grouping by `a` results into 1360 rows in about 40ms +/-3; > * Grouping by `b` results into 1170 rows in about 65ms +/-5; > * Grouping by `c` results into 3154 rows in about 90ms +/-4. > Run "explain query plan ..." [1] to see which plan SQLite chooses for your different queries. As [2] says "indexes are only useful if there are WHERE-clause constraints on the left-most columns of the index", with "left-most" being the keyword here. `a` is your left-most column, so it works best with it. But grouping by `b` may also benefits from the index, thanks to the "skip-scan" optimization. (see [2] again). if there are few distinct `a` values, the benefit of skip-scan improves. The NGQP [3] is more sensitive than the old planner to good statistics, to find the optimum plan. So do compare you plans and performance before and after running ANALYZE on your table(s). [1] https://www.sqlite.org/eqp.html [2] https://www.sqlite.org/optoverview.html#skipscan [3] https://www.sqlite.org/queryplanner-ng.html