On Sat, Oct 18, 2014 at 5:10 AM, Feng Tian <feng...@gmail.com> wrote:

> Hi,
>
> Consider the following queries.
>
> create table t(i int, j int, k int, t text);
> insert into t select i, i % 100, i % 1000, 'AABBCCDD' || i from
> generate_series(1, 1000000) i;
>
> ftian=# explain select count(distinct j) from t group by t, i;
>                                QUERY PLAN
> ------------------------------------------------------------------------
>  GroupAggregate  (cost=158029.84..178029.84 rows=1000000 width=22)
>    ->  Sort  (cost=158029.84..160529.84 rows=1000000 width=22)
>          Sort Key: t, i
>          ->  Seq Scan on t  (cost=0.00..17352.00 rows=1000000 width=22)
> (4 rows)
>
>
> The query,
> select count(distinct j) from t group by t, i;
>
> runs for 35 seconds.  However, if I change the query to,
> select count(distinct j) from t group by i, t;  -- note switching the
> ordering
> select count(distinct j) from t group by decode(t, 'escape'), i; --
> convert t to bytea
>
> Run times are just about 5 and 6.5 seconds.  The reason is clear, compare
> a string with collation is slow, which is well understood by pg hackers.
> However, here, the sorting order is forced by the planner, not user.
> Planner can do the following optimizations,
>
> 1. for the sort we generated for sort agg, planner can switch column
> ordering, put int before string,
> 2. for the sort we generated for sort agg, use bytea compare instead of
> string compare.
>
> They will bring big improvement to this common query.   Is this something
> reasonable?
>
>
It seems like it might be worth looking into, but I think it's likely more
complex than sorting the group by order into narrowest column first.

If the query was:

select count(distinct j) from t group by t, i order by t, i;

Then if that was  rewritten to make the group by i,t then the planner would
then need to perform an additional sort on t,i to get the correct order by
for the final result, which may or may not be faster, it would depend on
how many groups there were to sort. Though I guess you could possibly just
skip the optimisation if the next node up didn't need any specific ordering.

I also wonder if taking into account the column's width is not quite
enough. For example if the 'i' column only had 1 distinct value, then
performing the group by on 'i' first wouldn't help at all. Keep in mind
that the columns could be much closer in width than in your example, e.g
int and bigint. Perhaps you'd need to include some sort of heuristics to
look at the number of distinct values and the width, and form some sort of
weighted estimates about which column to put first.

Regards

David Rowley

Reply via email to