Andrew Gierth <and...@tao11.riddles.org.uk> writes:
> "Tom" == Tom Lane <t...@sss.pgh.pa.us> writes:
>  Tom> I've committed this after significant editorialization --- most
>  Tom> notably, I pushed control of the sort step into the aggregate
>  Tom> support functions.

> Initial tests suggest that your version is ~40% slower than ours on
> some workloads.

I poked at this a bit with perf and oprofile, and concluded that probably
the difference comes from ordered_set_startup() repeating lookups for each
group that could be done just once per query.  I'm not sure I believe the
40% figure; on this particular test case, oprofile breaks down the costs
of ordered_set_startup() like this:

  29        0.0756  postgres                 advance_transition_function
  38307    99.9244  postgres                 ordered_set_transition
1445      1.0808  postgres                 ordered_set_startup
  31418    79.4990  postgres                 tuplesort_begin_datum
  4056     10.2632  postgres                 get_typlenbyvalalign
  1445      3.6564  postgres                 ordered_set_startup [self]
  734       1.8573  postgres                 MemoryContextAllocZero
  510       1.2905  postgres                 RegisterExprContextCallback
  283       0.7161  postgres                 exprType
  247       0.6250  postgres                 get_sortgroupclause_tle
  235       0.5946  postgres                 exprCollation
  92        0.2328  postgres                 ReleaseSysCache
  78        0.1974  postgres                 SearchSysCache
  71        0.1797  postgres                 AggGetAggref
  63        0.1594  postgres                 AggCheckCallContext
  58        0.1468  postgres                 AllocSetAlloc
  46        0.1164  postgres                 PrepareSortSupportFromOrderingOp
  43        0.1088  postgres                 AggGetPerAggEContext
  40        0.1012  postgres                 get_typlenbyval
  39        0.0987  postgres                 palloc0
  35        0.0886  postgres                 MemoryContextAlloc
  17        0.0430  postgres                 get_sortgroupref_tle
  10        0.0253  postgres                 tuplesort_begin_common

The tuplesort_begin_datum calls are needed regardless --- your version
was just doing them inside nodeAgg rather than externally.  However,
get_typlenbyvalalign and some of the other calls here are to fetch
information that doesn't change across groups; probably we could arrange
to cache that info instead of recomputing it each time.  Still, it doesn't
look like that could save more than 20% of ordered_set_startup, which
itself is still only a few percent of the total query time.

Looking at this example makes me wonder if it wouldn't be worthwhile to
provide a way to reset and re-use a tuplesort object, instead of redoing
all the lookup work involved.  Or maybe just find a way to cache the
catalog lookups that are happening inside tuplesort_begin_datum, which are
about 50% of that function's cost it looks like.  We're paying this same
kind of price for repeated tuplesort setup in the existing nodeAgg code,
if we have an aggregate with ORDER BY or DISTINCT in a grouped query with
many groups.

                        regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to