Re: [HACKERS] using custom scan nodes to prototype parallel sequential scan

Kouhei Kaigai Thu, 13 Nov 2014 23:26:02 -0800

> On Fri, Nov 14, 2014 at 2:12 PM, Kouhei Kaigai <[email protected]> wrote:
> 
> 
>       >       On 12 November 2014 07:54, David Rowley
> <[email protected]>
>       > Take int4_avg_accum() for example it does:
>       >
>       >
>       > transdata->count++;
>       > transdata->sum += newval;
>       >
>       > The merge function would need to perform something like:
>       >
> 
>       > transdata->count += transdata2merge.count; sum +=
> transdata2merge.sum;
>       >
>       > Then the final function could be called on the merged aggregate
> state.
>       >
>       More simplify, we can describe parallel aware aggregate function.
>       Please assume AVG(X) function that takes nrows and sum of X. Its
> transition
>       function performs as like you described above, then final function
> works as
>       usual.
> 
>       The job of parallel seq scan needs to do is:
>         1. replace AVG(X) by AVG(nrows, sum(X)
>         2. generate count(*) on the partial relation being grouped.
>         3. generate sum(X) on the partial relation being grouped.
> 
>       It looks like the following query:
>         SELECT AVG(nrows, sum_X) FROM (
>           SELECT count(*) nrows, sum(X) sum_X FROM tbl WHERE blkno
> between 0 and 999 GROUP BY cat
>           UNION
>           SELECT count(*) nrows, sum(X) sum_X FROM tbl WHERE blkno
> between 1000 and 1999 GROUP BY cat
>           UNION
>           SELECT count(*) nrows, sum(X) sum_X FROM tbl WHERE blkno
> between 2000 and 2999 GROUP BY cat
>           UNION
>           SELECT count(*) nrows, sum(X) sum_X FROM tbl WHERE blkno
> between 3000 and 3999 GROUP BY cat
>         );
> 
> 
> 
> 
> Well, this would require giving the planner some kind of knowledge of what
> AVG() is. It currently knows nothing about that. It currently calls the
> transition function for each row, and the final function at the end and
> does not care or need to care about what either of those functions actually
> does. The transformation above looks like it would need to care and the
> logic to add that would be way more complex than aggregate merge functions.
> 
It may make sense to have an extra catalog to indicate how usual aggregate
function can be broken down (or unavailable).
Like, AVG(X) = {COUNT(X) + SUM(X)} x N-partitions


> Likely for most aggregates, like count, sum, max, min, bit_and and bit_or
> the merge function would be the same as the transition function, as the
> state type is just the same as the input type. It would only be aggregates
> like avg(), stddev*(), bool_and() and bool_or() that would need a new merge
> function made... These would be no more complex than the transition
> functions... Which are just a few lines of code anyway.
> 
> We'd simply just not run parallel query if any aggregates used in the query
> didn't have a merge function.
> 
> When I mentioned this, I didn't mean to appear to be placing a road block.I
> was just bringing to the table the information that COUNT(*) + COUNT(*)
> works ok for merging COUNT(*)'s "sub totals", but AVG(n) + AVG(n) does not.
> 
>  Merge functions should be a simple patch to write. If I thought there was
> going to be a use for them in this release, I'd put my hand up to put a
> patch together.
> 
Things I'm uncertain is, how caller of aggregate function distinguish a context
to call usual translation function, or new merge function.

Let's back an example of:
  SELECT cat, COUNT(*), AVG(X) FROM t GROUP BY cat;

Its plan tree is probably as follows:
  HashAggregate
   Group Key: cat
   ->  Custom Scan (Parallel Scan)
         # of workers: 4

The caller of translation/merge/final function is HashAggregate node.
On the other hand, it has to know whether the underlying plan returns every
records of underlying table or sub-total by parallel scan.
Please correct me, if my assumption is wrong.

Once HashAggregate can know that sub-plan returns sub-total of the relation,
it can chose merge function instead of translation function.
However, what I want to clarify is how to inform HashAggregate node its sub-
plan intends to return sub-total, instead of individual rows.

Thanks,
--
NEC OSS Promotion Center / PG-Strom Project
KaiGai Kohei <[email protected]>

-- 
Sent via pgsql-hackers mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] using custom scan nodes to prototype parallel sequential scan

Reply via email to