Re: [HACKERS] Partition-wise aggregation/grouping

Jeevan Chalke Wed, 01 Nov 2017 05:56:53 -0700

On Sat, Oct 28, 2017 at 3:07 PM, Robert Haas <robertmh...@gmail.com> wrote:


> On Fri, Oct 27, 2017 at 1:01 PM, Jeevan Chalke
> <jeevan.cha...@enterprisedb.com> wrote:
> > 1. Added separate patch for costing Append node as discussed up-front in
> the
> > patch-set.
> > 2. Since we now cost Append node, we don't need
> > partition_wise_agg_cost_factor
> > GUC. So removed that. The remaining patch hence merged into main
> > implementation
> > patch.
> > 3. Updated rows in test-cases so that we will get partition-wise plans.
>
> With 0006 applied, cost_merge_append() is now a little bit confused:
>
>     /*
>      * Also charge a small amount (arbitrarily set equal to operator cost)
> per
>      * extracted tuple.  We don't charge cpu_tuple_cost because a
> MergeAppend
>      * node doesn't do qual-checking or projection, so it has less overhead
>      * than most plan nodes.
>      */
>     run_cost += cpu_operator_cost * tuples;
>
>     /* Add MergeAppend node overhead like we do it for the Append node */
>     run_cost += cpu_tuple_cost * DEFAULT_APPEND_COST_FACTOR * tuples;
>
> The first comment says that we don't add cpu_tuple_cost, and the
> second one then adds half of it anyway.
>

Yep.
But as David reported earlier, if we remove the first part i.e. adding
cpu_operator_cost per tuple, Merge Append will be preferred over an Append
node unlike before. And thus, I thought of better having both, but no so
sure. Should we remove that part altogether, or add both in a single
statement with updated comments?


> I think it's fine to have a #define for DEFAULT_APPEND_COST_FACTOR,
> because as you say it's used twice, but I don't think that should be
> exposed in cost.h; I'd make it private to costsize.c and rename it to
> something like APPEND_CPU_COST_MULTIPLIER.  The word DEFAULT, in
> particular, seems useless to me, since there's no provision for it to
> be overridden by a different value.
>

Agree. Will make that change.


>
> What testing, if any, can we think about doing with this plan to make
> sure it doesn't regress things?  For example, if we do a TPC-H run
> with partitioned tables and partition-wise join enabled, will any
> plans change with this patch?


I have tried doing this on my local developer machine. For 1GB database
size (tpc-h scale factor 1), I see no plan change with and without this
patch.

I have tried with scale factor 10, but query is not executing well due to
space and memory constraints. Can someone try out that?


>   Do they get faster or not?  Anyone have
> other ideas for what to test?
>
> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company
>


-- 
Jeevan Chalke
Technical Architect, Product Development
EnterpriseDB Corporation
The Enterprise PostgreSQL Company

Re: [HACKERS] Partition-wise aggregation/grouping

Reply via email to