subject:"\[HACKERS\] why not parallel seq scan for slow functions"

Re: [HACKERS] why not parallel seq scan for slow functions

2018-03-31 Thread Amit Kapila

On Fri, Mar 30, 2018 at 1:41 AM, Robert Haas  wrote:
> On Thu, Mar 29, 2018 at 12:55 AM, Amit Kapila  wrote:
>> I think that is under acceptable range.  I am seeing few regression
>> failures with the patch series.  The order of targetlist seems to have
>> changed for Remote SQL.  Kindly find the failure report attached.  I
>> have requested my colleague Ashutosh Sharma to cross-verify this and
>> he is also seeing the same failures.
>
> Oops.  Those just require an expected output change.
>
>> It seems UPPERREL_TLIST is redundant in the patch now.  I think we can
>> remove it unless you have something else in mind.
>
> Yes.
>
>> I think the handling of partitioned rels looks okay, but we might want
>> to once check the overhead of the same unless you are sure that this
>> shouldn't be a problem.  If you think, we should check it once, then
>> is it possible that we can do it as a separate patch as this doesn't
>> look to be directly linked to the main patch.  It can be treated as an
>> optimization for partitionwise aggregates.  I think we can treat it
>> along with the main patch as well, but it might be somewhat simpler to
>> verify it if we do it separately.
>
> I don't think it should be a problem, although you're welcome to test
> it if you're concerned about it.  I think it would probably be
> penny-wise and pound-foolish to worry about the overhead of
> eliminating the Result nodes, which can occur not only with
> partition-wise aggregate but also with partition-wise join and, I
> think, really any case where the top scan/join plan would be an Append
> node.  We're talking about a very small amount of additional planning
> time to potentially get a better plan.
>
> I've committed all of these now.
>

Cool,  I have closed the corresponding CF entry.


-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Re: [HACKERS] why not parallel seq scan for slow functions

2018-03-29 Thread Robert Haas

On Thu, Mar 29, 2018 at 12:55 AM, Amit Kapila  wrote:
> I think that is under acceptable range.  I am seeing few regression
> failures with the patch series.  The order of targetlist seems to have
> changed for Remote SQL.  Kindly find the failure report attached.  I
> have requested my colleague Ashutosh Sharma to cross-verify this and
> he is also seeing the same failures.

Oops.  Those just require an expected output change.

> It seems UPPERREL_TLIST is redundant in the patch now.  I think we can
> remove it unless you have something else in mind.

Yes.

> I think the handling of partitioned rels looks okay, but we might want
> to once check the overhead of the same unless you are sure that this
> shouldn't be a problem.  If you think, we should check it once, then
> is it possible that we can do it as a separate patch as this doesn't
> look to be directly linked to the main patch.  It can be treated as an
> optimization for partitionwise aggregates.  I think we can treat it
> along with the main patch as well, but it might be somewhat simpler to
> verify it if we do it separately.

I don't think it should be a problem, although you're welcome to test
it if you're concerned about it.  I think it would probably be
penny-wise and pound-foolish to worry about the overhead of
eliminating the Result nodes, which can occur not only with
partition-wise aggregate but also with partition-wise join and, I
think, really any case where the top scan/join plan would be an Append
node.  We're talking about a very small amount of additional planning
time to potentially get a better plan.

I've committed all of these now.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: [HACKERS] why not parallel seq scan for slow functions

2018-03-28 Thread Amit Kapila

On Thu, Mar 29, 2018 at 2:31 AM, Robert Haas  wrote:
> On Wed, Mar 28, 2018 at 3:06 AM, Amit Kapila  wrote:
>>
>> The above block takes 43700.0289 ms on Head and 45025.3779 ms with the
>> patch which is approximately 3% regression.
>
> Thanks for the analysis -- the observation that this seemed to affect
> cases where CP_LABEL_TLIST gets passed to create_projection_plan
> allowed me to recognize that I was doing an unnecessary copyObject()
> call.  Removing that seems to have reduced this regression below 1% in
> my testing.
>

I think that is under acceptable range.  I am seeing few regression
failures with the patch series.  The order of targetlist seems to have
changed for Remote SQL.  Kindly find the failure report attached.  I
have requested my colleague Ashutosh Sharma to cross-verify this and
he is also seeing the same failures.

Few comments/suggestions:

1.
typedef enum UpperRelationKind
 {
  UPPERREL_SETOP, /* result of UNION/INTERSECT/EXCEPT, if any */
+ UPPERREL_TLIST, /* result of projecting final scan/join rel */
  UPPERREL_PARTIAL_GROUP_AGG, /* result of partial grouping/aggregation, if
  * any */
  UPPERREL_GROUP_AGG, /* result of grouping/aggregation, if any */
...
...
  /*
  * Save the various upper-rel PathTargets we just computed into
@@ -2003,6 +2004,7 @@ grouping_planner(PlannerInfo *root, bool
inheritance_update,
  root->upper_targets[UPPERREL_FINAL] = final_target;
  root->upper_targets[UPPERREL_WINDOW] = sort_input_target;
  root->upper_targets[UPPERREL_GROUP_AGG] = grouping_target;
+ root->upper_targets[UPPERREL_TLIST] = scanjoin_target;

It seems UPPERREL_TLIST is redundant in the patch now.  I think we can
remove it unless you have something else in mind.

2.
+ /*
+ * If the relation is partitioned, recurseively apply the same changes to
+ * all partitions and generate new Append paths. Since Append is not
+ * projection-capable, that might save a separate Result node, and it also
+ * is important for partitionwise aggregate.
+ */
+ if (rel->part_scheme && rel->part_rels)
  {

I think the handling of partitioned rels looks okay, but we might want
to once check the overhead of the same unless you are sure that this
shouldn't be a problem.  If you think, we should check it once, then
is it possible that we can do it as a separate patch as this doesn't
look to be directly linked to the main patch.  It can be treated as an
optimization for partitionwise aggregates.  I think we can treat it
along with the main patch as well, but it might be somewhat simpler to
verify it if we do it separately.

> Also, keep in mind that we're talking about extremely small amounts of
> time here.  On a trivial query that you're not even executing, you're
> seeing a difference of (45025.3779 - 43700.0289)/100 = 0.00132 ms
> per execution.  Sure, it's still 3%, but it's 3% of the time in an
> artificial case where you don't actually run the query.  In real life,
> nobody runs EXPLAIN in a tight loop a million times without ever
> running the query, because that's not a useful thing to do.
>

Agreed, but this was to ensure that we should not do this optimization
at the cost of adding significant cycles to the planner time.

>  The
> overhead on a realistic test case will be smaller.  Furthermore, at
> least in my testing, there are now cases where this is faster than
> master.  Now, I welcome further ideas for optimization, but a patch
> that makes some cases slightly slower while making others slightly
> faster, and also improving the quality of plans in some cases, is not
> to my mind an unreasonable thing.
>

Agreed.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

regression.diffs
Description: Binary data

Re: [HACKERS] why not parallel seq scan for slow functions

2018-03-28 Thread Robert Haas

On Wed, Mar 28, 2018 at 3:06 AM, Amit Kapila  wrote:
> Good idea, such an optimization will ensure that the cases reported
> above will not have regression.  However isn't it somewhat beating the
> idea you are using in the patch which is to avoid modifying the path
> in-place?

Sure, but you can't have everything.  I don't think modifying the
sortgroupref data in place is really quite the same thing as changing
the pathtarget in place; the sortgroupref stuff is some extra
information about the targets being computed, not really a change in
targets per se.  But in any case if you want to eliminate extra work
then we've gotta eliminate it...

> In any case, I think one will still see regression in cases
> where this optimization doesn't apply.  For example,
>
> DO $$
> DECLARE count integer;
> BEGIN
> For count In 1..100 Loop
> Execute 'explain Select sum(thousand)from tenk1 group by ten';
> END LOOP;
> END;
> $$;
>
> The above block takes 43700.0289 ms on Head and 45025.3779 ms with the
> patch which is approximately 3% regression.

Thanks for the analysis -- the observation that this seemed to affect
cases where CP_LABEL_TLIST gets passed to create_projection_plan
allowed me to recognize that I was doing an unnecessary copyObject()
call.  Removing that seems to have reduced this regression below 1% in
my testing.

Also, keep in mind that we're talking about extremely small amounts of
time here.  On a trivial query that you're not even executing, you're
seeing a difference of (45025.3779 - 43700.0289)/100 = 0.00132 ms
per execution.  Sure, it's still 3%, but it's 3% of the time in an
artificial case where you don't actually run the query.  In real life,
nobody runs EXPLAIN in a tight loop a million times without ever
running the query, because that's not a useful thing to do.  The
overhead on a realistic test case will be smaller.  Furthermore, at
least in my testing, there are now cases where this is faster than
master.  Now, I welcome further ideas for optimization, but a patch
that makes some cases slightly slower while making others slightly
faster, and also improving the quality of plans in some cases, is not
to my mind an unreasonable thing.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

0003-Rewrite-the-code-that-applies-scan-join-targets-to-p.patch
Description: Binary data

0002-Postpone-generate_gather_paths-for-topmost-scan-join.patch
Description: Binary data

0001-Teach-create_projection_plan-to-omit-projection-wher.patch
Description: Binary data

Re: [HACKERS] why not parallel seq scan for slow functions

2018-03-28 Thread Amit Kapila

On Tue, Mar 27, 2018 at 10:59 PM, Robert Haas  wrote:
> On Tue, Mar 27, 2018 at 7:42 AM, Robert Haas  wrote:
>> On Tue, Mar 27, 2018 at 1:45 AM, Amit Kapila  wrote:
>>> If we don't want to go with the upperrel logic, then maybe we should
>>> consider just merging some of the other changes from my previous patch
>>> in 0003* patch you have posted and then see if it gets rid of all the
>>> cases where we are seeing a regression with this new approach.
>>
>> Which changes are you talking about?
>
> I realized that this version could be optimized in the case where the
> scanjoin_target and the topmost scan/join rel have the same
> expressions in the target list.
>

Good idea, such an optimization will ensure that the cases reported
above will not have regression.  However isn't it somewhat beating the
idea you are using in the patch which is to avoid modifying the path
in-place?  In any case, I think one will still see regression in cases
where this optimization doesn't apply.  For example,

DO $$
DECLARE count integer;
BEGIN
For count In 1..100 Loop
Execute 'explain Select sum(thousand)from tenk1 group by ten';
END LOOP;
END;
$$;

The above block takes 43700.0289 ms on Head and 45025.3779 ms with the
patch which is approximately 3% regression.  In this case, the
regression is lesser than the previously reported cases, but I guess
we might see bigger regression if the number of columns is more or
with a different set of queries.   I think the cases which can
slowdown are where we need to use physical tlist in
create_projection_plan and the caller has requested CP_LABEL_TLIST.  I
have checked in regression tests and there seem to be more cases which
will be impacted.  Another such example from regression tests is:
select count(*) >= 0 as ok from pg_available_extension_versions;

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Re: [HACKERS] why not parallel seq scan for slow functions

2018-03-27 Thread Robert Haas

On Tue, Mar 27, 2018 at 7:42 AM, Robert Haas  wrote:
> On Tue, Mar 27, 2018 at 1:45 AM, Amit Kapila  wrote:
>> If we don't want to go with the upperrel logic, then maybe we should
>> consider just merging some of the other changes from my previous patch
>> in 0003* patch you have posted and then see if it gets rid of all the
>> cases where we are seeing a regression with this new approach.
>
> Which changes are you talking about?

I realized that this version could be optimized in the case where the
scanjoin_target and the topmost scan/join rel have the same
expressions in the target list.  Here's a revised patch series that
does that.  For me, this is faster than master on your last test case.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


0004-Rewrite-the-code-that-applies-scan-join-targets-to-p.patch
Description: Binary data


0003-Postpone-generate_gather_paths-for-topmost-scan-join.patch
Description: Binary data


0002-CP_IGNORE_TLIST.patch
Description: Binary data


0001-Teach-create_projection_plan-to-omit-projection-wher.patch
Description: Binary data

Re: [HACKERS] why not parallel seq scan for slow functions

2018-03-27 Thread Robert Haas

On Tue, Mar 27, 2018 at 1:45 AM, Amit Kapila  wrote:
> If we don't want to go with the upperrel logic, then maybe we should
> consider just merging some of the other changes from my previous patch
> in 0003* patch you have posted and then see if it gets rid of all the
> cases where we are seeing a regression with this new approach.

Which changes are you talking about?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: [HACKERS] why not parallel seq scan for slow functions

2018-03-26 Thread Amit Kapila

On Tue, Mar 27, 2018 at 3:08 AM, Robert Haas  wrote:
> On Sat, Mar 24, 2018 at 9:40 AM, Amit Kapila  wrote:
>> For me, it is equivalent to the master.  The average of ten runs on
>> the master is 20664.3683 and with all the patches applied it is
>> 20590.4734.  I think there is some run-to-run variation, but more or
>> less there is no visible degradation.  I think we have found the root
>> cause and eliminated it.  OTOH, I have found another case where new
>> patch series seems to degrade.
>
> All right, I have scaled my ambitions back further.  Here is a revised
> and slimmed-down version of the patch series.
>

It still didn't help much.  I am seeing the similar regression in one
of the tests [1] posted previously.

>  If we forget about
> "Remove explicit path construction logic in create_ordered_paths" for
> now, then we don't really need a new upperrel.  So this patch just
> modifies the toplevel scan/join rel in place, which should avoid a
> bunch of overhead in add_path() and other places, while hopefully
> still fixing the originally-reported problem.
>

If we don't want to go with the upperrel logic, then maybe we should
consider just merging some of the other changes from my previous patch
in 0003* patch you have posted and then see if it gets rid of all the
cases where we are seeing a regression with this new approach.  I
think that with this approach you want to target the problem of
partitonwise aggregate, but maybe we can deal with it in a separate
patch.

[1] -
Test case
--
DO $$
DECLARE count integer;
BEGIN
For count In 1..100 Loop
Execute 'explain Select count(ten) from tenk1';
END LOOP;
END;
$$;

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Re: [HACKERS] why not parallel seq scan for slow functions

2018-03-26 Thread Robert Haas

On Sat, Mar 24, 2018 at 9:40 AM, Amit Kapila  wrote:
> For me, it is equivalent to the master.  The average of ten runs on
> the master is 20664.3683 and with all the patches applied it is
> 20590.4734.  I think there is some run-to-run variation, but more or
> less there is no visible degradation.  I think we have found the root
> cause and eliminated it.  OTOH, I have found another case where new
> patch series seems to degrade.

All right, I have scaled my ambitions back further.  Here is a revised
and slimmed-down version of the patch series.  If we forget about
"Remove explicit path construction logic in create_ordered_paths" for
now, then we don't really need a new upperrel.  So this patch just
modifies the toplevel scan/join rel in place, which should avoid a
bunch of overhead in add_path() and other places, while hopefully
still fixing the originally-reported problem.  I haven't tested this
beyond verifying that it passes the regression test, but I've run out
of time for today.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

0004-Rewrite-the-code-that-applies-scan-join-targets-to-p.patch
Description: Binary data

0003-Postpone-generate_gather_paths-for-topmost-scan-join.patch
Description: Binary data

0002-CP_IGNORE_TLIST.patch
Description: Binary data

0001-Teach-create_projection_plan-to-omit-projection-wher.patch
Description: Binary data

Re: [HACKERS] why not parallel seq scan for slow functions

2018-03-24 Thread Amit Kapila

On Sat, Mar 24, 2018 at 8:41 AM, Robert Haas  wrote:
> On Fri, Mar 23, 2018 at 12:12 AM, Amit Kapila  wrote:
>> Yeah, sometimes that kind of stuff change performance characteristics,
>> but I think what is going on here is that create_projection_plan is
>> causing the lower node to build physical tlist which takes some
>> additional time.  I have tried below change on top of the patch series
>> and it brings back the performance for me.
>
> I tried another approach inspired by this, which is to altogether skip
> building the child scan tlist if it will just be replaced.  See 0006.
> In testing here, that seems to be a bit better than your proposal, but
> I wonder what your results will be.
>
..
>
> It looks in my testing like this still underperforms master on your
> test case.  Do you get the same result?
>

For me, it is equivalent to the master.  The average of ten runs on
the master is 20664.3683 and with all the patches applied it is
20590.4734.  I think there is some run-to-run variation, but more or
less there is no visible degradation.  I think we have found the root
cause and eliminated it.  OTOH, I have found another case where new
patch series seems to degrade.

Test case
--
DO $$
DECLARE count integer;
BEGIN
For count In 1..100 Loop
Execute 'explain Select count(ten) from tenk1';
END LOOP;
END;
$$;

The average of ten runs on the master is 31593.9533 and with all the
patches applied it is 34008.7341.  The patch takes approximately 7.6%
more time.  I think this patch series is doing something costly in the
common code path.  I am also worried that the new code proposed by you
in 0003* patch might degrade planner performance for partitioned rels,
though I have not tested it yet.  It is difficult to say without
testing it, but before going there, I think we should first
investigate whats happening in the non-partitioned case.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Re: [HACKERS] why not parallel seq scan for slow functions

2018-03-23 Thread Robert Haas

On Fri, Mar 23, 2018 at 12:12 AM, Amit Kapila  wrote:
> Yeah, sometimes that kind of stuff change performance characteristics,
> but I think what is going on here is that create_projection_plan is
> causing the lower node to build physical tlist which takes some
> additional time.  I have tried below change on top of the patch series
> and it brings back the performance for me.

I tried another approach inspired by this, which is to altogether skip
building the child scan tlist if it will just be replaced.  See 0006.
In testing here, that seems to be a bit better than your proposal, but
I wonder what your results will be.

> Before returning subplan, don't we need to copy the cost estimates
> from best_path as is done in the same function after few lines.

The new 0006 takes care of this, too.  Really, the new 0006 should be
merged into 0001, but I kept it separate for now.

So, rebased:

0001 - More or less as before.

0002 - More or less as before.

0003 - Rewritten in the wake of partitionwise aggregate, as the
tlist_rel must be partitioned in order for partitionwise aggregate to
work.  Quite pleasingly, this eliminates a bunch of Result nodes from
the partitionwise join test results.  Overall, I find this quite a bit
cleaner than the present code (leaving performance aside for the
moment).  In the process of writing this I realized that the
partitionwise aggregate code doesn't look like it handles SRFs
properly, so if this doesn't get committed we'll have to fix that
problem some other way.

0004 - A little different than before as a result of the extensive
changes in 0003.

0005 - Also different, and revealing another defect in partitionwise
aggregate, as noted in the commit message.

0006 - Introduce CP_IGNORE_TLIST; optimization of 0001.

0007 - Use NULL relids set for the toplevel tlist upper rel.  This
seems to be slightly faster than the other way.  This is an
optimization of 0003.

It looks in my testing like this still underperforms master on your
test case.  Do you get the same result?  Any other ideas?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

0007-Use-NULL-relids-set.patch
Description: Binary data

0006-CP_IGNORE_TLIST.patch
Description: Binary data

0005-Remove-explicit-path-construction-logic-in-create_or.patch
Description: Binary data

0004-Postpone-generate_gather_paths-for-topmost-scan-join.patch
Description: Binary data

0003-Add-new-upper-rel-to-represent-projecting-toplevel-s.patch
Description: Binary data

0002-Adjust-use_physical_tlist-to-work-on-upper-rels.patch
Description: Binary data

0001-Teach-create_projection_plan-to-omit-projection-wher.patch
Description: Binary data

Re: [HACKERS] why not parallel seq scan for slow functions

2018-03-22 Thread Amit Kapila

On Tue, Mar 20, 2018 at 1:23 AM, Robert Haas  wrote:
> On Sat, Mar 17, 2018 at 1:16 AM, Amit Kapila  wrote:
>> Test-1
>> --
>> DO $$
>> DECLARE count integer;
>> BEGIN
>> For count In 1..100 Loop
>> Execute 'explain Select ten from tenk1';
>> END LOOP;
>> END;
>> $$;
>>
>> In the above block, I am explaining the simple statement which will
>> have just one path, so there will be one additional path projection
>> and removal cycle for this statement.  I have just executed the above
>> block in psql by having \timing option 'on' and the average timing for
>> ten runs on HEAD is  21292.388 ms, with patches (0001.* ~ 0003) is
>> 22405.2466 ms and with patches (0001.* ~ 0005.*) is 22537.1362.  These
>> results indicate that there is approximately 5~6% of the increase in
>> planning time.
>
> Ugh.  I'm able to reproduce this, more or less -- with master, this
> test took 42089.484 ms, 41935.849 ms, 42519.336 ms on my laptop, but
> with 0001-0003 applied, 43925.959 ms, 43619.004 ms, 43648.426 ms.
> However I have a feeling there's more going on here, because the
> following patch on top of 0001-0003 made the time go back down to
> 42353.548, 41797.757 ms, 41891.194 ms.
>
..
>
> It seems pretty obvious that creating an extra projection path that is
> just thrown away can't "really" be making this faster, so there's
> evidently some other effect here involving how the code is laid out,
> or CPU cache effects, or, uh, something.
>

Yeah, sometimes that kind of stuff change performance characteristics,
but I think what is going on here is that create_projection_plan is
causing the lower node to build physical tlist which takes some
additional time.  I have tried below change on top of the patch series
and it brings back the performance for me.

@@ -1580,7 +1580,7 @@ create_projection_plan(PlannerInfo *root,
ProjectionPath *best_path, int flags)
List   *tlist;

/* Since we intend to project, we don't need to constrain child tlist */
-   subplan = create_plan_recurse(root, best_path->subpath, 0);
+   subplan = create_plan_recurse(root, best_path->subpath, flags);

Another point I have noticed in
0001-Teach-create_projection_plan-to-omit-projection-wher  patch:

-create_projection_plan(PlannerInfo *root, ProjectionPath *best_path)
+create_projection_plan(PlannerInfo *root, ProjectionPath *best_path, int flags)
{
..
+ else if ((flags & CP_LABEL_TLIST) != 0)
+ {
+ tlist = copyObject(subplan->targetlist);
+ apply_pathtarget_labeling_to_tlist(tlist, best_path->path.pathtarget);
+ }
+ else
+ return subplan;
..
}

Before returning subplan, don't we need to copy the cost estimates
from best_path as is done in the same function after few lines.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Re: [HACKERS] why not parallel seq scan for slow functions

2018-03-19 Thread Robert Haas

On Sat, Mar 17, 2018 at 1:16 AM, Amit Kapila  wrote:
> Test-1
> --
> DO $$
> DECLARE count integer;
> BEGIN
> For count In 1..100 Loop
> Execute 'explain Select ten from tenk1';
> END LOOP;
> END;
> $$;
>
> In the above block, I am explaining the simple statement which will
> have just one path, so there will be one additional path projection
> and removal cycle for this statement.  I have just executed the above
> block in psql by having \timing option 'on' and the average timing for
> ten runs on HEAD is  21292.388 ms, with patches (0001.* ~ 0003) is
> 22405.2466 ms and with patches (0001.* ~ 0005.*) is 22537.1362.  These
> results indicate that there is approximately 5~6% of the increase in
> planning time.

Ugh.  I'm able to reproduce this, more or less -- with master, this
test took 42089.484 ms, 41935.849 ms, 42519.336 ms on my laptop, but
with 0001-0003 applied, 43925.959 ms, 43619.004 ms, 43648.426 ms.
However I have a feeling there's more going on here, because the
following patch on top of 0001-0003 made the time go back down to
42353.548, 41797.757 ms, 41891.194 ms.

diff --git a/src/backend/optimizer/plan/planner.c
b/src/backend/optimizer/plan/planner.c
index bf0b3e75ea..0542b3e40b 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -1947,12 +1947,19 @@ grouping_planner(PlannerInfo *root, bool
inheritance_update,
 {
 Path   *subpath = (Path *) lfirst(lc);
 Path   *path;
+Path   *path2;

 Assert(subpath->param_info == NULL);
-path = (Path *) create_projection_path(root, tlist_rel,
+path2 = (Path *) create_projection_path(root, tlist_rel,
subpath, scanjoin_target);
-add_path(tlist_rel, path);
+path = (Path *) apply_projection_to_path(root, tlist_rel,
+   subpath, scanjoin_target);
+if (path == path2)
+elog(ERROR, "oops");
+lfirst(lc) = path;
 }
+tlist_rel->pathlist = current_rel->pathlist;
+current_rel->pathlist = NIL;

 /*
  * If we can produce partial paths for the tlist rel, for possible use

It seems pretty obvious that creating an extra projection path that is
just thrown away can't "really" be making this faster, so there's
evidently some other effect here involving how the code is laid out,
or CPU cache effects, or, uh, something.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: [HACKERS] why not parallel seq scan for slow functions

2018-03-16 Thread Amit Kapila

On Fri, Mar 16, 2018 at 3:36 PM, Amit Kapila  wrote:
> On Wed, Mar 14, 2018 at 12:01 AM, Robert Haas  wrote:
>>
>> 0003 introduces a new upper relation to represent the result of
>> applying the scan/join target to the topmost scan/join relation.  I'll
>> explain further down why this seems to be needed.  Since each path
>> must belong to only one relation, this involves using
>> create_projection_path() for the non-partial pathlist as we already do
>> for the partial pathlist, rather than apply_projection_to_path().
>> This is probably marginally more expensive but I'm hoping it doesn't
>> matter.  (However, I have not tested.)
>>
>
> I think in the patch series this is the questionable patch wherein it
> will always add an additional projection path (whether or not it is
> required) to all Paths (partial and non-partial) for the scanjoin rel
> and then later remove it (if not required) in create_projection_plan.
>  As you are saying, I also think it might not matter much in the grand
> scheme of things and if required we can test it as well,
>

I have done some tests to see the impact of this patch on planning
time.  I took some simple statements and tried to compute the time
they took in planning.

Test-1
--
DO $$
DECLARE count integer;
BEGIN
For count In 1..100 Loop
Execute 'explain Select ten from tenk1';
END LOOP;
END;
$$;

In the above block, I am explaining the simple statement which will
have just one path, so there will be one additional path projection
and removal cycle for this statement.  I have just executed the above
block in psql by having \timing option 'on' and the average timing for
ten runs on HEAD is  21292.388 ms, with patches (0001.* ~ 0003) is
22405.2466 ms and with patches (0001.* ~ 0005.*) is 22537.1362.  These
results indicate that there is approximately 5~6% of the increase in
planning time.

Test-2
--
DO $$
DECLARE count integer;
BEGIN
For count In 1..100 Loop
Execute 'explain select hundred,ten from tenk1 order by hundred';
END LOOP;
END;
$$;

In the above block, I am explaining the statement which will have two
paths, so there will be two additional path projections and one
removal cycle for one of the selected paths for this statement. The
average timing for ten runs on HEAD is 32869.8343 ms, with patches
(0001.* ~ 0003) is 34068.0608 ms and with patches (0001.* ~ 0005.*) is
34097.4899 ms.  These results indicate that there is approximately
3~4% of the increase in optimizer time.  Now, ideally, this test
should have shown more impact as we are adding additional projection
path for two paths, but I think as the overall time for planning is
higher, the impact of additional work is not much visible.

I have done these tests on the Centos VM, so there is some variation
in test results.  Please find attached the detailed results of all the
tests.  I have not changed any configuration for these tests.  I think
before reaching any conclusion, it would be better if someone repeats
these tests and see if they also have a similar observation.  The
reason for doing the tests separately for first three patches (0001.*
~ 0003.*) is to see the impact of changes without any change related
to parallelism.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com
Test-1
-
DO $$
DECLARE count integer;
BEGIN
For count In 1..100 Loop
Execute 'explain Select ten from tenk1';
END LOOP;
END;
$$;

Head

Time: 21228.829 ms (00:21.229)
Time: 21353.244 ms (00:21.353)
Time: 21758.958 ms (00:21.759)
Time: 20999.708 ms (00:21.000)
Time: 21804.451 ms (00:21.804)
Time: 20851.715 ms (00:20.852)
Time: 21514.198 ms (00:21.514)
Time: 20677.121 ms (00:20.677)
Time: 21685.576 ms (00:21.686)
Time: 21050.080 ms (00:21.050)

Patches (upto 0003)
--
Time: 23425.923 ms (00:23.426)
Time: 21609.624 ms (00:21.610)
Time: 21865.652 ms (00:21.866)
Time: 22006.988 ms (00:22.007)
Time: 22427.953 ms (00:22.428)
Time: 23195.211 ms (00:23.195)
Time: 22692.508 ms (00:22.693)
Time: 21757.227 ms (00:21.757)
Time: 22574.941 ms (00:22.575)
Time: 22496.439 ms (00:22.496)
Time: 21648.699 ms (00:21.649)

Patches (upto 0005)
---
Time: 23003.406 ms (00:23.003)
Time: 22624.973 ms (00:22.625)
Time: 22426.394 ms (00:22.426)
Time: 22700.590 ms (00:22.701)
Time: 22452.472 ms (00:22.452)
Time: 23048.367 ms (00:23.048)
Time: 21845.491 ms (00:21.845)
Time: 22832.643 ms (00:22.833)
Time: 21995.769 ms (00:21.996)
Time: 22441.257 ms (00:22.441)

Test-2
---
DO $$
DECLARE count integer;
BEGIN
For count In 1..100 Loop
Execute 'explain select hundred,ten from tenk1 order by hundred';
END LOOP;
END;
$$;

Head
-
Time: 32229.360 ms (00:32.229)
Time: 33310.348 ms (00:33.310)
Time: 32712.002 ms (00:32.712)
Time: 32630.473 ms (00:32.630)
Time: 33097.111 ms (00:33.097)
Time: 32487.799 ms (00:32.488)
Time: 33195.718 ms (00:33.196)
Time: 33166.606 ms (00:33.167)
Time: 32533.331 ms (00:32.533)

Re: [HACKERS] why not parallel seq scan for slow functions

2018-03-16 Thread Robert Haas

On Fri, Mar 16, 2018 at 6:06 AM, Amit Kapila  wrote:
> I think in the patch series this is the questionable patch wherein it
> will always add an additional projection path (whether or not it is
> required) to all Paths (partial and non-partial) for the scanjoin rel
> and then later remove it (if not required) in create_projection_plan.
>  As you are saying, I also think it might not matter much in the grand
> scheme of things and if required we can test it as well, however, I
> think it is better if some other people can also share their opinion
> on this matter.
>
> Tom, do you have anything to say?

I forgot to include part of the explanation in my previous email.  The
reason it has to work this way is that, of course, you can't include
the same path in the path list of relation B as you put into the path
of relation A; if you do, then you will be in trouble if a later
addition to the path list of relation B kicks that path out, because
it will get pfree'd, leaving a garbage pointer in the list for A,
which you may subsequently referenced.

At one point, I tried to solve the problem by sorting the cheapest
partial path from the scan/join rel and putting it back into the same
pathlist, but that also fails: in some cases, the new path dominates
all of the existing paths because it is better-sorted and only very
slightly more expensive.  So, when you call add_partial_path() for the
sort path, all of the previous paths -- including the one the sort
path is pointing to as its subpath! -- get pfree'd.

So without another relation, and the projection paths, I couldn't get
it to where I didn't have to modify paths after creating them.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: [HACKERS] why not parallel seq scan for slow functions

2018-03-16 Thread Amit Kapila

On Wed, Mar 14, 2018 at 12:01 AM, Robert Haas  wrote:
>
> 0003 introduces a new upper relation to represent the result of
> applying the scan/join target to the topmost scan/join relation.  I'll
> explain further down why this seems to be needed.  Since each path
> must belong to only one relation, this involves using
> create_projection_path() for the non-partial pathlist as we already do
> for the partial pathlist, rather than apply_projection_to_path().
> This is probably marginally more expensive but I'm hoping it doesn't
> matter.  (However, I have not tested.)
>

I think in the patch series this is the questionable patch wherein it
will always add an additional projection path (whether or not it is
required) to all Paths (partial and non-partial) for the scanjoin rel
and then later remove it (if not required) in create_projection_plan.
 As you are saying, I also think it might not matter much in the grand
scheme of things and if required we can test it as well, however, I
think it is better if some other people can also share their opinion
on this matter.

Tom, do you have anything to say?


  Each non-partial path in the
> topmost scan/join rel produces a corresponding path in the new upper
> rel.  The same is done for partial paths if the scan/join target is
> parallel-safe; otherwise we can't.
>
> 0004 causes the main part of the planner to skip calling
> generate_gather_paths() from the topmost scan/join rel.  This logic is
> mostly stolen from your patch.  If the scan/join target is NOT
> parallel-safe, we perform generate_gather_paths() on the topmost
> scan/join rel.  If the scan/join target IS parallel-safe, we do it on
> the post-projection rel introduced by 0003 instead.  This is the patch
> that actually fixes Jeff's original complaint.
>

Looks good, I feel you can include the test from my patch as well.

> 0005 goes a bit further and removes a bunch of logic from
> create_ordered_paths().  The removed logic tries to satisfy the
> query's ordering requirement via cheapest_partial_path + Sort + Gather
> Merge.  Instead, it adds an additional path to the new upper rel added
> in 0003.  This again avoids a call to apply_projection_to_path() which
> could cause projection to be pushed down after costing has already
> been fixed.  Therefore, it gains the same advantages as 0004 for
> queries that would this sort of plan.
>

After this patch, part of the sorts related work will be done in
create_ordered_paths and the other in grouping_planner, which looks
bit odd, otherwise, I don't see any problem with it.

>  Currently, this loses the
> ability to set limit_tuples for the Sort path; that doesn't look too
> hard to fix but I haven't done it.  If we decide to proceed with this
> overall approach I'll see about getting it sorted out.
>

Okay, that makes sense.

> Unfortunately, when I initially tried this approach, a number of
> things broke due to the fact that create_projection_path() was not
> exactly equivalent to apply_projection_to_path().  This initially
> surprised me, because create_projection_plan() contains logic to
> eliminate the Result node that is very similar to the logic in
> apply_projection_to_path().  If apply_projection_path() sees that the
> subordinate node is projection-capable, it applies the revised target
> list to the child; if not, it adds a Result node.
> create_projection_plan() does the same thing.  However,
> create_projection_plan() can lose the physical tlist optimization for
> the subordinate node; it forces an exact projection even if the parent
> node doesn't require this.  0001 fixes this, so that we get the same
> plans with create_projection_path() that we would with
> apply_projection_to_path().  I think this patch is a good idea
> regardless of what we decide to do about the rest of this, because it
> can potentially avoid losing the physical-tlist optimization in any
> place where create_projection_path() is used.
>
> It turns out that 0001 doesn't manage to preserve the physical-tlist
> optimization when the projection path is attached to an upper
> relation.  0002 fixes this.
>

I have done some basic verification of 0001 and 0002, will do further
review/tests, if I don't see any objection from anyone else about the
overall approach.


-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Re: [HACKERS] why not parallel seq scan for slow functions

2018-03-13 Thread Robert Haas

On Wed, Feb 14, 2018 at 5:37 AM, Amit Kapila  wrote:
> Your concern is valid, but isn't the same problem exists in another
> approach as well, because in that also we can call
> adjust_paths_for_srfs after generating gather path which means that we
> might lose some opportunity to reduce the relative cost of parallel
> paths due to tlists having SRFs.  Also, a similar problem can happen
> in create_order_paths  for the cases as described in the example
> above.

You're right.  I think cleaning all of this up for v11 is too much to
consider, but please tell me your opinion of the attached patch set.
Here, instead of the ripping the problematic logic out of
apply_projection_to_path, what I've done is just removed several of
the callers to apply_projection_to_path.  I think that the work of
removing other callers to that function could be postponed to a future
release, but we'd still get some benefit now, and this shows the
direction I have in mind.  I'm going to explain what the patches do
one by one, but out of order, because I backed into the need for the
earlier patches as a result of troubleshooting the later ones in the
series.  Note that the patches need to be applied in order even though
I'm explaining them out of order.

0003 introduces a new upper relation to represent the result of
applying the scan/join target to the topmost scan/join relation.  I'll
explain further down why this seems to be needed.  Since each path
must belong to only one relation, this involves using
create_projection_path() for the non-partial pathlist as we already do
for the partial pathlist, rather than apply_projection_to_path().
This is probably marginally more expensive but I'm hoping it doesn't
matter.  (However, I have not tested.)  Each non-partial path in the
topmost scan/join rel produces a corresponding path in the new upper
rel.  The same is done for partial paths if the scan/join target is
parallel-safe; otherwise we can't.

0004 causes the main part of the planner to skip calling
generate_gather_paths() from the topmost scan/join rel.  This logic is
mostly stolen from your patch.  If the scan/join target is NOT
parallel-safe, we perform generate_gather_paths() on the topmost
scan/join rel.  If the scan/join target IS parallel-safe, we do it on
the post-projection rel introduced by 0003 instead.  This is the patch
that actually fixes Jeff's original complaint.

0005 goes a bit further and removes a bunch of logic from
create_ordered_paths().  The removed logic tries to satisfy the
query's ordering requirement via cheapest_partial_path + Sort + Gather
Merge.  Instead, it adds an additional path to the new upper rel added
in 0003.  This again avoids a call to apply_projection_to_path() which
could cause projection to be pushed down after costing has already
been fixed.  Therefore, it gains the same advantages as 0004 for
queries that would this sort of plan.  Currently, this loses the
ability to set limit_tuples for the Sort path; that doesn't look too
hard to fix but I haven't done it.  If we decide to proceed with this
overall approach I'll see about getting it sorted out.

Unfortunately, when I initially tried this approach, a number of
things broke due to the fact that create_projection_path() was not
exactly equivalent to apply_projection_to_path().  This initially
surprised me, because create_projection_plan() contains logic to
eliminate the Result node that is very similar to the logic in
apply_projection_to_path().  If apply_projection_path() sees that the
subordinate node is projection-capable, it applies the revised target
list to the child; if not, it adds a Result node.
create_projection_plan() does the same thing.  However,
create_projection_plan() can lose the physical tlist optimization for
the subordinate node; it forces an exact projection even if the parent
node doesn't require this.  0001 fixes this, so that we get the same
plans with create_projection_path() that we would with
apply_projection_to_path().  I think this patch is a good idea
regardless of what we decide to do about the rest of this, because it
can potentially avoid losing the physical-tlist optimization in any
place where create_projection_path() is used.

It turns out that 0001 doesn't manage to preserve the physical-tlist
optimization when the projection path is attached to an upper
relation.  0002 fixes this.

If we decide to go forward with this approach, it may makes sense to
merge some of these when actually committing, but I found it useful to
separate them for development and testing purposes, and for clarity
about what was getting changed at each stage.  Please let me know your
thoughts.

Thanks,

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

0005-Remove-explicit-path-construction-logic-in-create_or.patch
Description: Binary data

0004-Postpone-generate_gather_paths-for-topmost-scan-join.patch
Description: Binary data

Re: [HACKERS] why not parallel seq scan for slow functions

2018-03-12 Thread Ashutosh Bapat

On Sun, Mar 11, 2018 at 5:49 PM, Amit Kapila  wrote:

>
>> +/* Add projection step if needed */
>> +if (target && simple_gather_path->pathtarget != target)
>>
>> If the target was copied someplace, this test will fail. Probably we want to
>> check containts of the PathTarget structure? Right now copy_pathtarget() is 
>> not
>> called from many places and all those places modify the copied target. So 
>> this
>> isn't a problem. But we can't guarantee that in future. Similar comment for
>> gather_merge path creation.
>>
>
> I am not sure if there is any use of copying the path target unless
> you want to modify it , but anyway we use the check similar to what is
> used in patch in the multiple places in code.  See
> create_ordered_paths.  So, we need to change all those places first if
> we sense any such danger.

Even if the test fails and we add a projection path here, while
creating the plan we avoid adding a Result node when the projection
target and underlying plan's target look same
(create_projection_plan()), so this works. An advantage with this
simple check (although it miscosts the projection) is that we don't do
expensive target equality check for every path. The expensive check
happens only on the chosen path.

>
>>
>>  deallocate tenk1_count;
>> +explain (costs off) select ten, costly_func(ten) from tenk1;
>>
>> verbose output will show that the parallel seq scan's targetlist has
>> costly_func() in it. Isn't that what we want to test here?
>>
>
> Not exactly, we want to just test whether the parallel plan is
> selected when the costly function is used in the target list.

Parallel plan may be selected whether or not costly function exists in
the targetlist, if the underlying scan is optimal with parallel scan.
AFAIU, this patch is about pushing down the costly functions into the
parllel scan's targetlist. I think that can be verified only by
looking at the targetlist of parallel seq scan plan.

The solution here addresses only parallel scan requirement. In future
if we implement a solution which also addresses requirements of FDW
and custom plan (i.e. ability to handle targetlists by FDW and custom
plan), the changes made here will need to be reverted. That would be a
painful exercsize.

-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company

Re: [HACKERS] why not parallel seq scan for slow functions

2018-03-11 Thread Amit Kapila

On Thu, Feb 15, 2018 at 4:18 PM, Ashutosh Bapat
 wrote:
>
>

After recent commits, the patch doesn't get applied cleanly, so
rebased it and along the way addressed the comments raised by you.

> Here are some comments on the patch.
>
> +/*
> + * Except for the topmost scan/join rel, consider gathering
> + * partial paths.  We'll do the same for the topmost 
> scan/join
> This function only works on join relations. Mentioning scan rel is confusing.
>

Okay, removed the 'scan' word from the comment.


> index 6e842f9..5206da7 100644
> --- a/src/backend/optimizer/path/allpaths.c
> +++ b/src/backend/optimizer/path/allpaths.c
> @@ -481,14 +481,21 @@ set_rel_pathlist(PlannerInfo *root, RelOptInfo *rel,
>  }
>
> + *
> + * Also, if this is the topmost scan/join rel (that is, the only 
> baserel),
> + * we postpone this until the final scan/join targelist is available (see
>
> Mentioning join rel here is confusing since we deal with base relations here.
>

Okay, removed the 'join' word from the comment.

> +bms_membership(root->all_baserels) != BMS_SINGLETON)
>
> set_tablesample_rel_pathlist() is also using this method to decide whether
> there are any joins in the query. May be macro-ize this and use that macro at
> these two places?
>

maybe, but I am not sure if it improves the readability.  I am open to
changing it if somebody else also feels it is better to macro-ize this
usage.

> - * for the specified relation.  (Otherwise, add_partial_path might delete a
> + * for the specified relation. (Otherwise, add_partial_path might delete a
>
> Unrelated change?
>

oops, removed.

> +/* Add projection step if needed */
> +if (target && simple_gather_path->pathtarget != target)
>
> If the target was copied someplace, this test will fail. Probably we want to
> check containts of the PathTarget structure? Right now copy_pathtarget() is 
> not
> called from many places and all those places modify the copied target. So this
> isn't a problem. But we can't guarantee that in future. Similar comment for
> gather_merge path creation.
>

I am not sure if there is any use of copying the path target unless
you want to modify it , but anyway we use the check similar to what is
used in patch in the multiple places in code.  See
create_ordered_paths.  So, we need to change all those places first if
we sense any such danger.

> +simple_gather_path = apply_projection_to_path(root,
> +  rel,
> +  simple_gather_path,
> +  target);
> +
>
> Why don't we incorporate those changes in create_gather_path() by passing it
> the projection target instead of rel->reltarget? Similar comment for
> gather_merge path creation.
>

Nothing important, just for the sake of code consistency, some other
places in code uses it this way.  See create_ordered_paths. Also, I am
not sure if there is any advantage of doing it inside
create_gather_path.

> +/*
> + * Except for the topmost scan/join rel, consider gathering
> + * partial paths.  We'll do the same for the topmost scan/join 
> rel
>
> Mentioning scan rel is confusing here.
>

Okay, changed.

>
>  deallocate tenk1_count;
> +explain (costs off) select ten, costly_func(ten) from tenk1;
>
> verbose output will show that the parallel seq scan's targetlist has
> costly_func() in it. Isn't that what we want to test here?
>

Not exactly, we want to just test whether the parallel plan is
selected when the costly function is used in the target list.


-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


parallel_paths_include_tlist_cost_v9.patch
Description: Binary data

Re: [HACKERS] why not parallel seq scan for slow functions

2018-02-20 Thread Amit Kapila

On Mon, Feb 19, 2018 at 9:56 AM, Ashutosh Bapat
 wrote:
> On Mon, Feb 19, 2018 at 9:35 AM, Ashutosh Bapat
>  wrote:
>> On Sat, Feb 17, 2018 at 8:17 AM, Amit Kapila  wrote:
>>> On Fri, Feb 16, 2018 at 9:29 AM, Ashutosh Bapat
>>>  wrote:
 On Thu, Feb 15, 2018 at 7:47 PM, Amit Kapila  
 wrote:
> On Thu, Feb 15, 2018 at 4:18 PM, Ashutosh Bapat
>  wrote:
>> I happened to look at the patch for something else. But here are some
>> comments. If any of those have been already discussed, please feel
>> free to ignore. I have gone through the thread cursorily, so I might
>> have missed something.
>>
>> In grouping_planner() we call query_planner() first which builds the
>> join tree and creates paths, then calculate the target for scan/join
>> rel which is applied on the topmost scan join rel. I am wondering
>> whether we can reverse this order to calculate the target list of
>> scan/join relation and pass it to standard_join_search() (or the hook)
>> through query_planner().
>>
>
> I think the reason for doing in that order is that we can't compute
> target's width till after query_planner().  See the below note in
> code:
>
> /*
> * Convert the query's result tlist into PathTarget format.
> *
> * Note: it's desirable to not do this till after query_planner(),
> * because the target width estimates can use per-Var width numbers
> * that were obtained within query_planner().
> */
> final_target = create_pathtarget(root, tlist);
>
> Now, I think we can try to juggle the code in a way that the width can
> be computed later, but the rest can be done earlier.

 /* Convenience macro to get a PathTarget with valid cost/width fields */
 #define create_pathtarget(root, tlist) \
 set_pathtarget_cost_width(root, make_pathtarget_from_tlist(tlist))
 create_pathtarget already works that way. We will need to split it.

 Create the Pathtarget without widths. Apply width estimates once we
 know the width of Vars somewhere here in query_planner()
 /*
  * We should now have size estimates for every actual table involved in
  * the query, and we also know which if any have been deleted from the
  * query by join removal; so we can compute total_table_pages.
  *
  * Note that appendrels are not double-counted here, even though we 
 don't
  * bother to distinguish RelOptInfos for appendrel parents, because the
  * parents will still have size zero.
  *
 The next step is building the join tree. Set the pathtarget before that.

> However, I think
> that will be somewhat major change

 I agree.

>  still won't address all kind of
> cases (like for ordered paths) unless we can try to get all kind of
> targets pushed down in the call stack.

 I didn't understand that.

>>>
>>> The places where we use a target different than the target which is
>>> pushed via query planner can cause a similar problem (ex. see the
>>> usage of adjust_paths_for_srfs) because the cost of that target
>>> wouldn't be taken into consideration for Gather's costing.
>>>
>>
>> Right. Right now apply_projection_to_path() or adjust_paths_for_srfs()
>> do not take into consideration the type of path whose cost is being
>> adjusted for the new targetlist. That works for most of the paths but
>> not all the paths like custom, FDW or parallel paths. The idea I am
>> proposing is to compute final targetlist before query planner so that
>> it's available when we create paths for the topmost scan/join
>> relation. That way, any path creation logic can then take advantage of
>> this list to compute costs, not just parallel paths.
>
> In fact, we should do this not just for scan/join relations, but all
> the upper relations as well. Upper relations too create parallel
> paths, whose costs need to be adjusted after their targetlists are
> updated by adjust_paths_for_srfs(). Similar adjustments are needed for
> any FDWs, custom paths which cost targetlists differently.
>

I think any such change in planner can be quite tricky and can lead to
a lot of work.  I am not denying that it is not possible to think
along the lines you are suggesting, but OTOH, I don't see it as a
realistic approach for this patch where we can deal with the majority
of cases with the much smaller patch.  In future, if you are someone
can have a patch along those lines for some other purpose (considering
it is feasible to do so which I am not completely sure), then we can
adjust the things for parallel paths as well.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Re: [HACKERS] why not parallel seq scan for slow functions

2018-02-18 Thread Ashutosh Bapat

On Mon, Feb 19, 2018 at 9:35 AM, Ashutosh Bapat
 wrote:
> On Sat, Feb 17, 2018 at 8:17 AM, Amit Kapila  wrote:
>> On Fri, Feb 16, 2018 at 9:29 AM, Ashutosh Bapat
>>  wrote:
>>> On Thu, Feb 15, 2018 at 7:47 PM, Amit Kapila  
>>> wrote:
 On Thu, Feb 15, 2018 at 4:18 PM, Ashutosh Bapat
  wrote:
> I happened to look at the patch for something else. But here are some
> comments. If any of those have been already discussed, please feel
> free to ignore. I have gone through the thread cursorily, so I might
> have missed something.
>
> In grouping_planner() we call query_planner() first which builds the
> join tree and creates paths, then calculate the target for scan/join
> rel which is applied on the topmost scan join rel. I am wondering
> whether we can reverse this order to calculate the target list of
> scan/join relation and pass it to standard_join_search() (or the hook)
> through query_planner().
>

 I think the reason for doing in that order is that we can't compute
 target's width till after query_planner().  See the below note in
 code:

 /*
 * Convert the query's result tlist into PathTarget format.
 *
 * Note: it's desirable to not do this till after query_planner(),
 * because the target width estimates can use per-Var width numbers
 * that were obtained within query_planner().
 */
 final_target = create_pathtarget(root, tlist);

 Now, I think we can try to juggle the code in a way that the width can
 be computed later, but the rest can be done earlier.
>>>
>>> /* Convenience macro to get a PathTarget with valid cost/width fields */
>>> #define create_pathtarget(root, tlist) \
>>> set_pathtarget_cost_width(root, make_pathtarget_from_tlist(tlist))
>>> create_pathtarget already works that way. We will need to split it.
>>>
>>> Create the Pathtarget without widths. Apply width estimates once we
>>> know the width of Vars somewhere here in query_planner()
>>> /*
>>>  * We should now have size estimates for every actual table involved in
>>>  * the query, and we also know which if any have been deleted from the
>>>  * query by join removal; so we can compute total_table_pages.
>>>  *
>>>  * Note that appendrels are not double-counted here, even though we 
>>> don't
>>>  * bother to distinguish RelOptInfos for appendrel parents, because the
>>>  * parents will still have size zero.
>>>  *
>>> The next step is building the join tree. Set the pathtarget before that.
>>>
 However, I think
 that will be somewhat major change
>>>
>>> I agree.
>>>
  still won't address all kind of
 cases (like for ordered paths) unless we can try to get all kind of
 targets pushed down in the call stack.
>>>
>>> I didn't understand that.
>>>
>>
>> The places where we use a target different than the target which is
>> pushed via query planner can cause a similar problem (ex. see the
>> usage of adjust_paths_for_srfs) because the cost of that target
>> wouldn't be taken into consideration for Gather's costing.
>>
>
> Right. Right now apply_projection_to_path() or adjust_paths_for_srfs()
> do not take into consideration the type of path whose cost is being
> adjusted for the new targetlist. That works for most of the paths but
> not all the paths like custom, FDW or parallel paths. The idea I am
> proposing is to compute final targetlist before query planner so that
> it's available when we create paths for the topmost scan/join
> relation. That way, any path creation logic can then take advantage of
> this list to compute costs, not just parallel paths.

In fact, we should do this not just for scan/join relations, but all
the upper relations as well. Upper relations too create parallel
paths, whose costs need to be adjusted after their targetlists are
updated by adjust_paths_for_srfs(). Similar adjustments are needed for
any FDWs, custom paths which cost targetlists differently.

-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company

Re: [HACKERS] why not parallel seq scan for slow functions

2018-02-18 Thread Ashutosh Bapat

On Sat, Feb 17, 2018 at 8:17 AM, Amit Kapila  wrote:
> On Fri, Feb 16, 2018 at 9:29 AM, Ashutosh Bapat
>  wrote:
>> On Thu, Feb 15, 2018 at 7:47 PM, Amit Kapila  wrote:
>>> On Thu, Feb 15, 2018 at 4:18 PM, Ashutosh Bapat
>>>  wrote:
 I happened to look at the patch for something else. But here are some
 comments. If any of those have been already discussed, please feel
 free to ignore. I have gone through the thread cursorily, so I might
 have missed something.

 In grouping_planner() we call query_planner() first which builds the
 join tree and creates paths, then calculate the target for scan/join
 rel which is applied on the topmost scan join rel. I am wondering
 whether we can reverse this order to calculate the target list of
 scan/join relation and pass it to standard_join_search() (or the hook)
 through query_planner().

>>>
>>> I think the reason for doing in that order is that we can't compute
>>> target's width till after query_planner().  See the below note in
>>> code:
>>>
>>> /*
>>> * Convert the query's result tlist into PathTarget format.
>>> *
>>> * Note: it's desirable to not do this till after query_planner(),
>>> * because the target width estimates can use per-Var width numbers
>>> * that were obtained within query_planner().
>>> */
>>> final_target = create_pathtarget(root, tlist);
>>>
>>> Now, I think we can try to juggle the code in a way that the width can
>>> be computed later, but the rest can be done earlier.
>>
>> /* Convenience macro to get a PathTarget with valid cost/width fields */
>> #define create_pathtarget(root, tlist) \
>> set_pathtarget_cost_width(root, make_pathtarget_from_tlist(tlist))
>> create_pathtarget already works that way. We will need to split it.
>>
>> Create the Pathtarget without widths. Apply width estimates once we
>> know the width of Vars somewhere here in query_planner()
>> /*
>>  * We should now have size estimates for every actual table involved in
>>  * the query, and we also know which if any have been deleted from the
>>  * query by join removal; so we can compute total_table_pages.
>>  *
>>  * Note that appendrels are not double-counted here, even though we don't
>>  * bother to distinguish RelOptInfos for appendrel parents, because the
>>  * parents will still have size zero.
>>  *
>> The next step is building the join tree. Set the pathtarget before that.
>>
>>> However, I think
>>> that will be somewhat major change
>>
>> I agree.
>>
>>>  still won't address all kind of
>>> cases (like for ordered paths) unless we can try to get all kind of
>>> targets pushed down in the call stack.
>>
>> I didn't understand that.
>>
>
> The places where we use a target different than the target which is
> pushed via query planner can cause a similar problem (ex. see the
> usage of adjust_paths_for_srfs) because the cost of that target
> wouldn't be taken into consideration for Gather's costing.
>

Right. Right now apply_projection_to_path() or adjust_paths_for_srfs()
do not take into consideration the type of path whose cost is being
adjusted for the new targetlist. That works for most of the paths but
not all the paths like custom, FDW or parallel paths. The idea I am
proposing is to compute final targetlist before query planner so that
it's available when we create paths for the topmost scan/join
relation. That way, any path creation logic can then take advantage of
this list to compute costs, not just parallel paths.

--
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company

Re: [HACKERS] why not parallel seq scan for slow functions

2018-02-16 Thread Amit Kapila

On Fri, Feb 16, 2018 at 9:29 AM, Ashutosh Bapat
 wrote:
> On Thu, Feb 15, 2018 at 7:47 PM, Amit Kapila  wrote:
>> On Thu, Feb 15, 2018 at 4:18 PM, Ashutosh Bapat
>>  wrote:
>>> I happened to look at the patch for something else. But here are some
>>> comments. If any of those have been already discussed, please feel
>>> free to ignore. I have gone through the thread cursorily, so I might
>>> have missed something.
>>>
>>> In grouping_planner() we call query_planner() first which builds the
>>> join tree and creates paths, then calculate the target for scan/join
>>> rel which is applied on the topmost scan join rel. I am wondering
>>> whether we can reverse this order to calculate the target list of
>>> scan/join relation and pass it to standard_join_search() (or the hook)
>>> through query_planner().
>>>
>>
>> I think the reason for doing in that order is that we can't compute
>> target's width till after query_planner().  See the below note in
>> code:
>>
>> /*
>> * Convert the query's result tlist into PathTarget format.
>> *
>> * Note: it's desirable to not do this till after query_planner(),
>> * because the target width estimates can use per-Var width numbers
>> * that were obtained within query_planner().
>> */
>> final_target = create_pathtarget(root, tlist);
>>
>> Now, I think we can try to juggle the code in a way that the width can
>> be computed later, but the rest can be done earlier.
>
> /* Convenience macro to get a PathTarget with valid cost/width fields */
> #define create_pathtarget(root, tlist) \
> set_pathtarget_cost_width(root, make_pathtarget_from_tlist(tlist))
> create_pathtarget already works that way. We will need to split it.
>
> Create the Pathtarget without widths. Apply width estimates once we
> know the width of Vars somewhere here in query_planner()
> /*
>  * We should now have size estimates for every actual table involved in
>  * the query, and we also know which if any have been deleted from the
>  * query by join removal; so we can compute total_table_pages.
>  *
>  * Note that appendrels are not double-counted here, even though we don't
>  * bother to distinguish RelOptInfos for appendrel parents, because the
>  * parents will still have size zero.
>  *
> The next step is building the join tree. Set the pathtarget before that.
>
>> However, I think
>> that will be somewhat major change
>
> I agree.
>
>>  still won't address all kind of
>> cases (like for ordered paths) unless we can try to get all kind of
>> targets pushed down in the call stack.
>
> I didn't understand that.
>

The places where we use a target different than the target which is
pushed via query planner can cause a similar problem (ex. see the
usage of adjust_paths_for_srfs) because the cost of that target
wouldn't be taken into consideration for Gather's costing.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Re: [HACKERS] why not parallel seq scan for slow functions

2018-02-15 Thread Ashutosh Bapat

On Thu, Feb 15, 2018 at 7:47 PM, Amit Kapila  wrote:
> On Thu, Feb 15, 2018 at 4:18 PM, Ashutosh Bapat
>  wrote:
>> I happened to look at the patch for something else. But here are some
>> comments. If any of those have been already discussed, please feel
>> free to ignore. I have gone through the thread cursorily, so I might
>> have missed something.
>>
>> In grouping_planner() we call query_planner() first which builds the
>> join tree and creates paths, then calculate the target for scan/join
>> rel which is applied on the topmost scan join rel. I am wondering
>> whether we can reverse this order to calculate the target list of
>> scan/join relation and pass it to standard_join_search() (or the hook)
>> through query_planner().
>>
>
> I think the reason for doing in that order is that we can't compute
> target's width till after query_planner().  See the below note in
> code:
>
> /*
> * Convert the query's result tlist into PathTarget format.
> *
> * Note: it's desirable to not do this till after query_planner(),
> * because the target width estimates can use per-Var width numbers
> * that were obtained within query_planner().
> */
> final_target = create_pathtarget(root, tlist);
>
> Now, I think we can try to juggle the code in a way that the width can
> be computed later, but the rest can be done earlier.

/* Convenience macro to get a PathTarget with valid cost/width fields */
#define create_pathtarget(root, tlist) \
set_pathtarget_cost_width(root, make_pathtarget_from_tlist(tlist))
create_pathtarget already works that way. We will need to split it.

Create the Pathtarget without widths. Apply width estimates once we
know the width of Vars somewhere here in query_planner()
/*
 * We should now have size estimates for every actual table involved in
 * the query, and we also know which if any have been deleted from the
 * query by join removal; so we can compute total_table_pages.
 *
 * Note that appendrels are not double-counted here, even though we don't
 * bother to distinguish RelOptInfos for appendrel parents, because the
 * parents will still have size zero.
 *
The next step is building the join tree. Set the pathtarget before that.

> However, I think
> that will be somewhat major change

I agree.

>  still won't address all kind of
> cases (like for ordered paths) unless we can try to get all kind of
> targets pushed down in the call stack.

I didn't understand that.

-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company

Re: [HACKERS] why not parallel seq scan for slow functions

2018-02-15 Thread Amit Kapila

On Thu, Feb 15, 2018 at 4:18 PM, Ashutosh Bapat
 wrote:
> I happened to look at the patch for something else. But here are some
> comments. If any of those have been already discussed, please feel
> free to ignore. I have gone through the thread cursorily, so I might
> have missed something.
>
> In grouping_planner() we call query_planner() first which builds the
> join tree and creates paths, then calculate the target for scan/join
> rel which is applied on the topmost scan join rel. I am wondering
> whether we can reverse this order to calculate the target list of
> scan/join relation and pass it to standard_join_search() (or the hook)
> through query_planner().
>

I think the reason for doing in that order is that we can't compute
target's width till after query_planner().  See the below note in
code:

/*
* Convert the query's result tlist into PathTarget format.
*
* Note: it's desirable to not do this till after query_planner(),
* because the target width estimates can use per-Var width numbers
* that were obtained within query_planner().
*/
final_target = create_pathtarget(root, tlist);

Now, I think we can try to juggle the code in a way that the width can
be computed later, but the rest can be done earlier.  However, I think
that will be somewhat major change and still won't address all kind of
cases (like for ordered paths) unless we can try to get all kind of
targets pushed down in the call stack.  Also, I am not sure if that is
the only reason or there are some other assumptions about this calling
order as well.

>
> Here are some comments on the patch.
>

Thanks for looking into the patch.  As of now, we are evaluating the
right approach for this patch, so let's wait for Robert's reply.
After we agree on the approach, I will address your comments.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Re: [HACKERS] why not parallel seq scan for slow functions

2018-02-15 Thread Ashutosh Bapat

I happened to look at the patch for something else. But here are some
comments. If any of those have been already discussed, please feel
free to ignore. I have gone through the thread cursorily, so I might
have missed something.

In grouping_planner() we call query_planner() first which builds the
join tree and creates paths, then calculate the target for scan/join
rel which is applied on the topmost scan join rel. I am wondering
whether we can reverse this order to calculate the target list of
scan/join relation and pass it to standard_join_search() (or the hook)
through query_planner(). standard_join_search() would then set this as
reltarget of the topmost relation and every path created for it will
have that target, applying projection if needed. This way we avoid
calling generate_gather_path() at two places. Right now
generate_gather_path() seems to be the only thing benefitting from
this but what about FDWs and custom paths whose costs may change when
targetlist changes. For custom paths I am considering GPU optimization
paths. Also this might address Tom's worry, "But having to do that in
a normal code path
suggests that something's not right about the design ... "

Here are some comments on the patch.

+/*
+ * Except for the topmost scan/join rel, consider gathering
+ * partial paths.  We'll do the same for the topmost scan/join
This function only works on join relations. Mentioning scan rel is confusing.

index 6e842f9..5206da7 100644
--- a/src/backend/optimizer/path/allpaths.c
+++ b/src/backend/optimizer/path/allpaths.c
@@ -481,14 +481,21 @@ set_rel_pathlist(PlannerInfo *root, RelOptInfo *rel,
 }

+ *
+ * Also, if this is the topmost scan/join rel (that is, the only baserel),
+ * we postpone this until the final scan/join targelist is available (see

Mentioning join rel here is confusing since we deal with base relations here.

+bms_membership(root->all_baserels) != BMS_SINGLETON)

set_tablesample_rel_pathlist() is also using this method to decide whether
there are any joins in the query. May be macro-ize this and use that macro at
these two places?

- * for the specified relation.  (Otherwise, add_partial_path might delete a
+ * for the specified relation. (Otherwise, add_partial_path might delete a

Unrelated change?

+/* Add projection step if needed */
+if (target && simple_gather_path->pathtarget != target)

If the target was copied someplace, this test will fail. Probably we want to
check containts of the PathTarget structure? Right now copy_pathtarget() is not
called from many places and all those places modify the copied target. So this
isn't a problem. But we can't guarantee that in future. Similar comment for
gather_merge path creation.

+simple_gather_path = apply_projection_to_path(root,
+  rel,
+  simple_gather_path,
+  target);
+

Why don't we incorporate those changes in create_gather_path() by passing it
the projection target instead of rel->reltarget? Similar comment for
gather_merge path creation.

+/*
+ * Except for the topmost scan/join rel, consider gathering
+ * partial paths.  We'll do the same for the topmost scan/join rel

Mentioning scan rel is confusing here.


 deallocate tenk1_count;
+explain (costs off) select ten, costly_func(ten) from tenk1;

verbose output will show that the parallel seq scan's targetlist has
costly_func() in it. Isn't that what we want to test here?


-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company

Re: [HACKERS] why not parallel seq scan for slow functions

2018-02-14 Thread Amit Kapila

On Fri, Feb 2, 2018 at 12:15 AM, Robert Haas  wrote:
> On Wed, Jan 31, 2018 at 11:48 PM, Amit Kapila  wrote:
>
> The other cases aren't so clear.  In the case of the first call within
> create_ordered_paths, there's no loss in the !is_sorted case because
> apply_projection_to_path will be getting called on a Sort path.  But
> when is_sorted is true, the current code can push a target list into a
> Gather or Gather Merge that was created with some other target list,
> and with the patch it can't.  I'm not quite sure what sort of query
> would trigger that problem, but it seems like there's something to
> worry about there.
>

I think the query plans which involve Gather Merge -> Parallel Index
Scan can be impacted.  For ex. cosider below case:

With parallel-paths-tlist-cost-rmh-v2.patch
postgres=# set cpu_operator_cost=0;
SET
postgres=# set parallel_setup_cost=0;set parallel_tuple_cost=0;set
min_parallel_table_scan_size=0;set
max_parallel_workers_per_gather=4;
SET
SET
SET
SET
postgres=# explain (costs off, verbose) select simple_func(aid) from
pgbench_accounts where aid > 1000 order by aid;
  QUERY PLAN
---
 Gather Merge
   Output: simple_func(aid), aid
   Workers Planned: 4
   ->  Parallel Index Only Scan using pgbench_accounts_pkey on
public.pgbench_accounts
 Output: aid
 Index Cond: (pgbench_accounts.aid > 1000)
(6 rows)

HEAD and parallel_paths_include_tlist_cost_v7
postgres=# explain (costs off, verbose) select simple_func(aid) from
pgbench_accounts where aid > 1000 order by aid;
  QUERY PLAN
---
 Gather Merge
   Output: (simple_func(aid)), aid
   Workers Planned: 4
   ->  Parallel Index Only Scan using pgbench_accounts_pkey on
public.pgbench_accounts
 Output: simple_func(aid), aid
 Index Cond: (pgbench_accounts.aid > 1000)
(6 rows)

For the above test, I have initialized pgbench with 100 scale factor.

It shows that with patch parallel-paths-tlist-cost-rmh-v2.patch, we
will lose the capability to push target list in some cases.  One lame
idea could be that at the call location, we detect if it is a Gather
(Merge) and then push the target list, but doing so at all the
locations doesn't sound to be a good alternative as compared to
another approach where we push target list in
apply_projection_to_path.

>  Similarly I can't see any obvious reason why this
> isn't a problem for adjust_path_for_srfs and build_minmax_path as
> well, although I haven't tried to construct queries that hit those
> cases, either.
>

Neither do I, but I can give it a try if we expect something different
than the results of above example.

> Now, we could just give up on this approach and leave that code in
> apply_projection_to_path, but what's bothering me is that, presumably,
> any place where that code is actually getting used has the same
> problem that we're trying to fix in grouping_planner: the tlist evals
> costs are not being factored into the decision as to which path we
> should choose, which might make a good parallel path lose to an
> inferior non-parallel path.
>

Your concern is valid, but isn't the same problem exists in another
approach as well, because in that also we can call
adjust_paths_for_srfs after generating gather path which means that we
might lose some opportunity to reduce the relative cost of parallel
paths due to tlists having SRFs.  Also, a similar problem can happen
in create_order_paths  for the cases as described in the example
above.

>  It would be best to fix that throughout
> the code base rather than only fixing the more common paths -- if we
> can do so with a reasonable amount of work.
>

Agreed, I think one way to achieve that is instead of discarding
parallel paths based on cost, we retain them till the later phase of
planning, something like what we do for ordered paths.  In that case,
the way changes have been done in the patch
parallel_paths_include_tlist_cost_v7 will work.  I think it will be
helpful for other cases as well if we keep the parallel paths as
alternative paths till later stage of planning (we have discussed it
during parallelizing subplans as well), however, that is a bigger
project on its own.  I don't think it will be too bad to go with what
we have in parallel_paths_include_tlist_cost_v7 with the intent that
in future when we do the other project, the other cases will be
automatically dealt.

I have updated the patch parallel_paths_include_tlist_cost_v7 by
changing some of the comments based on
parallel-paths-tlist-cost-rmh-v2.patch.

I have one additional comment on parallel-paths-tlist-cost-rmh-v2.patch
@@ -374,6 +374,14 @@ cost_gather(GatherPath *path, PlannerInfo *root,
  startup_cost += parallel_setup_cost;
  run_cost += parallel_tuple_cost *

Re: [HACKERS] why not parallel seq scan for slow functions

2018-02-01 Thread Robert Haas

On Wed, Jan 31, 2018 at 11:48 PM, Amit Kapila  wrote:
> I can see what you have in mind, but I think we still need to change
> the parallel safety flag of the path if  *_target is not parallel safe
> either inside apply_projection_to_path or may be outside where it is
> called.

Hmm.  You have a point.

That's not the only problem, though.  With these changes, we need to
check every place where  apply_projection_to_path is used to see
whether we're losing the ability to push down the target list below
Gather (Merge) in some situations.  If we are, then we need to see if
we can supply the correct target list to the code that generates the
partial path, before Gather (Merge) is added.

There are the following call sites:

* grouping_planner.
* create_ordered_paths (x2).
* adjust_path_for_srfs.
* build_minmax_path.
* recurse_set_operations (x2).

The cases in recurse_set_operations() don't matter, because the path
whose target list we're adjusting is known not to be a Gather path.
In the first call, it's definitely a Subquery Scan, and in the second
case it'll be a path implementing some set operation.
grouping_planner() is the core thing the patch is trying to fix, and
as far as I can tell the logic changes there are adequate.  Also, the
second case in create_ordered_paths() is fixed: the patch changes
things so that it inserts a projection path before Gather Merge if
that's safe to do so.

The other cases aren't so clear.  In the case of the first call within
create_ordered_paths, there's no loss in the !is_sorted case because
apply_projection_to_path will be getting called on a Sort path.  But
when is_sorted is true, the current code can push a target list into a
Gather or Gather Merge that was created with some other target list,
and with the patch it can't.  I'm not quite sure what sort of query
would trigger that problem, but it seems like there's something to
worry about there.  Similarly I can't see any obvious reason why this
isn't a problem for adjust_path_for_srfs and build_minmax_path as
well, although I haven't tried to construct queries that hit those
cases, either.

Now, we could just give up on this approach and leave that code in
apply_projection_to_path, but what's bothering me is that, presumably,
any place where that code is actually getting used has the same
problem that we're trying to fix in grouping_planner: the tlist evals
costs are not being factored into the decision as to which path we
should choose, which might make a good parallel path lose to an
inferior non-parallel path.  It would be best to fix that throughout
the code base rather than only fixing the more common paths -- if we
can do so with a reasonable amount of work.

Here's a new version which (a) restores the code that you pointed out,
(b) always passes the target to generate_gather_paths() instead of
treating NULL as a special case, and (c) reworks some of the comments
you added.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

parallel-paths-tlist-cost-rmh-v2.patch
Description: Binary data

Re: [HACKERS] why not parallel seq scan for slow functions

2018-01-31 Thread Amit Kapila

On Tue, Jan 30, 2018 at 3:30 AM, Robert Haas  wrote:
> On Sun, Jan 28, 2018 at 10:13 PM, Amit Kapila  wrote:
>> If we want to get rid of Gather (Merge) checks in
>> apply_projection_to_path(), then we need some way to add a projection
>> path to the subpath of gather node even if that is capable of
>> projection as we do now.  I think changing the order of applying
>> scan/join target won't address that unless we decide to do it for
>> every partial path.  Another way could be that we handle that in
>> generate_gather_paths, but I think that won't be the idle place to add
>> projection.
>>
>> If we want, we can compute the parallel-safety of scan/join target
>> once in grouping_planner and then pass it in apply_projection_to_path
>> to address your main concern.
>
> I spent some time today hacking on this; see attached.  It needs more
> work, but you can see what I have in mind.
>

I can see what you have in mind, but I think we still need to change
the parallel safety flag of the path if  *_target is not parallel safe
either inside apply_projection_to_path or may be outside where it is
called.  Basically, I am talking about below code:

@@ -2473,57 +2469,6 @@ apply_projection_to_path(PlannerInfo *root,
{
..
- else if (path->parallel_safe &&
- !is_parallel_safe(root, (Node *) target->exprs))
- {
- /*
- * We're inserting a parallel-restricted target list into a path
- * currently marked parallel-safe, so we have to mark it as no longer
- * safe.
- */
- path->parallel_safe = false;
- }
-
..
}

I can take care of dealing with this unless you think otherwise.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Re: [HACKERS] why not parallel seq scan for slow functions

2018-01-29 Thread Robert Haas

On Sun, Jan 28, 2018 at 10:13 PM, Amit Kapila  wrote:
> If we want to get rid of Gather (Merge) checks in
> apply_projection_to_path(), then we need some way to add a projection
> path to the subpath of gather node even if that is capable of
> projection as we do now.  I think changing the order of applying
> scan/join target won't address that unless we decide to do it for
> every partial path.  Another way could be that we handle that in
> generate_gather_paths, but I think that won't be the idle place to add
> projection.
>
> If we want, we can compute the parallel-safety of scan/join target
> once in grouping_planner and then pass it in apply_projection_to_path
> to address your main concern.

I spent some time today hacking on this; see attached.  It needs more
work, but you can see what I have in mind.  It's not quite the same as
what I outlined before because that turned out to not quite work, but
it does remove most of the logic from apply_projection_to_path().

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


parallel-paths-tlist-cost-rmh.patch
Description: Binary data

Re: [HACKERS] why not parallel seq scan for slow functions

2018-01-28 Thread Amit Kapila

On Sat, Jan 27, 2018 at 2:50 AM, Robert Haas  wrote:
> On Tue, Jan 2, 2018 at 6:38 AM, Amit Kapila  wrote:
>>  [ new patch ]
>
> I think that grouping_planner() could benefit from a slightly more
> extensive rearrangement.  With your patch applied, the order of
> operations is:
>
> 1. compute the scan/join target
> 2. apply the scan/join target to all paths in current_rel's pathlist
> 3. generate gather paths, possibly adding more stuff to current_rel's pathlist
> 4. rerun set_cheapest
> 5. apply the scan/join target, if parallel safe, to all paths in the
> current rel's partial_pathlist, for the benefit of upper planning
> steps
> 6. clear the partial pathlist if the target list is not parallel safe
>
> I at first thought this was outright broken because step #3 imposes
> the scan/join target without testing it for parallel-safety, but then
> I realized that generate_gather_paths will apply that target list by
> using apply_projection_to_path, which makes an is_parallel_safe test
> of its own.  But it doesn't seem good for step 3 to test the
> parallel-safety of the target list separately for each path and then
> have grouping_planner do it one more time for the benefit of upper
> planning steps.  Instead, I suggest that we try to get rid of the
> logic in apply_projection_to_path that knows about Gather and Gather
> Merge specifically.  I think we can do that if grouping_planner does
> this:
>
> 1. compute the scan/join target
> 2. apply the scan/join target, if parallel safe, to all paths in the
> current rel's partial_pathlist
> 3. generate gather paths
> 4. clear the partial pathlist if the target list is not parallel safe
> 5. apply the scan/join target to all paths in current_rel's pathlist
> 6. rerun set_cheapest
>
> That seems like a considerably more logical order of operations.  It
> avoids not only the expense of testing the scanjoin_target for
> parallel-safety multiple times, but the ugliness of having
> apply_projection_to_path know about Gather and Gather Merge as a
> special case.
>

If we want to get rid of Gather (Merge) checks in
apply_projection_to_path(), then we need some way to add a projection
path to the subpath of gather node even if that is capable of
projection as we do now.  I think changing the order of applying
scan/join target won't address that unless we decide to do it for
every partial path.  Another way could be that we handle that in
generate_gather_paths, but I think that won't be the idle place to add
projection.

If we want, we can compute the parallel-safety of scan/join target
once in grouping_planner and then pass it in apply_projection_to_path
to address your main concern.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Re: [HACKERS] why not parallel seq scan for slow functions

2018-01-26 Thread Robert Haas

On Tue, Jan 2, 2018 at 6:38 AM, Amit Kapila  wrote:
>  [ new patch ]

I think that grouping_planner() could benefit from a slightly more
extensive rearrangement.  With your patch applied, the order of
operations is:

1. compute the scan/join target
2. apply the scan/join target to all paths in current_rel's pathlist
3. generate gather paths, possibly adding more stuff to current_rel's pathlist
4. rerun set_cheapest
5. apply the scan/join target, if parallel safe, to all paths in the
current rel's partial_pathlist, for the benefit of upper planning
steps
6. clear the partial pathlist if the target list is not parallel safe

I at first thought this was outright broken because step #3 imposes
the scan/join target without testing it for parallel-safety, but then
I realized that generate_gather_paths will apply that target list by
using apply_projection_to_path, which makes an is_parallel_safe test
of its own.  But it doesn't seem good for step 3 to test the
parallel-safety of the target list separately for each path and then
have grouping_planner do it one more time for the benefit of upper
planning steps.  Instead, I suggest that we try to get rid of the
logic in apply_projection_to_path that knows about Gather and Gather
Merge specifically.  I think we can do that if grouping_planner does
this:

1. compute the scan/join target
2. apply the scan/join target, if parallel safe, to all paths in the
current rel's partial_pathlist
3. generate gather paths
4. clear the partial pathlist if the target list is not parallel safe
5. apply the scan/join target to all paths in current_rel's pathlist
6. rerun set_cheapest

That seems like a considerably more logical order of operations.  It
avoids not only the expense of testing the scanjoin_target for
parallel-safety multiple times, but the ugliness of having
apply_projection_to_path know about Gather and Gather Merge as a
special case.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: [HACKERS] why not parallel seq scan for slow functions

2018-01-02 Thread Amit Kapila

On Fri, Dec 29, 2017 at 7:56 PM, Marina Polyakova
 wrote:
> Hello everyone in this thread!
>
> On 29-11-2017 8:01, Michael Paquier wrote:
>>
>> Moved to next CF for extra reviews.
>
>
> Amit, I would like to ask some questions about your patch (and can you
> please rebase it on the top of the master?):
>

Thanks for looking into the patch.

> 1)
> +   path->total_cost -= (target->cost.per_tuple -
> oldcost.per_tuple) * path->rows;
>
> Here we undo the changes that we make in this function earlier:
>
> path->total_cost += target->cost.startup - oldcost.startup +
> (target->cost.per_tuple - oldcost.per_tuple) * path->rows;
>
> Perhaps we should not start these "reversible" changes in this case from the
> very beginning?
>

We can do that way as well, however, when the patch was written we
don't have GatherMerge handling in that function due to which it
appears to me that changing the code the way you are suggesting will
complicate the code, but now it looks saner to do it that way.  I have
changed the code accordingly.


> 2)
> gpath->subpath = (Path *)
> create_projection_path(root,
>
> gpath->subpath->parent,
>
> gpath->subpath,
>target);
> ...
> +   if (((ProjectionPath *) gpath->subpath)->dummypp)
> +   {
> ...
> +   path->total_cost += (target->cost.per_tuple -
> oldcost.per_tuple) * gpath->subpath->rows;
> +   }
> +   else
> +   {
> ...
> +   path->total_cost += (cpu_tuple_cost +
> target->cost.per_tuple) * gpath->subpath->rows;
> +   }
>
> As I understand it, here in the if-else block we change the run costs of
> gpath in the same way as they were changed for its subpath in the function
> create_projection_path earlier. But for the startup costs we always subtract
> the cost of the old target:
>
> path->startup_cost += target->cost.startup - oldcost.startup;
> path->total_cost += target->cost.startup - oldcost.startup +
> (target->cost.per_tuple - oldcost.per_tuple) * path->rows;
>
> Should we change the startup costs of gpath in this way if ((ProjectionPath
> *) gpath->subpath)->dummypp is false?
>

The startup costs will be computed in the way you mentioned for both
gpath and gmpath as that code is executed before we do any handling
for Gather or GatherMerge path.

> 3)
> simple_gather_path = (Path *)
> create_gather_path(root, rel, cheapest_partial_path,
> rel->reltarget,
>NULL, NULL);
> +   /* Add projection step if needed */
> +   if (target && simple_gather_path->pathtarget != target)
> +   simple_gather_path = apply_projection_to_path(root, rel,
> simple_gather_path, target);
> ...
> +   path = (Path *) create_gather_merge_path(root, rel, subpath,
> rel->reltarget,
> +
> subpath->pathkeys, NULL, NULL);
> +   /* Add projection step if needed */
> +   if (target && path->pathtarget != target)
> +   path = apply_projection_to_path(root, rel, path,
> target);
> ...
> @@ -2422,16 +2422,27 @@ apply_projection_to_path(PlannerInfo *root,
> ...
> gpath->subpath = (Path *)
> create_projection_path(root,
>
> gpath->subpath->parent,
>
> gpath->subpath,
>target);
>
> The target is changing so we change it for the gather(merge) node and for
> its subpath. Do not we have to do this work (replace the subpath by calling
> the function create_projection_path if the target is different) in the
> functions create_gather(_merge)_path too? I suppose that the target of the
> subpath affects its costs => the costs of the gather(merge) node in the
> functions cost_gather(_merge) (=> the costs of the gather(merge) node in the
> function apply_projection_to_path).
>

The cost impact due to different target will be taken care when we
call apply_projection_to_path.  I am not sure if I understand your
question completely, but do you see anything in functions
cost_gather(_merge) which is suspicious and the target list costing
can impact the same.  Generally, we compute the target list cost in
the later stage of planning once the path target is formed.

> 4)
> +* Consider ways to implement parallel paths.  We always
> skip
> +* generating parallel path for top level scan/join nodes
> till the
> +* pathtarget is computed.  This is to ensure that we can
> account for
> +* the fact that most of the target evaluation work will be
> performed
> +* in workers.
> +*/
> +   generate_gather_paths(root, current_rel, scanjoin_target);
> +
> +   /* Set or update cheapest_total_path and

Re: [HACKERS] why not parallel seq scan for slow functions

2017-12-29 Thread Marina Polyakova


Hello everyone in this thread!

On 29-11-2017 8:01, Michael Paquier wrote:

Moved to next CF for extra reviews.


Amit, I would like to ask some questions about your patch (and can you 
please rebase it on the top of the master?):


1)
+			path->total_cost -= (target->cost.per_tuple - oldcost.per_tuple) * 
path->rows;


Here we undo the changes that we make in this function earlier:

path->total_cost += target->cost.startup - oldcost.startup +
(target->cost.per_tuple - oldcost.per_tuple) * path->rows;

Perhaps we should not start these "reversible" changes in this case from 
the very beginning?


2)
gpath->subpath = (Path *)
create_projection_path(root,
   
gpath->subpath->parent,
   
gpath->subpath,
   target);
...
+   if (((ProjectionPath *) gpath->subpath)->dummypp)
+   {
...
+			path->total_cost += (target->cost.per_tuple - oldcost.per_tuple) * 
gpath->subpath->rows;

+   }
+   else
+   {
...
+			path->total_cost += (cpu_tuple_cost + target->cost.per_tuple) * 
gpath->subpath->rows;

+   }

As I understand it, here in the if-else block we change the run costs of 
gpath in the same way as they were changed for its subpath in the 
function create_projection_path earlier. But for the startup costs we 
always subtract the cost of the old target:


path->startup_cost += target->cost.startup - oldcost.startup;
path->total_cost += target->cost.startup - oldcost.startup +
(target->cost.per_tuple - oldcost.per_tuple) * path->rows;

Should we change the startup costs of gpath in this way if 
((ProjectionPath *) gpath->subpath)->dummypp is false?


3)
simple_gather_path = (Path *)
create_gather_path(root, rel, cheapest_partial_path, 
rel->reltarget,
   NULL, NULL);
+   /* Add projection step if needed */
+   if (target && simple_gather_path->pathtarget != target)
+		simple_gather_path = apply_projection_to_path(root, rel, 
simple_gather_path, target);

...
+		path = (Path *) create_gather_merge_path(root, rel, subpath, 
rel->reltarget,

+  
  subpath->pathkeys, NULL, NULL);
+   /* Add projection step if needed */
+   if (target && path->pathtarget != target)
+   path = apply_projection_to_path(root, rel, path, 
target);
...
@@ -2422,16 +2422,27 @@ apply_projection_to_path(PlannerInfo *root,
...
gpath->subpath = (Path *)
create_projection_path(root,
   
gpath->subpath->parent,
   
gpath->subpath,
   target);

The target is changing so we change it for the gather(merge) node and 
for its subpath. Do not we have to do this work (replace the subpath by 
calling the function create_projection_path if the target is different) 
in the functions create_gather(_merge)_path too? I suppose that the 
target of the subpath affects its costs => the costs of the 
gather(merge) node in the functions cost_gather(_merge) (=> the costs of 
the gather(merge) node in the function apply_projection_to_path).


4)
+* Consider ways to implement parallel paths.  We always skip
+* generating parallel path for top level scan/join nodes till 
the
+* pathtarget is computed.  This is to ensure that we can 
account for
+* the fact that most of the target evaluation work will be 
performed
+* in workers.
+*/
+   generate_gather_paths(root, current_rel, scanjoin_target);
+
+   /* Set or update cheapest_total_path and related fields */
+   set_cheapest(current_rel);

As I understand it (if correctly, thank the comments of Robert Haas and 
Tom Lane :), after that we cannot use the function 
apply_projection_to_path for paths in the current_rel->pathlist without 
risking that the cheapest path will change. And we have several calls to 
the function adjust_paths_for_srfs (which uses apply_projection_to_path 
for paths in the current_rel->pathlist) in grouping_planner after 
generating the gather paths:


/* Now fix things up if scan/join target contains SRFs */
if (parse->hasTargetSRFs)
adjust_paths_for_srfs(root, current_rel,
  
scanjoin_targets,
  
scanjoin_targets_contain_srfs);
...

Re: [HACKERS] why not parallel seq scan for slow functions

Re: [HACKERS] why not parallel seq scan for slow functions

Re: [HACKERS] why not parallel seq scan for slow functions

Re: [HACKERS] why not parallel seq scan for slow functions

Re: [HACKERS] why not parallel seq scan for slow functions

Re: [HACKERS] why not parallel seq scan for slow functions

Re: [HACKERS] why not parallel seq scan for slow functions

Re: [HACKERS] why not parallel seq scan for slow functions

Re: [HACKERS] why not parallel seq scan for slow functions

Re: [HACKERS] why not parallel seq scan for slow functions

Re: [HACKERS] why not parallel seq scan for slow functions

Re: [HACKERS] why not parallel seq scan for slow functions

Re: [HACKERS] why not parallel seq scan for slow functions

Re: [HACKERS] why not parallel seq scan for slow functions

Re: [HACKERS] why not parallel seq scan for slow functions

Re: [HACKERS] why not parallel seq scan for slow functions

Re: [HACKERS] why not parallel seq scan for slow functions

Re: [HACKERS] why not parallel seq scan for slow functions

Re: [HACKERS] why not parallel seq scan for slow functions

Re: [HACKERS] why not parallel seq scan for slow functions

Re: [HACKERS] why not parallel seq scan for slow functions

Re: [HACKERS] why not parallel seq scan for slow functions

Re: [HACKERS] why not parallel seq scan for slow functions

Re: [HACKERS] why not parallel seq scan for slow functions

Re: [HACKERS] why not parallel seq scan for slow functions

Re: [HACKERS] why not parallel seq scan for slow functions

Re: [HACKERS] why not parallel seq scan for slow functions

Re: [HACKERS] why not parallel seq scan for slow functions

Re: [HACKERS] why not parallel seq scan for slow functions

Re: [HACKERS] why not parallel seq scan for slow functions

Re: [HACKERS] why not parallel seq scan for slow functions

Re: [HACKERS] why not parallel seq scan for slow functions

Re: [HACKERS] why not parallel seq scan for slow functions

Re: [HACKERS] why not parallel seq scan for slow functions

34 matches

Site Navigation

Mail list logo

Footer information