Re: [HACKERS] ERROR: ORDER/GROUP BY expression not found in targetlist

2016-06-28 Thread Rushabh Lathia
Thanks Tom.

I performed testing with the latest commit and test are
running fine.

On Tue, Jun 28, 2016 at 8:14 PM, Tom Lane  wrote:

> Rushabh Lathia  writes:
> > SELECT setval('s', max(100)) from tab;
> > ERROR:  ORDER/GROUP BY expression not found in targetlist
>
> Fixed, thanks for the report!
>
> regards, tom lane
>



-- 
Rushabh Lathia


Re: [HACKERS] ERROR: ORDER/GROUP BY expression not found in targetlist

2016-06-28 Thread Tom Lane
Rushabh Lathia  writes:
> SELECT setval('s', max(100)) from tab;
> ERROR:  ORDER/GROUP BY expression not found in targetlist

Fixed, thanks for the report!

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ERROR: ORDER/GROUP BY expression not found in targetlist

2016-06-28 Thread Amit Langote
On Tue, Jun 28, 2016 at 2:52 PM, Rushabh Lathia
 wrote:
> Hi,
>
> Consider the below testcase:
>
> CREATE TABLE tab(
>   c1 INT NOT NULL,
>   c2 INT NOT NULL
> );
> INSERT INTO tab VALUES (1, 2);
> INSERT INTO tab VALUES (2, 1);
> INSERT INTO tab VALUES (1, 2);
>
>
> case 1:
>
> SELECT c.c1, c.c2 from tab C WHERE c.c2 = ANY (
> SELECT 1 FROM tab A WHERE a.c2 IN (
>   SELECT 1 FROM tab B WHERE a.c1 = c.c1
>   GROUP BY rollup(a.c1)
> )
> GROUP BY cube(c.c2)
>   )
>   GROUP BY grouping sets(c.c1, c.c2)
>   ORDER BY 1, 2 DESC;
> ERROR:  ORDER/GROUP BY expression not found in targetlist
>
> case 2:
>
> create sequence s;
> SELECT setval('s', max(100)) from tab;
> ERROR:  ORDER/GROUP BY expression not found in targetlist

The following give the same error:

select max(100) from tab;
select max((select 1)) from tab;

Thanks,
Amit


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] ERROR: ORDER/GROUP BY expression not found in targetlist

2016-06-27 Thread Rushabh Lathia
Hi,

Consider the below testcase:

CREATE TABLE tab(
  c1 INT NOT NULL,
  c2 INT NOT NULL
);
INSERT INTO tab VALUES (1, 2);
INSERT INTO tab VALUES (2, 1);
INSERT INTO tab VALUES (1, 2);


case 1:

SELECT c.c1, c.c2 from tab C WHERE c.c2 = ANY (
SELECT 1 FROM tab A WHERE a.c2 IN (
  SELECT 1 FROM tab B WHERE a.c1 = c.c1
  GROUP BY rollup(a.c1)
)
GROUP BY cube(c.c2)
  )
  GROUP BY grouping sets(c.c1, c.c2)
  ORDER BY 1, 2 DESC;
ERROR:  ORDER/GROUP BY expression not found in targetlist

case 2:

create sequence s;
SELECT setval('s', max(100)) from tab;
ERROR:  ORDER/GROUP BY expression not found in targetlist

Looking further I found that error started coming with below commit:

commit aeb9ae6457865c8949641d71a9523374d843a418
Author: Tom Lane 
Date:   Thu May 26 14:52:24 2016 -0400

Disable physical tlist if any Var would need multiple sortgroupref
labels.

If we revert the above commit, then the give test are running
as expected.


Regards,
Rushabh Lathia
www.EnterpriseDB.com


Re: [HACKERS] ERROR: ORDER/GROUP BY expression not found in targetlist

2016-06-26 Thread Noah Misch
On Mon, Jun 13, 2016 at 05:27:13PM -0400, Robert Haas wrote:
> One problem that I've realized that is related to this is that the way
> that the consider_parallel flag is being set for upper rels is almost
> totally bogus, which I believe accounts for your complaint at PGCon
> that force_parallel_mode was not doing as much as it ought to do.
> When I originally wrote a lot of this logic, there were no upper rels,
> which led to this code:
> 
> if (force_parallel_mode == FORCE_PARALLEL_OFF || 
> !glob->parallelModeOK)
> {
> glob->parallelModeNeeded = false;
> glob->wholePlanParallelSafe = false;/* either
> false or don't care */
> }
> else
> {
> glob->parallelModeNeeded = true;
> glob->wholePlanParallelSafe =
> !has_parallel_hazard((Node *) parse, false);
> }
> 
> The main way that wholePlanParallelSafe is supposed to be cleared is
> in create_plan:
> 
> /* Update parallel safety information if needed. */
> if (!best_path->parallel_safe)
> root->glob->wholePlanParallelSafe = false;
> 
> However, at the time that code was written, it was impossible to rely
> entirely on the latter mechanism.  Since there were no upper rels and
> no paths for upper plan nodes, you could have the case where every
> path was parallel-safe but the whole plan was node.  Therefore the
> code shown above was needed to scan the whole darned plan for
> parallel-unsafe things.  Post-pathification, this whole thing is
> pretty bogus: upper rels mostly don't get consider_parallel set at
> all, which means plans involving upper rels get marked parallel-unsafe
> even if they are not, which means the wholePlanParallelSafe flag gets
> cleared in a bunch of cases where it shouldn't.  And on the flip side
> I think that the first chunk of code quoted above would be totally
> unnecessary if we were actually setting consider_parallel correctly on
> the upper rels.

[Action required within 72 hours.  This is a generic notification.]

The above-described topic is currently a PostgreSQL 9.6 open item ("set
consider_parallel correctly for upper rels").  Robert, since you committed the
patch believed to have created it, you own this open item.  If some other
commit is more relevant or if this does not belong as a 9.6 open item, please
let us know.  Otherwise, please observe the policy on open item ownership[1]
and send a status update within 72 hours of this message.  Include a date for
your subsequent status update.  Testers may discover new open items at any
time, and I want to plan to get them all fixed well in advance of shipping
9.6rc1.  Consequently, I will appreciate your efforts toward speedy
resolution.  Thanks.

[1] 
http://www.postgresql.org/message-id/20160527025039.ga447...@tornado.leadboat.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ERROR: ORDER/GROUP BY expression not found in targetlist

2016-06-26 Thread Noah Misch
On Mon, Jun 13, 2016 at 04:46:19PM -0400, Tom Lane wrote:
> Robert Haas  writes:
> > In practice, we don't yet have the ability for
> > parallel-safe paths from subqueries to affect planning at higher query
> > levels, but that's because the pathification stuff landed too late in
> > the cycle for me to fully react to it.
> 
> I wonder if that's not just from confusion between subplans and
> subqueries.  I don't believe any of the claims made in the comment
> adjusted in the patch below (other than "Subplans currently aren't passed
> to workers", which is true but irrelevant).  Nested Gather nodes is
> something that you would need, and already have, a defense for anyway.

[Action required within 72 hours.  This is a generic notification.]

The above-described topic is currently a PostgreSQL 9.6 open item ("fix
possible confusion between subqueries and subplans").  Robert, since you
committed the patch believed to have created it, you own this open item.  If
some other commit is more relevant or if this does not belong as a 9.6 open
item, please let us know.  Otherwise, please observe the policy on open item
ownership[1] and send a status update within 72 hours of this message.
Include a date for your subsequent status update.  Testers may discover new
open items at any time, and I want to plan to get them all fixed well in
advance of shipping 9.6rc1.  Consequently, I will appreciate your efforts
toward speedy resolution.  Thanks.

[1] 
http://www.postgresql.org/message-id/20160527025039.ga447...@tornado.leadboat.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ERROR: ORDER/GROUP BY expression not found in targetlist

2016-06-26 Thread Noah Misch
On Mon, Jun 13, 2016 at 03:42:49PM -0400, Tom Lane wrote:
> I wrote:
> > ... there was also an unexplainable plan change:
> 
> > *** /home/postgres/pgsql/src/test/regress/expected/aggregates.out   Thu Apr 
> >  7 21:13:14 2016
> > --- /home/postgres/pgsql/src/test/regress/results/aggregates.outMon Jun 
> > 13 11:54:01 2016
> > ***
> > *** 577,590 
>   
> >   explain (costs off)
> > select max(unique1) from tenk1 where unique1 > 42000;
> > ! QUERY PLAN
> >  
> > ! 
> > ---
> > !  Result
> > !InitPlan 1 (returns $0)
> > !  ->  Limit
> > !->  Index Only Scan Backward using tenk1_unique1 on tenk1
> > !  Index Cond: ((unique1 IS NOT NULL) AND (unique1 > 42000))
> > ! (5 rows)
>   
> >   select max(unique1) from tenk1 where unique1 > 42000;
> >max 
> > --- 577,588 
>   
> >   explain (costs off)
> > select max(unique1) from tenk1 where unique1 > 42000;
> > !  QUERY PLAN 
> > ! 
> > !  Aggregate
> > !->  Index Only Scan using tenk1_unique1 on tenk1
> > !  Index Cond: (unique1 > 42000)
> > ! (3 rows)
>   
> >   select max(unique1) from tenk1 where unique1 > 42000;
> >max 
> 
> > I would not be surprised at a change to a parallel-query plan, but there's
> > no parallelism here, so what happened?  This looks like a bug to me.
> > (Also, doing this query without COSTS OFF shows that the newly selected
> > plan actually has a greater estimated cost than the expected plan, which
> > makes it definitely a bug.)
> 
> I looked into this and found that the costs are considered fuzzily the
> same, and then add_path prefers the slightly-worse path on the grounds
> that it is marked parallel_safe while the MinMaxAgg path is not.  It seems
> to me that there is some fuzzy thinking going on there.  On exactly what
> grounds is a path to be preferred merely because it is parallel safe, and
> not actually parallelized?  Or perhaps the question to ask is whether a
> MinMaxAgg path can be marked parallel-safe.

[Action required within 72 hours.  This is a generic notification.]

The above-described topic is currently a PostgreSQL 9.6 open item ("consider
whether MinMaxAggPath might fail to be parallel-safe").  Robert, since you
committed the patch believed to have created it, you own this open item.  If
some other commit is more relevant or if this does not belong as a 9.6 open
item, please let us know.  Otherwise, please observe the policy on open item
ownership[1] and send a status update within 72 hours of this message.
Include a date for your subsequent status update.  Testers may discover new
open items at any time, and I want to plan to get them all fixed well in
advance of shipping 9.6rc1.  Consequently, I will appreciate your efforts
toward speedy resolution.  Thanks.

[1] 
http://www.postgresql.org/message-id/20160527025039.ga447...@tornado.leadboat.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ERROR: ORDER/GROUP BY expression not found in targetlist

2016-06-20 Thread Tatsuro Yamada

Hi,

I ran tpc-h's queries on this version and it's successful.
The error is fixed.

commit 705ad7f3b523acae0ddfdebd270b7048b2bb8029
Author: Tom Lane 
Date:   Sun Jun 19 13:11:40 2016 -0400

Regards,
Tatsuro Yamada
NTT OSS Center


On 2016/06/18 1:26, Robert Haas wrote:

On Fri, Jun 17, 2016 at 9:29 AM, Robert Haas  wrote:

On Fri, Jun 17, 2016 at 9:04 AM, Amit Kapila  wrote:

Attached, please find updated patch based on above lines.  Due to addition
of projection path on top of partial paths, some small additional cost is
added for parallel paths. In one of the tests in select_parallel.sql the
plan was getting converted to serial plan from parallel plan due to this
cost, so I have changed it to make it more restrictive.  Before changing, I
have debugged the test to confirm that it was due to this new projection
path cost.  I have added two new tests as well to cover the new code added
by patch.


I'm going to go work on this patch now.


Here is a revised version, which I plan to commit in a few hours
before the workday expires.  I kept it basically as Amit had it, but I
whacked the comments around some and, more substantively, avoided
inserting and/or charging for a Result node when that's not necessary.









--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ERROR: ORDER/GROUP BY expression not found in targetlist

2016-06-17 Thread Robert Haas
On Fri, Jun 17, 2016 at 9:29 AM, Robert Haas  wrote:
> On Fri, Jun 17, 2016 at 9:04 AM, Amit Kapila  wrote:
>> Attached, please find updated patch based on above lines.  Due to addition
>> of projection path on top of partial paths, some small additional cost is
>> added for parallel paths. In one of the tests in select_parallel.sql the
>> plan was getting converted to serial plan from parallel plan due to this
>> cost, so I have changed it to make it more restrictive.  Before changing, I
>> have debugged the test to confirm that it was due to this new projection
>> path cost.  I have added two new tests as well to cover the new code added
>> by patch.
>
> I'm going to go work on this patch now.

Here is a revised version, which I plan to commit in a few hours
before the workday expires.  I kept it basically as Amit had it, but I
whacked the comments around some and, more substantively, avoided
inserting and/or charging for a Result node when that's not necessary.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
diff --git a/src/backend/optimizer/plan/planagg.c b/src/backend/optimizer/plan/planagg.c
index cefec7b..0434a5a 100644
--- a/src/backend/optimizer/plan/planagg.c
+++ b/src/backend/optimizer/plan/planagg.c
@@ -465,7 +465,8 @@ build_minmax_path(PlannerInfo *root, MinMaxAggInfo *mminfo,
 	 * cheapest path.)
 	 */
 	sorted_path = apply_projection_to_path(subroot, final_rel, sorted_path,
-		   create_pathtarget(subroot, tlist));
+		   create_pathtarget(subroot, tlist),
+		   false);
 
 	/*
 	 * Determine cost to get just the first row of the presorted path.
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index 07b925e..0af8151 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -1500,6 +1500,7 @@ grouping_planner(PlannerInfo *root, bool inheritance_update,
 		PathTarget *grouping_target;
 		PathTarget *scanjoin_target;
 		bool		have_grouping;
+		bool		scanjoin_target_parallel_safe = false;
 		WindowFuncLists *wflists = NULL;
 		List	   *activeWindows = NIL;
 		List	   *rollup_lists = NIL;
@@ -1730,7 +1731,16 @@ grouping_planner(PlannerInfo *root, bool inheritance_update,
 			scanjoin_target = grouping_target;
 
 		/*
-		 * Forcibly apply that target to all the Paths for the scan/join rel.
+		 * Check whether scan/join target is parallel safe ... unless there
+		 * are no partial paths, in which case we don't care.
+		 */
+		if (current_rel->partial_pathlist &&
+			!has_parallel_hazard((Node *) scanjoin_target->exprs, false))
+			scanjoin_target_parallel_safe = true;
+
+		/*
+		 * Forcibly apply scan/join target to all the Paths for the scan/join
+		 * rel.
 		 *
 		 * In principle we should re-run set_cheapest() here to identify the
 		 * cheapest path, but it seems unlikely that adding the same tlist
@@ -1746,7 +1756,8 @@ grouping_planner(PlannerInfo *root, bool inheritance_update,
 
 			Assert(subpath->param_info == NULL);
 			path = apply_projection_to_path(root, current_rel,
-			subpath, scanjoin_target);
+			subpath, scanjoin_target,
+			scanjoin_target_parallel_safe);
 			/* If we had to add a Result, path is different from subpath */
 			if (path != subpath)
 			{
@@ -1759,6 +1770,70 @@ grouping_planner(PlannerInfo *root, bool inheritance_update,
 		}
 
 		/*
+		 * Upper planning steps which make use of the top scan/join rel's
+		 * partial pathlist will expect partial paths for that rel to produce
+		 * the same output as complete paths ... and we just changed the
+		 * output for the complete paths, so we'll need to do the same thing
+		 * for partial paths.
+		 */
+		if (scanjoin_target_parallel_safe)
+		{
+			/*
+			 * Apply the scan/join target to each partial path.  Otherwise,
+			 * anything that attempts to use the partial paths for further
+			 * upper planning may go wrong.
+			 */
+			foreach(lc, current_rel->partial_pathlist)
+			{
+Path	   *subpath = (Path *) lfirst(lc);
+Path	   *newpath;
+
+/*
+ * We can't use apply_projection_to_path() here, because there
+ * could already be pointers to these paths, and therefore
+ * we cannot modify them in place.  Instead, we must use
+ * create_projection_path().  The good news is this won't
+ * actually insert a Result node into the final plan unless
+ * it's needed, but the bad news is that it will charge for
+ * the node whether it's needed or not.  Therefore, if the
+ * target list is already what we need it to be, just leave
+ * this partial path alone.
+ */
+if (equal(scanjoin_target->exprs, subpath->pathtarget->exprs))
+	continue;
+
+Assert(subpath->param_info == NULL);
+newpath = (Path *) create_projection_path(root,
+			 current_rel,
+			 subpath,
+			 scanjoin_target);
+if 

Re: [HACKERS] ERROR: ORDER/GROUP BY expression not found in targetlist

2016-06-17 Thread Robert Haas
On Fri, Jun 17, 2016 at 9:04 AM, Amit Kapila  wrote:
> Attached, please find updated patch based on above lines.  Due to addition
> of projection path on top of partial paths, some small additional cost is
> added for parallel paths. In one of the tests in select_parallel.sql the
> plan was getting converted to serial plan from parallel plan due to this
> cost, so I have changed it to make it more restrictive.  Before changing, I
> have debugged the test to confirm that it was due to this new projection
> path cost.  I have added two new tests as well to cover the new code added
> by patch.

I'm going to go work on this patch now.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ERROR: ORDER/GROUP BY expression not found in targetlist

2016-06-17 Thread Amit Kapila
On Fri, Jun 17, 2016 at 8:18 AM, Amit Kapila 
wrote:
>
> On Fri, Jun 17, 2016 at 3:20 AM, Robert Haas 
wrote:
> >
>
> > Something like what you have there might work if you use
> > create_projection_path instead of apply_projection_to_path.
> >
>
> Okay, I think you mean to say use create_projection_path() while applying
scanjoin_target to partial path lists in below code:
> foreach(lc, current_rel->partial_pathlist)
> {
> Path   *subpath = (Path *) lfirst(lc);
> Assert(subpath->param_info == NULL);
> lfirst(lc) = apply_projection_to_path(root, current_rel,
> subpath, scanjoin_target,
> scanjoin_target_parallel_safe);
> }
>

Attached, please find updated patch based on above lines.  Due to addition
of projection path on top of partial paths, some small additional cost is
added for parallel paths. In one of the tests in select_parallel.sql the
plan was getting converted to serial plan from parallel plan due to this
cost, so I have changed it to make it more restrictive.  Before changing, I
have debugged the test to confirm that it was due to this new projection
path cost.  I have added two new tests as well to cover the new code added
by patch.

Note - Apart from the tests related to new code,  Dilip Kumar has helped to
verify that TPC-H queries also worked fine (he tested using Explain). He
doesn't encounter problem reported above [1] whereas without patch TPCH
queries were failing.


[1] - https://www.postgresql.org/message-id/575E6BE6.8040006%40lab.ntt.co.jp

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


fix_scanjoin_pathtarget_v2.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ERROR: ORDER/GROUP BY expression not found in targetlist

2016-06-17 Thread Andreas Joseph Krogh
På fredag 17. juni 2016 kl. 08:14:39, skrev Amit Kapila >:
On Fri, Jun 17, 2016 at 11:39 AM, Andreas Joseph Krogh > wrote: På torsdag 16. juni 2016 kl. 20:19:44, 
skrev Tom Lane >:
Amit Kapila 
> writes:
 > On Mon, Jun 13, 2016 at 10:36 PM, Tom Lane > wrote:
 >> min_parallel_relation_size, or min_parallelizable_relation_size, or
 >> something like that?

 > You are right that such a variable will make it simpler to write tests for
 > parallel query.  I have implemented such a guc and choose to keep the name
 > as min_parallel_relation_size.

 Pushed with minor adjustments.  My first experiments with this say that
 we should have done this long ago:
https://www.postgresql.org/message-id/22782.1466100...@sss.pgh.pa.us 


 > One thing to note is that in function
 > create_plain_partial_paths(), curently it is using PG_INT32_MAX/3 for
 > parallel_threshold to check for overflow, I have changed it to INT_MAX/3 so
 > as to be consistent with guc.c.  I am not sure if it is advisable to use
 > PG_INT32_MAX in guc.c as other similar parameters use INT_MAX.

 I agree that using INT_MAX is more consistent with the code elsewhere in
 guc.c, and more correct given that we declare the variable in question
 as int not int32.  But you need to include  to use INT_MAX ...

 regards, tom lane
 


As of 4c56f3269a84a81461cc53941e0eee02fc920ab6 I'm still getting it in one of 
my queries:
ORDER/GROUP BY expression not found in targetlist
 
I am working on preparing a patch to fix this issue.
 
 
Am I missing something?
 

 No, the fix is still not committed.



 
Ah, I thought Tom pushed a fix, but it apparently was another fix.
 
Thanks for fixing.
 
-- Andreas Joseph Krogh
CTO / Partner - Visena AS
Mobile: +47 909 56 963
andr...@visena.com 
www.visena.com 
 


 


Re: [HACKERS] ERROR: ORDER/GROUP BY expression not found in targetlist

2016-06-17 Thread Amit Kapila
On Fri, Jun 17, 2016 at 11:39 AM, Andreas Joseph Krogh 
wrote:

> På torsdag 16. juni 2016 kl. 20:19:44, skrev Tom Lane :
>
> Amit Kapila  writes:
> > On Mon, Jun 13, 2016 at 10:36 PM, Tom Lane  wrote:
> >> min_parallel_relation_size, or min_parallelizable_relation_size, or
> >> something like that?
>
> > You are right that such a variable will make it simpler to write tests
> for
> > parallel query.  I have implemented such a guc and choose to keep the
> name
> > as min_parallel_relation_size.
>
> Pushed with minor adjustments.  My first experiments with this say that
> we should have done this long ago:
> https://www.postgresql.org/message-id/22782.1466100...@sss.pgh.pa.us
>
> > One thing to note is that in function
> > create_plain_partial_paths(), curently it is using PG_INT32_MAX/3 for
> > parallel_threshold to check for overflow, I have changed it to INT_MAX/3
> so
> > as to be consistent with guc.c.  I am not sure if it is advisable to use
> > PG_INT32_MAX in guc.c as other similar parameters use INT_MAX.
>
> I agree that using INT_MAX is more consistent with the code elsewhere in
> guc.c, and more correct given that we declare the variable in question
> as int not int32.  But you need to include  to use INT_MAX ...
>
> regards, tom lane
>
>
> As of 4c56f3269a84a81461cc53941e0eee02fc920ab6 I'm still getting it in one
> of my queries:
> ORDER/GROUP BY expression not found in targetlist
>

I am working on preparing a patch to fix this issue.


> Am I missing something?
>

No, the fix is still not committed.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


Re: [HACKERS] ERROR: ORDER/GROUP BY expression not found in targetlist

2016-06-17 Thread Andreas Joseph Krogh
På torsdag 16. juni 2016 kl. 20:19:44, skrev Tom Lane >:
Amit Kapila  writes:
 > On Mon, Jun 13, 2016 at 10:36 PM, Tom Lane  wrote:
 >> min_parallel_relation_size, or min_parallelizable_relation_size, or
 >> something like that?

 > You are right that such a variable will make it simpler to write tests for
 > parallel query.  I have implemented such a guc and choose to keep the name
 > as min_parallel_relation_size.

 Pushed with minor adjustments.  My first experiments with this say that
 we should have done this long ago:
 https://www.postgresql.org/message-id/22782.1466100...@sss.pgh.pa.us

 > One thing to note is that in function
 > create_plain_partial_paths(), curently it is using PG_INT32_MAX/3 for
 > parallel_threshold to check for overflow, I have changed it to INT_MAX/3 so
 > as to be consistent with guc.c.  I am not sure if it is advisable to use
 > PG_INT32_MAX in guc.c as other similar parameters use INT_MAX.

 I agree that using INT_MAX is more consistent with the code elsewhere in
 guc.c, and more correct given that we declare the variable in question
 as int not int32.  But you need to include  to use INT_MAX ...

 regards, tom lane
 
As of 4c56f3269a84a81461cc53941e0eee02fc920ab6 I'm still getting it in one of 
my queries:
ORDER/GROUP BY expression not found in targetlist
  
Am I missing something?
 
I could dig into this further to reproduce if necessary.
 
-- Andreas Joseph Krogh
CTO / Partner - Visena AS
Mobile: +47 909 56 963
andr...@visena.com 
www.visena.com 
 


 


Re: [HACKERS] ERROR: ORDER/GROUP BY expression not found in targetlist

2016-06-16 Thread Amit Kapila
On Fri, Jun 17, 2016 at 3:20 AM, Robert Haas  wrote:
>
> On Thu, Jun 16, 2016 at 12:16 PM, Robert Haas 
wrote:
> > On Thu, Jun 16, 2016 at 8:00 AM, Amit Kapila 
wrote:
> >>> 1. The case originally reported by Thomas Munro still fails.  To fix
> >>> that, we probably need to apply scanjoin_target to each partial path.
> >>> But we can only do that if it's parallel-safe.  It seems like what we
> >>> want is something like this: (1) During scan/join planning, somehow
> >>> skip calling generate_gather_paths for the topmost scan/join rel as we
> >>> do to all the others.  (2) If scanjoin_target is not parallel-safe,
> >>> build a path for the scan/join phase that applies a Gather node to the
> >>> cheapest path and does projection at the Gather node.  Then forget all
> >>> the partial paths so we can't do any bogus upper planning.  (3) If
> >>> scanjoin_target is parallel-safe, replace the list of partial paths
> >>> for the topmost scan/join rel with a new list where scanjoin_target
> >>> has been applied to each one.  I haven't tested this so I might be
> >>> totally off-base about what's actually required here...
> >>
> >> I think we can achieve it just by doing something like what you have
> >> mentioned in (2) and (3).  I am not sure if there is a need to skip
> >> generation of gather paths for top scan/join node.  Please see the
patch
> >> attached.  I have just done some minimal testing to ensure that problem
> >> reported by Thomas Munro in this thread is fixed and verified that the
fix
> >> is sane for problems [1][2] reported by sqlsmith. If you think this is
on
> >> right lines, I can try to do more verification and try to add tests.
> >
> > You can't do it this way because of the issue Tom discussed here:
> >
> > https://www.postgresql.org/message-id/16421.1465828...@sss.pgh.pa.us
>

I think although we are always adding a projection path in
apply_projection_to_path() to subpath of Gather path if target is parallel
safe, still they can be used directly if create_projection_plan() doesn't
add a Result node.  So changing them directly after being pushed below
Gather is not a safe bet.

> Something like what you have there might work if you use
> create_projection_path instead of apply_projection_to_path.
>

Okay, I think you mean to say use create_projection_path() while applying
scanjoin_target to partial path lists in below code:
foreach(lc, current_rel->partial_pathlist)
{
Path   *subpath = (Path *) lfirst(lc);
Assert(subpath->param_info == NULL);
lfirst(lc) = apply_projection_to_path(root, current_rel,
subpath, scanjoin_target,
scanjoin_target_parallel_safe);
}

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


Re: [HACKERS] ERROR: ORDER/GROUP BY expression not found in targetlist

2016-06-16 Thread Tom Lane
Amit Kapila  writes:
> On Thu, Jun 16, 2016 at 11:49 PM, Tom Lane  wrote:
>> But you need to include  to use INT_MAX ...

> I wonder why it has not given me any compilation error/warning.  I have
> tried it on both Windows and CentOS, before sending the patch.

Some platforms seem to make it available from other headers too.
But AFAIK, POSIX only requires  to provide it.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ERROR: ORDER/GROUP BY expression not found in targetlist

2016-06-16 Thread Amit Kapila
On Thu, Jun 16, 2016 at 11:49 PM, Tom Lane  wrote:
>
> Amit Kapila  writes:
> > On Mon, Jun 13, 2016 at 10:36 PM, Tom Lane  wrote:
> >> min_parallel_relation_size, or min_parallelizable_relation_size, or
> >> something like that?
>
> > You are right that such a variable will make it simpler to write tests
for
> > parallel query.  I have implemented such a guc and choose to keep the
name
> > as min_parallel_relation_size.
>
> Pushed with minor adjustments.


Thanks.

> > One thing to note is that in function
> > create_plain_partial_paths(), curently it is using PG_INT32_MAX/3 for
> > parallel_threshold to check for overflow, I have changed it to
INT_MAX/3 so
> > as to be consistent with guc.c.  I am not sure if it is advisable to use
> > PG_INT32_MAX in guc.c as other similar parameters use INT_MAX.
>
> I agree that using INT_MAX is more consistent with the code elsewhere in
> guc.c, and more correct given that we declare the variable in question
> as int not int32.  But you need to include  to use INT_MAX ...
>

I wonder why it has not given me any compilation error/warning.  I have
tried it on both Windows and CentOS, before sending the patch.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


Re: [HACKERS] ERROR: ORDER/GROUP BY expression not found in targetlist

2016-06-16 Thread Robert Haas
On Thu, Jun 16, 2016 at 12:16 PM, Robert Haas  wrote:
> On Thu, Jun 16, 2016 at 8:00 AM, Amit Kapila  wrote:
>>> 1. The case originally reported by Thomas Munro still fails.  To fix
>>> that, we probably need to apply scanjoin_target to each partial path.
>>> But we can only do that if it's parallel-safe.  It seems like what we
>>> want is something like this: (1) During scan/join planning, somehow
>>> skip calling generate_gather_paths for the topmost scan/join rel as we
>>> do to all the others.  (2) If scanjoin_target is not parallel-safe,
>>> build a path for the scan/join phase that applies a Gather node to the
>>> cheapest path and does projection at the Gather node.  Then forget all
>>> the partial paths so we can't do any bogus upper planning.  (3) If
>>> scanjoin_target is parallel-safe, replace the list of partial paths
>>> for the topmost scan/join rel with a new list where scanjoin_target
>>> has been applied to each one.  I haven't tested this so I might be
>>> totally off-base about what's actually required here...
>>
>> I think we can achieve it just by doing something like what you have
>> mentioned in (2) and (3).  I am not sure if there is a need to skip
>> generation of gather paths for top scan/join node.  Please see the patch
>> attached.  I have just done some minimal testing to ensure that problem
>> reported by Thomas Munro in this thread is fixed and verified that the fix
>> is sane for problems [1][2] reported by sqlsmith. If you think this is on
>> right lines, I can try to do more verification and try to add tests.
>
> You can't do it this way because of the issue Tom discussed here:
>
> https://www.postgresql.org/message-id/16421.1465828...@sss.pgh.pa.us

Something like what you have there might work if you use
create_projection_path instead of apply_projection_to_path.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ERROR: ORDER/GROUP BY expression not found in targetlist

2016-06-16 Thread Tom Lane
Amit Kapila  writes:
> On Mon, Jun 13, 2016 at 10:36 PM, Tom Lane  wrote:
>> min_parallel_relation_size, or min_parallelizable_relation_size, or
>> something like that?

> You are right that such a variable will make it simpler to write tests for
> parallel query.  I have implemented such a guc and choose to keep the name
> as min_parallel_relation_size.

Pushed with minor adjustments.  My first experiments with this say that
we should have done this long ago:
https://www.postgresql.org/message-id/22782.1466100...@sss.pgh.pa.us

> One thing to note is that in function
> create_plain_partial_paths(), curently it is using PG_INT32_MAX/3 for
> parallel_threshold to check for overflow, I have changed it to INT_MAX/3 so
> as to be consistent with guc.c.  I am not sure if it is advisable to use
> PG_INT32_MAX in guc.c as other similar parameters use INT_MAX.

I agree that using INT_MAX is more consistent with the code elsewhere in
guc.c, and more correct given that we declare the variable in question
as int not int32.  But you need to include  to use INT_MAX ...

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ERROR: ORDER/GROUP BY expression not found in targetlist

2016-06-16 Thread Robert Haas
On Thu, Jun 16, 2016 at 12:50 PM, Tom Lane  wrote:
> Amit Kapila  writes:
>> On Thu, Jun 16, 2016 at 9:26 AM, Robert Haas  wrote:
>>> 3. In https://www.postgresql.org/message-id/25521.1465837...@sss.pgh.pa.us
>>> Tom proposed adding a GUC to set the minimum relation size required
>>> for consideration of parallelism.
>
>> I have posted a patch for this upthread [3].  The main thing to verify is
>> the default value of guc.
>
> I'll review and push this patch, unless Robert or somebody else has
> already started on it.

Great, it's all yours.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ERROR: ORDER/GROUP BY expression not found in targetlist

2016-06-16 Thread Tom Lane
Amit Kapila  writes:
> On Thu, Jun 16, 2016 at 9:26 AM, Robert Haas  wrote:
>> 3. In https://www.postgresql.org/message-id/25521.1465837...@sss.pgh.pa.us
>> Tom proposed adding a GUC to set the minimum relation size required
>> for consideration of parallelism.

> I have posted a patch for this upthread [3].  The main thing to verify is
> the default value of guc.

I'll review and push this patch, unless Robert or somebody else has
already started on it.  I concur with your choice of setting the default
value to 1024 blocks; that's close to the existing behavior but as a power
of 2 will look cleaner in GUC size units.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ERROR: ORDER/GROUP BY expression not found in targetlist

2016-06-16 Thread Robert Haas
On Thu, Jun 16, 2016 at 8:00 AM, Amit Kapila  wrote:
>> 1. The case originally reported by Thomas Munro still fails.  To fix
>> that, we probably need to apply scanjoin_target to each partial path.
>> But we can only do that if it's parallel-safe.  It seems like what we
>> want is something like this: (1) During scan/join planning, somehow
>> skip calling generate_gather_paths for the topmost scan/join rel as we
>> do to all the others.  (2) If scanjoin_target is not parallel-safe,
>> build a path for the scan/join phase that applies a Gather node to the
>> cheapest path and does projection at the Gather node.  Then forget all
>> the partial paths so we can't do any bogus upper planning.  (3) If
>> scanjoin_target is parallel-safe, replace the list of partial paths
>> for the topmost scan/join rel with a new list where scanjoin_target
>> has been applied to each one.  I haven't tested this so I might be
>> totally off-base about what's actually required here...
>
> I think we can achieve it just by doing something like what you have
> mentioned in (2) and (3).  I am not sure if there is a need to skip
> generation of gather paths for top scan/join node.  Please see the patch
> attached.  I have just done some minimal testing to ensure that problem
> reported by Thomas Munro in this thread is fixed and verified that the fix
> is sane for problems [1][2] reported by sqlsmith. If you think this is on
> right lines, I can try to do more verification and try to add tests.

You can't do it this way because of the issue Tom discussed here:

https://www.postgresql.org/message-id/16421.1465828...@sss.pgh.pa.us

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ERROR: ORDER/GROUP BY expression not found in targetlist

2016-06-16 Thread Amit Kapila
On Thu, Jun 16, 2016 at 9:26 AM, Robert Haas  wrote:
>
> On Tue, Jun 14, 2016 at 12:18 PM, Tom Lane  wrote:
> > Amit Kapila  writes:
> >> On Mon, Jun 13, 2016 at 9:37 PM, Tom Lane  wrote:
> >>> ...  I got a core dump in the window.sql test:
> >>> which I think may be another manifestation of the
failure-to-apply-proper-
> >>> pathtarget issue we're looking at in this thread.  Or maybe it's just
> >>> an unjustified assumption in make_partialgroup_input_target that the
> >>> input path must always have some sortgrouprefs assigned.
> >
> >> I don't see this problem after your recent commit
> >> - 89d53515e53ea080029894939118365b647489b3.  Are you still getting it?
> >
> > No.  I am suspicious that there's still a bug there, ie we're looking at
> > the wrong pathtarget; but the regression test doesn't actually crash.
> > That might only be because we don't choose the parallelized path.
>
> OK, so there are quite a number of separate things here:
>
> 1. The case originally reported by Thomas Munro still fails.  To fix
> that, we probably need to apply scanjoin_target to each partial path.
> But we can only do that if it's parallel-safe.  It seems like what we
> want is something like this: (1) During scan/join planning, somehow
> skip calling generate_gather_paths for the topmost scan/join rel as we
> do to all the others.  (2) If scanjoin_target is not parallel-safe,
> build a path for the scan/join phase that applies a Gather node to the
> cheapest path and does projection at the Gather node.  Then forget all
> the partial paths so we can't do any bogus upper planning.  (3) If
> scanjoin_target is parallel-safe, replace the list of partial paths
> for the topmost scan/join rel with a new list where scanjoin_target
> has been applied to each one.  I haven't tested this so I might be
> totally off-base about what's actually required here...
>

I think we can achieve it just by doing something like what you have
mentioned in (2) and (3).  I am not sure if there is a need to skip
generation of gather paths for top scan/join node.  Please see the patch
attached.  I have just done some minimal testing to ensure that problem
reported by Thomas Munro in this thread is fixed and verified that the fix
is sane for problems [1][2] reported by sqlsmith. If you think this is on
right lines, I can try to do more verification and try to add tests.

> 2. In https://www.postgresql.org/message-id/15695.1465827...@sss.pgh.pa.us
> Tom raised the issue that we need some test cases covering this area.
>
> 3. In https://www.postgresql.org/message-id/25521.1465837...@sss.pgh.pa.us
> Tom proposed adding a GUC to set the minimum relation size required
> for consideration of parallelism.
>

I have posted a patch for this upthread [3].  The main thing to verify is
the default value of guc.


[1] -
https://www.postgresql.org/message-id/CAA4eK1Ky2=HsTsT4hmfL=eal5rv0_t59tvwzvk9hqkvn6do...@mail.gmail.com
[2] -
https://www.postgresql.org/message-id/CAFiTN-vzg5BkK6kAh3OMhvgRu-uJvkjz47ybtopMAfGJp=z...@mail.gmail.com
[3] -
https://www.postgresql.org/message-id/CA%2BkptmAU4xkzBpd8tie3X6o9_tE2oKm-0kDn8%2BZOF%3D2_qOMZNA%40mail.gmail.com

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


fix_scanjoin_pathtarget_v1.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ERROR: ORDER/GROUP BY expression not found in targetlist

2016-06-15 Thread Robert Haas
On Tue, Jun 14, 2016 at 12:18 PM, Tom Lane  wrote:
> Amit Kapila  writes:
>> On Mon, Jun 13, 2016 at 9:37 PM, Tom Lane  wrote:
>>> ...  I got a core dump in the window.sql test:
>>> which I think may be another manifestation of the failure-to-apply-proper-
>>> pathtarget issue we're looking at in this thread.  Or maybe it's just
>>> an unjustified assumption in make_partialgroup_input_target that the
>>> input path must always have some sortgrouprefs assigned.
>
>> I don't see this problem after your recent commit
>> - 89d53515e53ea080029894939118365b647489b3.  Are you still getting it?
>
> No.  I am suspicious that there's still a bug there, ie we're looking at
> the wrong pathtarget; but the regression test doesn't actually crash.
> That might only be because we don't choose the parallelized path.

OK, so there are quite a number of separate things here:

1. The case originally reported by Thomas Munro still fails.  To fix
that, we probably need to apply scanjoin_target to each partial path.
But we can only do that if it's parallel-safe.  It seems like what we
want is something like this: (1) During scan/join planning, somehow
skip calling generate_gather_paths for the topmost scan/join rel as we
do to all the others.  (2) If scanjoin_target is not parallel-safe,
build a path for the scan/join phase that applies a Gather node to the
cheapest path and does projection at the Gather node.  Then forget all
the partial paths so we can't do any bogus upper planning.  (3) If
scanjoin_target is parallel-safe, replace the list of partial paths
for the topmost scan/join rel with a new list where scanjoin_target
has been applied to each one.  I haven't tested this so I might be
totally off-base about what's actually required here...

2. In https://www.postgresql.org/message-id/15695.1465827...@sss.pgh.pa.us
Tom raised the issue that we need some test cases covering this area.

3. In https://www.postgresql.org/message-id/25521.1465837...@sss.pgh.pa.us
Tom proposed adding a GUC to set the minimum relation size required
for consideration of parallelism.

4. In https://www.postgresql.org/message-id/1597.1465846...@sss.pgh.pa.us
Tom raised the question of whether there is a danger that
MinMaxAggPath might not be parallel-safe.

5. In https://www.postgresql.org/message-id/20391.1465850...@sss.pgh.pa.us
Tom proposed a patch to fix an apparent confusion on my part between
subplans and subqueries.

6. In 
https://www.postgresql.org/message-id/ca+tgmozwjb9eaibj-meeaq913wkkoz4aq7vqncfrdls9wyh...@mail.gmail.com
I discussed the need to fix the way consider_parallel is set for upper
rels, and in 
https://www.postgresql.org/message-id/22832.1465854...@sss.pgh.pa.us
Tom observed that, once that work is done, we can get rid of the
wholePlanParallelSafe flag.

I don't think it's remotely realistic to get all of this fixed in time
for beta2; I think we'll be lucky if we can get #1 fixed.  But we'd
better track all of these issues so that we can crank through them
later.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ERROR: ORDER/GROUP BY expression not found in targetlist

2016-06-15 Thread Amit Kapila
On Tue, Jun 14, 2016 at 4:48 PM, Amit Kapila 
wrote:

> On Mon, Jun 13, 2016 at 8:11 PM, Tom Lane  wrote:
>
> >  I do
> > not share your confidence that using apply_projection_to_path within
> > create_grouping_paths is free of such a hazard.
> >
>
> Fair enough, let me try to explain the problem and someways to solve it in
> some more detail.  The main thing that got missed by me in the patch
> related to commit-04ae11f62 is that the partial path list of rel also needs
> to have a scanjoin_target. I was under assumption that
> create_grouping_paths will take care of assigning appropriate Path targets
> for the parallel paths generated by it.  If we see, create_grouping_paths()
> do take care of adding the appropriate path targets for the paths generated
> by that function.  However, it doesn't do anything for partial paths.   The
> patch sent by me yesterday [1] which was trying to assign
> partial_grouping_target to partial paths won't be the right fix, because
> (a) partial_grouping_target includes Aggregates (refer
> make_partialgroup_input_target) which we don't want for partial paths; (b)
> it is formed using grouping_target passed in function
> create_grouping_paths() whereas we need the path target formed from
> final_target for partial paths (as partial paths are scanjoin relations)
>  as is done for main path list (in grouping_planner(),  /* Forcibly apply
> that target to all the Paths for the scan/join rel.*/).   Now, I think we
> have following ways to solve it:
>
> (a) check whether the scanjoin_target contains any parallel-restricted
> clause before applying the same to partial path list in grouping_planner.
> However it could lead to duplicate checks in some cases for
> parallel-restricted clause, once in apply_projection_to_path() for main
> pathlist and then again before applying to partial paths.  I think we can
> avoid that by having an additional parameter in apply_projection_to_path()
> which can indicate if the check for parallel-safety is done inside that
> function and what is the result of same.
>
> (b) generate the appropriate scanjoin_target for partial path list in
> create_grouping_paths using final_target. However I think this might lead
> to some duplicate code in create_grouping_paths() as we might have to some
> thing similar to what we have done in grouping_planner for generating such
> a target.  I think if we want we can pass scanjoin_target to
> create_grouping_paths() as well.   Again, we have to check for
> parallel-safety for scanjoin_target before applying it to partial paths,
> but we need to do that only when grouped_rel is considered parallel-safe.
>
>
One more idea could be to have a flag (say parallel_safe) in PathTarget and
set it once we have ensured that the expressions in target doesn't contain
any parallel-restricted entry.  In
create_pathtarget()/set_pathtarget_cost_width(),  if the parallelmodeOK
flag is set, then we can evaluate target expressions for
parallel-restricted expressions and set the flag accordingly.  Now, this
has the benefit that we don't need to evaluate the expressions of targets
for parallel-restricted clauses again and again.  I think this way if the
flag is set once for final_target in grouping_planner, we don't need to
evaluate it again for other targets (grouping_target, scanjoin_target,
etc.) as those are derived from final_target.  Similarly, I think we need
to set ReplOptInfo->reltarget and others as required.


With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


Re: [HACKERS] ERROR: ORDER/GROUP BY expression not found in targetlist

2016-06-14 Thread Tom Lane
Amit Kapila  writes:
> On Mon, Jun 13, 2016 at 9:37 PM, Tom Lane  wrote:
>> ...  I got a core dump in the window.sql test:
>> which I think may be another manifestation of the failure-to-apply-proper-
>> pathtarget issue we're looking at in this thread.  Or maybe it's just
>> an unjustified assumption in make_partialgroup_input_target that the
>> input path must always have some sortgrouprefs assigned.

> I don't see this problem after your recent commit
> - 89d53515e53ea080029894939118365b647489b3.  Are you still getting it?

No.  I am suspicious that there's still a bug there, ie we're looking at
the wrong pathtarget; but the regression test doesn't actually crash.
That might only be because we don't choose the parallelized path.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ERROR: ORDER/GROUP BY expression not found in targetlist

2016-06-14 Thread Tom Lane
Robert Haas  writes:
> On Mon, Jun 13, 2016 at 6:21 PM, Tom Lane  wrote:
>> I think this is bad because it forces any external creators of
>> UPPERREL_GROUP_AGG to duplicate that code --- and heaven help everybody
>> if anyone is out of sync on whether to set the flag.  So I'd rather keep
>> set_grouped_rel_consider_parallel(), even if all it does is the above.
>> And make it global not static.  Ditto for the other upperrels.

> I'm slightly mystified by this because we really shouldn't be setting
> that flag more than once.  We don't want to do that work repeatedly,
> just once, and prior to adding any paths to the rel.  Are you
> imagining that somebody would try to created grouped paths before we
> finish scan/join planning?

Exactly.  The original pathification design contemplated that FDWs and
other extensions could inject Paths for the upperrels whenever they felt
like it, specifically during relation path building.  Even if you think
that's silly, it *must* be possible to do it during GetForeignUpperPaths,
else that hook is completely useless.  Therefore, adding any data other
than new Paths to the upperrel during create_grouping_paths is a broken
design unless there is some easy, reliable, and not too expensive way for
an FDW to make the same thing happen a bit earlier.  See the commentary in
optimizer/README starting at line 922.

I generally don't like rel->consider_parallel at all for basically this
reason, but it may be too late to undo that aspect of the parallel join
planning design.  I will settle for having an API call that allows FDWs
to get the flag set correctly.  Another possibility is to set the flag
before calling GetForeignUpperPaths, but that might be messy too.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ERROR: ORDER/GROUP BY expression not found in targetlist

2016-06-14 Thread Amit Kapila
On Mon, Jun 13, 2016 at 10:36 PM, Tom Lane  wrote:
>
> Robert Haas  writes:
> > On Mon, Jun 13, 2016 at 11:02 AM, Tom Lane  wrote:
> >> BTW, decent regression tests could be written without the need to
create
> >> enormous tables if the minimum rel size in create_plain_partial_paths()
> >> could be configured to something less than 1000 blocks.  I think it's
> >> fairly crazy that that arbitrary constant is hard-wired anyway.  Should
> >> we make it a GUC?
>
> > That was proposed before, and I didn't do it mostly because I couldn't
> > think of a name for it that didn't sound unbelievably corny.
>
> min_parallel_relation_size, or min_parallelizable_relation_size, or
> something like that?
>

You are right that such a variable will make it simpler to write tests for
parallel query.  I have implemented such a guc and choose to keep the name
as min_parallel_relation_size.  One thing to note is that in function
create_plain_partial_paths(), curently it is using PG_INT32_MAX/3 for
parallel_threshold to check for overflow, I have changed it to INT_MAX/3 so
as to be consistent with guc.c.  I am not sure if it is advisable to use
PG_INT32_MAX in guc.c as other similar parameters use INT_MAX.


With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


min_parallel_relation_size_v1.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ERROR: ORDER/GROUP BY expression not found in targetlist

2016-06-14 Thread Tom Lane
David Rowley  writes:
> Do you think it's worth also applying the attached so as to have
> ressortgroupref set to NULL more consistently, instead of sometimes
> NULL and other times allocated to memory wastefully filled with zeros?

Meh --- that seems to add complication without buying a whole lot.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ERROR: ORDER/GROUP BY expression not found in targetlist

2016-06-14 Thread Amit Kapila
On Tue, Jun 14, 2016 at 2:16 AM, Tom Lane  wrote:
>
> Robert Haas  writes:
> > In practice, we don't yet have the ability for
> > parallel-safe paths from subqueries to affect planning at higher query
> > levels, but that's because the pathification stuff landed too late in
> > the cycle for me to fully react to it.
>
> I wonder if that's not just from confusion between subplans and
> subqueries.
>

Won't the patch as written will allow parallel-restricted things to be
pushed to workers for UNION ALL kind of queries?

Explain verbose Select * from (SELECT c1+1 FROM t1 UNION ALL SELECT c1+1
FROM t2 where c1 < 10) ss(a);

In above kind of queries, set_rel_consider_parallel() might set
consider_parallel as true for rel, but later in set_append_rel_size(), it
might pull some unsafe clauses in target of childrel.   Basically, I am
wondering about the same problem as we discussed here [1].


[1] -
https://www.postgresql.org/message-id/CAA4eK1Jz5tG2D2PCNYqYob2cb4dKowKYwVJ7OmP5vwyRyCMx4g%40mail.gmail.com

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


Re: [HACKERS] ERROR: ORDER/GROUP BY expression not found in targetlist

2016-06-14 Thread Amit Kapila
On Mon, Jun 13, 2016 at 8:11 PM, Tom Lane  wrote:
>
> Amit Kapila  writes:
> > On Mon, Jun 13, 2016 at 7:51 PM, Tom Lane  wrote:
> >> I think the real question here is why the code removed by 04ae11f62
> >> was wrong.  It was unsafe to use apply_projection_to_path, certainly,
> >> but using create_projection_path directly would have avoided the
> >> stated problem.  And it's very unclear that this new patch doesn't
> >> bring back that bug in a different place.
>
> > This new patch still doesn't seem to be right, but it won't bring back
the
> > original problem because apply_projection_to_path will be only done if
> > grouped_rel is parallel_safe which means it doesn't have any
> > parallel-unsafe or parallel-restricted clause in quals or target list.
>
> The problem cited in 04ae11f62's commit message is that
> apply_projection_to_path would overwrite the paths' pathtargets in-place,
> causing problems if they'd been used for other purposes elsewhere.
>

The main reason for removal of that code is that because there was no check
there to prevent assigning of parallel-restricted clauses to pathtarget of
partial paths.  I think the same is indicated in commit message as well, if
we focus on below part of commit message:
 "especially because this code has no check that the PathTarget is in fact
parallel-safe."

Due to above reason, I don't see how the suggestion given by you to avoid
the problem by using create_projection_path can work.

>  I do
> not share your confidence that using apply_projection_to_path within
> create_grouping_paths is free of such a hazard.
>

Fair enough, let me try to explain the problem and someways to solve it in
some more detail.  The main thing that got missed by me in the patch
related to commit-04ae11f62 is that the partial path list of rel also needs
to have a scanjoin_target. I was under assumption that
create_grouping_paths will take care of assigning appropriate Path targets
for the parallel paths generated by it.  If we see, create_grouping_paths()
do take care of adding the appropriate path targets for the paths generated
by that function.  However, it doesn't do anything for partial paths.   The
patch sent by me yesterday [1] which was trying to assign
partial_grouping_target to partial paths won't be the right fix, because
(a) partial_grouping_target includes Aggregates (refer
make_partialgroup_input_target) which we don't want for partial paths; (b)
it is formed using grouping_target passed in function
create_grouping_paths() whereas we need the path target formed from
final_target for partial paths (as partial paths are scanjoin relations)
 as is done for main path list (in grouping_planner(),  /* Forcibly apply
that target to all the Paths for the scan/join rel.*/).   Now, I think we
have following ways to solve it:

(a) check whether the scanjoin_target contains any parallel-restricted
clause before applying the same to partial path list in grouping_planner.
However it could lead to duplicate checks in some cases for
parallel-restricted clause, once in apply_projection_to_path() for main
pathlist and then again before applying to partial paths.  I think we can
avoid that by having an additional parameter in apply_projection_to_path()
which can indicate if the check for parallel-safety is done inside that
function and what is the result of same.

(b) generate the appropriate scanjoin_target for partial path list in
create_grouping_paths using final_target. However I think this might lead
to some duplicate code in create_grouping_paths() as we might have to some
thing similar to what we have done in grouping_planner for generating such
a target.  I think if we want we can pass scanjoin_target to
create_grouping_paths() as well.   Again, we have to check for
parallel-safety for scanjoin_target before applying it to partial paths,
but we need to do that only when grouped_rel is considered parallel-safe.

Thoughts?


[1] -
https://www.postgresql.org/message-id/CAA4eK1Jm6HrJAPX26xyLGWes%2B%2Br5%3DdOyOGRWeTa4q8NKd-UhVQ%40mail.gmail.com

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


Re: [HACKERS] ERROR: ORDER/GROUP BY expression not found in targetlist

2016-06-14 Thread David Rowley
On 14 June 2016 at 04:07, Tom Lane  wrote:
> Just as an experiment to see what would happen, I did
>
> -   int parallel_threshold = 1000;
> +   int parallel_threshold = 1;
>
> and ran the regression tests.  I got a core dump in the window.sql test:
>
> Program terminated with signal 11, Segmentation fault.
> #0  0x00664dbc in make_partialgroup_input_target (root=0x1795018,
> input_rel=0x17957a8, target=0x17bf228, rollup_lists=0x0,
> rollup_groupclauses=0x0) at planner.c:4307
> 4307Index   sgref = 
> final_target->sortgrouprefs[i];
> (gdb) bt
> #0  0x00664dbc in make_partialgroup_input_target (root=0x1795018,
> input_rel=0x17957a8, target=0x17bf228, rollup_lists=0x0,
> rollup_groupclauses=0x0) at planner.c:4307
> #1  create_grouping_paths (root=0x1795018, input_rel=0x17957a8,
> target=0x17bf228, rollup_lists=0x0, rollup_groupclauses=0x0)
> at planner.c:3420
> #2  0x00667405 in grouping_planner (root=0x1795018,
> inheritance_update=0 '\000', tuple_fraction=0) at planner.c:1794
> #3  0x00668c80 in subquery_planner (glob=,
> parse=0x1703580, parent_root=,
> hasRecursion=, tuple_fraction=0) at planner.c:769
> #4  0x00668ea5 in standard_planner (parse=0x1703580,
> cursorOptions=256, boundParams=0x0) at planner.c:308
> #5  0x006691b6 in planner (parse=,
> cursorOptions=, boundParams=)
> at planner.c:178
> #6  0x006fb069 in pg_plan_query (querytree=0x1703580,
> cursorOptions=256, boundParams=0x0) at postgres.c:798
> (gdb) p debug_query_string
> $1 = 0x1702078 "SELECT SUM(COUNT(f1)) OVER () FROM int4_tbl WHERE f1=42;"

I see you've committed a fix for this. Thank you.

I thought it would be good to be consistent with the ressortgroupref
handling, and I quite like your fix in that regard.

Do you think it's worth also applying the attached so as to have
ressortgroupref set to NULL more consistently, instead of sometimes
NULL and other times allocated to memory wastefully filled with zeros?

The patch also saved on allocating the array's memory when it's not needed.

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


make_pathtarget_from_tlist.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ERROR: ORDER/GROUP BY expression not found in targetlist

2016-06-13 Thread Amit Kapila
On Mon, Jun 13, 2016 at 9:37 PM, Tom Lane  wrote:
>
> I wrote:
> > Amit Kapila  writes:
> >> It is slightly tricky to write a reproducible parallel-query test, but
> >> point taken and I think we should try to have a test unless such a
test is
> >> really time consuming.
>
> > BTW, decent regression tests could be written without the need to create
> > enormous tables if the minimum rel size in create_plain_partial_paths()
> > could be configured to something less than 1000 blocks.  I think it's
> > fairly crazy that that arbitrary constant is hard-wired anyway.  Should
> > we make it a GUC?
>
> Just as an experiment to see what would happen, I did
>
> -   int parallel_threshold = 1000;
> +   int parallel_threshold = 1;
>
> and ran the regression tests.  I got a core dump in the window.sql test:
>
> Program terminated with signal 11, Segmentation fault.
> #0  0x00664dbc in make_partialgroup_input_target (root=0x1795018,
> input_rel=0x17957a8, target=0x17bf228, rollup_lists=0x0,
> rollup_groupclauses=0x0) at planner.c:4307
> 4307Index   sgref =
final_target->sortgrouprefs[i];
> (gdb) bt
> #0  0x00664dbc in make_partialgroup_input_target (root=0x1795018,
> input_rel=0x17957a8, target=0x17bf228, rollup_lists=0x0,
> rollup_groupclauses=0x0) at planner.c:4307
> #1  create_grouping_paths (root=0x1795018, input_rel=0x17957a8,
> target=0x17bf228, rollup_lists=0x0, rollup_groupclauses=0x0)
> at planner.c:3420
> #2  0x00667405 in grouping_planner (root=0x1795018,
> inheritance_update=0 '\000', tuple_fraction=0) at planner.c:1794
> #3  0x00668c80 in subquery_planner (glob=,
> parse=0x1703580, parent_root=,
> hasRecursion=, tuple_fraction=0) at planner.c:769
> #4  0x00668ea5 in standard_planner (parse=0x1703580,
> cursorOptions=256, boundParams=0x0) at planner.c:308
> #5  0x006691b6 in planner (parse=,
> cursorOptions=, boundParams=)
> at planner.c:178
> #6  0x006fb069 in pg_plan_query (querytree=0x1703580,
> cursorOptions=256, boundParams=0x0) at postgres.c:798
> (gdb) p debug_query_string
> $1 = 0x1702078 "SELECT SUM(COUNT(f1)) OVER () FROM int4_tbl WHERE f1=42;"
>
> which I think may be another manifestation of the failure-to-apply-proper-
> pathtarget issue we're looking at in this thread.  Or maybe it's just
> an unjustified assumption in make_partialgroup_input_target that the
> input path must always have some sortgrouprefs assigned.
>

I don't see this problem after your recent commit
- 89d53515e53ea080029894939118365b647489b3.  Are you still getting it?

>
> Before getting to that point, there was also an unexplainable plan change:
>

Yeah, I am also seeing such a plan change, but I this is a separate thing
to investigate.


With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


Re: [HACKERS] ERROR: ORDER/GROUP BY expression not found in targetlist

2016-06-13 Thread Robert Haas
On Mon, Jun 13, 2016 at 6:21 PM, Tom Lane  wrote:
> Robert Haas  writes:
>> Again, please have a look at the patch and see if it seems unsuitable
>> to you for some reason.  I'm not confident of my ability to get all of
>> this path stuff right without a bit of help from the guy who designed
>> it.
>
> I'm kind of in a bind right now because Tomas has produced an
> FK-selectivity patch rewrite on schedule, and I already committed to
> review that before beta2, so I better go do that first.  If you can wait
> awhile I will try to help out more with parallel query.

I'm happy to have you look at his patch first.

> Having said that, the main thing I noticed in your draft patch is that
> you'd removed set_grouped_rel_consider_parallel() in favor of hard-wiring
> this in create_grouping_paths():
>
> +   if (input_rel->consider_parallel &&
> +   !has_parallel_hazard((Node *) target->exprs, false) &&
> +   !has_parallel_hazard((Node *) parse->havingQual, false))
> +   grouped_rel->consider_parallel = true;
>
> I think this is bad because it forces any external creators of
> UPPERREL_GROUP_AGG to duplicate that code --- and heaven help everybody
> if anyone is out of sync on whether to set the flag.  So I'd rather keep
> set_grouped_rel_consider_parallel(), even if all it does is the above.
> And make it global not static.  Ditto for the other upperrels.

I'm slightly mystified by this because we really shouldn't be setting
that flag more than once.  We don't want to do that work repeatedly,
just once, and prior to adding any paths to the rel.  Are you
imagining that somebody would try to created grouped paths before we
finish scan/join planning?

> Also, I wonder whether we could reduce that test to just the
> has_parallel_hazard tests, so as to avoid the ordering dependency of
> needing to create the top scan/join rel (and set its consider_parallel
> flag) before we can create the UPPERREL_GROUP_AGG rel.  This would mean
> putting more dependency on per-path parallel_safe flags to detect whether
> you can't parallelize a given step for lack of parallel-safe input, but
> I'm not sure that that's a bad thing.

It doesn't sound like an especially good thing to me.  Skipping all
parallel path generation is quite a bit less work than trying each
path in turn and realizing that none of them will work, and there are
various places where we optimize on that basis.  I don't understand,
in any event, why it makes any sense to create the UPPERREL_GROUP_AGG
rel before we finish scan/join planning.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ERROR: ORDER/GROUP BY expression not found in targetlist

2016-06-13 Thread Tom Lane
Robert Haas  writes:
> Again, please have a look at the patch and see if it seems unsuitable
> to you for some reason.  I'm not confident of my ability to get all of
> this path stuff right without a bit of help from the guy who designed
> it.

I'm kind of in a bind right now because Tomas has produced an
FK-selectivity patch rewrite on schedule, and I already committed to
review that before beta2, so I better go do that first.  If you can wait
awhile I will try to help out more with parallel query.

Having said that, the main thing I noticed in your draft patch is that
you'd removed set_grouped_rel_consider_parallel() in favor of hard-wiring
this in create_grouping_paths():

+   if (input_rel->consider_parallel &&
+   !has_parallel_hazard((Node *) target->exprs, false) &&
+   !has_parallel_hazard((Node *) parse->havingQual, false))
+   grouped_rel->consider_parallel = true;

I think this is bad because it forces any external creators of
UPPERREL_GROUP_AGG to duplicate that code --- and heaven help everybody
if anyone is out of sync on whether to set the flag.  So I'd rather keep
set_grouped_rel_consider_parallel(), even if all it does is the above.
And make it global not static.  Ditto for the other upperrels.

Also, I wonder whether we could reduce that test to just the
has_parallel_hazard tests, so as to avoid the ordering dependency of
needing to create the top scan/join rel (and set its consider_parallel
flag) before we can create the UPPERREL_GROUP_AGG rel.  This would mean
putting more dependency on per-path parallel_safe flags to detect whether
you can't parallelize a given step for lack of parallel-safe input, but
I'm not sure that that's a bad thing.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ERROR: ORDER/GROUP BY expression not found in targetlist

2016-06-13 Thread Robert Haas
On Mon, Jun 13, 2016 at 5:46 PM, Tom Lane  wrote:
> Robert Haas  writes:
>> One problem that I've realized that is related to this is that the way
>> that the consider_parallel flag is being set for upper rels is almost
>> totally bogus, which I believe accounts for your complaint at PGCon
>> that force_parallel_mode was not doing as much as it ought to do.
>
> Yeah, I just ran into a different reason to think that.  I tried marking
> MinMaxAggPaths in the same way as other relation-scan paths, ie
>
> pathnode->path.parallel_safe = rel->consider_parallel;
>
> but that did nothing because the just-created UPPERREL_GROUP_AGG rel
> didn't have its consider_parallel flag set yet.  Moreover, if I'd tried to
> fix that by invoking set_grouped_rel_consider_parallel() from planagg.c,
> it would still not work right because set_grouped_rel_consider_parallel()
> is hard-wired to consider only partial aggregation, which is not what's
> going on in a MinMaxAggPath.  I'm not sure about the best rewrite here,
> but I would urge you to make sure that consider_parallel on an upper rel
> reflects only properties inherent to the rel (eg, safeness of quals that
> must be applied there) and not properties of specific implementation
> methods.

Yeah, I think that David and I goofed this up when adding parallel
aggregation.  I just wasn't thinking clearly about the way upper rels
needed to work with consider_parallel.  If you would be willing to do
any sort of review of the WIP patch I posted just upthread, that would
help.

> Also, please make sure that whatever logic is involved remains
> factored out where it can be called by an extension that might want to
> create an upperrel sooner than the core code would do.

Again, please have a look at the patch and see if it seems unsuitable
to you for some reason.  I'm not confident of my ability to get all of
this path stuff right without a bit of help from the guy who designed
it.

> BTW, do we need wholePlanParallelSafe at all, and if so why?  Can't
> it be replaced by examining the parallel_safe flag on the selected
> topmost Path?

Oh, man.  I think you're right.  This is yet another piece of fallout
from upper planner pathification.  That was absolutely necessary
before, but now if we do the other bits right we don't need it any
more.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ERROR: ORDER/GROUP BY expression not found in targetlist

2016-06-13 Thread Tom Lane
Robert Haas  writes:
> Oh, one other thing about this: I'm not going to try to defend
> whatever confusion between subplans and subqueries appears in that
> comment, but prior to upper planner pathification I think we were dead
> in the water here anyway, because the subquery was going to output a
> finished plan, not a list of paths.  Since finished plans aren't
> labeled as to parallel-safety, we'd have to conservatively assume that
> the finished plan we got back might not be parallel-safe.  Now that
> we're using the path representation throughout, we can do better.

Well, if that were still the state of affairs, we could certainly consider
adding a parallel_safe flag to Plans as well as Paths.  But as you say,
it's moot now.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ERROR: ORDER/GROUP BY expression not found in targetlist

2016-06-13 Thread Tom Lane
Robert Haas  writes:
> One problem that I've realized that is related to this is that the way
> that the consider_parallel flag is being set for upper rels is almost
> totally bogus, which I believe accounts for your complaint at PGCon
> that force_parallel_mode was not doing as much as it ought to do.

Yeah, I just ran into a different reason to think that.  I tried marking
MinMaxAggPaths in the same way as other relation-scan paths, ie

pathnode->path.parallel_safe = rel->consider_parallel;

but that did nothing because the just-created UPPERREL_GROUP_AGG rel
didn't have its consider_parallel flag set yet.  Moreover, if I'd tried to
fix that by invoking set_grouped_rel_consider_parallel() from planagg.c,
it would still not work right because set_grouped_rel_consider_parallel()
is hard-wired to consider only partial aggregation, which is not what's
going on in a MinMaxAggPath.  I'm not sure about the best rewrite here,
but I would urge you to make sure that consider_parallel on an upper rel
reflects only properties inherent to the rel (eg, safeness of quals that
must be applied there) and not properties of specific implementation
methods.  Also, please make sure that whatever logic is involved remains
factored out where it can be called by an extension that might want to
create an upperrel sooner than the core code would do.

BTW, do we need wholePlanParallelSafe at all, and if so why?  Can't
it be replaced by examining the parallel_safe flag on the selected
topmost Path?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ERROR: ORDER/GROUP BY expression not found in targetlist

2016-06-13 Thread Robert Haas
On Mon, Jun 13, 2016 at 5:27 PM, Robert Haas  wrote:
> On Mon, Jun 13, 2016 at 4:46 PM, Tom Lane  wrote:
>> Robert Haas  writes:
>>> In practice, we don't yet have the ability for
>>> parallel-safe paths from subqueries to affect planning at higher query
>>> levels, but that's because the pathification stuff landed too late in
>>> the cycle for me to fully react to it.
>>
>> I wonder if that's not just from confusion between subplans and
>> subqueries.  I don't believe any of the claims made in the comment
>> adjusted in the patch below (other than "Subplans currently aren't passed
>> to workers", which is true but irrelevant).  Nested Gather nodes is
>> something that you would need, and already have, a defense for anyway.
>
> I think you may be correct.

Oh, one other thing about this: I'm not going to try to defend
whatever confusion between subplans and subqueries appears in that
comment, but prior to upper planner pathification I think we were dead
in the water here anyway, because the subquery was going to output a
finished plan, not a list of paths.  Since finished plans aren't
labeled as to parallel-safety, we'd have to conservatively assume that
the finished plan we got back might not be parallel-safe.  Now that
we're using the path representation throughout, we can do better.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ERROR: ORDER/GROUP BY expression not found in targetlist

2016-06-13 Thread Robert Haas
On Mon, Jun 13, 2016 at 4:46 PM, Tom Lane  wrote:
> Robert Haas  writes:
>> In practice, we don't yet have the ability for
>> parallel-safe paths from subqueries to affect planning at higher query
>> levels, but that's because the pathification stuff landed too late in
>> the cycle for me to fully react to it.
>
> I wonder if that's not just from confusion between subplans and
> subqueries.  I don't believe any of the claims made in the comment
> adjusted in the patch below (other than "Subplans currently aren't passed
> to workers", which is true but irrelevant).  Nested Gather nodes is
> something that you would need, and already have, a defense for anyway.

I think you may be correct.

One problem that I've realized that is related to this is that the way
that the consider_parallel flag is being set for upper rels is almost
totally bogus, which I believe accounts for your complaint at PGCon
that force_parallel_mode was not doing as much as it ought to do.
When I originally wrote a lot of this logic, there were no upper rels,
which led to this code:

if (force_parallel_mode == FORCE_PARALLEL_OFF || !glob->parallelModeOK)
{
glob->parallelModeNeeded = false;
glob->wholePlanParallelSafe = false;/* either
false or don't care */
}
else
{
glob->parallelModeNeeded = true;
glob->wholePlanParallelSafe =
!has_parallel_hazard((Node *) parse, false);
}

The main way that wholePlanParallelSafe is supposed to be cleared is
in create_plan:

/* Update parallel safety information if needed. */
if (!best_path->parallel_safe)
root->glob->wholePlanParallelSafe = false;

However, at the time that code was written, it was impossible to rely
entirely on the latter mechanism.  Since there were no upper rels and
no paths for upper plan nodes, you could have the case where every
path was parallel-safe but the whole plan was node.  Therefore the
code shown above was needed to scan the whole darned plan for
parallel-unsafe things.  Post-pathification, this whole thing is
pretty bogus: upper rels mostly don't get consider_parallel set at
all, which means plans involving upper rels get marked parallel-unsafe
even if they are not, which means the wholePlanParallelSafe flag gets
cleared in a bunch of cases where it shouldn't.  And on the flip side
I think that the first chunk of code quoted above would be totally
unnecessary if we were actually setting consider_parallel correctly on
the upper rels.

I started working on a patch to fix all this, which I'm attaching here
so you can see what I'm thinking.  I am not sure it's correct, but it
does cause force_parallel_mode to do something interesting in many
more cases.

Anyway, the reason this is related to the issue at hand is that you
might have something like A LEFT JOIN (SELECT x, sum(q) FROM B GROUP
BY 1) ON A.x = B.x.   Right now, I think that the paths for the
aggregated subquery will always be marked as not parallel-safe, but
that's only because the consider_parallel flag on the grouped rel
isn't being set properly.  If it were, you could theoretically do a
parallel seq scan on A and then have each worker evaluate the join for
the rows that pop out of the subquery.  Maybe that's a stupid example,
but the point is that I bet there are cases where failing to mark the
upper rels with consider_parallel where appropriate causes good
parallel plans not to get generated.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index b1554cb..8232a64 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -107,9 +107,6 @@ static double get_number_of_groups(PlannerInfo *root,
 	 double path_rows,
 	 List *rollup_lists,
 	 List *rollup_groupclauses);
-static void set_grouped_rel_consider_parallel(PlannerInfo *root,
-	 RelOptInfo *grouped_rel,
-	 PathTarget *target);
 static Size estimate_hashagg_tablesize(Path *path, AggClauseCosts *agg_costs,
 	 double dNumGroups);
 static RelOptInfo *create_grouping_paths(PlannerInfo *root,
@@ -272,8 +269,7 @@ standard_planner(Query *parse, int cursorOptions, ParamListInfo boundParams)
 	else
 	{
 		glob->parallelModeNeeded = true;
-		glob->wholePlanParallelSafe =
-			!has_parallel_hazard((Node *) parse, false);
+		glob->wholePlanParallelSafe = true;
 	}
 
 	/* Determine what fraction of the plan is likely to be scanned */
@@ -1878,6 +1874,17 @@ grouping_planner(PlannerInfo *root, bool inheritance_update,
 	 */
 	final_rel = fetch_upper_rel(root, UPPERREL_FINAL, NULL);
 
+	/*
+	 * We can just inherit the current_rel's consider_parallel flag; nothing
+	 * that happens here can switch us from being parallel-safe path to
+	 * being parallel-restricted.  (If 

Re: [HACKERS] ERROR: ORDER/GROUP BY expression not found in targetlist

2016-06-13 Thread Tom Lane
Robert Haas  writes:
> In practice, we don't yet have the ability for
> parallel-safe paths from subqueries to affect planning at higher query
> levels, but that's because the pathification stuff landed too late in
> the cycle for me to fully react to it.

I wonder if that's not just from confusion between subplans and
subqueries.  I don't believe any of the claims made in the comment
adjusted in the patch below (other than "Subplans currently aren't passed
to workers", which is true but irrelevant).  Nested Gather nodes is
something that you would need, and already have, a defense for anyway.

regards, tom lane


diff --git a/src/backend/optimizer/path/allpaths.c b/src/backend/optimizer/path/allpaths.c
index cc8ba61..8ab049c 100644
*** a/src/backend/optimizer/path/allpaths.c
--- b/src/backend/optimizer/path/allpaths.c
*** set_rel_consider_parallel(PlannerInfo *r
*** 560,574 
  		case RTE_SUBQUERY:
  
  			/*
! 			 * Subplans currently aren't passed to workers.  Even if they
! 			 * were, the subplan might be using parallelism internally, and we
! 			 * can't support nested Gather nodes at present.  Finally, we
! 			 * don't have a good way of knowing whether the subplan involves
! 			 * any parallel-restricted operations.  It would be nice to relax
! 			 * this restriction some day, but it's going to take a fair amount
! 			 * of work.
  			 */
! 			return;
  
  		case RTE_JOIN:
  			/* Shouldn't happen; we're only considering baserels here. */
--- 560,574 
  		case RTE_SUBQUERY:
  
  			/*
! 			 * If the subquery doesn't have anything parallel-restricted, we
! 			 * can consider parallel scans.  Note that this does not mean that
! 			 * all (or even any) of the paths produced for the subquery will
! 			 * actually be parallel-safe; but that's true for paths produced
! 			 * for regular tables, too.
  			 */
! 			if (has_parallel_hazard((Node *) rte->subquery, false))
! return;
! 			break;
  
  		case RTE_JOIN:
  			/* Shouldn't happen; we're only considering baserels here. */

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ERROR: ORDER/GROUP BY expression not found in targetlist

2016-06-13 Thread Robert Haas
On Mon, Jun 13, 2016 at 3:42 PM, Tom Lane  wrote:
>> I would not be surprised at a change to a parallel-query plan, but there's
>> no parallelism here, so what happened?  This looks like a bug to me.
>> (Also, doing this query without COSTS OFF shows that the newly selected
>> plan actually has a greater estimated cost than the expected plan, which
>> makes it definitely a bug.)
>
> I looked into this and found that the costs are considered fuzzily the
> same, and then add_path prefers the slightly-worse path on the grounds
> that it is marked parallel_safe while the MinMaxAgg path is not.  It seems
> to me that there is some fuzzy thinking going on there.  On exactly what
> grounds is a path to be preferred merely because it is parallel safe, and
> not actually parallelized?

A parallel-safe plan can be joined to a partial path at a higher level
of the plan tree to form a new partial path.  A non-parallel-safe path
cannot.  Therefore, if two paths are equally good, the one which is
parallel-safe is more useful (just as a sorted path which is slightly
more expensive than an unsorted path is worth keeping around because
it is ordered).  In practice, we don't yet have the ability for
parallel-safe paths from subqueries to affect planning at higher query
levels, but that's because the pathification stuff landed too late in
the cycle for me to fully react to it.

> Or perhaps the question to ask is whether a
> MinMaxAgg path can be marked parallel-safe.

That is a good question.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ERROR: ORDER/GROUP BY expression not found in targetlist

2016-06-13 Thread Tom Lane
I wrote:
> ... there was also an unexplainable plan change:

> *** /home/postgres/pgsql/src/test/regress/expected/aggregates.out Thu Apr 
>  7 21:13:14 2016
> --- /home/postgres/pgsql/src/test/regress/results/aggregates.out  Mon Jun 
> 13 11:54:01 2016
> ***
> *** 577,590 
  
>   explain (costs off)
> select max(unique1) from tenk1 where unique1 > 42000;
> ! QUERY PLAN 
> ! ---
> !  Result
> !InitPlan 1 (returns $0)
> !  ->  Limit
> !->  Index Only Scan Backward using tenk1_unique1 on tenk1
> !  Index Cond: ((unique1 IS NOT NULL) AND (unique1 > 42000))
> ! (5 rows)
  
>   select max(unique1) from tenk1 where unique1 > 42000;
>max 
> --- 577,588 
  
>   explain (costs off)
> select max(unique1) from tenk1 where unique1 > 42000;
> !  QUERY PLAN 
> ! 
> !  Aggregate
> !->  Index Only Scan using tenk1_unique1 on tenk1
> !  Index Cond: (unique1 > 42000)
> ! (3 rows)
  
>   select max(unique1) from tenk1 where unique1 > 42000;
>max 

> I would not be surprised at a change to a parallel-query plan, but there's
> no parallelism here, so what happened?  This looks like a bug to me.
> (Also, doing this query without COSTS OFF shows that the newly selected
> plan actually has a greater estimated cost than the expected plan, which
> makes it definitely a bug.)

I looked into this and found that the costs are considered fuzzily the
same, and then add_path prefers the slightly-worse path on the grounds
that it is marked parallel_safe while the MinMaxAgg path is not.  It seems
to me that there is some fuzzy thinking going on there.  On exactly what
grounds is a path to be preferred merely because it is parallel safe, and
not actually parallelized?  Or perhaps the question to ask is whether a
MinMaxAgg path can be marked parallel-safe.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ERROR: ORDER/GROUP BY expression not found in targetlist

2016-06-13 Thread Robert Haas
On Mon, Jun 13, 2016 at 1:06 PM, Tom Lane  wrote:
> Robert Haas  writes:
>> On Mon, Jun 13, 2016 at 11:02 AM, Tom Lane  wrote:
>>> BTW, decent regression tests could be written without the need to create
>>> enormous tables if the minimum rel size in create_plain_partial_paths()
>>> could be configured to something less than 1000 blocks.  I think it's
>>> fairly crazy that that arbitrary constant is hard-wired anyway.  Should
>>> we make it a GUC?
>
>> That was proposed before, and I didn't do it mostly because I couldn't
>> think of a name for it that didn't sound unbelievably corny.
>
> min_parallel_relation_size, or min_parallelizable_relation_size, or
> something like that?

Sure.

>> Also,
>> the whole way that algorithm works is kind of a hack and probably
>> needs to be overhauled entirely in some future release.  I'm worried
>> about having the words "backward compatibility" thrown in my face when
>> it's time to improve this logic.  But aside from those two issues I'm
>> OK with exposing a knob.
>
> I agree it's a hack, and I don't want to expose anything about the
> number-of-workers scaling behavior, for precisely that reason.  But a
> threshold on the size of a table to consider parallel scans for at all
> doesn't seem unreasonable.

OK.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ERROR: ORDER/GROUP BY expression not found in targetlist

2016-06-13 Thread Amit Kapila
On Mon, Jun 13, 2016 at 7:51 PM, Tom Lane  wrote:
>
> Robert Haas  writes:
> > On Mon, Jun 13, 2016 at 3:18 AM, Amit Kapila 
wrote:
> >> In create_grouping_paths(), we are building partial_grouping_path and
same
> >> is used for gather path and other grouping paths (for partial paths).
> >> However, we don't use it for partial path list and sort path due to
which
> >> path target for Sort path is different from what we have expected.  Is
there
> >> a problem in applying  partial_grouping_path for partial pathlist?
> >> Attached patch just does that and I don't see error with patch.
>
> > It doesn't seem like a good idea to destructive modify
> > input_rel->partial_pathlist.  Applying the projection to each path
> > before using it would probably be better.
>
> I think the real question here is why the code removed by 04ae11f62
> was wrong.  It was unsafe to use apply_projection_to_path, certainly,
> but using create_projection_path directly would have avoided the
> stated problem.  And it's very unclear that this new patch doesn't
> bring back that bug in a different place.
>

This new patch still doesn't seem to be right, but it won't bring back the
original problem because apply_projection_to_path will be only done if
grouped_rel is parallel_safe which means it doesn't have any
parallel-unsafe or parallel-restricted clause in quals or target list.

> I am not very happy that neither 04ae11f62 nor this patch include
> any regression test case proving that a problem existed and has
> been fixed.
>

It is slightly tricky to write a reproducible parallel-query test, but
point taken and I think we should try to have a test unless such a test is
really time consuming.


With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


Re: [HACKERS] ERROR: ORDER/GROUP BY expression not found in targetlist

2016-06-13 Thread Amit Kapila
On Mon, Jun 13, 2016 at 7:17 PM, Robert Haas  wrote:
>
> On Mon, Jun 13, 2016 at 3:18 AM, Amit Kapila 
wrote:
> > In create_grouping_paths(), we are building partial_grouping_path and
same
> > is used for gather path and other grouping paths (for partial paths).
> > However, we don't use it for partial path list and sort path due to
which
> > path target for Sort path is different from what we have expected.  Is
there
> > a problem in applying  partial_grouping_path for partial pathlist?
> > Attached patch just does that and I don't see error with patch.
>
> It doesn't seem like a good idea to destructive modify
> input_rel->partial_pathlist.  Applying the projection to each path
> before using it would probably be better.
>

Do you mean to have it when we generate a complete GroupAgg Path atop of
the cheapest partial path?

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


Re: [HACKERS] ERROR: ORDER/GROUP BY expression not found in targetlist

2016-06-13 Thread Tom Lane
Robert Haas  writes:
> On Mon, Jun 13, 2016 at 11:02 AM, Tom Lane  wrote:
>> BTW, decent regression tests could be written without the need to create
>> enormous tables if the minimum rel size in create_plain_partial_paths()
>> could be configured to something less than 1000 blocks.  I think it's
>> fairly crazy that that arbitrary constant is hard-wired anyway.  Should
>> we make it a GUC?

> That was proposed before, and I didn't do it mostly because I couldn't
> think of a name for it that didn't sound unbelievably corny.

min_parallel_relation_size, or min_parallelizable_relation_size, or
something like that?

> Also,
> the whole way that algorithm works is kind of a hack and probably
> needs to be overhauled entirely in some future release.  I'm worried
> about having the words "backward compatibility" thrown in my face when
> it's time to improve this logic.  But aside from those two issues I'm
> OK with exposing a knob.

I agree it's a hack, and I don't want to expose anything about the
number-of-workers scaling behavior, for precisely that reason.  But a
threshold on the size of a table to consider parallel scans for at all
doesn't seem unreasonable.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ERROR: ORDER/GROUP BY expression not found in targetlist

2016-06-13 Thread Robert Haas
On Mon, Jun 13, 2016 at 11:02 AM, Tom Lane  wrote:
> Amit Kapila  writes:
>> It is slightly tricky to write a reproducible parallel-query test, but
>> point taken and I think we should try to have a test unless such a test is
>> really time consuming.
>
> BTW, decent regression tests could be written without the need to create
> enormous tables if the minimum rel size in create_plain_partial_paths()
> could be configured to something less than 1000 blocks.  I think it's
> fairly crazy that that arbitrary constant is hard-wired anyway.  Should
> we make it a GUC?

That was proposed before, and I didn't do it mostly because I couldn't
think of a name for it that didn't sound unbelievably corny.  Also,
the whole way that algorithm works is kind of a hack and probably
needs to be overhauled entirely in some future release.  I'm worried
about having the words "backward compatibility" thrown in my face when
it's time to improve this logic.  But aside from those two issues I'm
OK with exposing a knob.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ERROR: ORDER/GROUP BY expression not found in targetlist

2016-06-13 Thread Tom Lane
I wrote:
> Amit Kapila  writes:
>> It is slightly tricky to write a reproducible parallel-query test, but
>> point taken and I think we should try to have a test unless such a test is
>> really time consuming.

> BTW, decent regression tests could be written without the need to create
> enormous tables if the minimum rel size in create_plain_partial_paths()
> could be configured to something less than 1000 blocks.  I think it's
> fairly crazy that that arbitrary constant is hard-wired anyway.  Should
> we make it a GUC?

Just as an experiment to see what would happen, I did

-   int parallel_threshold = 1000;
+   int parallel_threshold = 1;

and ran the regression tests.  I got a core dump in the window.sql test:

Program terminated with signal 11, Segmentation fault.
#0  0x00664dbc in make_partialgroup_input_target (root=0x1795018, 
input_rel=0x17957a8, target=0x17bf228, rollup_lists=0x0, 
rollup_groupclauses=0x0) at planner.c:4307
4307Index   sgref = final_target->sortgrouprefs[i];
(gdb) bt
#0  0x00664dbc in make_partialgroup_input_target (root=0x1795018, 
input_rel=0x17957a8, target=0x17bf228, rollup_lists=0x0, 
rollup_groupclauses=0x0) at planner.c:4307
#1  create_grouping_paths (root=0x1795018, input_rel=0x17957a8, 
target=0x17bf228, rollup_lists=0x0, rollup_groupclauses=0x0)
at planner.c:3420
#2  0x00667405 in grouping_planner (root=0x1795018, 
inheritance_update=0 '\000', tuple_fraction=0) at planner.c:1794
#3  0x00668c80 in subquery_planner (glob=, 
parse=0x1703580, parent_root=, 
hasRecursion=, tuple_fraction=0) at planner.c:769
#4  0x00668ea5 in standard_planner (parse=0x1703580, 
cursorOptions=256, boundParams=0x0) at planner.c:308
#5  0x006691b6 in planner (parse=, 
cursorOptions=, boundParams=)
at planner.c:178
#6  0x006fb069 in pg_plan_query (querytree=0x1703580, 
cursorOptions=256, boundParams=0x0) at postgres.c:798
(gdb) p debug_query_string
$1 = 0x1702078 "SELECT SUM(COUNT(f1)) OVER () FROM int4_tbl WHERE f1=42;"

which I think may be another manifestation of the failure-to-apply-proper-
pathtarget issue we're looking at in this thread.  Or maybe it's just
an unjustified assumption in make_partialgroup_input_target that the
input path must always have some sortgrouprefs assigned.

Before getting to that point, there was also an unexplainable plan change:

*** /home/postgres/pgsql/src/test/regress/expected/aggregates.out   Thu Apr 
 7 21:13:14 2016
--- /home/postgres/pgsql/src/test/regress/results/aggregates.outMon Jun 
13 11:54:01 2016
***
*** 577,590 
  
  explain (costs off)
select max(unique1) from tenk1 where unique1 > 42000;
! QUERY PLAN 
! ---
!  Result
!InitPlan 1 (returns $0)
!  ->  Limit
!->  Index Only Scan Backward using tenk1_unique1 on tenk1
!  Index Cond: ((unique1 IS NOT NULL) AND (unique1 > 42000))
! (5 rows)
  
  select max(unique1) from tenk1 where unique1 > 42000;
   max 
--- 577,588 
  
  explain (costs off)
select max(unique1) from tenk1 where unique1 > 42000;
!  QUERY PLAN 
! 
!  Aggregate
!->  Index Only Scan using tenk1_unique1 on tenk1
!  Index Cond: (unique1 > 42000)
! (3 rows)
  
  select max(unique1) from tenk1 where unique1 > 42000;
   max 

I would not be surprised at a change to a parallel-query plan, but there's
no parallelism here, so what happened?  This looks like a bug to me.
(Also, doing this query without COSTS OFF shows that the newly selected
plan actually has a greater estimated cost than the expected plan, which
makes it definitely a bug.)

At this point I'm pretty firmly convinced that we should have a way to
run the regression tests with parallel scans considered for even very
small tables.  If someone doesn't want that way to be a GUC, you'd better
propose another solution.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ERROR: ORDER/GROUP BY expression not found in targetlist

2016-06-13 Thread Tom Lane
Amit Kapila  writes:
> It is slightly tricky to write a reproducible parallel-query test, but
> point taken and I think we should try to have a test unless such a test is
> really time consuming.

BTW, decent regression tests could be written without the need to create
enormous tables if the minimum rel size in create_plain_partial_paths()
could be configured to something less than 1000 blocks.  I think it's
fairly crazy that that arbitrary constant is hard-wired anyway.  Should
we make it a GUC?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ERROR: ORDER/GROUP BY expression not found in targetlist

2016-06-13 Thread Tom Lane
Amit Kapila  writes:
> On Mon, Jun 13, 2016 at 7:51 PM, Tom Lane  wrote:
>> I think the real question here is why the code removed by 04ae11f62
>> was wrong.  It was unsafe to use apply_projection_to_path, certainly,
>> but using create_projection_path directly would have avoided the
>> stated problem.  And it's very unclear that this new patch doesn't
>> bring back that bug in a different place.

> This new patch still doesn't seem to be right, but it won't bring back the
> original problem because apply_projection_to_path will be only done if
> grouped_rel is parallel_safe which means it doesn't have any
> parallel-unsafe or parallel-restricted clause in quals or target list.

The problem cited in 04ae11f62's commit message is that
apply_projection_to_path would overwrite the paths' pathtargets in-place,
causing problems if they'd been used for other purposes elsewhere.  I do
not share your confidence that using apply_projection_to_path within
create_grouping_paths is free of such a hazard.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ERROR: ORDER/GROUP BY expression not found in targetlist

2016-06-13 Thread Tom Lane
Robert Haas  writes:
> On Mon, Jun 13, 2016 at 3:18 AM, Amit Kapila  wrote:
>> In create_grouping_paths(), we are building partial_grouping_path and same
>> is used for gather path and other grouping paths (for partial paths).
>> However, we don't use it for partial path list and sort path due to which
>> path target for Sort path is different from what we have expected.  Is there
>> a problem in applying  partial_grouping_path for partial pathlist?
>> Attached patch just does that and I don't see error with patch.

> It doesn't seem like a good idea to destructive modify
> input_rel->partial_pathlist.  Applying the projection to each path
> before using it would probably be better.

I think the real question here is why the code removed by 04ae11f62
was wrong.  It was unsafe to use apply_projection_to_path, certainly,
but using create_projection_path directly would have avoided the
stated problem.  And it's very unclear that this new patch doesn't
bring back that bug in a different place.

I am not very happy that neither 04ae11f62 nor this patch include
any regression test case proving that a problem existed and has
been fixed.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ERROR: ORDER/GROUP BY expression not found in targetlist

2016-06-13 Thread Robert Haas
On Mon, Jun 13, 2016 at 3:18 AM, Amit Kapila  wrote:
> In create_grouping_paths(), we are building partial_grouping_path and same
> is used for gather path and other grouping paths (for partial paths).
> However, we don't use it for partial path list and sort path due to which
> path target for Sort path is different from what we have expected.  Is there
> a problem in applying  partial_grouping_path for partial pathlist?
> Attached patch just does that and I don't see error with patch.

It doesn't seem like a good idea to destructive modify
input_rel->partial_pathlist.  Applying the projection to each path
before using it would probably be better.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ERROR: ORDER/GROUP BY expression not found in targetlist

2016-06-13 Thread Tatsuro Yamada

Hi,

I applied your patch and run tpc-h.
Then I got new errors on Q4,12,17 and 19.

ERROR:  Aggref found in non-Agg plan node.
See bellow,

--
postgres=# \i queries/4.explain.sql
ERROR:  Aggref found in non-Agg plan node
STATEMENT:  explain analyze select
o_orderpriority,
count(*) as order_count
from
orders
where
o_orderdate >= date '1993-10-01'
and o_orderdate < date '1993-10-01' + interval '3' month
and exists (
select
*
from
lineitem
where
l_orderkey = o_orderkey
and l_commitdate < l_receiptdate
)
group by
o_orderpriority
order by
o_orderpriority
LIMIT 1;
--

Regards,
Tatsuro Yamada
NTT OSS Center


On 2016/06/13 16:18, Amit Kapila wrote:

On Mon, Jun 13, 2016 at 11:05 AM, David Rowley > wrote:
 >
 > On 13 June 2016 at 15:39, Thomas Munro > wrote:
 > > What is going on here?
 >
 > ...
 >
 > >
 > > postgres=# set max_parallel_workers_per_gather = 2;
 > > SET
 > > postgres=# explain select length(data) from logs group by length(data);
 > > ERROR:  ORDER/GROUP BY expression not found in targetlist
 >
 > Seems like this was caused by 04ae11f62e643e07c411c4935ea6af46cb112aa9
 >

In create_grouping_paths(), we are building partial_grouping_path and same is 
used for gather path and other grouping paths (for partial paths). However, we 
don't use it for partial path list and sort path due to which path target for 
Sort path is different from what we have expected.  Is there a problem in 
applying  partial_grouping_path for partial pathlist?   Attached patch just 
does that and I don't see error with patch.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com 








--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ERROR: ORDER/GROUP BY expression not found in targetlist

2016-06-13 Thread Tatsuro Yamada

On 2016/06/13 15:52, Michael Paquier wrote:

On Mon, Jun 13, 2016 at 2:42 PM, Tatsuro Yamada
 wrote:

I got mistake to write an e-mail to -hackers on last week. :-<
I should have written this.

   The bug has not fixed by Tom Lane's patch: commit aeb9ae6.
   Because I got the same error using tpc-h.


This looks like a different regression..


I understand it now, thanks. :-)



I checked the list, but the bug is not listed.
   https://wiki.postgresql.org/wiki/PostgreSQL_9.6_Open_Items


And the winner is:

04ae11f62e643e07c411c4935ea6af46cb112aa9 is the first bad commit
commit 04ae11f62e643e07c411c4935ea6af46cb112aa9
Author: Robert Haas 
Date:   Fri Jun 3 14:27:33 2016 -0400

I am adding that to the list of open items.


Oh...
I'll try to run tpc-h if I got a new patch which fixes the bug. :)


Thanks,
Tatsuro Yamada
NTT OSS Center





--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ERROR: ORDER/GROUP BY expression not found in targetlist

2016-06-13 Thread Amit Kapila
On Mon, Jun 13, 2016 at 11:05 AM, David Rowley 
wrote:
>
> On 13 June 2016 at 15:39, Thomas Munro 
wrote:
> > What is going on here?
>
> ...
>
> >
> > postgres=# set max_parallel_workers_per_gather = 2;
> > SET
> > postgres=# explain select length(data) from logs group by length(data);
> > ERROR:  ORDER/GROUP BY expression not found in targetlist
>
> Seems like this was caused by 04ae11f62e643e07c411c4935ea6af46cb112aa9
>

In create_grouping_paths(), we are building partial_grouping_path and same
is used for gather path and other grouping paths (for partial paths).
However, we don't use it for partial path list and sort path due to which
path target for Sort path is different from what we have expected.  Is
there a problem in applying  partial_grouping_path for partial pathlist?
Attached patch just does that and I don't see error with patch.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


apply_partial_pathtarget_partial_pathlist_v1.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ERROR: ORDER/GROUP BY expression not found in targetlist

2016-06-13 Thread Michael Paquier
On Mon, Jun 13, 2016 at 2:42 PM, Tatsuro Yamada
 wrote:
> I got mistake to write an e-mail to -hackers on last week. :-<
> I should have written this.
>
>   The bug has not fixed by Tom Lane's patch: commit aeb9ae6.
>   Because I got the same error using tpc-h.

This looks like a different regression..

>>> Today, I try it again by changing max_parallel_workers_per_gather
>>> parameter.
>>> The result of Q1 is bellow. Is this bug in the Open items on wiki?
>>
>> I don't see it on the Open Issues list.
>
> I checked the list, but the bug is not listed.
>   https://wiki.postgresql.org/wiki/PostgreSQL_9.6_Open_Items

And the winner is:

04ae11f62e643e07c411c4935ea6af46cb112aa9 is the first bad commit
commit 04ae11f62e643e07c411c4935ea6af46cb112aa9
Author: Robert Haas 
Date:   Fri Jun 3 14:27:33 2016 -0400

I am adding that to the list of open items.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ERROR: ORDER/GROUP BY expression not found in targetlist

2016-06-12 Thread Tatsuro Yamada

Hi,


Subject: Re: [HACKERS] ORDER/GROUP BY expression not found in targetlist
Date: Thu, 09 Jun 2016 12:08:01 +0900


Right, I saw that thread which involved the same error message:

https://www.postgresql.org/message-id/flat/20160526021235.w4nq7k3gnheg7vit%40alap3.anarazel.de#20160526021235.w4nq7k3gnheg7...@alap3.anarazel.de

... but that seems to be a different problem than the one you and I
have seen, it involved repeated references to columns in the tlist.
It was fixed with this commit:

commit aeb9ae6457865c8949641d71a9523374d843a418
Author: Tom Lane 
Date:   Thu May 26 14:52:24 2016 -0400

 Disable physical tlist if any Var would need multiple sortgroupref labels.


I use this version:f721e94 to run tpc-h on last week.
This patch is commited at Jun 8. If it fixed, I didn't get the error.

>PG96beta1
>  commit: f721e94b5f360391fc3ffe183bf697a0441e9184

-
commit f721e94b5f360391fc3ffe183bf697a0441e9184
Author: Robert Haas 
Date:   Wed Jun 8 08:37:06 2016 -0400

Fix typo.

Amit Langote
-

I got mistake to write an e-mail to -hackers on last week. :-<
I should have written this.

  The bug has not fixed by Tom Lane's patch: commit aeb9ae6.
  Because I got the same error using tpc-h.



Today, I try it again by changing max_parallel_workers_per_gather parameter.
The result of Q1 is bellow. Is this bug in the Open items on wiki?


I don't see it on the Open Issues list.


I checked the list, but the bug is not listed.
  https://wiki.postgresql.org/wiki/PostgreSQL_9.6_Open_Items


Regards,
Tatsuro Yamada
NTT OSS Center






--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ERROR: ORDER/GROUP BY expression not found in targetlist

2016-06-12 Thread David Rowley
On 13 June 2016 at 15:39, Thomas Munro  wrote:
> What is going on here?

...

>
> postgres=# set max_parallel_workers_per_gather = 2;
> SET
> postgres=# explain select length(data) from logs group by length(data);
> ERROR:  ORDER/GROUP BY expression not found in targetlist

Seems like this was caused by 04ae11f62e643e07c411c4935ea6af46cb112aa9

I missed the discussion on this commit, so I'll go look for that now.

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ERROR: ORDER/GROUP BY expression not found in targetlist

2016-06-12 Thread Thomas Munro
On Mon, Jun 13, 2016 at 4:16 PM, Tatsuro Yamada
 wrote:
> I tried to run tpc-h queries, but some queries failed by the error on last
> week.
>
>>Subject: Re: [HACKERS] ORDER/GROUP BY expression not found in targetlist
>>Date: Thu, 09 Jun 2016 12:08:01 +0900

Right, I saw that thread which involved the same error message:

https://www.postgresql.org/message-id/flat/20160526021235.w4nq7k3gnheg7vit%40alap3.anarazel.de#20160526021235.w4nq7k3gnheg7...@alap3.anarazel.de

... but that seems to be a different problem than the one you and I
have seen, it involved repeated references to columns in the tlist.
It was fixed with this commit:

commit aeb9ae6457865c8949641d71a9523374d843a418
Author: Tom Lane 
Date:   Thu May 26 14:52:24 2016 -0400

Disable physical tlist if any Var would need multiple sortgroupref labels.

> Today, I try it again by changing max_parallel_workers_per_gather parameter.
> The result of Q1 is bellow. Is this bug in the Open items on wiki?

I don't see it on the Open Issues list.

-- 
Thomas Munro
http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ERROR: ORDER/GROUP BY expression not found in targetlist

2016-06-12 Thread Tatsuro Yamada

Hi,

I tried to run tpc-h queries, but some queries failed by the error on last week.

>Subject: Re: [HACKERS] ORDER/GROUP BY expression not found in targetlist
>Date: Thu, 09 Jun 2016 12:08:01 +0900

Today, I try it again by changing max_parallel_workers_per_gather parameter.
The result of Q1 is bellow. Is this bug in the Open items on wiki?

-
postgres=# set max_parallel_workers_per_gather = 0;
SET
postgres=# \i queries/1.explain.sql
 QUERY PLAN
-
 Limit  (cost=43474.03..43474.03 rows=1 width=236) (actual 
time=1039.583..1039.583 rows=1 loops=1)
   ->  Sort  (cost=43474.03..43474.04 rows=6 width=236) (actual 
time=1039.583..1039.583 rows=1 loops=1)
 Sort Key: l_returnflag, l_linestatus
 Sort Method: top-N heapsort  Memory: 25kB
 ->  HashAggregate  (cost=43473.83..43474.00 rows=6 width=236) (actual 
time=1039.529..1039.534 rows=4 loops=1)
   Group Key: l_returnflag, l_linestatus
   ->  Seq Scan on lineitem  (cost=0.00..19668.15 rows=595142 
width=25) (actual time=0.048..125.332 rows=595224 loops=1)
 Filter: (l_shipdate <= '1998-09-22 00:00:00'::timestamp 
without time zone)
 Rows Removed by Filter: 5348
 Planning time: 0.180 ms
 Execution time: 1039.758 ms
(11 rows)

postgres=# set max_parallel_workers_per_gather = default;
SET
postgres=# \i queries/1.explain.sql
ERROR:  ORDER/GROUP BY expression not found in targetlist

-

Regards,
Tatsuro Yamada
NTT OSS Center


On 2016/06/13 12:39, Thomas Munro wrote:

Hi,

What is going on here?

postgres=# create table logs as select generate_series(1,
100)::text as data;
SELECT 100
postgres=# insert into logs select * from logs;
INSERT 0 100
postgres=# insert into logs select * from logs;
INSERT 0 200
postgres=# insert into logs select * from logs;
INSERT 0 400
postgres=# insert into logs select * from logs;
INSERT 0 800
postgres=# insert into logs select * from logs;
INSERT 0 1600
postgres=# analyze logs;
ANALYZE
postgres=# set max_parallel_workers_per_gather = 0;
SET
postgres=# explain select length(data) from logs group by length(data);
┌┐
│ QUERY PLAN │
├┤
│ Group  (cost=5843157.07..6005642.13 rows=993989 width=4)   │
│   Group Key: (length(data))│
│   ->  Sort  (cost=5843157.07..5923157.11 rows=3218 width=4)│
│ Sort Key: (length(data))   │
│ ->  Seq Scan on logs  (cost=0.00..541593.22 rows=3218 width=4) │
└┘
(5 rows)

postgres=# set max_parallel_workers_per_gather = 2;
SET
postgres=# explain select length(data) from logs group by length(data);
ERROR:  ORDER/GROUP BY expression not found in targetlist






--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] ERROR: ORDER/GROUP BY expression not found in targetlist

2016-06-12 Thread Thomas Munro
Hi,

What is going on here?

postgres=# create table logs as select generate_series(1,
100)::text as data;
SELECT 100
postgres=# insert into logs select * from logs;
INSERT 0 100
postgres=# insert into logs select * from logs;
INSERT 0 200
postgres=# insert into logs select * from logs;
INSERT 0 400
postgres=# insert into logs select * from logs;
INSERT 0 800
postgres=# insert into logs select * from logs;
INSERT 0 1600
postgres=# analyze logs;
ANALYZE
postgres=# set max_parallel_workers_per_gather = 0;
SET
postgres=# explain select length(data) from logs group by length(data);
┌┐
│ QUERY PLAN │
├┤
│ Group  (cost=5843157.07..6005642.13 rows=993989 width=4)   │
│   Group Key: (length(data))│
│   ->  Sort  (cost=5843157.07..5923157.11 rows=3218 width=4)│
│ Sort Key: (length(data))   │
│ ->  Seq Scan on logs  (cost=0.00..541593.22 rows=3218 width=4) │
└┘
(5 rows)

postgres=# set max_parallel_workers_per_gather = 2;
SET
postgres=# explain select length(data) from logs group by length(data);
ERROR:  ORDER/GROUP BY expression not found in targetlist

-- 
Thomas Munro
http://www.enterprisedb.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers