Re: [HACKERS] Strange result with LATERAL query

2016-08-24 Thread Tom Lane
Andrew Gierth  writes:
> "Tom" == Tom Lane  writes:
>  Tom> I think it's a bit too stupid as-is, though.  We don't need to
>  Tom> recalculate for Params in aggdirectargs, do we?

> In theory we would need to.

How come?  Those are only passed to the final function, no?  They
shouldn't affect what is in the hash table.

> But in practice we don't, because we don't
> allow ordered aggs in AGG_HASHED mode anyway.

True, it's moot at the moment.

> We could skip filling in aggParam at all if not in AGG_HASHED mode I
> guess.

Yeah, I'm planning to make it do that.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Strange result with LATERAL query

2016-08-24 Thread Andrew Gierth
> "Tom" == Tom Lane  writes:

 Tom> Hm, I was just working on inserting something of the sort into
 Tom> ExecInitAgg.  But I guess we could do it in the planner too.  Will
 Tom> run with your approach.

 Tom> I think it's a bit too stupid as-is, though.  We don't need to
 Tom> recalculate for Params in aggdirectargs, do we?

In theory we would need to. But in practice we don't, because we don't
allow ordered aggs in AGG_HASHED mode anyway.

We could skip filling in aggParam at all if not in AGG_HASHED mode I
guess.

-- 
Andrew (irc:RhodiumToad)


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Strange result with LATERAL query

2016-08-24 Thread Tom Lane
Andrew Gierth  writes:
> How about:

Hm, I was just working on inserting something of the sort into
ExecInitAgg.  But I guess we could do it in the planner too.
Will run with your approach.

I think it's a bit too stupid as-is, though.  We don't need to
recalculate for Params in aggdirectargs, do we?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Strange result with LATERAL query

2016-08-24 Thread Andrew Gierth
> "Pavel" == Pavel Stehule  writes:

 Pavel> The result should not depend on GUC - hashagg on/off changing
 Pavel> output - it is error.

I don't think anyone's suggesting leaving it unfixed, just whether the
fix should introduce unnecessary rescans of the aggregate input.

-- 
Andrew (irc:RhodiumToad)


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Strange result with LATERAL query

2016-08-24 Thread Andrew Gierth
> "Andrew" == Andrew Gierth  writes:
> "Tom" == Tom Lane  writes:

 Tom> I'm not sure if it's worth trying to distinguish whether the Param
 Tom> is inside any aggregate calls or not.

How about:

-- 
Andrew (irc:RhodiumToad)

diff --git a/src/backend/executor/nodeAgg.c b/src/backend/executor/nodeAgg.c
index 1ec2515..4a9fefb 100644
--- a/src/backend/executor/nodeAgg.c
+++ b/src/backend/executor/nodeAgg.c
@@ -3426,10 +3426,11 @@ ExecReScanAgg(AggState *node)
 
 		/*
 		 * If we do have the hash table and the subplan does not have any
-		 * parameter changes, then we can just rescan the existing hash table;
-		 * no need to build it again.
+		 * parameter changes, we might still need to rebuild the hash if any of
+		 * the parameter changes affect the input to aggregate functions.
 		 */
-		if (outerPlan->chgParam == NULL)
+		if (outerPlan->chgParam == NULL
+			&& !bms_overlap(node->ss.ps.chgParam, aggnode->aggParam))
 		{
 			ResetTupleHashIterator(node->hashtable, >hashiter);
 			return;
diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
index c7a0644..7b5eb86 100644
--- a/src/backend/nodes/copyfuncs.c
+++ b/src/backend/nodes/copyfuncs.c
@@ -871,6 +871,7 @@ _copyAgg(const Agg *from)
 	COPY_SCALAR_FIELD(aggstrategy);
 	COPY_SCALAR_FIELD(aggsplit);
 	COPY_SCALAR_FIELD(numCols);
+	COPY_BITMAPSET_FIELD(aggParam);
 	if (from->numCols > 0)
 	{
 		COPY_POINTER_FIELD(grpColIdx, from->numCols * sizeof(AttrNumber));
diff --git a/src/backend/nodes/outfuncs.c b/src/backend/nodes/outfuncs.c
index 1fab807..5e48edd 100644
--- a/src/backend/nodes/outfuncs.c
+++ b/src/backend/nodes/outfuncs.c
@@ -706,6 +706,7 @@ _outAgg(StringInfo str, const Agg *node)
 	WRITE_ENUM_FIELD(aggstrategy, AggStrategy);
 	WRITE_ENUM_FIELD(aggsplit, AggSplit);
 	WRITE_INT_FIELD(numCols);
+	WRITE_BITMAPSET_FIELD(aggParam);
 
 	appendStringInfoString(str, " :grpColIdx");
 	for (i = 0; i < node->numCols; i++)
diff --git a/src/backend/nodes/readfuncs.c b/src/backend/nodes/readfuncs.c
index c83063e..67dcf17 100644
--- a/src/backend/nodes/readfuncs.c
+++ b/src/backend/nodes/readfuncs.c
@@ -2004,6 +2004,7 @@ _readAgg(void)
 	READ_ENUM_FIELD(aggstrategy, AggStrategy);
 	READ_ENUM_FIELD(aggsplit, AggSplit);
 	READ_INT_FIELD(numCols);
+	READ_BITMAPSET_FIELD(aggParam);
 	READ_ATTRNUMBER_ARRAY(grpColIdx, local_node->numCols);
 	READ_OID_ARRAY(grpOperators, local_node->numCols);
 	READ_LONG_FIELD(numGroups);
diff --git a/src/backend/optimizer/plan/subselect.c b/src/backend/optimizer/plan/subselect.c
index a46cc10..2e56484 100644
--- a/src/backend/optimizer/plan/subselect.c
+++ b/src/backend/optimizer/plan/subselect.c
@@ -82,6 +82,7 @@ static Bitmapset *finalize_plan(PlannerInfo *root,
 			  Bitmapset *valid_params,
 			  Bitmapset *scan_params);
 static bool finalize_primnode(Node *node, finalize_primnode_context *context);
+static bool finalize_agg_primnode(Node *node, finalize_primnode_context *context);
 
 
 /*
@@ -2659,8 +2660,30 @@ finalize_plan(PlannerInfo *root, Plan *plan, Bitmapset *valid_params,
 			  );
 			break;
 
-		case T_Hash:
 		case T_Agg:
+			{
+Agg		   *agg = (Agg *) plan;
+finalize_primnode_context aggcontext;
+
+/*
+ * We need to know which params are referenced in inputs to
+ * aggregate calls, so that we know whether we need to rebuild
+ * the hashtable in the AGG_HASHED case. Rescan the tlist and
+ * qual for this purpose.
+ */
+
+aggcontext = context;
+aggcontext.paramids = NULL;
+
+finalize_agg_primnode((Node *) agg->plan.targetlist, );
+finalize_agg_primnode((Node *) agg->plan.qual, );
+
+/* remember results for execution */
+agg->aggParam = aggcontext.paramids;
+			}
+			break;
+
+		case T_Hash:
 		case T_Material:
 		case T_Sort:
 		case T_Unique:
@@ -2812,6 +2835,24 @@ finalize_primnode(Node *node, finalize_primnode_context *context)
 }
 
 /*
+ * finalize_agg_primnode: add IDs of all PARAM_EXEC params appearing inside
+ * Aggref nodes in the given expression tree to the result set.
+ */
+static bool
+finalize_agg_primnode(Node *node, finalize_primnode_context *context)
+{
+	if (node == NULL)
+		return false;
+	if (IsA(node, Aggref))
+	{
+		finalize_primnode(node, context);
+		return false;			/* no more to do here */
+	}
+	return expression_tree_walker(node, finalize_agg_primnode,
+  (void *) context);
+}
+
+/*
  * SS_make_initplan_output_param - make a Param for an initPlan's output
  *
  * The plan is expected to return a scalar value of the given type/collation.
diff --git a/src/include/nodes/plannodes.h b/src/include/nodes/plannodes.h
index 369179f..3d5e4df 100644
--- a/src/include/nodes/plannodes.h
+++ b/src/include/nodes/plannodes.h
@@ -712,6 +712,7 @@ typedef struct Agg
 	AggStrategy aggstrategy;	/* basic strategy, see nodes.h */
 	AggSplit	aggsplit;		/* agg-splitting mode, see nodes.h */
 	int			numCols;		/* number of grouping columns */
+	Bitmapset  *aggParam;		/* params 

Re: [HACKERS] Strange result with LATERAL query

2016-08-24 Thread Andrew Gierth
> "Tom" == Tom Lane  writes:

 Tom> I'm not sure if it's worth trying to distinguish whether the Param
 Tom> is inside any aggregate calls or not.  The existing code gets the
 Tom> right answer for

 Tom> select array(select x+sum(y) from generate_series(1,3) y group by y) 
 Tom>   from generate_series(1,3) x;

 Tom> and we'd be losing some efficiency for cases like that if we fix
 Tom> it as above.  But is it worth the trouble?

The loss of efficiency could be significant, since it's forcing a rescan
of what could be an expensive plan.

-- 
Andrew (irc:RhodiumToad)


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Strange result with LATERAL query

2016-08-24 Thread Pavel Stehule
2016-08-24 17:08 GMT+02:00 Tom Lane :

> Andrew Gierth  writes:
> > Something is wrong with the way chgParam is being handled in Agg nodes.
> > The code in ExecReScanAgg seems to assume that if the lefttree doesn't
> > have any parameter changes then it suffices to re-project the data from
> > the existing hashtable; but of course this is nonsense if the parameter
> > is in an input to an aggregate function.
>
> It looks like it's sufficient to do this:
>
> diff --git a/src/backend/executor/nodeAgg.c b/src/backend/executor/
> nodeAgg.c
> index 1ec2515..f468fad 100644
> *** a/src/backend/executor/nodeAgg.c
> --- b/src/backend/executor/nodeAgg.c
> *** ExecReScanAgg(AggState *node)
> *** 3425,3435 
> return;
>
> /*
> !* If we do have the hash table and the subplan does not
> have any
> !* parameter changes, then we can just rescan the existing
> hash table;
> !* no need to build it again.
>  */
> !   if (outerPlan->chgParam == NULL)
> {
> ResetTupleHashIterator(node->hashtable,
> >hashiter);
> return;
> --- 3425,3436 
> return;
>
> /*
> !* If we do have the hash table and there are no relevant
> parameter
> !* changes, then we can just rescan the existing hash
> table; no need
> !* to build it again.
>  */
> !   if (node->ss.ps.chgParam == NULL &&
> !   outerPlan->chgParam == NULL)
> {
> ResetTupleHashIterator(node->hashtable,
> >hashiter);
> return;
>
>
> I'm not sure if it's worth trying to distinguish whether the Param is
> inside any aggregate calls or not.  The existing code gets the right
> answer for
>
> select array(select x+sum(y) from generate_series(1,3) y group by y)
>   from generate_series(1,3) x;
>
> and we'd be losing some efficiency for cases like that if we fix
> it as above.  But is it worth the trouble?
>
>
The result should not depend on GUC - hashagg on/off changing output - it
is error.

Regards

Pavel


> regards, tom lane
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>


Re: [HACKERS] Strange result with LATERAL query

2016-08-24 Thread Tom Lane
Andrew Gierth  writes:
> Something is wrong with the way chgParam is being handled in Agg nodes.
> The code in ExecReScanAgg seems to assume that if the lefttree doesn't
> have any parameter changes then it suffices to re-project the data from
> the existing hashtable; but of course this is nonsense if the parameter
> is in an input to an aggregate function.

It looks like it's sufficient to do this:

diff --git a/src/backend/executor/nodeAgg.c b/src/backend/executor/nodeAgg.c
index 1ec2515..f468fad 100644
*** a/src/backend/executor/nodeAgg.c
--- b/src/backend/executor/nodeAgg.c
*** ExecReScanAgg(AggState *node)
*** 3425,3435 
return;
  
/*
!* If we do have the hash table and the subplan does not have 
any
!* parameter changes, then we can just rescan the existing hash 
table;
!* no need to build it again.
 */
!   if (outerPlan->chgParam == NULL)
{
ResetTupleHashIterator(node->hashtable, 
>hashiter);
return;
--- 3425,3436 
return;
  
/*
!* If we do have the hash table and there are no relevant 
parameter
!* changes, then we can just rescan the existing hash table; no 
need
!* to build it again.
 */
!   if (node->ss.ps.chgParam == NULL &&
!   outerPlan->chgParam == NULL)
{
ResetTupleHashIterator(node->hashtable, 
>hashiter);
return;


I'm not sure if it's worth trying to distinguish whether the Param is
inside any aggregate calls or not.  The existing code gets the right
answer for

select array(select x+sum(y) from generate_series(1,3) y group by y) 
  from generate_series(1,3) x;

and we'd be losing some efficiency for cases like that if we fix
it as above.  But is it worth the trouble?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Strange result with LATERAL query

2016-08-24 Thread Andrew Gierth
> "Andrew" == Andrew Gierth  writes:
> "Jeevan" == Jeevan Chalke  writes:

 Jeevan> Hi,
 Jeevan> While playing with LATERAL along with some aggregates in
 Jeevan> sub-query, I have observed somewhat unusual behavior.

 Andrew> Simpler example not needing LATERAL:

 Andrew> select array(select sum(x+y) from generate_series(1,3) y group by y)
 Andrew>   from generate_series(1,3) x;

Oh, and another way to verify that it's a bug and not expected behavior
is that it goes away with set enable_hashagg=false;

-- 
Andrew (irc:RhodiumToad)


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Strange result with LATERAL query

2016-08-24 Thread Andrew Gierth
> "Jeevan" == Jeevan Chalke  writes:

 Jeevan> Hi,
 Jeevan> While playing with LATERAL along with some aggregates in
 Jeevan> sub-query, I have observed somewhat unusual behavior.

Simpler example not needing LATERAL:

select array(select sum(x+y) from generate_series(1,3) y group by y)
  from generate_series(1,3) x;
 ?column? 
--
 {2,4,3}
 {2,4,3}
 {2,4,3}
(3 rows)

This is broken all the way back, it's not a recent bug.

Something is wrong with the way chgParam is being handled in Agg nodes.
The code in ExecReScanAgg seems to assume that if the lefttree doesn't
have any parameter changes then it suffices to re-project the data from
the existing hashtable; but of course this is nonsense if the parameter
is in an input to an aggregate function.

-- 
Andrew (irc:RhodiumToad)


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Strange result with LATERAL query

2016-08-23 Thread Jeevan Chalke
Hi,

While playing with LATERAL along with some aggregates in sub-query, I have
observed somewhat unusual behavior.

Consider following steps:

create table tab1(c1 int, c2 int);
insert into tab1 select id, 1 from generate_series(1, 3) id;
create function sum_tab1(extra int) returns setof bigint as $$
  select sum(c1 + extra) sum from tab1 group by c1
$$ language sql;

-- This gives wrong output
select c1, c2, sum from tab1 t1, lateral
  (select sum(t2.c1 + t1.c1) sum from tab1 t2 group by t2.c1) qry
  order by 1, 2, 3;
-- This gives correct output
select c1, c2, sum from tab1 t1, lateral
  (select sum_tab1 sum from sum_tab1(c1)) qry
  order by 1, 2, 3;

I would expect same result from both these queries, but unfortunately not.
Per my understanding, second query involving function gives me correct
result
where as first query's output seems wrong.

Is this an expected behavior OR we are giving wrong result in case of first
query?

Thanks
-- 
Jeevan B Chalke