Re: [HACKERS] LATERAL quals revisited

2013-08-19 Thread Tom Lane
Some time ago, I wrote:
 I've been studying the bug reported at
 http://www.postgresql.org/message-id/20130617235236.GA1636@jeremyevans.local
 ...
 After some contemplation, I think that the most practical way to fix
 this is for deconstruct_recurse and distribute_qual_to_rels to
 effectively move such a qual to the place where it logically belongs;
 that is, rather than processing it when we look at the lower WHERE
 clause, set it aside for a moment and then add it back when looking at
 the ON clause of the appropriate outer join.  This should be reasonably
 easy to do by keeping a list of postponed lateral clauses while we're
 scanning the join tree.

Here's a draft patch for this.  The comments need a bit more work
probably, but barring objection I want to push this in before this
afternoon's 9.3rc1 wrap.

regards, tom lane

diff --git a/src/backend/optimizer/plan/initsplan.c b/src/backend/optimizer/plan/initsplan.c
index 98f601c..e055088 100644
*** a/src/backend/optimizer/plan/initsplan.c
--- b/src/backend/optimizer/plan/initsplan.c
*** int			from_collapse_limit;
*** 36,47 
  int			join_collapse_limit;
  
  
  static void extract_lateral_references(PlannerInfo *root, RelOptInfo *brel,
  		   Index rtindex);
  static void add_lateral_info(PlannerInfo *root, Relids lhs, Relids rhs);
  static List *deconstruct_recurse(PlannerInfo *root, Node *jtnode,
  	bool below_outer_join,
! 	Relids *qualscope, Relids *inner_join_rels);
  static SpecialJoinInfo *make_outerjoininfo(PlannerInfo *root,
     Relids left_rels, Relids right_rels,
     Relids inner_join_rels,
--- 36,56 
  int			join_collapse_limit;
  
  
+ /* Elements of the postponed_qual_list used during deconstruct_recurse */
+ typedef struct PostponedQual
+ {
+ 	Node	   *qual;			/* a qual clause waiting to be processed */
+ 	Relids		relids;			/* the set of baserels it references */
+ } PostponedQual;
+ 
+ 
  static void extract_lateral_references(PlannerInfo *root, RelOptInfo *brel,
  		   Index rtindex);
  static void add_lateral_info(PlannerInfo *root, Relids lhs, Relids rhs);
  static List *deconstruct_recurse(PlannerInfo *root, Node *jtnode,
  	bool below_outer_join,
! 	Relids *qualscope, Relids *inner_join_rels,
! 	List **postponed_qual_list);
  static SpecialJoinInfo *make_outerjoininfo(PlannerInfo *root,
     Relids left_rels, Relids right_rels,
     Relids inner_join_rels,
*** static void distribute_qual_to_rels(Plan
*** 53,59 
  		Relids qualscope,
  		Relids ojscope,
  		Relids outerjoin_nonnullable,
! 		Relids deduced_nullable_relids);
  static bool check_outerjoin_delay(PlannerInfo *root, Relids *relids_p,
  	  Relids *nullable_relids_p, bool is_pushed_down);
  static bool check_equivalence_delay(PlannerInfo *root,
--- 62,69 
  		Relids qualscope,
  		Relids ojscope,
  		Relids outerjoin_nonnullable,
! 		Relids deduced_nullable_relids,
! 		List **postponed_qual_list);
  static bool check_outerjoin_delay(PlannerInfo *root, Relids *relids_p,
  	  Relids *nullable_relids_p, bool is_pushed_down);
  static bool check_equivalence_delay(PlannerInfo *root,
*** add_lateral_info(PlannerInfo *root, Reli
*** 630,644 
  List *
  deconstruct_jointree(PlannerInfo *root)
  {
  	Relids		qualscope;
  	Relids		inner_join_rels;
  
  	/* Start recursion at top of jointree */
  	Assert(root-parse-jointree != NULL 
  		   IsA(root-parse-jointree, FromExpr));
  
! 	return deconstruct_recurse(root, (Node *) root-parse-jointree, false,
! 			   qualscope, inner_join_rels);
  }
  
  /*
--- 640,662 
  List *
  deconstruct_jointree(PlannerInfo *root)
  {
+ 	List	   *result;
  	Relids		qualscope;
  	Relids		inner_join_rels;
+ 	List	   *postponed_qual_list = NIL;
  
  	/* Start recursion at top of jointree */
  	Assert(root-parse-jointree != NULL 
  		   IsA(root-parse-jointree, FromExpr));
  
! 	result = deconstruct_recurse(root, (Node *) root-parse-jointree, false,
!  qualscope, inner_join_rels,
!  postponed_qual_list);
! 
! 	/* Shouldn't be any leftover quals */
! 	Assert(postponed_qual_list == NIL);
! 
! 	return result;
  }
  
  /*
*** deconstruct_jointree(PlannerInfo *root)
*** 656,668 
   *	*inner_join_rels gets the set of base Relids syntactically included in
   *		inner joins appearing at or below this jointree node (do not modify
   *		or free this, either)
   *	Return value is the appropriate joinlist for this jointree node
   *
   * In addition, entries will be added to root-join_info_list for outer joins.
   */
  static List *
  deconstruct_recurse(PlannerInfo *root, Node *jtnode, bool below_outer_join,
! 	Relids *qualscope, Relids *inner_join_rels)
  {
  	List	   *joinlist;
  
--- 674,689 
   *	*inner_join_rels gets the set of base Relids syntactically included in
   *		inner joins appearing at or below this jointree node (do not modify
   *		or 

Re: [HACKERS] LATERAL quals revisited

2013-08-14 Thread Tom Lane
Antonin Houska antonin.hou...@gmail.com writes:
 On 07/04/2013 06:11 PM, Antonin Houska wrote:
 On 07/03/2013 08:32 PM, Tom Lane wrote:
 Another possibility would be to keep the optimization, but disable it in
 queries that use LATERAL.  I don't much care for that though --- seems
 too Rube Goldbergish, and in any case I have a lot less faith in the
 whole concept now than I had before I started digging into this issue.

 I constructed a query that triggers the optimization - see attachment 
 with comments.

Thanks for poking at this.

 EXPLAIN shows the same plan with or without the ph_may_need 
 optimization, but that might be data problem (my tables are empty).

Yeah, I didn't have much luck getting a different plan even with data in
the tables.  What you'd need for this to be important would be for a join
order that's precluded without the ph_may_need logic to be significantly
better than the join orders that are still allowed.  While that's
certainly within the realm of possibility, the difficulty of triggering
the case at all reinforces my feeling that this optimization isn't worth
bothering with.  For the moment I'm just going to take it out.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] LATERAL quals revisited

2013-07-19 Thread Ashutosh Bapat
I have couple of questions.

On Wed, Jun 26, 2013 at 1:30 AM, Tom Lane t...@sss.pgh.pa.us wrote:

 I've been studying the bug reported at

 http://www.postgresql.org/message-id/20130617235236.GA1636@jeremyevans.local
 that the planner can do the wrong thing with queries like

 SELECT * FROM
   i LEFT JOIN LATERAL (SELECT * FROM j WHERE i.n = j.n) j ON true;

 I think the fundamental problem is that, because the i.n = j.n clause
 appears syntactically in WHERE, the planner is treating it as if it were
 an inner-join clause; but really it ought to be considered a clause of
 the upper LEFT JOIN.  That is, semantically this query ought to be
 equivalent to

 SELECT * FROM
   i LEFT JOIN LATERAL (SELECT * FROM j) j ON i.n = j.n;

 However, because distribute_qual_to_rels doesn't see the clause as being
 attached to the outer join, it's not marked with the correct properties
 and ends up getting evaluated in the wrong place (as a filter clause
 not a join filter clause).  The bug is masked in the test cases we've
 used so far because those cases are designed to let the clause get
 pushed down into the scan of the inner relation --- but if it doesn't
 get pushed down, it's evaluated the wrong way.

 After some contemplation, I think that the most practical way to fix
 this is for deconstruct_recurse and distribute_qual_to_rels to
 effectively move such a qual to the place where it logically belongs;
 that is, rather than processing it when we look at the lower WHERE
 clause, set it aside for a moment and then add it back when looking at
 the ON clause of the appropriate outer join.  This should be reasonably
 easy to do by keeping a list of postponed lateral clauses while we're
 scanning the join tree.

 For there to *be* a unique appropriate outer join, we need to require
 that a LATERAL-using qual clause that's under an outer join contain
 lateral references only to the outer side of the nearest enclosing outer
 join.  There's no such restriction in the spec of course, but we can
 make it so by refusing to flatten a sub-select if pulling it up would
 result in having a clause in the outer query that violates this rule.
 There's already some code in prepjointree.c (around line 1300) that
 attempts to enforce this, though now that I look at it again I'm not
 sure it's covering all the bases.  We may need to extend that check.


Why do we need this restriction? Wouldn't a place (specifically join qual
at such a place) in join tree where all the participating relations are
present, serve as a place where the clause can be applied. E.g. in the query

select * from tab1 left join tab2 t2 using (val) left join lateral (select
val from tab2 where val2 = tab1.val * t2.val) t3 using (val);

Can't we apply (as a join qual) the qual val2 = tab1.val * t2.val at a
place where we are computing join between tab1, t2 and t3?


 I'm inclined to process all LATERAL-using qual clauses this way, ie
 postpone them till we recurse back up to a place where they can
 logically be evaluated.  That won't make any real difference when no
 outer joins are present, but it will eliminate the ugliness that right
 now distribute_qual_to_rels is prevented from sanity-checking the scope
 of the references in a qual when LATERAL is present.  If we do it like
 this, we can resurrect full enforcement of that sanity check, and then
 throw an error if any postponed quals are left over when we're done
 recursing.


Parameterized nested loop join would always be able to evaluate a LATERAL
query. Instead of throwing error, why can't we choose that as the default
strategy whenever we fail to flatten subquery?

Can we put the clause with lateral references at its appropriate place
while flattening the subquery? IMO, that will be cleaner and lesser work
than first pulling the clause and then putting it back again? Right, now,
we do not have that capability in pull_up_subqueries() but given its
recursive structure, it might be easier to do it there.


 Thoughts, better ideas?

 regards, tom lane


 --
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers




-- 
Best Wishes,
Ashutosh Bapat
EntepriseDB Corporation
The Postgres Database Company


Re: [HACKERS] LATERAL quals revisited

2013-07-19 Thread Tom Lane
Ashutosh Bapat ashutosh.ba...@enterprisedb.com writes:
 On Wed, Jun 26, 2013 at 1:30 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 For there to *be* a unique appropriate outer join, we need to require
 that a LATERAL-using qual clause that's under an outer join contain
 lateral references only to the outer side of the nearest enclosing outer
 join.  There's no such restriction in the spec of course, but we can
 make it so by refusing to flatten a sub-select if pulling it up would
 result in having a clause in the outer query that violates this rule.
 There's already some code in prepjointree.c (around line 1300) that
 attempts to enforce this, though now that I look at it again I'm not
 sure it's covering all the bases.  We may need to extend that check.

 Why do we need this restriction? Wouldn't a place (specifically join qual
 at such a place) in join tree where all the participating relations are
 present, serve as a place where the clause can be applied.

No.  If you hoist a qual that appears below an outer join to above the
outer join, you get wrong results in general: you might eliminate rows
from the outer side of the join, which a qual from within the inner side
should never be able to do.

 select * from tab1 left join tab2 t2 using (val) left join lateral (select
 val from tab2 where val2 = tab1.val * t2.val) t3 using (val);
 Can't we apply (as a join qual) the qual val2 = tab1.val * t2.val at a
 place where we are computing join between tab1, t2 and t3?

This particular example doesn't violate the rule I gave above, since
both tab1 and t2 are on the left side of the join to the lateral
subquery, and the qual doesn't have to get hoisted *past* an outer join,
only to the outer join of {tab1,t2} with {t3}.

 I'm inclined to process all LATERAL-using qual clauses this way, ie
 postpone them till we recurse back up to a place where they can
 logically be evaluated.  That won't make any real difference when no
 outer joins are present, but it will eliminate the ugliness that right
 now distribute_qual_to_rels is prevented from sanity-checking the scope
 of the references in a qual when LATERAL is present.  If we do it like
 this, we can resurrect full enforcement of that sanity check, and then
 throw an error if any postponed quals are left over when we're done
 recursing.

 Parameterized nested loop join would always be able to evaluate a LATERAL
 query. Instead of throwing error, why can't we choose that as the default
 strategy whenever we fail to flatten subquery?

I think you misunderstood.  That error would only be a sanity check that
we'd accounted for all qual clauses, it's not something a user should
ever see.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] LATERAL quals revisited

2013-07-12 Thread Antonin Houska

On 07/04/2013 06:11 PM, Antonin Houska wrote:

On 07/03/2013 08:32 PM, Tom Lane wrote:

Another possibility would be to keep the optimization, but disable it in
queries that use LATERAL.  I don't much care for that though --- seems
too Rube Goldbergish, and in any case I have a lot less faith in the
whole concept now than I had before I started digging into this issue.

Thoughts?

I noticed EXPLAIN in some regression tests. So if they all pass after 
removal of this optimization, it might indicate that it was really 
insignificant. But alternatively it may just be a lack of focus on 
this feature in the test queries. Digging for (non-LATERAL) queries or 
rather patterns where the ph_may_need optimization clearly appears to 
be important sounds to me like a good SQL exercise, but I'm afraid I 
won't have time for it in the next few days.




I constructed a query that triggers the optimization - see attachment 
with comments. (Note that the relid sets are derived from my current 
knowledge of the logic. I haven't figured out how to check them easily 
in gdb session.)


The intention was that the top-level OJ references LHS of the join below 
rather than the RHS. That should increase the likelihood that the PHV 
becomes the only obstacle for join commuting. And therefore the 
ph_may_need optimization should unblock some combinations that would be 
impossible otherwise.


However I could not see the condition

  if (bms_is_subset(phinfo-ph_may_need, min_righthand))
  continue;

met for the top-level join even though the supposed ph_may_need did not 
contain tab1. Then it struck me that min_righthand can be the problem. 
So I changed the join clause to reference RHS of j1, hoping that it 
should make min_righthand bigger. And that really triggered the condition.


EXPLAIN shows the same plan with or without the ph_may_need 
optimization, but that might be data problem (my tables are empty).


More important is the fact that I could only avoid addition of the PHV's 
eval_at to min_righthand at the cost of adding the whole j1 join (i.e. 
more than just eval_at).


Although the idea behind ph_may_need is clever, I can now imagine that 
other techniques of the planner can substitute for it. There might be 
examples showing the opposite but such are beyond my imagination.


// Antonin Houska (Tony)

SELECT  tab1.i
FROMtab1
-- The ph_may_need optimization should be effective for the
	-- top-level LEFT JOIN. The PHV in sub1 is only referenced below it.
LEFT JOIN
	(  tab2
   LEFT JOIN
   ( tab3
 LEFT JOIN
 ( SELECT
   -- This expression should be wrapped in the PHV
   -- That PHV should have eval_at = {tab4, tab5}.
   -- 
   -- Join clause of j1 is the highest reference to the PHV.
   -- Thus ph_may_need should be {tab2, tab3, tab4, tab5}. Therefore
   -- the ph_may_need optimization should avoid addition of eval_at
   -- to min_righthand of the top-level join's SpecialJoinInfo.
		   COALESCE(tab4.l, tab5.m, 1) AS x
 FROM  tab4
   LEFT JOIN
   tab5
   ON l = m
 ) AS sub1(x)
 ON tab3.k = sub1.x
   ) AS j2(k, x) 
	   ON tab2.j = j2.x
 ) AS j1(j, k)
 -- This clause references j.k (RHS of the lower join) to keep min_righthand
 -- of the top-level join bigger (ph_may_need needs to be its subset).
	 ON tab1.i = j1.k;
CREATE TABLE tab1(i int);
CREATE TABLE tab2(j int);
CREATE TABLE tab3(k int);
CREATE TABLE tab4(l int);
CREATE TABLE tab5(m int);

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] LATERAL quals revisited

2013-07-04 Thread Antonin Houska

On 07/03/2013 08:32 PM, Tom Lane wrote:

Another possibility would be to keep the optimization, but disable it in
queries that use LATERAL.  I don't much care for that though --- seems
too Rube Goldbergish, and in any case I have a lot less faith in the
whole concept now than I had before I started digging into this issue.

Thoughts?

I noticed EXPLAIN in some regression tests. So if they all pass after 
removal of this optimization, it might indicate that it was really 
insignificant. But alternatively it may just be a lack of focus on this 
feature in the test queries. Digging for (non-LATERAL) queries or rather 
patterns where the ph_may_need optimization clearly appears to be 
important sounds to me like a good SQL exercise, but I'm afraid I won't 
have time for it in the next few days.



//Antonin Houska (Tony)


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] LATERAL quals revisited

2013-07-03 Thread Tom Lane
I wrote:
 So attached is a draft patch for this.  It's not complete yet because
 there are various comments that are now wrong and need to be updated;
 but I think the code is functioning correctly.

Hm, spoke too soon :-(.  This query causes an assertion failure, with or
without my draft patch:

select c.*,a.*,ss1.q1,ss2.q1,ss3.* from
  int8_tbl c left join (
int8_tbl a left join
  (select q1, coalesce(q2,f1) as x from int8_tbl b, int4_tbl b2) ss1
  on a.q2 = ss1.q1
cross join
lateral (select q1, coalesce(ss1.x,q2) as y from int8_tbl d) ss2
  ) on c.q2 = ss2.q1,
  lateral (select * from int4_tbl i where ss2.y  f1) ss3;

TRAP: FailedAssertion(!(bms_is_subset(phinfo-ph_needed, 
phinfo-ph_may_need)), File: initsplan.c, Line: 213)

What's happening is that distribute_qual_to_rels concludes (correctly)
that the ss2.y  f1 clause must be postponed until after the nest of
left joins, since those could null ss2.y.  So the PlaceHolderVar for
ss2.y is marked as being needed at the topmost join level.  However,
find_placeholders_in_jointree had only marked the PHV as being maybe
needed to scan the i relation, since that's what the syntactic
location of the reference implies.  Since we depend on the assumption
that ph_needed is always a subset of ph_may_need, there's an assertion
that fires if that stops being true, and that's what's crashing.

After some thought about this, I'm coming to the conclusion that lateral
references destroy the ph_maybe_needed optimization altogether: we
cannot derive an accurate estimate of where a placeholder will end up in
the final qual distribution, short of essentially doing all the work in
deconstruct_jointree over again.  I guess in principle we could repeat
deconstruct_jointree until we had stable estimates of the ph_needed
locations, but that would be expensive and probably would induce a lot
of new planner bugs (since the data structure changes performed during
deconstruct_jointree aren't designed to be backed out easily).

The only place where ph_may_need is actually used is in this bit in
make_outerjoininfo():

/*
 * Examine PlaceHolderVars.  If a PHV is supposed to be evaluated within
 * this join's nullable side, and it may get used above this join, then
 * ensure that min_righthand contains the full eval_at set of the PHV.
 * This ensures that the PHV actually can be evaluated within the RHS.
 * Note that this works only because we should already have determined the
 * final eval_at level for any PHV syntactically within this join.
 */
foreach(l, root-placeholder_list)
{
PlaceHolderInfo *phinfo = (PlaceHolderInfo *) lfirst(l);
Relidsph_syn_level = phinfo-ph_var-phrels;

/* Ignore placeholder if it didn't syntactically come from RHS */
if (!bms_is_subset(ph_syn_level, right_rels))
continue;

/* We can also ignore it if it's certainly not used above this join */
/* XXX this test is probably overly conservative */
if (bms_is_subset(phinfo-ph_may_need, min_righthand))
continue;

/* Else, prevent join from being formed before we eval the PHV */
min_righthand = bms_add_members(min_righthand, phinfo-ph_eval_at);
}

Looking at it again, it's not really clear that skipping placeholders in
this way results in very much optimization --- sometimes we can avoid
constraining join order, but how often?  I tried diking out the check
on ph_may_need from this loop, and saw no changes in the regression test
results (not that that proves a whole lot about optimization of complex
queries).  So I'm pretty tempted to just remove ph_may_need, along with
the machinery that computes it.

Another possibility would be to keep the optimization, but disable it in
queries that use LATERAL.  I don't much care for that though --- seems
too Rube Goldbergish, and in any case I have a lot less faith in the
whole concept now than I had before I started digging into this issue.

Thoughts?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] LATERAL quals revisited

2013-06-26 Thread Antonin Houska

On 06/26/2013 12:52 AM, Tom Lane wrote:

Instead of setting it aside, can we (mis)use placeholder var (PHV), to
ensure that the WHERE clause is evaluated below the OJ; instead of
combining it with the ON clause?

No, that doesn't help; it has to be part of the joinquals at the join
node, or you don't get the right execution semantics.
When I wrote 'below the OJ' I meant to retain something of the original 
semantics (just like the subquery applies the WHERE clause below the OJ).

However that's probably too restrictive and your next arguments

Plus you could
lose some optimization opportunities, for example if we fail to see
that there's a strict join clause associated with the outer join
(cf lhs_strict).  Worse, I think wrapping a PHV around an otherwise
indexable clause would prevent using it for an indexscan.


also confirm the restrictiveness. So I can forget.

One more concern anyway: doesn't your proposal make subquery pull-up a 
little bit risky in terms of cost of the resulting plan?


IMO the subquery in the original query may filter out many rows and thus 
decrease the number of pairs to be evaluated by the join the ON clause 
belongs to.
If the WHERE clause moves up, then the resulting plan might be less 
efficient than the one we'd get if the subquery hadn't been pulled-up at 
all.


However at the time of cost evaluation there's no way to get back (not 
even to notice the higher cost) because the original subquery has gone 
at earlier stage of the planning.


Regards,
Antonin Houska (Tony)


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] LATERAL quals revisited

2013-06-26 Thread Tom Lane
Antonin Houska antonin.hou...@gmail.com writes:
 If the WHERE clause moves up, then the resulting plan might be less 
 efficient than the one we'd get if the subquery hadn't been pulled-up at 
 all.

No, because we can push the qual back down again (using a parameterized
path) if that's appropriate.  The problem at this stage is to understand
the semantics of the outer join correctly, not to make a choice of what
the plan will be.

In fact, the reason we'd not noticed this bug before is exactly that
all the test cases in the regression tests do end up pushing the qual
back down.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] LATERAL quals revisited

2013-06-25 Thread Tom Lane
I've been studying the bug reported at
http://www.postgresql.org/message-id/20130617235236.GA1636@jeremyevans.local
that the planner can do the wrong thing with queries like

SELECT * FROM
  i LEFT JOIN LATERAL (SELECT * FROM j WHERE i.n = j.n) j ON true;

I think the fundamental problem is that, because the i.n = j.n clause
appears syntactically in WHERE, the planner is treating it as if it were
an inner-join clause; but really it ought to be considered a clause of
the upper LEFT JOIN.  That is, semantically this query ought to be
equivalent to

SELECT * FROM
  i LEFT JOIN LATERAL (SELECT * FROM j) j ON i.n = j.n;

However, because distribute_qual_to_rels doesn't see the clause as being
attached to the outer join, it's not marked with the correct properties
and ends up getting evaluated in the wrong place (as a filter clause
not a join filter clause).  The bug is masked in the test cases we've
used so far because those cases are designed to let the clause get
pushed down into the scan of the inner relation --- but if it doesn't
get pushed down, it's evaluated the wrong way.

After some contemplation, I think that the most practical way to fix
this is for deconstruct_recurse and distribute_qual_to_rels to
effectively move such a qual to the place where it logically belongs;
that is, rather than processing it when we look at the lower WHERE
clause, set it aside for a moment and then add it back when looking at
the ON clause of the appropriate outer join.  This should be reasonably
easy to do by keeping a list of postponed lateral clauses while we're
scanning the join tree.

For there to *be* a unique appropriate outer join, we need to require
that a LATERAL-using qual clause that's under an outer join contain
lateral references only to the outer side of the nearest enclosing outer
join.  There's no such restriction in the spec of course, but we can
make it so by refusing to flatten a sub-select if pulling it up would
result in having a clause in the outer query that violates this rule.
There's already some code in prepjointree.c (around line 1300) that
attempts to enforce this, though now that I look at it again I'm not
sure it's covering all the bases.  We may need to extend that check.

I'm inclined to process all LATERAL-using qual clauses this way, ie
postpone them till we recurse back up to a place where they can
logically be evaluated.  That won't make any real difference when no
outer joins are present, but it will eliminate the ugliness that right
now distribute_qual_to_rels is prevented from sanity-checking the scope
of the references in a qual when LATERAL is present.  If we do it like
this, we can resurrect full enforcement of that sanity check, and then
throw an error if any postponed quals are left over when we're done
recursing.

Thoughts, better ideas?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] LATERAL quals revisited

2013-06-25 Thread Antonin Houska
(Please excuse me if my proposal sounds silly, i'm still not too 
advanced in this area...)


On 06/25/2013 10:00 PM, Tom Lane wrote:

After some contemplation, I think that the most practical way to fix
this is for deconstruct_recurse and distribute_qual_to_rels to
effectively move such a qual to the place where it logically belongs;
that is, rather than processing it when we look at the lower WHERE
clause, set it aside for a moment and then add it back when looking at
the ON clause of the appropriate outer join.  This should be reasonably
easy to do by keeping a list of postponed lateral clauses while we're
scanning the join tree.



Instead of setting it aside, can we (mis)use placeholder var (PHV), to 
ensure
that the WHERE clause is evaluated below the OJ; instead of combining it 
with

the ON clause?  That would be a special PHV(s) in that it's not actually
referenced from outside the subquery.

Whether I'm right or not, I seem to have found another problem while
trying to enforce such a PHV:

postgres=# SELECT i.*, j.* FROM i LEFT JOIN LATERAL (SELECT COALESCE(i) 
FROM j WHERE (i.n = j.n)) j ON true;

The connection to the server was lost. Attempting reset: Failed.

TRAP: FailedAssertion(!(!bms_overlap(min_lefthand, min_righthand)), 
File: initsplan.c, Line: 1043)

LOG:  server process (PID 24938) was terminated by signal 6: Aborted
DETAIL:  Failed process was running: SELECT i.*, j.* FROM i LEFT JOIN 
LATERAL (SELECT COALESCE(i) FROM j WHERE (i.n = j.n)) j ON true;

LOG:  terminating any other active server processes
WARNING:  terminating connection because of crash of another server process
DETAIL:  The postmaster has commanded this server process to roll back 
the current transaction and exit, because another server process exited 
abnormally and possibly corrupted shared memory.
HINT:  In a moment you should be able to reconnect to the database and 
repeat your command.

FATAL:  the database system is in recovery mode

I'm not able to judge right now whether the Assert() statement is the 
problem itself or anything

else. You'll probably know better.

(4f14c86d7434376b95477aeeb07fcc7272f4c47d is the last commit in my 
environment)


Regards,
Antonin Houska (Tony)





--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] LATERAL quals revisited

2013-06-25 Thread Tom Lane
Antonin Houska antonin.hou...@gmail.com writes:
 On 06/25/2013 10:00 PM, Tom Lane wrote:
 After some contemplation, I think that the most practical way to fix
 this is for deconstruct_recurse and distribute_qual_to_rels to
 effectively move such a qual to the place where it logically belongs;
 that is, rather than processing it when we look at the lower WHERE
 clause, set it aside for a moment and then add it back when looking at
 the ON clause of the appropriate outer join.

 Instead of setting it aside, can we (mis)use placeholder var (PHV), to
 ensure that the WHERE clause is evaluated below the OJ; instead of
 combining it with the ON clause?

No, that doesn't help; it has to be part of the joinquals at the join
node, or you don't get the right execution semantics.  Plus you could
lose some optimization opportunities, for example if we fail to see
that there's a strict join clause associated with the outer join
(cf lhs_strict).  Worse, I think wrapping a PHV around an otherwise
indexable clause would prevent using it for an indexscan.

 Whether I'm right or not, I seem to have found another problem while
 trying to enforce such a PHV:

 postgres=# SELECT i.*, j.* FROM i LEFT JOIN LATERAL (SELECT COALESCE(i) 
 FROM j WHERE (i.n = j.n)) j ON true;
 The connection to the server was lost. Attempting reset: Failed.

[ pokes at that ... ]  Hm, right offhand this seems like an independent
issue --- the ph_eval_at for the PHV is wrong AFAICS.  Thanks for
reporting it!

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers