Re: Bogus use of canonicalize_qual

2018-03-11 Thread Tom Lane
Dean Rasheed  writes:
> On 10 March 2018 at 20:21, Tom Lane  wrote:
>> If we suppose that we only need to fix it in HEAD, the most attractive
>> answer is to add a parameter distinguishing WHERE and CHECK arguments
>> to canonicalize_qual.

> I agree that this looks like the best choice, but it feels a little
> unsatisfactory to not back-patch a fix for such a glaring bug. You
> could perhaps leave the signature of canonicalize_qual() the same, but
> add a new canonicalize_check() function, and make both thin wrappers
> on top of a local function accepting the is_check parameter.

Hm.  I'd be inclined to create canonicalize_qual_extended(qual, is_check)
and then make canonicalize_qual() call that with is_check = false.
But either way would avoid breaking API compatibility for the back
branches.

I guess the next question is whether we should do it the same way
in HEAD, avoiding a cross-branch difference.  But I don't like that,
because part of the point here IMO is to force any external callers
of canonicalize_qual() to reconsider what they're doing.

regards, tom lane



Re: Bogus use of canonicalize_qual

2018-03-11 Thread Dean Rasheed
On 10 March 2018 at 20:21, Tom Lane  wrote:
> I wrote:
>> Whilst fooling about with predtest.c, I noticed a rather embarrassing
>> error.  Consider the following, rather silly, CHECK constraint:
>> ...
>> So, what to do?  We have a few choices, none ideal:
>
> I'd been assuming that we need to back-patch a fix for this, but after
> further reflection, I'm not so sure.  The bug is only triggered by fairly
> silly CHECK constraints, and given that it's been there a long time (at
> least since 9.2 according to my tests) without any field reports, it seems
> likely that nobody is writing such silly CHECK constraints.
>
> If we suppose that we only need to fix it in HEAD, the most attractive
> answer is to add a parameter distinguishing WHERE and CHECK arguments
> to canonicalize_qual.  That allows symmetrical simplification of constant-
> NULL subexpressions in the two cases, and the fact that the caller now
> has to make an explicit choice of WHERE vs CHECK semantics might help
> discourage people from applying the function in cases where it's not
> clear which one applies.  PFA a patch that does it like that.
>

I agree that this looks like the best choice, but it feels a little
unsatisfactory to not back-patch a fix for such a glaring bug. You
could perhaps leave the signature of canonicalize_qual() the same, but
add a new canonicalize_check() function, and make both thin wrappers
on top of a local function accepting the is_check parameter.

Regards,
Dean



Re: Bogus use of canonicalize_qual

2018-03-10 Thread Tom Lane
I wrote:
> Whilst fooling about with predtest.c, I noticed a rather embarrassing
> error.  Consider the following, rather silly, CHECK constraint:
> ...
> So, what to do?  We have a few choices, none ideal:

I'd been assuming that we need to back-patch a fix for this, but after
further reflection, I'm not so sure.  The bug is only triggered by fairly
silly CHECK constraints, and given that it's been there a long time (at
least since 9.2 according to my tests) without any field reports, it seems
likely that nobody is writing such silly CHECK constraints.

If we suppose that we only need to fix it in HEAD, the most attractive
answer is to add a parameter distinguishing WHERE and CHECK arguments
to canonicalize_qual.  That allows symmetrical simplification of constant-
NULL subexpressions in the two cases, and the fact that the caller now
has to make an explicit choice of WHERE vs CHECK semantics might help
discourage people from applying the function in cases where it's not
clear which one applies.  PFA a patch that does it like that.

I'm a little unhappy with what I learned about the PARTITION code while
doing this :-(.  It's pretty schizophrenic about whether partition
constraints are implicit-AND or explicit-AND format, and I do not think
that the construction of default-partition constraints is done in a
desirable fashion either.  But I mostly resisted the temptation to touch
that logic in this patch.

Comments, objections?

regards, tom lane

diff --git a/src/backend/catalog/partition.c b/src/backend/catalog/partition.c
index fcf7655..786c05d 100644
*** a/src/backend/catalog/partition.c
--- b/src/backend/catalog/partition.c
*** get_proposed_default_constraint(List *ne
*** 3204,3215 
  	defPartConstraint = makeBoolExpr(NOT_EXPR,
  	 list_make1(defPartConstraint),
  	 -1);
  	defPartConstraint =
  		(Expr *) eval_const_expressions(NULL,
  		(Node *) defPartConstraint);
! 	defPartConstraint = canonicalize_qual(defPartConstraint);
  
! 	return list_make1(defPartConstraint);
  }
  
  /*
--- 3204,3217 
  	defPartConstraint = makeBoolExpr(NOT_EXPR,
  	 list_make1(defPartConstraint),
  	 -1);
+ 
+ 	/* Simplify, to put the negated expression into canonical form */
  	defPartConstraint =
  		(Expr *) eval_const_expressions(NULL,
  		(Node *) defPartConstraint);
! 	defPartConstraint = canonicalize_qual(defPartConstraint, true);
  
! 	return make_ands_implicit(defPartConstraint);
  }
  
  /*
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index e559afb..46a648a 100644
*** a/src/backend/commands/tablecmds.c
--- b/src/backend/commands/tablecmds.c
*** PartConstraintImpliedByRelConstraint(Rel
*** 13719,13725 
  		 * fail to detect valid matches without this.
  		 */
  		cexpr = eval_const_expressions(NULL, cexpr);
! 		cexpr = (Node *) canonicalize_qual((Expr *) cexpr);
  
  		existConstraint = list_concat(existConstraint,
  	  make_ands_implicit((Expr *) cexpr));
--- 13719,13725 
  		 * fail to detect valid matches without this.
  		 */
  		cexpr = eval_const_expressions(NULL, cexpr);
! 		cexpr = (Node *) canonicalize_qual((Expr *) cexpr, true);
  
  		existConstraint = list_concat(existConstraint,
  	  make_ands_implicit((Expr *) cexpr));
*** ATExecAttachPartition(List **wqueue, Rel
*** 14058,14067 
  	/* Skip validation if there are no constraints to validate. */
  	if (partConstraint)
  	{
  		partConstraint =
  			(List *) eval_const_expressions(NULL,
  			(Node *) partConstraint);
! 		partConstraint = (List *) canonicalize_qual((Expr *) partConstraint);
  		partConstraint = list_make1(make_ands_explicit(partConstraint));
  
  		/*
--- 14058,14075 
  	/* Skip validation if there are no constraints to validate. */
  	if (partConstraint)
  	{
+ 		/*
+ 		 * Run the partition quals through const-simplification similar to
+ 		 * check constraints.  We skip canonicalize_qual, though, because
+ 		 * partition quals should be in canonical form already; also, since
+ 		 * the qual is in implicit-AND format, we'd have to explicitly convert
+ 		 * it to explicit-AND format and back again.
+ 		 */
  		partConstraint =
  			(List *) eval_const_expressions(NULL,
  			(Node *) partConstraint);
! 
! 		/* XXX this sure looks wrong */
  		partConstraint = list_make1(make_ands_explicit(partConstraint));
  
  		/*
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index 14b7bec..24e6c46 100644
*** a/src/backend/optimizer/plan/planner.c
--- b/src/backend/optimizer/plan/planner.c
*** preprocess_expression(PlannerInfo *root,
*** 988,994 
  	 */
  	if (kind == EXPRKIND_QUAL)
  	{
! 		expr = (Node *) canonicalize_qual((Expr *) expr);
  
  #ifdef OPTIMIZER_DEBUG
  		printf("After canonicalize_qual()\n");
--- 988,994 
  	 */
  	if (kind == EXPRKIND_QUAL)
  	{
! 		expr = (Node *) canonicaliz

Bogus use of canonicalize_qual

2018-03-09 Thread Tom Lane
Whilst fooling about with predtest.c, I noticed a rather embarrassing
error.  Consider the following, rather silly, CHECK constraint:

regression=# create table pp (f1 int);
CREATE TABLE
regression=# create table cc (check (f1 = 1 or f1 = null)) inherits(pp);
CREATE TABLE

Because "f1 = null" reduces to constant NULL, this check constraint is
actually a no-op, because it can never evaluate to FALSE:

regression=# insert into cc values(1);
INSERT 0 1
regression=# insert into cc values(2);
INSERT 0 1
regression=# select * from pp;
 f1 

  1
  2
(2 rows)

But:

regression=# select * from pp where f1 = 2;
 f1 

(0 rows)

Huh?  The reason is that the planner is deciding that it can exclude
cc from the plan:

regression=# explain select * from pp where f1 = 2;
   QUERY PLAN   

 Append  (cost=0.00..0.01 rows=1 width=4)
   ->  Seq Scan on pp  (cost=0.00..0.00 rows=1 width=4)
 Filter: (f1 = 2)
(3 rows)

and that ultimately traces to the part of canonicalize_qual() that throws
away constant-NULL subexpressions of AND/OR clauses.  It's clearly
documented in canonicalize_qual() that it should only be applied to actual
WHERE clauses, where that's a valid optimization.  But there is lots of
code that didn't get that memo and is calling it on CHECK constraints,
allowing the NULL to be thrown away when it should not be.  The particular
culprit here, I think, is get_relation_constraints(), but there's a lot of
similar code elsewhere.

So, what to do?  We have a few choices, none ideal:

1. Just remove that optimization from canonicalize_qual().  This would
result in some inefficiency in badly-written queries, but it might be
acceptable.

2. Run around and remove all the bogus canonicalize_qual() calls.  The
main demerit here is that this'd mean CHECK constraints also don't get the
other effect of canonicalize_qual(), which is:

 * The following code attempts to apply the inverse OR distributive law:
 *  ((A AND B) OR (A AND C))  =>  (A AND (B OR C))
 * That is, locate OR clauses in which every subclause contains an
 * identical term, and pull out the duplicated terms.

This'd possibly make it harder to match WHERE clauses, which do get that
processing, to CHECK clauses which wouldn't.  I think that possibly
predtest.c is smart enough to make proofs even in the face of that, but
I'm not sure.  Another concern is whether external code might still
contain incorrect canonicalize_qual() calls, or whether we might not
accidentally put some back in future.

3. Change canonicalize_qual() to take an additional parameter indicating
whether it's working on a true qual or not.  This might be the best fix
for HEAD, but it doesn't seem very back-patchable.

4. Split canonicalize_qual() into two functions, one for the inverse OR
business and one for NULL removal.  This would result in an additional
tree traversal (and reconstruction) for every WHERE clause, slowing
planning somewhat.

5. Remove NULL-simplification from canonicalize_qual(), but put it
back somewhere later in the planner where we're traversing qual trees
anyway.  I think that it might work to charge the RestrictInfo-building
code with this, though I'm not sure about it.  It seems kind of high
risk for a back-patchable change in any case.

Thoughts?

regards, tom lane