On Mon, Nov 26, 2012 at 09:29:19PM +0100, Hannu Krosing wrote:
> On 11/26/2012 09:05 PM, Tom Lane wrote:
> >Hannu Krosing <ha...@2ndquadrant.com> writes:
> >>In some previous mail Tom Lane claimed that by SQL standard
> >>either an array of all NULLs or a record with all fields NULLs (I
> >>don't remember which) is also considered NULL. If this is true,
> >>then an empty array - which can be said to consist of nothing
> >>but NULLs - should itself be NULL.
> >What I think you're referring to is that the spec says that "foo IS
> >NULL" should return true if foo is a record containing only null fields.
> Is this requirement recursive ?
> 
> That is , should
> 
> ROW(NULL, NULL, ROW(NULL, ROW(NULL, NULL))) IS NULL
> also be true ?
> 
> Currently PostgreSQL does this kind of IS NULL for "simple" rows
> 
> hannu=# SELECT ROW(NULL, NULL) IS NULL;
>  ?column?
> ----------
>  t
> (1 row)
> 
> and also for first level row types
> 
> hannu=# SELECT ROW(NULL, ROW(NULL, NULL)) IS NULL;
>  ?column?
> ----------
>  t
> (1 row)
> 
> but then mysteriously stops working at third level
> 
> hannu=# SELECT ROW(NULL, NULL, ROW(NULL, ROW(NULL, NULL))) IS NULL;
>  ?column?
> ----------
>  f
> (1 row)

I finally had time to look at this, and it is surprising.  I used
EXPLAIN VERBOSE to see what the optimizer was outputting:

        EXPLAIN VERBOSE SELECT ROW(null) IS NULL;
                        QUERY PLAN
        ------------------------------------------
         Result  (cost=0.00..0.01 rows=1 width=0)
-->        Output: true
        
        EXPLAIN VERBOSE SELECT ROW(ROW(null)) IS NULL;
                        QUERY PLAN
        ------------------------------------------
         Result  (cost=0.00..0.01 rows=1 width=0)
-->        Output: (ROW(NULL::unknown) IS NULL)
        
        EXPLAIN VERBOSE SELECT ROW(ROW(ROW(null))) IS NULL;
                         QUERY PLAN
        ---------------------------------------------
         Result  (cost=0.00..0.01 rows=1 width=0)
-->        Output: (ROW(ROW(NULL::unknown)) IS NULL)

The first test outputs a constant, 'true'.  The second test is
ROW(NULL::unknown) because the inner ROW(NULL) was converted to
NULL:unknown.  The third one, which returns false (wrong), happens
because you have ROW embedded in ROW, which the optimizer can't process,
and the executor can't either.

I developed the attached patch which properly recurses into ROW()
records checking for NULLs;  you can see it returns the right answer in
all cases (and constant folds too):

        test=> EXPLAIN VERBOSE SELECT ROW(null) IS NULL;
                        QUERY PLAN
        ------------------------------------------
         Result  (cost=0.00..0.01 rows=1 width=0)
-->        Output: true
        (2 rows)
        
        test=> EXPLAIN VERBOSE SELECT ROW(ROW(null)) IS NULL;
                        QUERY PLAN
        ------------------------------------------
         Result  (cost=0.00..0.01 rows=1 width=0)
-->        Output: true
        (2 rows)
        
        test=> EXPLAIN VERBOSE SELECT ROW(ROW(ROW(null))) IS NULL;
                        QUERY PLAN
        ------------------------------------------
         Result  (cost=0.00..0.01 rows=1 width=0)
-->        Output: true
        (2 rows)

You might think the problem is only with constants, but it extends to
column values too (non-patched server):

        CREATE TABLE test (x INT);
        CREATE TABLE

        INSERT INTO test VALUES (1), (NULL);
        INSERT 0 2

        SELECT ROW(x) IS NULL FROM test;
         ?column?
        ----------
         f
         t
        
        SELECT ROW(ROW(x)) IS NULL FROM test;
         ?column?
        ----------
         f
         t
        
        SELECT ROW(ROW(ROW(x))) IS NULL FROM test;
         ?column?
        ----------
-->      f
-->      f

With the patch, that works too:

        SELECT ROW(ROW(ROW(x))) IS NULL FROM test;
         ?column?
        ----------
         f
         t

The optimizer seems like the right place to fix this, per my patch.  It
already flattens IS NULL tests into a series of AND clauses, and now by
recursing, it handles nested ROW values properly too.

This fix would be for head only.

-- 
  Bruce Momjian  <br...@momjian.us>        http://momjian.us
  EnterpriseDB                             http://enterprisedb.com

  + It's impossible for everything to be true. +
diff --git a/src/backend/optimizer/util/clauses.c b/src/backend/optimizer/util/clauses.c
new file mode 100644
index 6d5b204..0abf789
*** a/src/backend/optimizer/util/clauses.c
--- b/src/backend/optimizer/util/clauses.c
*************** eval_const_expressions_mutator(Node *nod
*** 3149,3155 ****
  						newntest->arg = (Expr *) relem;
  						newntest->nulltesttype = ntest->nulltesttype;
  						newntest->argisrow = type_is_rowtype(exprType(relem));
! 						newargs = lappend(newargs, newntest);
  					}
  					/* If all the inputs were constants, result is TRUE */
  					if (newargs == NIL)
--- 3149,3160 ----
  						newntest->arg = (Expr *) relem;
  						newntest->nulltesttype = ntest->nulltesttype;
  						newntest->argisrow = type_is_rowtype(exprType(relem));
! 						/* flatten nested ROW(), e.g.  ROW(ROW(ROW(NULL))) IS NULL */
! 						if (IsA(relem, RowExpr))
! 							newargs = lappend(newargs,
! 								eval_const_expressions_mutator((Node *)newntest, context));
! 						else
! 							newargs = lappend(newargs, newntest);
  					}
  					/* If all the inputs were constants, result is TRUE */
  					if (newargs == NIL)
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to