Re: [HACKERS] When is a record NULL?

2009-07-28 Thread David E . Wheeler

On Jul 27, 2009, at 5:19 PM, David E. Wheeler wrote:

Yep, that's just what I needed, thanks. I think I'll send a patch  
for the Cursors section of the PL/pgSQL documentation that  
mentions this. Would have saved me a bunch of hassle.


So would have reading two more sentences of the docs, which state, to  
my surprise (that I missed it!), “As with `SELECT INTO`, the special  
variable codeFOUND/code can be checked to see whether a row was  
obtained or not.” D’oh!


So now I've got things working the way I need, and I've blogged about  
these SQL pathologies so that I don't forget.


  
http://www.justatheory.com/computers/databases/postgresql/neither-null-nor-not-null.html

Thanks for the help, everyone.

Best,

David


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] When is a record NULL?

2009-07-27 Thread Kevin Grittner
Greg Stark gsst...@mit.edu wrote:
 Kevin Grittnerkevin.gritt...@wicourts.gov wrote:
 
 impossible to write two queries which can be shown to be logically
 equivalent but which optimize to different access plans
 
 Personally I think that's a fine goal to aim for.
 
Sure, but from my experience, there aren't any database products which
come close to having an optimizer which can do a good enough job that
it is yet feasible.  I'm not sure I've even seen any which reliably
treat '(a AND b) OR (c and d)' as equivalent to 'NOT ((a OR b) AND (c
OR d))', much less the fairly common 'a  b OR (a = b AND c  d)' into
'a = b AND (a  b OR c  d)'.  Two commonly heard arguments on this
list are that:
 
(1) a particular improvement in this area is not worth it because it
would cost more in CPU time to recognize the equivalence that it would
save in run time from the better plan, and
 
(2) someone sometimes finds it useful to not recognize the equivalence
so that they can coerce a certain type of plan.
 
The latter really *is* a form of optimizer hint, it's just an
undocumented, arcane hint for the Illuminati.
 
But anyway, I didn't say that it was a bad thing toward which to
strive, just that it's so far from realization that as an absolute
requirement to be designated an RDBMS, it's a bit ivory tower.
 
 I'm not sure what to be considered a relational database means
 
In that context I was talking about Codd's book where he provides a
list of properties which a DBMS should have to be considered, in his
opinion (as the inventor of the relational model for database
management) a relational database management system.  It just occurs
to me that I think some of these entries were *required* for his
blessing, and others were just *desirable*.  I don't remember which
designation he gave this point.
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] When is a record NULL?

2009-07-27 Thread Kevin Grittner
Sam Mason s...@samason.me.uk wrote: 
 
 I've heard lots and read a few smaller articles but don't think
 I've got around to any of his books.
 
Having just poked around on the Internet, I think perhaps this was his
only full-fledge book, per se.  The rest of his work appears to have
been papers published in academia or with the ACM.
 
 to be considered a relational database, it must be
 impossible to write two queries which can be shown to be logically
 equivalent but which optimize to different access plans
 
 Sounds as though he's using a different definition than what I would
 use, but I'm sure I'll find out.
 
I think that as the inventor of the relational model for database
management, he felt that things were being done using the name of the
technology which didn't match his vision of it.  This book, and some
of his papers, seem to have been geared toward preserving the
integrity of his vision of RDBMS.
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] When is a record NULL?

2009-07-27 Thread Robert Haas
On Mon, Jul 27, 2009 at 9:48 AM, Kevin
Grittnerkevin.gritt...@wicourts.gov wrote:
 The latter really *is* a form of optimizer hint, it's just an
 undocumented, arcane hint for the Illuminati.

Well said.

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] When is a record NULL?

2009-07-27 Thread David E. Wheeler

On Jul 26, 2009, at 4:02 PM, Eric B. Ridge wrote:

I'm just a random lurker, but FOUND seems to work just fine (I  
suppose it's PG-specific?).


http://www.postgresql.org/docs/8.1/static/plpgsql-statements.html#PLPGSQL-STATEMENTS-DIAGNOSTICS

BEGIN
  OPEN stuff;
  FETCH stuff INTO rec;
  WHILE FOUND LOOP
 RETURN NEXT rec;
 FETCH stuff INTO rec;
  END LOOP;
END;


Yep, that's just what I needed, thanks. I think I'll send a patch for  
the Cursors section of the PL/pgSQL documentation that mentions  
this. Would have saved me a bunch of hassle.


Best,

David


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] When is a record NULL?

2009-07-26 Thread Kevin Grittner
Sam Mason s...@samason.me.uk wrote: 
 
 I've not read much of his writings, any canonical references for
 this sort of discussion?
 
I think this is the one, although it's been a while since I read it,
and I might be getting it confused with something else he wrote:
 
Codd, E.F. (1990). The Relational Model for Database Management
(Version 2 ed.). Addison Wesley Publishing Company.
ISBN 0-201-14192-2.
 
I believe that he puts forward a list of about 200 things he feels
should be true of a database in order for him to consider it a
relational database.  Since he was first and foremost a mathematician,
and was something of a perfectionist, I don't think some of these are
achievable (at least in the foreseeable future) without tanking
performance, but it makes for an interesting read.  I find most of it
to be on target, and it gives a unique chance to see things from the
perspective of the inventor of relational model for database
management.
 
I don't, of course, agree with him on everything.  If you think that
the SQL standard date handling is weird, wait until you see how a
perfectionist mathematician attempts to deal with it.  :-)  Also, the
requirement that, to be considered a relational database, it must be
impossible to write two queries which can be shown to be logically
equivalent but which optimize to different access plans to be, well, a
bit ivory tower.
 
It appears that the no duplicate rows in a relation rule is to
Codd's relational theory what the speed of light is to relativity.  I
think it is basically a corollary to the rule that each datum must be
addressable by specifying its table name, column name, and some set of
key values which uniquely identify the row.
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] When is a record NULL?

2009-07-26 Thread Greg Stark
On Sun, Jul 26, 2009 at 6:49 PM, Kevin
Grittnerkevin.gritt...@wicourts.gov wrote:
 Also, the
 requirement that, to be considered a relational database, it must be
 impossible to write two queries which can be shown to be logically
 equivalent but which optimize to different access plans to be, well, a
 bit ivory tower.

Personally I think that's a fine goal to aim for. I'm not sure what
to be considered a relational database means but I consider a bug
whenever there's a case where this isn't true. It may be a bug that we
don't have a good solution for or a bug that's too minor for the
amount of effort it would require but it's still not right and if we
found a solution that we were happy with we would definitely want to
fix it.

-- 
greg
http://mit.edu/~gsstark/resume.pdf

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] When is a record NULL?

2009-07-26 Thread Sam Mason
On Sun, Jul 26, 2009 at 12:49:32PM -0500, Kevin Grittner wrote:
 Codd, E.F. (1990). The Relational Model for Database Management
 (Version 2 ed.). Addison Wesley Publishing Company.
 ISBN 0-201-14192-2.

Looks as though I've got some reading to do then--somewhat annoying that
only second hand copies available from the US, but never mind!

 I believe that he puts forward a list of about 200 things he feels
 should be true of a database in order for him to consider it a
 relational database.  Since he was first and foremost a mathematician,
 and was something of a perfectionist, I don't think some of these are
 achievable (at least in the foreseeable future) without tanking
 performance, but it makes for an interesting read.  I find most of it
 to be on target, and it gives a unique chance to see things from the
 perspective of the inventor of relational model for database
 management.

Yup, I've heard lots and read a few smaller articles but don't think
I've got around to any of his books.

 I don't, of course, agree with him on everything.  If you think that
 the SQL standard date handling is weird, wait until you see how a
 perfectionist mathematician attempts to deal with it.  :-)  Also, the
 requirement that, to be considered a relational database, it must be
 impossible to write two queries which can be shown to be logically
 equivalent but which optimize to different access plans to be, well, a
 bit ivory tower.

Sounds as though he's using a different definition than what I would
use, but I'm sure I'll find out.

-- 
  Sam  http://samason.me.uk/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] When is a record NULL?

2009-07-26 Thread David E. Wheeler

On Jul 25, 2009, at 4:41 PM, David E. Wheeler wrote:

Useless perhaps, but it's gonna happen, and someone may even have a  
reason for it. Until such time as NULLs are killed off, we need to  
be able to deal with SQL's pathologies.


And something I'd like to be able to handle in a while loop, as I'm  
actually fetching one row at a time from two cursors and need to be  
able to tell when I've reached the end of a cursor. This example  
highlights the issue:


\set QUIET 1
SET client_min_messages = warning;
BEGIN;

CREATE TABLE peeps (
name TEXT NOT NULL,
dob date,
ssn text,
active boolean NOT NULL DEFAULT true
);

INSERT INTO peeps
VALUES ('Tom', '1963-03-23', '123-45-6789', true),
   ('Damian', NULL, NULL, true),
   ('Larry',  NULL, '932-45-3456', true),
   ('Bruce',  '1965-12-31', NULL, true);

CREATE TYPE dobssn AS ( dob date, ssn text );

CREATE FUNCTION using_loop() RETURNS SETOF dobssn LANGUAGE  
plpgsql AS $$

DECLARE
stuff CURSOR FOR SELECT dob, ssn from peeps where active  
ORDER BY name;

BEGIN
FOR rec IN stuff LOOP
RETURN NEXT rec;
END LOOP;
END;
$$;


CREATE FUNCTION using_while() RETURNS SETOF dobssn LANGUAGE  
plpgsql AS $$

DECLARE
stuff CURSOR FOR SELECT dob, ssn from peeps where active  
ORDER BY name;

rec dobssn;
BEGIN
open stuff;
FETCH stuff INTO rec;
WHILE NOT rec IS NULL LOOP
RETURN NEXT rec;
FETCH stuff INTO rec;
END LOOP;
END;
$$;

SELECT * FROM using_loop();
SELECT * FROM using_while();

ROLLBACK;

Output:

dob | ssn
+-
 1965-12-31 |
|
| 932-45-3456
 1963-03-23 | 123-45-6789
(4 rows)

dob | ssn
+-
 1965-12-31 |
(1 row)

So somehow the use of the loop to go right through the cursor can tell  
the difference between a record that's all nulls and the when the end  
of the cursor has been reached. My use of the while loop, however,  
cannot tell the difference, and AFAICT, there is no way to detect the  
difference in SQL. Is that correct? Is there some way to get  
using_while() to properly return all the records?


FYI, using:

WHILE rec IS DISTINCT FROM NULL LOOP

Results in an infinite loop. So does:

WHILE NOT rec IS NOT DISTINCT FROM NULL LOOP

And this, of course:

WHILE rec IS NOT NULL LOOP

Returns no rows at all.

Surely someone has run into this before, no?

Thanks,

David


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] When is a record NULL?

2009-07-26 Thread Sam Mason
On Sun, Jul 26, 2009 at 03:46:19PM -0700, David E. Wheeler wrote:
 And something I'd like to be able to handle in a while loop, as I'm  
 actually fetching one row at a time from two cursors and need to be  
 able to tell when I've reached the end of a cursor.

I'm sure I'm missing something obvious, but why doesn't the FOUND
magic variable tell you what you want?

-- 
  Sam  http://samason.me.uk/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] When is a record NULL?

2009-07-26 Thread Eric B. Ridge

On Jul 26, 2009, at 6:46 PM, David E. Wheeler wrote:

Is there some way to get using_while() to properly return all the  
records?


I'm just a random lurker, but FOUND seems to work just fine (I suppose  
it's PG-specific?).


http://www.postgresql.org/docs/8.1/static/plpgsql-statements.html#PLPGSQL-STATEMENTS-DIAGNOSTICS

BEGIN
   OPEN stuff;
   FETCH stuff INTO rec;
   WHILE FOUND LOOP
  RETURN NEXT rec;
  FETCH stuff INTO rec;
   END LOOP;
END;

HTH,

eric



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] When is a record NULL?

2009-07-25 Thread Kevin Grittner
David E. Wheeler da...@kineticode.com wrote: 
 
 So when it gets to that second row in the first cursor, it doesn't  
 know it's a row with NULLs as opposed to an empty row.
 
 there ought to be an easy way to tell the difference. :-(
 
I would have thought that the correct thing is to check SQLSTATE for
'02000'.  I can't see how PostgreSQL allows this, however.  :-(
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] When is a record NULL?

2009-07-25 Thread Jeff Davis
On Fri, 2009-07-24 at 15:47 -0700, David E. Wheeler wrote:
  WHILE (NOT rec_have IS NULL AND rec_have IS DISTINCT FROM NULL)
 OR (NOT rec_want IS NULL AND rec_want IS DISTINCT FROM NULL)
 

I don't think you want the NOT x IS NULL part at all -- that will
evaluate to false when x = rec(NULL,NULL). I think you just want the x
IS DISTINCT FROM NULL part, right? Will that work?

Regards,
Jeff Davis


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] When is a record NULL?

2009-07-25 Thread Kevin Grittner
Sam Mason s...@samason.me.uk wrote: 
 On Fri, Jul 24, 2009 at 12:30:39PM -0500, Kevin Grittner wrote:
 
 In this dichotomy a NULL is most definitely a value and with my
 current experience I don't understand the distinction you're trying
 to draw.
 
There can be a place where a value *could* go which does not contain a
value.  Codd considered it crucial, from a mathematical correctness
point of view, that the absence of a value not be indicated by some
special magic value, but rather by some other technique which
indicates that there *is* no value there.  In SQL this is done with
NULL.  Based on reading his books, it seems to me that Codd always
seemed uncomfortable with this, since it made it appear to be some
special value, which he was adamant that it is *not*.  It seems he
would have preferred a relational language use a term like FLAGGED AS
MISSING rather than IS NULL.  It also would have allowed the
flexibility to differentiate various types of missing values, such as
FLAGGED AS UNKNOWN or FLAGGED AS NOT APPLICABLE.
 
 The distinction between not having a tuple and having a tuple for
 which you don't know any applicable values seems thin.  I'm not
 sure what that would really mean.
 
 Other languages/type systems do define this precisely.
 
Yeah, I've made my living programming for decades, and worked in
dozens of languages, so I know how this is usually done.  I do think
that set logic in relational data involves some slightly different
twists on things than most language have.  I tend, for bettor or
worse, to come down in agreement with the positions Codd espoused on
most of these things.
 
 [PG] ... internally knows there is a distinction
 between the two but it doesn't like to expose this.
 
Well, to some extent I think it's a tough problem, since the set logic
of a relational database is implemented in C, which doesn't have the
same concepts.  There's got to be a little slight of hand in there
somewhere.
 
 If your model is correct then when the IS DISTINCT FROM operator
 works on RECORDs the following should return FALSE for all of the
 following:
 
   SELECT NULL  IS DISTINCT FROM ROW(NULL);
   SELECT NULL  IS DISTINCT FROM ROW(NULL,NULL);
   SELECT NULL  IS DISTINCT FROM ROW(NULL,ROW(NULL,NULL));
   SELECT ROW(NULL) IS DISTINCT FROM ROW(NULL,ROW(NULL,NULL));
 
 i.e. there is *no* difference between a NULL record and a record
 consisting entirely of NULLs.
 
Well, on that I would go with whatever the SQL standard says, and hope
it's not too ambiguous.  (I haven't tried to sort though this one in
the standard, so far.)  I was going into the theory both because it is
the basis for some of the seemingly odd aspects of SQL, and because at
least half the time I see someone put the word NULL immediately in
front of the word VALUE, they are wandering into confusion on these
issues.  (I will admit that using such technically incorrect language
is sometimes hard to avoid without sounding stilted, even if all
parties to the conversation know that NULL is *not* a value.)
 
I know that Codd was insistent that any relation (which included the
result of any query) which could contain duplicate rows should be
called a corrupted relation.  (In fact, in one of his books I think
he averaged a comment on this point about once every two pages.)  So I
shudder to think what his reaction would be to a relation with a row
which contained no values.  I have a really hard time figuring out
what useful information such a row could represent.
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] When is a record NULL?

2009-07-25 Thread David E. Wheeler

On Jul 25, 2009, at 9:24 AM, Jeff Davis wrote:


I don't think you want the NOT x IS NULL part at all -- that will
evaluate to false when x = rec(NULL,NULL). I think you just want the  
x

IS DISTINCT FROM NULL part, right? Will that work?


Nope, infinite loop when because `ROW(null, null)` and `ROW()` are  
both distinct from null:


try=# select row(null, null) is distinct from null, row() is distinct  
from null;

 ?column? | ?column?
--+--
 t| t
(1 row)

So I still can't tell when I've exhausted a cursor.

Best,

David



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] When is a record NULL?

2009-07-25 Thread David E. Wheeler

On Jul 25, 2009, at 9:42 AM, Kevin Grittner wrote:


I know that Codd was insistent that any relation (which included the
result of any query) which could contain duplicate rows should be
called a corrupted relation.  (In fact, in one of his books I think
he averaged a comment on this point about once every two pages.)  So I
shudder to think what his reaction would be to a relation with a row
which contained no values.  I have a really hard time figuring out
what useful information such a row could represent.


I agree that it's pathological, but it's clearly allowed by SQL, so we  
need to be able to deal with it effectively. Intuitively would be  
nice, but effectively will do.


Consider:

CREATE TABLE peeps (
name TEXT NOT NULL,
dob date,
ssn text,
active boolean NOT NULL DEFAULT true
);

INSERT INTO peeps
VALUES ('Tom', '1963-03-23', '123-45-6789', true),
   ('Damian', NULL, NULL, true),
   ('Larry',  NULL, '932-45-3456', true),
   ('Bruce',  '1965-12-31', NULL, true);

% SELECT dob, ssn from peeps where active;
dob | ssn
+-
 1963-03-23 | 123-45-6789
 [null] | [null]
 [null] | 932-45-3456
 1965-12-31 | [null]

Useless perhaps, but it's gonna happen, and someone may even have a  
reason for it. Until such time as NULLs are killed off, we need to be  
able to deal with SQL's pathologies.


Best,

David

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] When is a record NULL?

2009-07-25 Thread Sam Mason
On Sat, Jul 25, 2009 at 11:42:04AM -0500, Kevin Grittner wrote:
 Sam Mason s...@samason.me.uk wrote: 
  In this dichotomy a NULL is most definitely a value and with my
  current experience I don't understand the distinction you're trying
  to draw.
  
 There can be a place where a value *could* go which does not contain a
 value.  Codd considered it crucial, from a mathematical correctness
 point of view, that the absence of a value not be indicated by some
 special magic value, but rather by some other technique which
 indicates that there *is* no value there.  In SQL this is done with
 NULL.  Based on reading his books, it seems to me that Codd always
 seemed uncomfortable with this, since it made it appear to be some
 special value, which he was adamant that it is *not*.

Have you used more academic languages like ML or Haskell?  Their
option and Maybe types, respectively, provide the nicest practical
treatment I've seen of this.

 It seems he
 would have preferred a relational language use a term like FLAGGED AS
 MISSING rather than IS NULL.  It also would have allowed the
 flexibility to differentiate various types of missing values, such as
 FLAGGED AS UNKNOWN or FLAGGED AS NOT APPLICABLE.

Those sorts of lists get domain specific *very* quickly.  I think
languages are better off providing a type system of enough complexity to
express maybe types and letting users invent whatever is most useful for
the job at hand.

 I do think
 that set logic in relational data involves some slightly different
 twists on things than most language have.

You must live in a very different world from me then! :)

 I tend, for bettor or
 worse, to come down in agreement with the positions Codd espoused on
 most of these things.

I've not read much of his writings, any canonical references for this
sort of discussion?

  [PG] ... internally knows there is a distinction
  between the two but it doesn't like to expose this.
  
 Well, to some extent I think it's a tough problem, since the set logic
 of a relational database is implemented in C, which doesn't have the
 same concepts.  There's got to be a little slight of hand in there
 somewhere.

That's a pretty bad excuse; everything ends up as machine code
in the end.  Many languages expose very abstract and consistent
views of things, some of them a lot more awkward than that of a
relational database.  PG is admittedly hampered by a desire to follow a
particularly innovative standard and correctness for PG commonly means
keeping data safe.  Being logically consistent normally takes a minor
role, although there is quite a lot of overlap between the two.

  If your model is correct then when the IS DISTINCT FROM operator
  works on RECORDs the following should return FALSE for all of the
  following:
  
SELECT NULL  IS DISTINCT FROM ROW(NULL);
SELECT NULL  IS DISTINCT FROM ROW(NULL,NULL);
SELECT NULL  IS DISTINCT FROM ROW(NULL,ROW(NULL,NULL));
SELECT ROW(NULL) IS DISTINCT FROM ROW(NULL,ROW(NULL,NULL));
  
  i.e. there is *no* difference between a NULL record and a record
  consisting entirely of NULLs.
  
 Well, on that I would go with whatever the SQL standard says, and hope
 it's not too ambiguous.  (I haven't tried to sort though this one in
 the standard, so far.)

I've had a reasonable browse around an old copy of SQL'08 I found, but
couldn't find much helpful.  The definition of distinct (3.1.6.8 in my
copy) seems particularly vacuous, it would seem to leave it down to how
rows are constructed and I can't find many details of that.

 I was going into the theory both because it is
 the basis for some of the seemingly odd aspects of SQL, and because at
 least half the time I see someone put the word NULL immediately in
 front of the word VALUE, they are wandering into confusion on these
 issues.  (I will admit that using such technically incorrect language
 is sometimes hard to avoid without sounding stilted, even if all
 parties to the conversation know that NULL is *not* a value.)

I think that depends on what definition of value you're using.  If
you're considering it to be a value like a NULL pointer then I'd agree
as this is just a convention to treat pointers with a value of zero
specially.  If you treat values as members of a set with the set defined
as their type and a sub-type relation existing between types then a NULL
value is the only member of an unnamed type (in SQL) that's the subtype
of all other types.  There are other ways of formalizing this, and I've
probably explained it badly here, but it's a rough sketch of how I think
about it.

 I know that Codd was insistent that any relation (which included the
 result of any query) which could contain duplicate rows should be
 called a corrupted relation.  (In fact, in one of his books I think
 he averaged a comment on this point about once every two pages.)  So I
 shudder to think what his reaction would be to a relation with a row
 which contained no values.  I have a really 

Re: [HACKERS] When is a record NULL?

2009-07-24 Thread David E. Wheeler

On Jul 23, 2009, at 9:34 PM, Brendan Jurd wrote:


Well, a ROW is an ordered set of values, each one of which may be
either NULL or NOT NULL.


Right.


It doesn't really make sense to talk about the ROW itself being NULL
or NOT NULL, only its member values (but for extra confusion, contrast
with the treatment of arrays, which can themselves be NULL).


Well then maybe a record (row) should *never* be null.


It does make sense, however, to talk about the ROW's member values
being entirely NULL or entirely NOT NULL, and that's what the IS NULL
and IS NOT NULL tests tell you about.


Ah! So that's where the three-valued logic comes in to play with  
records:


try=#   SELECT ROW(1, NULL) IS NULL, ROW (1, 1) IS NULL, ROW(NULL,  
NULL) IS NULL;

 ?column? | ?column? | ?column?
--+--+--
 f| f| t


I guess the spec authors figured they might as well make IS [NOT] NULL
do something useful when applied to a row rather than throwing an
error.  I tend to agree.


Frankly, I find the state where a record with a NULL and a not-null  
value being neither NULL nor not NULL bizarre.



I hope that provides some clarity.


It's useful to learn that `ROW(NULL, NULL)` is NULL, but I find the  
whole thing totally bizarre. Is it me?


Best,

David


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] When is a record NULL?

2009-07-24 Thread Richard Huxton

David E. Wheeler wrote:

On Jul 23, 2009, at 9:34 PM, Brendan Jurd wrote:



I guess the spec authors figured they might as well make IS [NOT] NULL
do something useful when applied to a row rather than throwing an
error.  I tend to agree.


Frankly, I find the state where a record with a NULL and a not-null 
value being neither NULL nor not NULL bizarre.


I'm guessing the justification (and presumably this was worked out based 
on the behaviour of one or more of the big DB providers and then 
justified afterwards) is that the composite is partially unknown. Of 
course you should either introduce a new code or throw an error, but 
four-valued logic isn't going to win you any friends.


If the argument *is* that because you know part of the overall value the 
composite isn't null then I'd argue that ('abc' || null) isn't null 
either. After all, the first three characters are perfectly well 
established.



I hope that provides some clarity.


It's useful to learn that `ROW(NULL, NULL)` is NULL, but I find the 
whole thing totally bizarre. Is it me?


Yes, just you. None of the rest of us have any problems with this at all :-)

--
  Richard Huxton
  Archonet Ltd

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] When is a record NULL?

2009-07-24 Thread Brendan Jurd
2009/7/24 David E. Wheeler da...@kineticode.com:
 It's useful to learn that `ROW(NULL, NULL)` is NULL, but I find the whole
 thing totally bizarre. Is it me?


*shrug* The IS [NOT] NULL tests mean something different when applied
to a ROW than they do when applied to a scalar value or an array.

SELECT 1 IS NULL
  = means is this scalar set to the special value NULL?.

SELECT ROW(1, 2) IS NULL
  = means are all the member values of this row set to the special
value NULL?

So it is wrong to talk about ROW(NULL, NULL) being NULL.  It doesn't
have the property of being NULL or not NULL, because it is a composite
value.  ROW(NULL, NULL) IS NULL returns true, but that is not the
same as saying that it actually is NULL, because of the different
semantics above.

It's slightly different semantics from what you get with ordinary
scalar values, but that is cognisant with the fact that composites are
fundamentally different things from ordinary scalar values.

Cheers,
BJ

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] When is a record NULL?

2009-07-24 Thread Michael Gould
Does this also apply at the column level?  

  In Postgres is   If column IS NOT NULL or If column IS NULL?  Do all
columns in the record have to be NULL to have IF column IS NULL = true?

Best Regards
--
Michael Gould, Managing Partner
Intermodal Software Solutions, LLC
904.226.0978
904.592.5250 fax



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] When is a record NULL?

2009-07-24 Thread Tom Lane
Brendan Jurd dire...@gmail.com writes:
 So it is wrong to talk about ROW(NULL, NULL) being NULL.  It doesn't
 have the property of being NULL or not NULL, because it is a composite
 value.  ROW(NULL, NULL) IS NULL returns true, but that is not the
 same as saying that it actually is NULL, because of the different
 semantics above.

It's worse than that, because there actually is also such a thing as
the row value being NULL --- ie, there's no row structure at all.
At least internally, that's a completely different thing from having
a row all of whose fields are null.

SQL doesn't provide a test for this case that's separate from the test
involving null-ness of individual fields.  Not much we can do about
it though.  I'm not entirely sure that exposing the distinction would
be helpful anyway ...

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] When is a record NULL?

2009-07-24 Thread Joshua Tolley
On Thu, Jul 23, 2009 at 06:46:25PM -0700, David E. Wheeler wrote:
 Yes, but given that the standard says that `ROW(1, NULL)` is NULL, then I 
 would expect it to be NOT DISTINCT from `ROW(2, NULL)`.

Wait, didn't we decide upthread that the standard said ROW(1, NULL) isn't
NULL?

(From Tom):
 This is per SQL standard.  IS NULL is true if *all* the record's  

 fields are null; IS NOT NULL is true if *none* of them are.   


--
Joshua Tolley / eggyknap
End Point Corporation
http://www.endpoint.com


signature.asc
Description: Digital signature


Re: [HACKERS] When is a record NULL?

2009-07-24 Thread Tom Lane
Joshua Tolley eggyk...@gmail.com writes:
 On Thu, Jul 23, 2009 at 06:46:25PM -0700, David E. Wheeler 
 Yes, but given that the standard says that `ROW(1, NULL)` is NULL, then I 
 would expect it to be NOT DISTINCT from `ROW(2, NULL)`.

 Wait, didn't we decide upthread that the standard said ROW(1, NULL) isn't
 NULL?

David misspoke in the quoted statement, as I believe he figured out soon
thereafter.  For that row value, neither IS NULL nor IS NOT NULL will
return true.  The spec defines them in such a way that they are not inverses
for row values.

SQL2008 points out:

NOTE 219 - For all R, R IS NOT NULL has the same result as
NOT R IS NULL if and only if R is of degree 1. Table 14,
null predicate semantics, specifies this behavior.

That table looks like this:

   R ISR IS NOT  NOT R IS  NOT R IS NOT
 _Expression___NULLNULL__NULL__NULL_

| degree 1: null | true_ | false_  | false_ |  true_   |
||   | ||  |
| degree 1: not  | false_| true_   | true_  |  false_  |
  null

| degree  1:| true_ | false_  | false_ |  true_   |
| all null   |   | ||  |
||   | ||  |
| degree  1:| false_| false_  | true_  |  true_   |
| some null  |   | ||  |
||   | ||  |
| degree  1:| false_| true_   | true_  |  false_  |
|_none_null__|___|_||__|


regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] When is a record NULL?

2009-07-24 Thread Jeff Davis
On Fri, 2009-07-24 at 09:31 -0400, Tom Lane wrote:
 Brendan Jurd dire...@gmail.com writes:
 It's worse than that, because there actually is also such a thing as
 the row value being NULL --- ie, there's no row structure at all.
 At least internally, that's a completely different thing from having
 a row all of whose fields are null.

Here is an example:
  select * from (values (row(NULL)), (null)) v;

But it's hard to do much useful with row values once you have them. You
can't even count them:
  select count(column1) from (values (row(NULL)), (null)) v;
  ERROR:  record type has not been registered

 SQL doesn't provide a test for this case that's separate from the test
 involving null-ness of individual fields.  Not much we can do about
 it though.  I'm not entirely sure that exposing the distinction would
 be helpful anyway ...

There's no single test, but you can see if it's an empty row with:
  x IS NULL AND x IS DISTINCT FROM NULL
And you can see if it's a real NULL by:
  x IS NULL AND x IS NOT DISTINCT FROM NULL

It's funny until you try it -- then it's just scary.

Regards,
Jeff Davis 


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] When is a record NULL?

2009-07-24 Thread Sam Mason
On Fri, Jul 24, 2009 at 09:31:13AM -0400, Tom Lane wrote:
 Brendan Jurd dire...@gmail.com writes:
  So it is wrong to talk about ROW(NULL, NULL) being NULL.  It doesn't
  have the property of being NULL or not NULL, because it is a composite
  value.  ROW(NULL, NULL) IS NULL returns true, but that is not the
  same as saying that it actually is NULL, because of the different
  semantics above.
 
 It's worse than that, because there actually is also such a thing as
 the row value being NULL --- ie, there's no row structure at all.
 At least internally, that's a completely different thing from having
 a row all of whose fields are null.

Hope nobody minds if I go off on a somewhat pedagogic bent here!

Not quite sure what you mean by the above; but I think maybe something
like:

  SELECT NULL::RECORD AS r;

PG may well treat this internally as a special case, but from a type
level I don't see any difference between the above and, say:

  SELECT ROW(1)::RECORD AS r;

In both cases we get a result that has exactly one column and this
column is of type RECORD (or ROW depending how you spell things).
The fact that it happens to be a NULL *value* in one case shouldn't
affect things at the level of *types*--unless PG has suddenly become
dependently-typed which I don't believe it wants to be.

I'm also aware that PG's handling of types with ROW values is somewhat
inconsistent when compared to other values, for example:

  SELECT (r).a
  FROM (
SELECT a,b
FROM (VALUES
  (1,2),
  (2,3)) x(a,b)) r;

here, we can look inside the RECORD named by r and pull out the value
associated with attribute a, but inside:

  SELECT (r).a
  FROM (VALUES
(ROW(1,2)),
(ROW(2,3))) x(r);

we get a message saying that the record type has not been registered
when I'd expect to get an error saying that it doesn't know which
attribute a is.  We also fail to get an error in the following case:

SELECT r
FROM (VALUES
  (ROW(1,2)),
  (ROW('a','b'))) x(r);

which (to me) seems wrong.  The (allegedly) static types in PG appear
dynamic when it comes to RECORDs.


I'd be interested in fixing this behavior but every time I've started
trying to fix this I've tended to get horribly lost in the code.

 SQL doesn't provide a test for this case that's separate from the test
 involving null-ness of individual fields.  Not much we can do about
 it though.  I'm not entirely sure that exposing the distinction would
 be helpful anyway ...

I think it would; I tend to write the following and have just realized
that it doesn't do what I thought it did:

  SELECT a.*
  FROM tbla a
LEFT JOIN tblb b ON a.id = b.id
  WHERE b IS NULL;

The intuition being that the row valued b would only be considered to
be NULL (i.e. IS NULL returned TRUE) when the LEFT JOIN failed.  The
standard way to write this is of course to write WHERE b.id IS NULL,
but I'm unsure why it's necessary to look inside the record b to get
out attribute id to see if it's NULL when it should just be possible
to look at b directly.

-- 
  Sam  http://samason.me.uk/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] When is a record NULL?

2009-07-24 Thread Kevin Grittner
Sam Mason s...@samason.me.uk wrote:
 On Fri, Jul 24, 2009 at 09:31:13AM -0400, Tom Lane wrote:
 
 Hope nobody minds if I go off on a somewhat pedagogic bent here!
 
Not as long as you don't mind replies in kind.   ;-)
 
 The fact that it happens to be a NULL *value* in one case
 
Well, according to Codd (and I tend to go with him on this) there is
no such thing.  NULL is a way to flag a place where a value could be
stored, but is not -- because is unknown or is not applicable in that
context.  (He seemed to feel it was a big weakness of SQL that it
didn't differentiate between these two conditions, but that's another
argument.)  NULL value is an oxymoron.
 
 SQL doesn't provide a test for this case that's separate from the
 test involving null-ness of individual fields.  Not much we can do
 about it though.  I'm not entirely sure that exposing the
 distinction would be helpful anyway ...
 
 I think it would
 
The distinction between not having a tuple and having a tuple for
which you don't know any applicable values seems thin.  I'm not sure
what that would really mean.
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] When is a record NULL?

2009-07-24 Thread Merlin Moncure
On Fri, Jul 24, 2009 at 5:15 AM, David E. Wheelerda...@kineticode.com wrote:
 On Jul 23, 2009, at 9:34 PM, Brendan Jurd wrote:

 Well, a ROW is an ordered set of values, each one of which may be
 either NULL or NOT NULL.

 Right.

 It doesn't really make sense to talk about the ROW itself being NULL
 or NOT NULL, only its member values (but for extra confusion, contrast
 with the treatment of arrays, which can themselves be NULL).

 Well then maybe a record (row) should *never* be null.

I disagree, and I think our current way of treating things is
incorrect (although harmless).  I rowtype can be null:

select null::somerowtype;

I think the following should _not_ return true:
select (null, null)::somerowtype is null;

The reasoning being that while the rowtype members are null, the
record variable itself is not; these are two distinct cases and should
be checked for and treated differently.

Another line of reasoning for this is that if something gives 'true'
for the is null operator, it should behave as null does, giving null
for any operations on it and giving null for STRICT functions, to give
a couple of examples.

create table foo (a int, b int);
select (null, null)::foo is null;
 ?column?
--
 t

create or replace function doit(foo) returns void as $$ begin raise
notice '!'; end; $$ language plpgsql strict;

select doit(row(null, null)::foo);
NOTICE:  !  -- what???!?

I think this is wrong, and if the sql standard sez it is so, then the
standard is wrong :-).

merlin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] When is a record NULL?

2009-07-24 Thread Merlin Moncure
On Fri, Jul 24, 2009 at 1:35 PM, Merlin Moncuremmonc...@gmail.com wrote:
 On Fri, Jul 24, 2009 at 5:15 AM, David E. Wheelerda...@kineticode.com wrote:
 On Jul 23, 2009, at 9:34 PM, Brendan Jurd wrote:

 Well, a ROW is an ordered set of values, each one of which may be
 either NULL or NOT NULL.

 Right.

 It doesn't really make sense to talk about the ROW itself being NULL
 or NOT NULL, only its member values (but for extra confusion, contrast
 with the treatment of arrays, which can themselves be NULL).

 Well then maybe a record (row) should *never* be null.

 I disagree, and I think our current way of treating things is
 incorrect (although harmless).  I rowtype can be null:

 select null::somerowtype;

 I think the following should _not_ return true:
 select (null, null)::somerowtype is null;

 The reasoning being that while the rowtype members are null, the
 record variable itself is not; these are two distinct cases and should
 be checked for and treated differently.

 Another line of reasoning for this is that if something gives 'true'
 for the is null operator, it should behave as null does, giving null
 for any operations on it and giving null for STRICT functions, to give
 a couple of examples.

 create table foo (a int, b int);
 select (null, null)::foo is null;
  ?column?
 --
  t

 create or replace function doit(foo) returns void as $$ begin raise
 notice '!'; end; $$ language plpgsql strict;

 select doit(row(null, null)::foo);
 NOTICE:  !  -- what???!?

 I think this is wrong, and if the sql standard sez it is so, then the
 standard is wrong :-).

Just to clarify...I think this is the right behavior to run the
function that way, but (null,null) is null == true is not correct.
There are not 'degrees' of null...something is either null or it
isn't.

merlin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] When is a record NULL?

2009-07-24 Thread Tom Lane
Merlin Moncure mmonc...@gmail.com writes:
 I think the following should _not_ return true:
 select (null, null)::somerowtype is null;

Take it up with the SQL standards committee.  The spec is 100% not
ambiguous about this.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] When is a record NULL?

2009-07-24 Thread Tom Lane
Jeff Davis pg...@j-davis.com writes:
 There's no single test, but you can see if it's an empty row with:
   x IS NULL AND x IS DISTINCT FROM NULL
 And you can see if it's a real NULL by:
   x IS NULL AND x IS NOT DISTINCT FROM NULL

Hmmm ... that suggests that we may not be handling IS DISTINCT FROM
correctly for rowtypes.  I haven't read that part of the spec, but
I would've guessed that the spec wants it to be consistent with
IS NULL.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] When is a record NULL?

2009-07-24 Thread David E. Wheeler

On Jul 24, 2009, at 3:17, Brendan Jurd dire...@gmail.com wrote:


 ROW(NULL, NULL) IS NULL returns true, but that is not the
same as saying that it actually is NULL


I think my head just exploded. Loving the dark corners of SQL…

Best,

David
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] When is a record NULL?

2009-07-24 Thread David E. Wheeler

On Jul 24, 2009, at 11:10, Tom Lane t...@sss.pgh.pa.us wrote:


Hmmm ... that suggests that we may not be handling IS DISTINCT FROM
correctly for rowtypes.  I haven't read that part of the spec, but
I would've guessed that the spec wants it to be consistent with
IS NULL.


Yes, that's precisely what I was trying to get at last night.

Best,

David

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] When is a record NULL?

2009-07-24 Thread Jeff Davis
On Fri, 2009-07-24 at 13:35 -0400, Merlin Moncure wrote:
 I think the following should _not_ return true:
 select (null, null)::somerowtype is null;
 
 The reasoning being that

It's not at all clear to me that you can reason effectively about SQL
logic semantics. Upon which laws are you reasoning? Certainly not 2VL
boolean logic.

The standard is what it is. If it says that some NULLs are red and some
NULLs are blue, then we'd probably support it.

Regards,
Jeff Davis


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] When is a record NULL?

2009-07-24 Thread Jeff Davis
On Fri, 2009-07-24 at 14:10 -0400, Tom Lane wrote:
 Hmmm ... that suggests that we may not be handling IS DISTINCT FROM
 correctly for rowtypes.  I haven't read that part of the spec, but
 I would've guessed that the spec wants it to be consistent with
 IS NULL.
 

Our behavior appears to match the standard, which covers this case in
8.15.General Rules.1.c.iv.

Regards,
Jeff Davis


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] When is a record NULL?

2009-07-24 Thread Merlin Moncure
On Fri, Jul 24, 2009 at 2:05 PM, Tom Lanet...@sss.pgh.pa.us wrote:
 Merlin Moncure mmonc...@gmail.com writes:
 I think the following should _not_ return true:
 select (null, null)::somerowtype is null;

 Take it up with the SQL standards committee.  The spec is 100% not
 ambiguous about this.

I'm not proposing to change the current behavior...just playing
devil's advocate here.  Still, why:

*) is select null::foo is distinct from (null, null)::foo; true?
(null should not be distinct from null)

*) does (null, null) allow STRICT functions to execute?

*) is (null, null)::text not null?

The upshot of this is that 'is null' really means 'is approximately
null', or our handling of the three above cases is wrong.  Or put
another way, I'd like a clarification:

is (null, null) really null (in which case I'd say our handling of it
is out of standard), or just a special case for the IS NULL operator,
so that it returns null but in all other respect not null?  ISTM you
can't have it both ways.

merlin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] When is a record NULL?

2009-07-24 Thread Greg Stark
On Fri, Jul 24, 2009 at 7:49 PM, Merlin Moncuremmonc...@gmail.com wrote:
 Still, why:

 *) is select null::foo is distinct from (null, null)::foo; true?
 (null should not be distinct from null)

 *) does (null, null) allow STRICT functions to execute?

 *) is (null, null)::text not null?

These are all good questions. Are you interested in checking what the
spec says should happen in these cases?


-- 
greg
http://mit.edu/~gsstark/resume.pdf

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] When is a record NULL?

2009-07-24 Thread Merlin Moncure
On Fri, Jul 24, 2009 at 4:29 PM, Greg Starkgsst...@mit.edu wrote:
 On Fri, Jul 24, 2009 at 7:49 PM, Merlin Moncuremmonc...@gmail.com wrote:
 Still, why:

 *) is select null::foo is distinct from (null, null)::foo; true?
 (null should not be distinct from null)

 *) does (null, null) allow STRICT functions to execute?

 *) is (null, null)::text not null?

 These are all good questions. Are you interested in checking what the
 spec says should happen in these cases?

I don't have a copy :-).  (Am I being obtuse, and not noticing
something obvious?)

I think though that one of three cases is true:

*) postgresql is not treating (null, null) as null except in the case
where checked with 'is null'.
*) postgresql is following spec, which basically contradicts itself.
*) the behavior is undefined

If the first case is true (i bet it is), then many things we do with
composite types are wrong on some level, seriously in some cases.  for
example:

select count(rowtype) from foo;
select distinct rowtype from foo;
select * from foo join bar using (rowtype);

can all give the wrong answer.

regardless, I think the sql standard definition of null for rowtypes
is lame -- the way null works for arrays is much better, in that the
elements inside the array an be null, independently can the array
itself.

I like the current behavior of composites (all composite fields being
null is a pretty rare case in practice), and would argue against
bringing the database closer to spec if it turns out that we are doing
it incorrectly.

merin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] When is a record NULL?

2009-07-24 Thread David E. Wheeler

On Jul 24, 2009, at 9:40 AM, Jeff Davis wrote:


There's no single test, but you can see if it's an empty row with:
 x IS NULL AND x IS DISTINCT FROM NULL
And you can see if it's a real NULL by:
 x IS NULL AND x IS NOT DISTINCT FROM NULL

It's funny until you try it -- then it's just scary.


Amen to that.

So here's what I'm doing, essentially (some error handling removed for  
clarity):


FETCH have INTO rec_have;
FETCH want INTO rec_want;
WHILE NOT rec_have IS NULL OR NOT rec_want IS NULL LOOP
IF rec_have IS DISTINCT FROM rec_want THEN
RETURN false;
END IF;
rownum = rownum + 1;
FETCH have INTO rec_have;
FETCH want INTO rec_want;
END LOOP;
RETURN true;

So far this seems to work for the tests I've thrown at it, telling me  
when two cursors return results that are row-by-row equivalent,  
including when columns have the NULLs, though I've not yet tried rows  
that are nothing but nulls. That's probably not very useful, but it is  
possible.


Does this look like it's a reasonable implementation for what I'm  
testing? Have I missed anything in the swirl of the bizarre that this  
thread has triggered?


Thanks,

David




--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] When is a record NULL?

2009-07-24 Thread David E . Wheeler

On Jul 24, 2009, at 11:37 AM, Jeff Davis wrote:


Hmmm ... that suggests that we may not be handling IS DISTINCT FROM
correctly for rowtypes.  I haven't read that part of the spec, but
I would've guessed that the spec wants it to be consistent with
IS NULL.


Our behavior appears to match the standard, which covers this case in
8.15.General Rules.1.c.iv.


Good, because what I'm doing is comparing rows returned by two  
different cursors using IS DISTINCT FROM, and what I need the  
comparison to do is to ensure that two rows are not equivalent  
(distinct) unless all of their values are the same, column to column,  
including NULLs.


Best,

David

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] When is a record NULL?

2009-07-24 Thread Sam Mason
On Fri, Jul 24, 2009 at 12:30:39PM -0500, Kevin Grittner wrote:
 Sam Mason s...@samason.me.uk wrote:
  The fact that it happens to be a NULL *value* in one case
  
 Well, according to Codd (and I tend to go with him on this) there is
 no such thing.  NULL is a way to flag a place where a value could be
 stored, but is not -- because is unknown or is not applicable in that
 context.  (He seemed to feel it was a big weakness of SQL that it
 didn't differentiate between these two conditions, but that's another
 argument.)  NULL value is an oxymoron.

I think then maybe we're talking about different things; I was trying
to draw attention to the distinction between types and values---types
allow some invariants of the code to be automatically checked before it
is run, a value only has meaning at run time with the set of possible
values an expression is defined over being constrained by its type.

In this dichotomy a NULL is most definitely a value and with my current
experience I don't understand the distinction you're trying to draw.

  SQL doesn't provide a test for this case that's separate from the
  test involving null-ness of individual fields.  Not much we can do
  about it though.  I'm not entirely sure that exposing the
  distinction would be helpful anyway ...
  
  I think it would
  
 The distinction between not having a tuple and having a tuple for
 which you don't know any applicable values seems thin.  I'm not sure
 what that would really mean.

Other languages/type systems do define this precisely.  For example,
in object orientated languages there's a big difference between a
reference to an object being NULL and some member of an object being
NULL.  Databases obviously have their own semantics, but the distinction
is well defined.

Any implementation that tries to be faithful to a standard has its hands
somewhat tied and PG is no exception.  PG currently seems to be some
hybrid half way between, it internally knows there is a distinction
between the two but it doesn't like to expose this.  For example (and
this appears particularly awkward because of annoying limitations in the
syntax PG accepts):

  SELECT y
  FROM (SELECT 1) x(a)
LEFT JOIN (SELECT 1,2) y(a,b) ON FALSE;

I think it should be valid to express this as:
  SELECT (SELECT 1,2 WHERE FALSE);
but PG doesn't like sub-queries returning two columns--but this is
material for another discussion.

This returns a single row whose only attribute is NULL (i.e. it's
rendered as '') and not as a record whose attributes are all NULL (i.e.
rendered as '(,)'). The fact PG does the former says that your mental
model isn't congruent with PGs behavior.

If your model is correct then when the IS DISTINCT FROM operator works
on RECORDs the following should return FALSE for all of the following:

  SELECT NULL  IS DISTINCT FROM ROW(NULL);
  SELECT NULL  IS DISTINCT FROM ROW(NULL,NULL);
  SELECT NULL  IS DISTINCT FROM ROW(NULL,ROW(NULL,NULL));
  SELECT ROW(NULL) IS DISTINCT FROM ROW(NULL,ROW(NULL,NULL));

i.e. there is *no* difference between a NULL record and a record
consisting entirely of NULLs.

-- 
  Sam  http://samason.me.uk/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] When is a record NULL?

2009-07-24 Thread David E. Wheeler

On Jul 24, 2009, at 2:59 PM, David E. Wheeler wrote:


   FETCH have INTO rec_have;
   FETCH want INTO rec_want;
   WHILE NOT rec_have IS NULL OR NOT rec_want IS NULL LOOP
   IF rec_have IS DISTINCT FROM rec_want THEN
   RETURN false;
   END IF;
   rownum = rownum + 1;
   FETCH have INTO rec_have;
   FETCH want INTO rec_want;
   END LOOP;
   RETURN true;


Bah. It fails to do what I want when I pass cursors that return:

VALUES (NULL, NULL), (NULL, NULL)
VALUES (NULL, NULL)

So when it gets to that second row in the first cursor, it doesn't  
know it's a row with NULLs as opposed to an empty row. So this bit:


   WHILE NOT rec_have IS NULL OR NOT rec_want IS NULL LOOP

Obviously isn't detecting the difference. I tried

WHILE (NOT rec_have IS NULL AND rec_have IS DISTINCT FROM NULL)
   OR (NOT rec_want IS NULL AND rec_want IS DISTINCT FROM NULL)

and

WHILE (NOT rec_have IS NULL AND NOT rec_have IS NOT DISTINCT FROM  
NULL)
   OR (NOT rec_want IS NULL AND NOT rec_want IS NOT DISTINCT FROM  
NULL)


But they didn't work, either.

There's got to be a way to do this; better, there ought to be an easy  
way to tell the difference. :-(


Thanks,

David

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] When is a record NULL?

2009-07-23 Thread David E. Wheeler

How can a record be neither NULL or NOT NULL?

try=# select ROW(1, NULL) IS NULL;
 ?column?
--
 f
(1 row)

try=# select ROW(1, NULL) IS NOT NULL;
 ?column?
--
 f
(1 row)

This makes it rather hard to tell, in PL/pgSQL, when I've fetched the  
last record from a cursor…


Best,

David
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] When is a record NULL?

2009-07-23 Thread David E. Wheeler

On Jul 23, 2009, at 3:19 PM, David E. Wheeler wrote:


How can a record be neither NULL or NOT NULL?

   try=# select ROW(1, NULL) IS NULL;
?column?
   --
f
   (1 row)

   try=# select ROW(1, NULL) IS NOT NULL;
?column?
   --
f
   (1 row)

This makes it rather hard to tell, in PL/pgSQL, when I've fetched  
the last record from a cursor…


Also:

select ROW(1, NULL) IS DISTINCT FROM ROW(2, NULL);
 ?column?
--
 t

As expected, but the IS NULL above is not expected (by this bunny,  
anyway).


Best,

David


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] When is a record NULL?

2009-07-23 Thread Jeff Davis
On Thu, 2009-07-23 at 15:19 -0700, David E. Wheeler wrote:
 How can a record be neither NULL or NOT NULL?

You could do:

not ROW(1, NULL) is null and not ROW(1, NULL) is not null

I don't know what the SQL standard says about this.

Regards,
Jeff Davis


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] When is a record NULL?

2009-07-23 Thread David E. Wheeler

On Jul 23, 2009, at 4:08 PM, Tom Lane wrote:


This is per SQL standard.  IS NULL is true if *all* the record's
fields are null; IS NOT NULL is true if *none* of them are.

Yeah, it's a bit dubious, but that's what they said.


And yet they're DISTINCT FROM each other if either or both contain  
NULLs and some other values? It seems to me that, to be consistent, it  
should be:


select ROW(1, NULL) IS DISTINCT FROM ROW(2, NULL);
?column?
--
f

No?

Best,

David

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] When is a record NULL?

2009-07-23 Thread Tom Lane
David E. Wheeler da...@kineticode.com writes:
 How can a record be neither NULL or NOT NULL?

This is per SQL standard.  IS NULL is true if *all* the record's
fields are null; IS NOT NULL is true if *none* of them are.

Yeah, it's a bit dubious, but that's what they said.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] When is a record NULL?

2009-07-23 Thread Tom Lane
David E. Wheeler da...@kineticode.com writes:
 And yet they're DISTINCT FROM each other if either or both contain  
 NULLs and some other values?

Well, that would depend on what the values were and in what columns...

 It seems to me that, to be consistent, it  
 should be:

  select ROW(1, NULL) IS DISTINCT FROM ROW(2, NULL);
  ?column?
  --
  f

Dunno how you can possibly come to that conclusion.  Those row
values are certainly distinct (according to both PG and the spec).

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] When is a record NULL?

2009-07-23 Thread David E. Wheeler

On Jul 23, 2009, at 3:33 PM, Jeff Davis wrote:


not ROW(1, NULL) is null and not ROW(1, NULL) is not null

I don't know what the SQL standard says about this


Thanks Jeff, that's just what I needed.

Best,

David

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] When is a record NULL?

2009-07-23 Thread David E. Wheeler

On Jul 23, 2009, at 4:22 PM, Tom Lane wrote:


Dunno how you can possibly come to that conclusion.  Those row
values are certainly distinct (according to both PG and the spec).


Are they not both null?

David

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] When is a record NULL?

2009-07-23 Thread Greg Stark
On Fri, Jul 24, 2009 at 1:32 AM, David E. Wheelerda...@kineticode.com wrote:
 On Jul 23, 2009, at 4:22 PM, Tom Lane wrote:

 Dunno how you can possibly come to that conclusion.  Those row
 values are certainly distinct (according to both PG and the spec).

 Are they not both null?

Isn't that just what you were complaining about not being the case:

On Thu, Jul 23, 2009 at 11:29 PM, David E. Wheelerda...@kineticode.com wrote:
   try=# select ROW(1, NULL) IS NULL;
?column?
   --
f
   (1 row)

-- 
greg
http://mit.edu/~gsstark/resume.pdf

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] When is a record NULL?

2009-07-23 Thread Jeff Davis
On Thu, 2009-07-23 at 17:32 -0700, David E. Wheeler wrote:
 On Jul 23, 2009, at 4:22 PM, Tom Lane wrote:
 
  Dunno how you can possibly come to that conclusion.  Those row
  values are certainly distinct (according to both PG and the spec).
 
 Are they not both null?

[ Is that a play on words? ;) ]

Here's the logical breakdown of your argument:

x = ROW(1, NULL)

0. x IS NOT NULL = false
1. =  NOT x IS NOT NULL = true
2. =  NOT NOT x IS NULL = true
3. =  x IS NULL = true

However, in step 2, you transformed:
  x IS NOT NULL = NOT x IS NULL

But in SQL that is not a tautology!

I don't think it's wise to assume SQL is consistent. I think it is
possible to create a consistent 3VL system, but you have to give up some
other very important property. I can't remember the details at the
moment, but there's an interesting proof in Logic and Databases by
C.J. Date.

Does the SQL spec lay out the tautologies anywhere, so that we can get a
clear picture of what's going on with NULLs? I won't make the claim that
SQL is inconsistent without actually seeing the system as a whole, but,
at a minimum, many of the tautologies that people are accustomed to are
not true in SQL.

Regards,
Jeff Davis



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] When is a record NULL?

2009-07-23 Thread David E. Wheeler

On Jul 23, 2009, at 6:21 PM, Greg Stark wrote:


Are they not both null?


Isn't that just what you were complaining about not being the case:


Yes, but given that the standard says that `ROW(1, NULL)` is NULL,  
then I would expect it to be NOT DISTINCT from `ROW(2, NULL)`.


Best,

David

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] When is a record NULL?

2009-07-23 Thread David E. Wheeler

On Jul 23, 2009, at 6:06 PM, Jeff Davis wrote:


However, in step 2, you transformed:
 x IS NOT NULL = NOT x IS NULL

But in SQL that is not a tautology!


No, that's not the problem I see -- that solved the problem in my  
particular code. The problem I see is that, given that the standard  
says (according to Tom) that if any value is NULL then the record is  
NULL, then I would expect this to return false:


SELECT ROW(1, NULL) IS DISTINCT FROM ROW(2, NULL);

But it returns true.


I don't think it's wise to assume SQL is consistent. I think it is
possible to create a consistent 3VL system, but you have to give up  
some

other very important property. I can't remember the details at the
moment, but there's an interesting proof in Logic and Databases by
C.J. Date.


If the standard says that, in the case of records, two NULLs are  
distinct, then fine. Completely bizarre, but fine. But I suspect that  
such is not the case.


Best,

David


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] When is a record NULL?

2009-07-23 Thread David E. Wheeler

On Jul 23, 2009, at 6:52 PM, David E. Wheeler wrote:

No, that's not the problem I see -- that solved the problem in my  
particular code. The problem I see is that, given that the standard  
says (according to Tom) that if any value is NULL then the record is  
NULL, then I would expect this to return false:


   SELECT ROW(1, NULL) IS DISTINCT FROM ROW(2, NULL);

But it returns true.


Sorry, I'm confused. Understandable I think. So:

ROW(1, NULL) is neither NULL nor NOT NULL. I've no idea what state it  
is, but I guess that's the standard. In this case, IS DISTINCT FROM  
gives me a sensible return value -- it assumes that the records are  
NOT NULL, I guess, for its purposes. This is still inconsistent, since  
the records are neither NULL nor NOT NULL, but perhaps sensible.


It's pretty insane, frankly. Gotta love SQL.

Best,

David

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] When is a record NULL?

2009-07-23 Thread Brendan Jurd
2009/7/24 David E. Wheeler da...@kineticode.com:
 ROW(1, NULL) is neither NULL nor NOT NULL. I've no idea what state it is,
 but I guess that's the standard.

Well, a ROW is an ordered set of values, each one of which may be
either NULL or NOT NULL.

It doesn't really make sense to talk about the ROW itself being NULL
or NOT NULL, only its member values (but for extra confusion, contrast
with the treatment of arrays, which can themselves be NULL).

It does make sense, however, to talk about the ROW's member values
being entirely NULL or entirely NOT NULL, and that's what the IS NULL
and IS NOT NULL tests tell you about.

I guess the spec authors figured they might as well make IS [NOT] NULL
do something useful when applied to a row rather than throwing an
error.  I tend to agree.

I hope that provides some clarity.

Cheers,
BJ

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers