Isaac Morland <isaac.morl...@gmail.com> writes: > What is confusing me is why the planner can't convert "[entire row] IS > NULL" into a test for existence of the matching row (assuming there is at > least one NOT NULL column).
The reasons why the planner does very little with row-level IS [NOT] NULL conditions are (1) so few people use them that it doesn't really seem worth expending cycles or development effort on such cases, and (2) the SQL spec is vague enough about the semantics of these predicates that we've never been entirely sure whether we implement them correctly. Thus it didn't seem worth expending a lot of effort developing deduction logic that might turn out to be completely wrong. I suspect (1) is not unrelated to (2) ... The semantic vaguenesses are also twofold: (a) It's not quite clear whether the spec intends to draw a distinction between a composite value that is in itself NULL and one that is a tuple of all NULL fields. There is certainly a physical difference, but it looks like the IS [NOT] NULL predicates are designed not to be able to tell the difference. (b) It's not at all clear whether these predicates are meant to be recursive for nested composite types. Depending on how you read it, it could be that a composite field that is NULL satisfies an IS NULL predicate on the parent row, but a composite field that is ROW(NULL, NULL, ...) does not. That is in fact how we implement it, but it sure seems weird given (a). So personally, I've got zero confidence in these predicates and don't especially wish to sink development effort into something that critically depends on having the right semantics for them. The shortage of field demand for doing better doesn't help. Circling back to your interest in using a "row IS NULL" predicate to conclude that a left join is actually an antijoin, these questions are really critical, because if the join column is itself composite, the deduction would hold *only* if our theory that "row IS NULL" is non-recursive is correct. If our theory is wrong, and the spec intends that ROW(NULL, ROW(NULL, NULL), NULL) IS NULL should be TRUE, then it wouldn't be correct to draw the inference that the join has to be an antijoin. And that is also connected to the fact that record_cmp considers ROW(NULL, NULL) to be equal to ROW(NULL, NULL), which is (or at least seems to be) wrong per spec, but tough: we have to have a total order for composite values, or they wouldn't be indexable or sortable. If your head's not hurting yet, you just need to think harder about these issues. It's all a mess, and IMO we're best off not adding any more dependencies on these semantics than we have to. regards, tom lane