On 05/04/2011 07:39 PM, Tom Lane wrote:
While looking at the grammar's operator-precedence declarations in connection with a recent pgsql-docs question, it struck me that this declaration is a foot-gun waiting to go off: %nonassoc IS NULL_P TRUE_P FALSE_P UNKNOWN /* sets precedence for IS NULL, etc */ The only terminal that we actually need to attach precedence to for IS NULL and related productions is "IS". The others are just listed there to save attaching explicit %prec declarations to those productions. This seems like a bad idea, because attaching a precedence to a terminal symbol that doesn't absolutely have to have one is just asking for trouble: it can cause bison to accept grammars that would better have provoked a shift/reduce error, and to silently resolve the ambiguity in a way that you maybe didn't expect. So I thought to myself that it'd be better to remove the unnecessary precedence markings, and tried, with the attached patch. And behold, I got a shift/reduce conflict: state 2788 1569 b_expr: b_expr qual_Op . b_expr 1571 | b_expr qual_Op . NULL_P shift, and go to state 1010 NULL_P [reduce using rule 1571 (b_expr)] So the god of unintended consequences has been here already. What this is telling us is that in examples such as CREATE TABLE foo (f1 int DEFAULT 10 %% NULL); it is not immediately clear to bison whether to shift upon seeing the NULL (which leads to a parse tree that says %% is an infix operator with arguments 10 and NULL), or to reduce (which leads to a parse tree that says that %% is a postfix operator with argument 10, and NULL is a column declaration constraint separate from the DEFAULT expression). If you try the experiment, you find out that the first interpretation is preferred by the current grammar: ERROR: operator does not exist: integer %% unknown
Yeah, IIRC the default action for a shift/reduce conflict is to shift, as it's usually the right thing to do.
Now, this is probably a good thing, because NULL as a column declaration constraint is not SQL standard (only NOT NULL is), so we're resolving the ambiguity in a way that's more likely to be SQL-compatible. But it could be surprising to somebody who expected the other behavior, especially since this seemingly-closely-related command is parsed the other way: CREATE TABLE foo (f1 int DEFAULT 10 %% NOT NULL); ERROR: operator does not exist: integer %% And the reason for that difference in behavior is that NOT has a declared precedence lower than POSTFIXOP, whereas NULL has a declared precedence that's higher. That comparison determines how bison resolves the shift/reduce conflict. The fact that this behavior falls out of a precedence declaration that's seemingly quite unrelated, and was surely not designed with this case in mind, is a perfect example of why I say that precedence declarations are hazardous. So I'd still like to get rid of the precedence markings for TRUE_P, FALSE_P, and UNKNOWN, and will do so unless somebody has a really good reason not to. (It looks like we could avoid marking ZONE, too.) But I would be happier if we could also not mark NULL, because that's surely used in a lot of other places, and could easily bite us a lot harder than this. Can anyone think of an alternative way to resolve this particular conflict without the blunt instrument of a precedence marking?
My bison-fu is a bit rusty, but years ago I could do this stuff in my sleep. I'll be surprised if there isn't a way.
If we do need a precedence setting for NULL_P, then I think it should probably be on its own and not sharing one with IS.
If you don't solve it soon I'll take a look after I clear a couple of higher priority items from my list.
cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers