1. sqlgrammar.jj. I think that creating a new method, isNonReservedKeyword() to determine whether a token is a non-reserved keyword or not, is a maintenance problem. Whenever we add a new non-reserved keyword we must add it to the list of tokens, to nonReservedKeyword(), and now to isNonReservedKeyword(). Having to add it in two places is difficult enough to discover or remember. If we need isNonReservedKeyword then we should find a way of combining nonReservedKeyword and isNonReservedKeyword so that only one of them keeps the list of non-reserved key words.
It is not necessary for the parser to recognize 3 cases of ORDER BY sort key type. A column name is just one kind of <expression>. If the parser treats it as an expression we should still get the right ordering. I think that it would better if the parser did so. The OrderByColumn class can special case a simple column reference expression, as an optimization. This considerably simplifies parsing sort keys.
The only sort key type that has to be handled specially is that of an integer constant. That specifies one of the select list columns as the sort key. This case can be recognized in the parser, as is done in the patch, or it can be recognized in OrderByColumn. In this alternative the parser always creates OrderByColumn nodes with the sort key given by an expression (a ValueNode). At bind time OrderByColumn can determine whether the sort key expression is an integer constant, and if so treat it as a column position.
The two alternatives differ in the way that they treat constant integer expressions like "ORDER BY 2-1". The patch orders the rows by the constant 1, which is not usefull. With the patch "ORDER BY 2-1 ASC" and "ORDER BY 2-1 DESC" produce the same ordering. If OrderByColumn treated an integer constant sort key expression as a result column position then "ORDER BY 2-1" would cause the rows to be ordered on the first result column, which I think is more usefull.
From the SQL spec, the <sort-key> is simply a value expression evaluated for each row. Hence "ORDER BY 2-1" means that all rows are peers and the ordering is implementation-dependent.
I think allowing the value-expressions to evaluate to column position is potentially confusing. For example, suppose you have "ORDER BY ?" - do we order by the supplied value (which would be the same for all rows making the actual ordering implementation-dependent) or do we order by the column position (with the potential need to re-select indexes used to assist in ordering).
I believe it is simple and SQL-compliant to have very simple rules here: 1) if all <sort-key>s are integer literals, then sort by the column position they imply. This is SQL-compliant as the comparison criteria for each row will be the same, making all rows peers and the actual order implementation dependent (we just define ours as using the appropriate column value).
2) if any <sort-key> is not an integer literal then we treat all of them as <value expression> and compare rows accordingly.
-- Jeremy
