Jeremy Boynes wrote:

Jack Klebanoff wrote:


1. sqlgrammar.jj. I think that creating a new method, isNonReservedKeyword() to determine whether a token is a non-reserved keyword or not, is a maintenance problem. Whenever we add a new non-reserved keyword we must add it to the list of tokens, to nonReservedKeyword(), and now to isNonReservedKeyword(). Having to add it in two places is difficult enough to discover or remember. If we need isNonReservedKeyword then we should find a way of combining nonReservedKeyword and isNonReservedKeyword so that only one of them keeps the list of non-reserved key words.

It is not necessary for the parser to recognize 3 cases of ORDER BY sort
key type. A column name is just one kind of <expression>. If the parser
treats it as an expression we should still get the right ordering. I
think that it would better if the parser did so. The OrderByColumn class
can special case a simple column reference expression, as an
optimization. This considerably simplifies parsing sort keys.

The only sort key type that has to be handled specially is that of an
integer constant. That specifies one of the select list columns as the
sort key. This case can be recognized in the parser, as is done in the
patch, or it can be recognized in OrderByColumn. In this alternative the
parser always creates OrderByColumn nodes with the sort key given by an
expression (a ValueNode). At bind time OrderByColumn can determine
whether the sort key expression is an integer constant, and if so treat
it as a column position.

The two alternatives differ in the way that they treat constant integer
expressions like "ORDER BY 2-1". The patch orders the rows by the
constant 1, which is not usefull. With the patch "ORDER BY 2-1 ASC" and
"ORDER BY 2-1 DESC" produce the same ordering. If OrderByColumn treated
an integer constant sort key expression as a result column position then
"ORDER BY 2-1" would cause the rows to be ordered on the first result
column, which I think is more usefull.


From the SQL spec, the <sort-key> is simply a value expression evaluated for each row. Hence "ORDER BY 2-1" means that all rows are peers and the ordering is implementation-dependent.


I think allowing the value-expressions to evaluate to column position is potentially confusing. For example, suppose you have "ORDER BY ?" - do we order by the supplied value (which would be the same for all rows making the actual ordering implementation-dependent) or do we order by the column position (with the potential need to re-select indexes used to assist in ordering).

I believe it is simple and SQL-compliant to have very simple rules here:
1) if all <sort-key>s are integer literals, then sort by the column
   position they imply. This is SQL-compliant as the comparison criteria
   for each row will be the same, making all rows peers and the actual
   order implementation dependent (we just define ours as using the
   appropriate column value).

2) if any <sort-key> is not an integer literal then we treat all of them
   as <value expression> and compare rows accordingly.

--
Jeremy

I think that most people would be surprised to find that "ORDER BY 1 + 1" behaves differently than "ORDER BY 2". I also think that it is simpler for us to treat all integer constants the same. It is hard to get the parser to distinguish between a literal constant and a complex expression that starts with an integer literal. That is why Tomohito Nakayama needed to use complex lookahead in his parser change.

I think that it is better to distinguish between integer constants and other expressions in the OrderByColumn class. ValueNodes have isConstantExpression(), getTypeID(), and getConstantValueAsObject() methods that OrderByColumn can use to see if the expression is an integer constant and, if so, what the value is. OrderByColumn can test whether the order by expression is an instanceOf ColumnReference or ResultColumn to special case a simple column reference. This simplifies the parser significantly and causes us to treat "ORDER BY 1 + 1" the same as "ORDER BY 2".

I don't feel strongly enough about this to block a patch that is well tested and otherwise works correctly.

Jack Klebanoff

Reply via email to