Re: About improvement of DERBY-134

Jeremy Boynes 27 Feb 2005 20:29:35 -0000

Jack Klebanoff wrote:


1. sqlgrammar.jj. I think that creating a new method,
isNonReservedKeyword() to determine whether a token is a non-reserved
keyword or not, is a maintenance problem. Whenever we add a new
non-reserved keyword we must add it to the list of tokens, to
nonReservedKeyword(), and now to isNonReservedKeyword(). Having to add
it in two places is difficult enough to discover or remember. If we need
isNonReservedKeyword then we should find a way of combining
nonReservedKeyword and isNonReservedKeyword so that only one of them
keeps the list of non-reserved key words.

It is not necessary for the parser to recognize 3 cases of ORDER BY sort
key type. A column name is just one kind of <expression>. If the parser
treats it as an expression we should still get the right ordering. I
think that it would better if the parser did so. The OrderByColumn class
can special case a simple column reference expression, as an
optimization. This considerably simplifies parsing sort keys.

The only sort key type that has to be handled specially is that of an
integer constant. That specifies one of the select list columns as the
sort key. This case can be recognized in the parser, as is done in the
patch, or it can be recognized in OrderByColumn. In this alternative the
parser always creates OrderByColumn nodes with the sort key given by an
expression (a ValueNode). At bind time OrderByColumn can determine
whether the sort key expression is an integer constant, and if so treat
it as a column position.

The two alternatives differ in the way that they treat constant integer
expressions like "ORDER BY 2-1". The patch orders the rows by the
constant 1, which is not usefull. With the patch "ORDER BY 2-1 ASC" and
"ORDER BY 2-1 DESC" produce the same ordering. If OrderByColumn treated
an integer constant sort key expression as a result column position then
"ORDER BY 2-1" would cause the rows to be ordered on the first result
column, which I think is more usefull.

From the SQL spec, the <sort-key> is simply a value expression evaluated for each row. Hence "ORDER BY 2-1" means that all rows are peers and the ordering is implementation-dependent.

I think allowing the value-expressions to evaluate to column position is potentially confusing. For example, suppose you have "ORDER BY ?" - do we order by the supplied value (which would be the same for all rows making the actual ordering implementation-dependent) or do we order by the column position (with the potential need to re-select indexes used to assist in ordering).

I believe it is simple and SQL-compliant to have very simple rules here:
1) if all <sort-key>s are integer literals, then sort by the column
   position they imply. This is SQL-compliant as the comparison criteria
   for each row will be the same, making all rows peers and the actual
   order implementation dependent (we just define ours as using the
   appropriate column value).

2) if any <sort-key> is not an integer literal then we treat all of them
   as <value expression> and compare rows accordingly.

--
Jeremy

Re: About improvement of DERBY-134

Reply via email to