[ 
https://issues.apache.org/jira/browse/DERBY-2998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12573867#action_12573867
 ] 

A B commented on DERBY-2998:
----------------------------

Thanks for patch 19, Thomas.  I think your approach to handling the virtual 
column nodes is a good one (certainly better than the alternatives seen thus 
far).  In the interest of getting something into the codeline for others to 
play with--and to simplify future patches--per Dan's comment a few days back, 
I'll try to commit patch 19 sometime today.

Once that is committed, perhaps you can post one or more follow-up patches to 
address the following un-related comments, which I noticed while looking into 
the VCN issue.  Of these # 3 is probably the most important one...

1) Instead of using "row_number()" for the default name of a ROW_NUMBER() 
column, I think it'd be better to use a normal generated column name, as is 
done for all other un-named expressions in Derby (I believe?).  So something 
like:

  select i, j, i+j, row_number() over() from t1

should return four columns, "I", "J", "3", and "4", where the last two are 
generic generated names used for unnamed expressions.  With patch 19 such a 
query currently returns "I", "J", "3", "row_number()", which deviates from the 
existing pattern.

2) Regarding the following logic in PredicateList.pushExpressionsIntoSelect():

    ColumnReference ref = (ColumnReference)e.nextElement();
    if (!ref.pointsToColumnReference() &&
        !ref.pointsToWindowFunction())
    {
         state = false;
         break;
    }

I think it says "if the column reference points to something that is neither a) 
another column reference NOR b) a window function, then do *not* push it down." 
 So if the column reference points to a window function, we we could still 
potentially push it down.  Is that what we want?  For the record, I commented 
the call to "pointsToWindowFunction()" out completely and all of the queries 
still ran the same.  So is this check still necessary?

3) The following query returns one row, regardless of how many rows there are 
in T1:

  select distinct row_number() over(), 'HMM' from t1

It appears that we will evaluate the entire SELECT result set, then apply the 
DISTINCT, and finally, assign the row numbers.  So if there are five rows in 
T1, we'll get five rows having a single "HMM" column, then we'll apply the 
distinct to get a single "HMM" row, and that row will have row number "1".  But 
is that the correct behavior?  Or is the row_number() supposed to be included 
in the "DISTINCT" qualification, in which case the query would return 5 rows 
because each row has a different row number and thus each is distinct from the 
others?

Just to see what would happen I ran the above query on DB2 v8 and it returned 
one row for each row in T1.  I took a look at the syntax rules for <window 
function> in SQL 2003 spec section 6.10 but could not, in my inexperience with 
reading specs, determine one way or the other.  That said, though, I think 
functions in a ResultColumnList are typically evaluated once per row, *prior* 
to the DISTINCT being applied, so I would guess ROW_NUMBER() should do the same?

4) Might be good to add some test cases for GROUP BY and HAVING clauses in the 
presence of ROW_NUMBER().  I tried a few quick ones and they seem to be working 
fine, but it wouldn't hurt to add some tests for posterity.  These are the ones 
I tried, but use your imagination :)

  select r from (select i, row_number() over() as r, j from t1) x group by r;
  select * from (select i, row_number() over() as r, j from t1) x group by i, 
j, r;
  select * from (select i, row_number() over() as r, j from t1) x group by j, 
r, i;

  select * from
      (select i, row_number() over() as r, j from t1) x
    group by j, r, i
    having r > 2;

  select * from
      (select i, row_number() over() as r, j from t1) x
    group by j, r, i
    having r > 2 and i >=3
    order by i desc;

  select * from
      (select i, row_number() over() as r, j from t1) x
    group by j, r, i
    having r > 2 and i >=3
    order by r desc;

  select * from
      (select i, row_number() over() as r, j from t1) x
    group by j, r, i
    having r > 2 and i >=3
    order by i asc, r desc;

And yes I realize that grouping by a ROW_NUMBER() is kind of silly since row 
numbers are unique across rows--but it's good make sure the behavior is correct 
:)

5) SQL spec indicates that window functions like ROW_NUMBER() can be used in an 
ORDER BY clause, but that is not yet implemented (right?).  Do you think it 
would be worth it to add a statement saying as much to the documentation?

> Add support for ROW_NUMBER() window function
> --------------------------------------------
>
>                 Key: DERBY-2998
>                 URL: https://issues.apache.org/jira/browse/DERBY-2998
>             Project: Derby
>          Issue Type: Sub-task
>          Components: SQL
>            Reporter: Thomas Nielsen
>            Assignee: Thomas Nielsen
>            Priority: Minor
>         Attachments: d2998-10.diff, d2998-10.stat, d2998-11.diff, 
> d2998-12.diff, d2998-12.stat, d2998-13.diff, d2998-13.stat, d2998-14.diff, 
> d2998-14.stat, d2998-15.diff, d2998-15.stat, d2998-16.diff, d2998-16.stat, 
> d2998-17.diff, d2998-17.stat, d2998-18.diff, d2998-18.stat, d2998-19.diff, 
> d2998-19.stat, d2998-4.diff, d2998-4.stat, d2998-5.diff, d2998-5.stat, 
> d2998-6.diff, d2998-6.stat, d2998-7.diff, d2998-7.stat, d2998-8.diff, 
> d2998-8.stat, d2998-9-derby.log, d2998-9.diff, d2998-9.stat, 
> d2998-doc-1.diff, d2998-doc-1.stat, d2998-test.diff, d2998-test.stat, 
> d2998-test2.diff, d2998-test2.stat, d2998-test3.diff, d2998-test3.stat, 
> d2998-test4.diff, d2998-test4.stat, d2998-test6.diff, d2998-test7.diff, 
> d2998-test8.diff, d2998-test9.diff
>
>
> As part of implementing the overall OLAP Operations features of SQL 
> (DERBY-581), implement the ROW_NUMBER() window function.
> More information about this feature is available at 
> http://wiki.apache.org/db-derby/OLAPRowNumber

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to