Hi,

This message is asking for help debugging some specific changes that I've written for Derby, so if you're not interested in my own personal coding problems, feel free to stop reading now ;)

I'm trying to add language support to Derby so that it allows aggregate functions to be used in the WHERE iff they are a) in a subquery of a HAVING clause and b) they correlate to an outer query block.

As a first step toward part "a", I made what seem to me like reasonable changes to the SelectNode class so that Derby will allow aggregates in the WHERE of a HAVING subquery. At a high level, what my changes do (or at least, what I want them to do) is as follows.

Assume I have a query of the form:

<OUTER_SELECT> <GROUP_BY> having <outer_col_a> >= <SUBSELECT>

and <SUBSELECT> is something like

select x from t1 where min(<outer_col_b>) > 0

and <outer_col_b> is a column that is both in the <OUTER_SELECT> list and in the GROUP_BY list. That said, the changes I've made would rewrite the <SUBSELECT> to represent the following:

select x, min(<outer_col_b>) SQLCol1 from t1 where SQLCol1 > 0

(Note that this is what the rewritten query _represents_--this in itself isn't necessary valid SQL).

With this new query I bind SQLCol1 without any problems, but at execution time, the result is an NPE from within the generated code for the query. I poked through the derby.log file and rewrote the query in a bunch of different ways, and I now _think_ problem is that the DataValueDescriptor for SQLCol1 in the WHERE clause is null at row retrieval time, and that's causing the NPE.

Are there any language experts out there who can look at the attached patch and offer input as to why the DataValueDescriptor for the aggregate result column would end up null on a call to retrieve the results? I've made sure that the aggregate is bound at compile time (and I've tried binding it in lots of different ways) but it looks like I'm missing something somewhere.

I've attached a small patch with minimal changes to reproduce the problem. I've re-written this patch too many times to count, some times in ways that were very simple (like the one attached to this patch) and some times in ways that were significantly more complex--but in all cases, I keep coming back to this NPE problem.

Using the attached patch, the NPE can be reproduced with either of the following two scenarios:

----

1) A "realistic" scenario that actually has some meaning (and thus is more complicated);

ij> create table team (teamid int not null primary key, budget double);
ij> create table players (id int not null unique, teamid int references team, salary double);
ij> insert into team values (1, 100), (2, 200), (3, 300), (4, 400);
ij> insert into players values (1, 1, 50), (2, 1, 75), (3, 1, 60);
3 rows inserted/updated/deleted
ij> insert into players values (4, 2, 10), (5, 2, 25), (6, 2, 30);
3 rows inserted/updated/deleted
ij> insert into players values (7, 3, 25), (8, 3, 30), (9, 3, 20), (10, 3, 45);


-- Now, look for all players such that: 1) the player is on a team with an average salary that is less than half of the team's budget, AND 2) the player is the lowest-paid player on his/her team, AND 3) the player is making less than "20" (i.e. maybe these player should get a raise! ;). In such cases, retrieve the player's team and his/her salary.

ij> select t.teamid,p.salary from team t, players p where (p.teamid = t.teamid) group by t.teamid,p.salary having avg(p.salary) < (select (t2.budget / 2) from team t2 where t2.teamid = t.teamid and 20 > min(p.salary));

----

2) A simpler query that doesn't make any logical sense but still reproduces the problem:

ij> create table t1 (i int);
ij> create table t2 (j int);
ij> create table t3 (k int);
ij> insert into t1 values 2, 4, 6;
ij> insert into t2 values 1, 3, 7;
ij> insert into t3 values 1, 2, 3, 4, 5;
ij> select j,k from t2 t2_alias, t3 t3_alias group by j,k having j > (select min(2) from t1 where min(t3_alias.k) > 0);


It's the derby.log file for this second query that makes me think the DataValueDescriptor is null and thus the NPE.

Many thanks to any who has the time to point out what I'm missing,
Army
Index: java/engine/org/apache/derby/impl/sql/compile/SelectNode.java
===================================================================
--- java/engine/org/apache/derby/impl/sql/compile/SelectNode.java       
(revision 170947)
+++ java/engine/org/apache/derby/impl/sql/compile/SelectNode.java       
(working copy)
@@ -519,10 +519,11 @@
                        ** in the WHERE clause at all.
                        ** Note: a similar check is made in JoinNode.
                        */
-                       if ((whereAggregates.size() > 0) &&
-                                       !generatedForHavingClause)
+                       if (whereAggregates.size() > 0)
                        {
-                               throw 
StandardException.newException(SQLState.LANG_NO_AGGREGATES_IN_WHERE_CLAUSE);
+                               if (!isSubqueryInHavingClause)
+                                       throw 
StandardException.newException(SQLState.LANG_NO_AGGREGATES_IN_WHERE_CLAUSE);
+                               bindAggregatesInWhereClause();
                        }
 
                        /* If whereClause is a parameter, (where ?), then we 
should catch it and throw exception
@@ -1971,4 +1972,66 @@
 
                return returnNode;
        }
+
+    /* ****
+     * Take:
+     *
+     *  ... having j > (select x from t1 where min(outer.y) > 0)
+     *
+     * and make it into:
+     *
+     *  ... having j > (select x, min(outer.y) SQLCol1
+     *         from t1 where SQLCol1 > 0),
+     * 
+     * and make SQLCol1 into a column reference that points to
+     * an RC that represents the aggregate "min(outer.y)".
+     *
+     */
+    public void bindAggregatesInWhereClause()
+        throws StandardException
+    {
+
+        ResultColumnList aggRCs = (ResultColumnList) getNodeFactory().getNode(
+                                            C_NodeTypes.RESULT_COLUMN_LIST,
+                                            getContextManager());
+
+        // Replace the aggregates in the WHERE clause with
+        // column references.
+        ReplaceAggregatesWithCRVisitor visitor = 
+            new ReplaceAggregatesWithCRVisitor(aggRCs, ResultSetNode.class);
+        whereClause = (ValueNode)whereClause.accept(visitor);
+
+        // Bind the column references that we just generated.
+        int sz = aggRCs.size();
+        for (int i = 0; i < sz; i++) 
+        {
+            // Get next generated reference to bind.
+            AggregateNode agg = (AggregateNode)whereAggregates.elementAt(i);
+            ColumnReference cr = agg.getGeneratedRef();
+
+            // Point the reference to the result column
+            // that was generated above.
+            ResultColumn rc = aggRCs.getResultColumn(i+1);
+            rc.bindResultColumnToExpression();
+            rc.setResultSetNumber(this.getResultSetNumber());
+            cr.setSource(rc);
+
+            // Tried playing around with these, but with no luck.
+            /*
+            cr.setType(rc.getTypeServices());
+            cr.setNestingLevel(((FromTable)fromList.elementAt(0)).getLevel());
+            cr.setSourceLevel(((FromTable)fromList.elementAt(0)).getLevel());
+            */
+
+            resultColumns.addResultColumn(rc);
+            selectAggregates.add(agg);
+        }
+
+    }
+
+    private boolean isSubqueryInHavingClause = false;
+    public void markAsSubqueryInHavingClause() {
+        isSubqueryInHavingClause = true;
+    }
+
 }
Index: java/engine/org/apache/derby/impl/sql/compile/sqlgrammar.jj
===================================================================
--- java/engine/org/apache/derby/impl/sql/compile/sqlgrammar.jj (revision 
170947)
+++ java/engine/org/apache/derby/impl/sql/compile/sqlgrammar.jj (working copy)
@@ -228,6 +228,8 @@
        //exception.    
         Vector explicitlyNullableColumnsList = new Vector();
 
+       // Keep track of HAVING nesting.
+       private int havingClauseNestLevel = 0;
 
        final void setCompilerContext(CompilerContext cc) {
                this.compilerContext = cc;
@@ -7252,6 +7254,9 @@
                                                        groupByList,
                                                        getContextManager());
 
+               if (havingClauseNestLevel > 0)
+                       selectNode.markAsSubqueryInHavingClause();
+
                /* A SELECT with a GROUP BY or HAVING clause is the one 
instance where the
                 * parser will generate a tree that does not exactly match the 
query.
                 * The resulting tree is an outer SelectNode with the HAVING 
clause
@@ -8333,11 +8338,13 @@
 havingClause() throws StandardException :
 {
        ValueNode value;
+       havingClauseNestLevel++;
 }
 {
        /* valueExpression() was searchCondition() */
        <HAVING> value = valueExpression(false)
        {
+               havingClauseNestLevel--;
                return value;
        }
 }

Reply via email to