Rick Hillegas <[EMAIL PROTECTED]> writes:
> Hi Knut,
>
> Thanks for the clarification. Sounds like we agree that the bug is in
> the query rewriting rather than in the pushdown to the Store. Methinks
> the DISTINCT should not be pushed into the subselect if the inner and
> outer SELECT lists aren't identical.
Yes, I suppose the optimizer could check that the ResultColumnList of
the two queries are identical. Then we would have the following
criteria for pushing the DISTINCT:
1) The query has a single table/subquery in the FROM clause.
2) The query and the subquery have identical column lists.
3) No predicates in the top-level query (this could possibly be
relaxed).
4) The duplicate elimination can't be merged with an ORDERED BY.
> Here's another possible scenario where you might push a DISTINCT from
> an outer into an inner query. It all depends on when the optimizer
> decides to eliminate the DISTINCT altogether:
>
> SELECT DISTINCT maxChildAge, employeeID FROM
> (
> SELECT max( childAge ) maxChildAge, employeeID
> FROM children, employees
> WHERE children.parentID=employeeID
> GROUP BY employeeID
> )
>
> Here the optimizer might reasonably do something like this:
>
> 1) Push the DISTINCT into the subselect.
>
> 2) Then notice that the results of the original subselect are already
> unique because of the GROUP BY
>
> 3) So eliminate the DISTINCT altogether.
Or we could just fix the query rewriting to flatten the query
properly... ;)
SELECT max( childAge ) maxChildAge, employeeID
FROM children, employees
WHERE children.parentID=employeeID
GROUP BY employeeID
Thanks for your input, Rick! Your comments have been very helpful!
I have just filed another bug in the derbylang test. That bug is
really a consequence of this bug. See
http://issues.apache.org/jira/browse/DERBY-519 for details.
--
Knut Anders