Hi Satheesh and Army,
I will try to get to this later this week. I'm a bit busy for the next
few days.
Regards,
-Rick
Satheesh Bandaram wrote:
This is a great improvement to Derby providing good performance
improvement. Thanks Army for the writeup... It does look impressive. I
will take this home for some "light" reading. :-) I would like to
invite other optimizer experts to review the doc and code patch when
ready. Jeff Lichtman and Rick, do you have some time to review the doc?
I have added an entry to DerbyDevActivities
<http://wiki.apache.org/db-derby/DerbyDevActivities>under improvements
for this. Hopefully this work will lead to further improvements to
Derby Optimizer in this area.
Satheesh
Army wrote:
I have attached a description of the changes I plan to submit for
DERBY-805 to that Jira issue. This is a rather complicated
enhancement so the description of the changes is pretty long. In
short, though, I outline a 6-step approach to pushing join predicates
down into Unions:
1 - Add the ability to take a predicate and scope it to a target
result set so that it can be pushed to that result set.
2 - Implement the "pushOptPredicate()" and "optimizeIt()" methods for
UnionNodes. The former method should take predicates that are passed
into the UnionNode from outer queries, scope them (per step 1) for
the left and right children of the UnionNode, and store them
locally. The latter method should then pass the scoped predicates
down to both children so that they can use the predicates in their
own optimize()/optimizeIt() calls.
3 - Take scoped predicates (created in step 1) that are pushed into
the children result sets of a UnionNode (per step 2) and allow the
the children to push the scoped predicates even further down the
tree, until we eventually get them to a base table.
4 - Make sure predicates that are pushed down into subqueries of a
UnionNode are correctly "pulled" back up (if they are unscoped) or
discarded (if they are scoped) for every permutation seen during
optimization.
5 - Ensure that the best access path for a UnionNode that pushes
predicates is correctly saved during optimization and correctly
retrieved when it comes time to finalize the query's overall access
path.
6 - And finally, when optimization is complete, make sure all
relevant predicates are pushed down the tree one last time and left
there, in preparation for code generation.
See DERBY-805.html for all the gory details.
I have made the changes described in this document locally and they
all seem to work, with a couple of exceptions as noted at the end of
the document. I plan to break the changes down into separate patches
where it's possible to do so, and will be posting those patches in
the coming days. In the meantime, if anyone has time to review this
document and provide feedback, direction, or suggestions, I would be
grateful. As I myself am still trying to learn all the subtleties of
Derby optimization, the more feedback I get, the better...
Thanks,
Army