While trying to complete the Phase 4 work for DERBY-805 I've come across some behavior in the optimizer's treatment of cost estimates that I don't fully understand. If anyone out there can offer answers/feedback for any of the following three questions, I'd appreciate it...

--------------------------------
1) Multiplication by outer rows.
--------------------------------

There are four classes--ProjectRestrictNode, UnionNode, JoinNode, and FromTable--that implement the "optimizeIt()" method and, in that method, call "getJoinStrategy().estimateCost(...)". One of the arguments to optimizeIt() is an outerCost value which holds the estimated cost for FromTables left of the current one for this join order. So if we have a FROM list that is [ FT1, FT2, FT3 ] and the current FromTable (the one we're optimizing) is FT2, then outerCost will hold the estimated cost for FT1. For the sake of this question, assume FT1, FT2, and FT3 are instances of one of the above-mentioned classes. outerCost is then used in two ways by these FTs: a) it's passed down to the children nodes, where it's factored into the cost estimates of those children, and b) it's passed into a call to estimate the cost of the current FT itself. As an example, take UnionNode. The a) code is like this:

        leftResultSet = optimizeSource(
                                optimizer,
                                leftResultSet,
                                getLeftOptPredicateList(),
                                outerCost);  // <==

        rightResultSet = optimizeSource(
                                optimizer,
                                rightResultSet,
                                getRightOptPredicateList(),
                                outerCost);  // <==

The b) code is then:

        /*
        ** Get the cost of this result set in the context of the whole plan.
        */
        getCurrentAccessPath().
                getJoinStrategy().
                        estimateCost(
                                this,
                                predList,
                                (ConglomerateDescriptor) null,
                                outerCost,   // <==
                                optimizer,
                                costEstimate
                                );

In both cases, the only part of outerCost that's actually used is the estimated row count. In cases where we're costing a Nested Loop Join (either for the current FT or for any of it's children), we multiply the cost of the current FT (or its children) by the number of outer rows, since we will in effect be incurring the cost of the inner FT for each row of the outer FT.

All of that said, my question is this: why do we need to pass the outerCost down to the children, where it will be multiplied by their costs, AND at the same time pass it into the "getJoinStrategy().estimateCost()" method, where it will be multiplied (again) by the cost of the current FT? It seems to me the like the correct thing to do here is to _not_ pass outerCost to the children and to instead just let the current FT do the multiplication if necessary.

As a trivialized example, assume our FT2 is a UnionNode over two subqueries, each of which just has a single table:

FT2 = (select * from T1) UNION (select * from T2)

And assume FT1 is an outer table with row count of 100. Then in the part a) code above, we'll pass "100" down to the left child, where it will eventually make it down to T1 and we'll multiply the cost of T1 by 100. Likewise we'll pass "100" down to the right child and multiply the cost of T2 by 100:

T1 estimated cost: 100 * T1_cost
T2 estimated cost: 100 * T2_cost

Then we'll set the cost of the UnionNode to be the sum of the estimated costs of its children:

FT2 estimated cost: (100 * T1_cost) + (100 * T2_cost)

That done, if the current join strategy for FT2 is NestedLoop, the call in part b) above will pass "100" to NestedLoopJoinStrategy.estimateCost(), where it will be multiplied by the cost of FT2:

FT2 estimatedCost: 100 * ((100 * T1_cost) + (100 * T2_cost))

In other words, we've ended up multiplying the cost of FT2 by the outer row count *TWICE*.

I guess this could be intentional in order to make sure we favor hash joins (which won't do the second multiplication--see below) over nested loop joins, but it'd be great if someone could confirm this one way or the other. As it is, it seems a bit odd to me.

----------------------------------
2) HashJoinStrategy.estimateCost()
----------------------------------

As mentioned at the end of question 1, if the join strategy that we're currently costing is a HashJoin and we make a call to HashJoinStrategy.estimateCost() (see above for an example), we get here:

/** @see JoinStrategy#estimateCost */
public void estimateCost(Optimizable innerTable,
                         OptimizablePredicateList predList,
                         ConglomerateDescriptor cd,
                         CostEstimate outerCost,
                         Optimizer optimizer,
                         CostEstimate costEstimate) {
        /*
        ** The cost of a hash join is the cost of building the hash table.
        ** There is no extra cost per outer row.
        */
}

I.e. we don't do anything. But at the same time, if we're costing a FromBaseTable and considering a HashJoin, we _do_ actually use the outer row count. See FromBaseTable.estimateCost():

        /*
        ** Let the join strategy decide whether the cost of the base
        ** scan is a single scan, or a scan per outer row.
        ** NOTE: In this case we only want to multiply against the
        ** total row count, not the singleScanRowCount.
        */
        if (currentJoinStrategy.multiplyBaseCostByOuterRows())
        {
                newCost *= outerCost.rowCount();
        }
        rowCount *= outerCost.rowCount();

For a HashJoin the call to "multiplyBaseCostByOuterRows() will return false. Notice though that while we will not multiply the current cost (newCost) by the outer row count for a hash join, we _will_ multiply the current _rowCount_ by the outer row count.

So my question is this: why is it that, for a FromBaseTable, we multiply the current rowCount by the outer row count for a Hash join (as per FromBaseTable.estimateCost()), but for other (non-base-table) FromTables (such as UnionNode) we skip this multiplication (as per HashJoinStrategy.estimateCost())? Is this an intentional difference, and if so, does anyone know why?

----------------------------------
3) Comparing plan costs.
----------------------------------

In OptimizerImpl.java there is the following code for comparing "current" cost with "best cost" for this round of optimization:

        /*
        ** Is the cost of this join order lower than the best one we've
        ** found so far?
        **
        ** NOTE: If the user has specified a join order, it will be the
        ** only join order the optimizer considers, so it is OK to use
        ** costing to decide that it is the "best" join order.
        */
        if ((! foundABestPlan) || currentCost.compare(bestCost) < 0)
        {
                rememberBestCost(currentCost, Optimizer.NORMAL_PLAN);

                // Since we just remembered all of the best plans,
                // no need to reload them when pulling Optimizables
                // from this join order.
                reloadBestPlan = false;
        }

The call to "currentCost.compare()" only compares the estimatedCost fields; it does not look at the other two fields of a CostEstimate--namely, rowCount and singleScanRowCount.

In running some tests against 10.1, though, I noticed that it is possible for currentCost and bestCost to have the same estimatedCost values but different row counts. Since the above check does not look at row counts, we can end up in situations where the following two queries can result in different query plans:

select * from V1, V2 where V1.j = V2.b;
select * from V2, V1 where V1.j = V2.b;

I.e. the queries are identical with the exception of the order of V1 and V2. In my particular example V1 and V2 are both unions of two tables, where V2 has 100,000 rows and V1 has less than ten. When I ran these queries against 10.1 with NO optimizer timeout--meaning that the optimizer had time to examine all possible query plans--the result was surprising: the first query chose join order [ V2, V1 ] and took 4.5 to 5 seconds to complete; the second query chose join order [ V1, V2 ] and took 30-33 seconds to complete. Wow.

I made a small change to the CostEstimateImpl.compare() method to consider rowCount as a secondary field of comparison (if the estimatedCosts are the same) and singleScanRowCount as a third field of comparison (if the rowCounts are the same). With that change the behavior was then what I expected: namely, the optimizer chose the same join order (the faster one) regardless of which query I ran.

So my question is this: if two cost estimates have the same estimatedCost but different rowCounts (or even singleScanRowCounts), it is reasonable to suppose that the estimate with the lower rowCount is actually better? In the above case the query with the lower row count was the one that ran more quickly--is it reasonable to believe that this will generally be the case?

---

As always, thanks to anyone with the time and inclination to read this email and provide feedback...

Army

Reply via email to