[ http://issues.apache.org/jira/browse/DERBY-1905?page=all ]

A B updated DERBY-1905:
-----------------------

    Attachment: d1905.sql

Attaching file referenced in the description for this issue.

> Optimizer cost estimates for subqueries are way (way) too high.
> ---------------------------------------------------------------
>
>                 Key: DERBY-1905
>                 URL: http://issues.apache.org/jira/browse/DERBY-1905
>             Project: Derby
>          Issue Type: Bug
>          Components: Performance, SQL
>    Affects Versions: 10.2.1.5, 10.3.0.0, 10.1.3.2
>            Reporter: A B
>         Attachments: d1905.sql
>
>
> If I run the attached repro (pulled from DERBY-1866) with 
> derby.optimizer.noTimeout=true (meaning the optimizer should take all the 
> time it needs to try out all possible plans within the restrictions of the 
> optimizer overrides), the estimated cost shown in derby.log (if I log the 
> query plan) is over 600k and the estimated row count is over 520k.
> If, as the code in OptimizerImpl seems to expect, the unit for a cost 
> estimate is milliseconds, then the optimize here is guessing that the query 
> will take over 10 minutes to execute and will return a half-million rows.  
> But in truth the *combined* time for compilation AND execution is just a 
> couple of seconds, and only 15 rows are returned.
> That suggests to me a rather serious problem with the optimizer cost 
> estimates for subqueries.
> Among other things this can have a major impact on the optimizer's timeout 
> mechanism for very deeply-nested queries.  The optimizer will continue to try 
> out different combinations of indexes/joinOrders/joinStrategies until it 
> either exhausts them all or until the number of milliseconds spent optimizing 
> is greater than the "best cost" estimate so far. In the case of the repro for 
> this issue, the optimizer quickly exhausts all of the options and so it 
> finishes in a fair amount of time.
> But in larger queries where there are far more combinations to try (see esp. 
> queries attached to DERBY-1205, DERBY-1777), these severly inflated cost 
> estimates get very large very quickly (sometimes even reaching infinity--see 
> DERBY-1259, DERBY-1260) and so the optimizer just keeps on optimizing and 
> never times out.  The result is that for queries like those in DERBY-1777, 
> the optimizer can spend literally hours optimizing a query which then 
> executes in a matter of seconds.
> I'm still investigating this, but preliminary examination suggests that part 
> of the problem is in some way related to the treatment of "outer costs" by 
> the optimizer--and in particular, it looks like the optimizer is perhaps too 
> aggressive in multiplying cost estimates by row counts pulled from "outer 
> costs".  That's just my first guess at the problem, though; there could of 
> course be far more to it than that...

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to