[ 
https://issues.apache.org/jira/browse/DERBY-2482?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

A B updated DERBY-2482:
-----------------------

          Description: 
Pulled from the write-up for DERBY-47:

<start>

The costing logic for probing as implemented in DERBY-47 is not perfect. It 
works great in situations where the IN list is significantly smaller than the 
number of rows in the table--ex. we see good results for the test program 
attached to that issue when there are 100k rows in the target table. However, I 
discovered that if we just run with 10,000 rows, then once we hit 1,000 values 
in the IN list the costing of probe predicates causes the optimizer to think 
that it would be too expensive, so it (the optimizer) ends up doing a table 
scan. In truth the table scan is still far slower than index probing, but the 
relative size of the IN list with respect to the number of rows in the table 
throws the costing off. To confirm this I just removed the probing cost logic 
(so that it effectively becomes the cost of a single "col = ?" predicate) and 
then the optimizer chose to do index probing for the 10,000 row scenario, which 
was much, much faster (as expected). 

<end>

So it would be great if the costing could be improved such that the optimizer 
will recognize cases like this and thus can choose to do multi-probing.  
DERBY-47 showed that multi-probing can significantly improve execution times, 
so the more we can get the optimizer to (correctly) choose to do it, the better.

  was:
Pulled from the write-up for DERBY-47:


    Affects Version/s: 10.3.0.0

Oops, somehow submitted the issue without a complete description.  Including 
the description now...

> Improve optimizer costing for "probe predicates" so that index multi-probing 
> is more likely to occur.
> -----------------------------------------------------------------------------------------------------
>
>                 Key: DERBY-2482
>                 URL: https://issues.apache.org/jira/browse/DERBY-2482
>             Project: Derby
>          Issue Type: Improvement
>          Components: Performance
>    Affects Versions: 10.3.0.0
>            Reporter: A B
>            Priority: Minor
>
> Pulled from the write-up for DERBY-47:
> <start>
> The costing logic for probing as implemented in DERBY-47 is not perfect. It 
> works great in situations where the IN list is significantly smaller than the 
> number of rows in the table--ex. we see good results for the test program 
> attached to that issue when there are 100k rows in the target table. However, 
> I discovered that if we just run with 10,000 rows, then once we hit 1,000 
> values in the IN list the costing of probe predicates causes the optimizer to 
> think that it would be too expensive, so it (the optimizer) ends up doing a 
> table scan. In truth the table scan is still far slower than index probing, 
> but the relative size of the IN list with respect to the number of rows in 
> the table throws the costing off. To confirm this I just removed the probing 
> cost logic (so that it effectively becomes the cost of a single "col = ?" 
> predicate) and then the optimizer chose to do index probing for the 10,000 
> row scenario, which was much, much faster (as expected). 
> <end>
> So it would be great if the costing could be improved such that the optimizer 
> will recognize cases like this and thus can choose to do multi-probing.  
> DERBY-47 showed that multi-probing can significantly improve execution times, 
> so the more we can get the optimizer to (correctly) choose to do it, the 
> better.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to