Aman Sinha created DRILL-7187:
---------------------------------

             Summary: Improve selectivity estimates for range predicates when 
using histogram
                 Key: DRILL-7187
                 URL: https://issues.apache.org/jira/browse/DRILL-7187
             Project: Apache Drill
          Issue Type: Bug
            Reporter: Aman Sinha
            Assignee: Aman Sinha


2 types of selectivity estimation improvements need to be done:

1.  For range predicates on the same column, we need to collect all such 
predicates in 1 group and do a histogram lookup for them together. 
For instance: 
{noformat}
 WHERE a > 10 AND b < 20 AND c = 100 AND a <= 50 AND b < 50
{noformat}
 Currently, the Drill behavior is to treat each of the conjuncts independently 
and multiply the individual selectivities.  However, that will not give the 
accurate estimates. Here, we want to group the predicates on 'a' together and 
do a single lookup.  Similarly for 'b'.  

2. NULLs are not maintained by the histogram but when doing the selectivity 
calculations, the histogram should use the totalRowCount as the denominator 
rather than the non-null count. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to