[ https://issues.apache.org/jira/browse/HIVE-14018?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Jesus Camacho Rodriguez updated HIVE-14018: ------------------------------------------- Status: Patch Available (was: In Progress) > Make IN clause row selectivity estimation customizable > ------------------------------------------------------ > > Key: HIVE-14018 > URL: https://issues.apache.org/jira/browse/HIVE-14018 > Project: Hive > Issue Type: Improvement > Components: Statistics > Affects Versions: 2.1.0, 2.2.0 > Reporter: Jesus Camacho Rodriguez > Assignee: Jesus Camacho Rodriguez > Priority: Minor > Attachments: HIVE-14018.patch > > > After HIVE-13287 went in, we calculate IN clause estimates natively (instead > of just dividing incoming number of rows by 2). However, as the distribution > of values of the columns is considered uniform, we might end up heavily > underestimating/overestimating the resulting number of rows. > This issue is to add a factor that multiplies the IN clause estimation so we > can alleviate this problem. The solution is not very elegant, but it is the > best we can do until we have histograms to improve our estimate. -- This message was sent by Atlassian JIRA (v6.3.4#6332)