Rui Li created HIVE-10261: ----------------------------- Summary: Data size can be underestimated when computed with partial column stats Key: HIVE-10261 URL: https://issues.apache.org/jira/browse/HIVE-10261 Project: Hive Issue Type: Bug Reporter: Rui Li
With {{hive.stats.fetch.column.stats=true}}, we'll estimate data size with column stats when annotating operators with statistics. However, when column stats is partial, we're likely to underestimate data size, which may hurt performance, e.g. picking an inappropriate small table for map join. -- This message was sent by Atlassian JIRA (v6.3.4#6332)