Laljo John Pullokkaran created HIVE-6540: --------------------------------------------
Summary: Support Multi Column Stats Key: HIVE-6540 URL: https://issues.apache.org/jira/browse/HIVE-6540 Project: Hive Issue Type: Improvement Reporter: Laljo John Pullokkaran Assignee: Laljo John Pullokkaran For Joins involving compound predicates, multi column stats can be used to accurately compute the NDV. Objective is to compute NDV of more than one columns. Compute NDV of (x,y,z). R1 IJ R2 on R1.x=R2.x and R1.y=R2.y and R1.z=R2.z can use max(NDV(R1.x, R1.y, R1.z), NDV(R2.x, R2.y, R2.z)) for Join NDV (& hence selectivity). http://www.oracle-base.com/articles/11g/statistics-collection-enhancements-11gr1.php#multi_column_statistics http://blogs.msdn.com/b/ianjo/archive/2005/11/10/491548.aspx http://developer.teradata.com/database/articles/removing-multi-column-statistics-a-process-for-identification-of-redundant-statist -- This message was sent by Atlassian JIRA (v6.2#6252)