[ https://issues.apache.org/jira/browse/HIVE-417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12758127#action_12758127 ]
Prasad Chakka commented on HIVE-417: ------------------------------------ i don't think it makes much sense unless there is some clustering or sorting property. if there is clustering and sorting and the selectivity of a query is much higher than 10% then storing this metadata along with data makes sense instead of a separate block. the 10% threshold may be larger for Hive but the point still stands. in OLAP case data is change seldom and the size of this kind of metadata is much smaller than the data itself so the overhead of storing this data is negligible. something similar to this is done in DB2 Multi-Dimensional Clustering where whole blocks (disk blocks) are skipped if the key value doesn't fit the query. > Implement Indexing in Hive > -------------------------- > > Key: HIVE-417 > URL: https://issues.apache.org/jira/browse/HIVE-417 > Project: Hadoop Hive > Issue Type: New Feature > Components: Metastore, Query Processor > Affects Versions: 0.2.0, 0.3.0, 0.3.1, 0.4.0 > Reporter: Prasad Chakka > Assignee: He Yongqiang > Attachments: hive-417.proto.patch, hive-417-2009-07-18.patch > > > Implement indexing on Hive so that lookup and range queries are efficient. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.