[ https://issues.apache.org/jira/browse/HBASE-14509?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14935818#comment-14935818 ]
Lars Hofhansl edited comment on HBASE-14509 at 10/2/15 6:16 AM: ---------------------------------------------------------------- [~lhofhansl], FYI HBASE-14511 - StoreFile.Writer Meta plugin framework. I need only Meta section and only for Writer. For your sparse indexes, you will need full Reader/Writer plugin (both meta and data blocks). It is just a one way of doing indexes, of course. was (Author: vrodionov): [~lhofhansl], FYI https://issues.apache.org/jira/browse/HBASE-14511 - StoreFile.Writer Meta plugin framework. I need only Meta section and only for Writer. For your sparse indexes, you will need full Reader/Writer plugin (both meta and data blocks). It is just a one way of doing indexes, of course. > Configurable sparse indexes? > ---------------------------- > > Key: HBASE-14509 > URL: https://issues.apache.org/jira/browse/HBASE-14509 > Project: HBase > Issue Type: Brainstorming > Reporter: Lars Hofhansl > > This idea just popped up today and I wanted to record it for discussion: > What if we kept sparse column indexes per region or HFile or per configurable > range? > I.e. For any given CQ we record the lowest and highest value for a particular > range (HFile, Region, or a custom range like the Phoenix guide post). > By tweaking the size of these ranges we can control the size of the index, vs > its selectivity. > For example if we kept it by HFile we can almost instantly decide whether we > need scan a particular HFile at all to find a particular value in a Cell. > We can also collect min/max values for each n MB of data, for example when we > can the region the first time. Assuming ranges are large enough we can always > keep the index in memory together with the region. > Kind of a sparse local index. Might much easier than the buddy region stuff > we've been discussing. -- This message was sent by Atlassian JIRA (v6.3.4#6332)