[ 
https://issues.apache.org/jira/browse/HBASE-696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12609357#action_12609357
 ] 

jimk edited comment on HBASE-696 at 6/30/08 1:32 PM:
--------------------------------------------------------------

stack wrote:
> Remove bloomfilter options. Only one bloomfilter type makes sense in hbase 
> context.

True.

> Also, make bloomfilter self-sizing; you know size when flushing.

However, you can't easily know the size when doing a compaction.

Question: Is the bloomfilter based on the row key; row and column; row, column 
and timestamp; row and timestamp?

It seems as if basing the bloomfilter solely on the row key would be the most 
useful. If you are doing a get or scan with LATEST_TIMESTAMP, that won't match 
anything in the bloomfilter if the timestamp is included. Similarly 
row/family:member doesn't make sense if you are fetching by column wildcard 
(family: ).

Using row/family: might be another option.

>Putting in 0.2 for now because its API change (for the simpler). We can punt 
>later.

With respect to the API change, would it be sufficient to change 
HColumnDescriptor so that bloomFilter is a boolean ? That would require a 
migration step. 

BloomFilterDescriptor could then be moved to 
org.apache.hadoop.hbase.regionserver and become package private.


      was (Author: jimk):
    stack wrote:
> Remove bloomfilter options. Only one bloomfilter type makes sense in hbase 
> context.

True.

> Also, make bloomfilter self-sizing; you know size when flushing.

However, you can't easily know the size when doing a compaction.

Question: Is the bloomfilter based on the row key; row and column; row, column 
and timestamp; row and timestamp?

It seems as if basing the bloomfilter solely on the row key would be the most 
useful. If you are doing a get or scan with LATEST_TIMESTAMP, that won't match 
anything in the bloomfilter if the timestamp is included. Similarly 
row/family:member doesn't make sense if you are fetching by column wildcard 
(family:).

Using row/family: might be another option.

>Putting in 0.2 for now because its API change (for the simpler). We can punt 
>later.

With respect to the API change, would it be sufficient to change 
HColumnDescriptor so that bloomFilter is a boolean ? That would require a 
migration step. 

BloomFilterDescriptor could then be moved to 
org.apache.hadoop.hbase.regionserver and become package private.

  
> Make bloomfilter true/false and self-sizing
> -------------------------------------------
>
>                 Key: HBASE-696
>                 URL: https://issues.apache.org/jira/browse/HBASE-696
>             Project: Hadoop HBase
>          Issue Type: Improvement
>            Reporter: stack
>             Fix For: 0.2.0
>
>
> Remove bloomfilter options.  Only one bloomfilter type makes sense in hbase 
> context.  Also, make bloomfilter self-sizing; you know size when flushing.
> Putting in 0.2 for now because its API change (for the simpler).  We can punt 
> later.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to