We have HADOOP-2604 open, which is about creating a brand-new MapFile implementation with HBase's needs specifically in mind. Some of the things we'd like to do there (different indexing schemes, etc) seem like they would be hard to implement with the existing MapFile even if we had the ability to subclass it better. I think in terms of easing ongoing HBase development, it would make sense to go in our own direction.

-Bryan

On Jan 16, 2008, at 11:14 AM, Jim Kellerman wrote:

HBase has several subclasses of MapFile already:
org.apache.hadoop.hbase.HStoreFile$
  HbaseMapFile
  BloomFilterMapFile
  HalfMapFileReader

If MapFile were more subclassable (had protected members instead of private or accessor methods) we would probably add client side caching, bloom filters (to determine if a key exists in a map file - different from BloomFilterMapFile above which is a mix-in of MapFile and BloomFilter)

Tom White said (in https://issues.apache.org/jira/browse/HADOOP-2604)
If MapFile.Reader were an interface (or an abstract class with a no
args constructor) then BloomFilterMapFile.Reader, HalfMapFileReader and caching Readers could be implemented as wrappers instead of in a static
hierarchy.

This would make it easier to mix and match readers (e.g. with or
without caching) without passing all possible parameters in the
constructor.

So we'd like to make MapFile (and probably SequenceFile) subclassable by providing accessors and/or making members protected instead of private.

If these classes should not be subclassed, they should be declared as final classes.

Thoughts? Opinions? Comments?

---
Jim Kellerman, Senior Engineer; Powerset

No virus found in this outgoing message.
Checked by AVG Free Edition.
Version: 7.5.516 / Virus Database: 269.19.5/1228 - Release Date: 1/16/2008 9:01 AM


Reply via email to