We have HADOOP-2604 open, which is about creating a brand-new MapFile
implementation with HBase's needs specifically in mind. Some of the
things we'd like to do there (different indexing schemes, etc) seem
like they would be hard to implement with the existing MapFile even
if we had the ability to subclass it better. I think in terms of
easing ongoing HBase development, it would make sense to go in our
own direction.
-Bryan
On Jan 16, 2008, at 11:14 AM, Jim Kellerman wrote:
HBase has several subclasses of MapFile already:
org.apache.hadoop.hbase.HStoreFile$
HbaseMapFile
BloomFilterMapFile
HalfMapFileReader
If MapFile were more subclassable (had protected members instead of
private or accessor methods) we would probably add client side
caching, bloom filters (to determine if a key exists in a map file
- different from BloomFilterMapFile above which is a mix-in of
MapFile and BloomFilter)
Tom White said (in https://issues.apache.org/jira/browse/HADOOP-2604)
If MapFile.Reader were an interface (or an abstract class with a no
args constructor) then BloomFilterMapFile.Reader,
HalfMapFileReader and
caching Readers could be implemented as wrappers instead of in a
static
hierarchy.
This would make it easier to mix and match readers (e.g. with or
without caching) without passing all possible parameters in the
constructor.
So we'd like to make MapFile (and probably SequenceFile)
subclassable by providing accessors and/or making members protected
instead of private.
If these classes should not be subclassed, they should be declared
as final classes.
Thoughts? Opinions? Comments?
---
Jim Kellerman, Senior Engineer; Powerset
No virus found in this outgoing message.
Checked by AVG Free Edition.
Version: 7.5.516 / Virus Database: 269.19.5/1228 - Release Date:
1/16/2008 9:01 AM