[hbase] HStoreFiles needlessly store the column family name in every entry
--------------------------------------------------------------------------
Key: HADOOP-2521
URL: https://issues.apache.org/jira/browse/HADOOP-2521
Project: Hadoop
Issue Type: Improvement
Components: contrib/hbase
Reporter: Bryan Duxbury
Priority: Minor
Today, HStoreFiles keep the entire serialized HStoreKey objects around for
every cell in the HStore. Since HStores are 1-1 with column families, this is
really unnecessary - you can always surmise the column family by looking at the
HStore it belongs to. (This information would ostensibly come from the file
name or a header section.) This means that we could remove the column family
part of the HStoreKeys we put into the HStoreFile, reducing the size of data
stored. This would be a space-saving benefit, removing redundant data, and
could be a speed benefit, as you have to scan over less data in memory and
transfer less data over the network.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.