[
https://issues.apache.org/jira/browse/HBASE-7253?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Matteo Bertozzi updated HBASE-7253:
-----------------------------------
Release Note:
The CompactionTool works at file-system level, so the table should be disabled.
The compaction process uses the same hbase-site.xml configuration property used
by the server, like
"hbase.hstore.compactionThreshold" & co.
You can compact the whole table or just a single region or family,
and the input of the CompactionTool is a fs path.
You can run the compaction as a MapReduce Job, or as a local process.
Each family can be compacted in parallel if you use the -mapreduce option.
To compact "TestTable" family "cf1" of region "e450da04b1a10099b618bec031e0f951"
bin/hbase org.apache.hadoop.hbase.regionserver.CompactionTool
hdfs:///hbase/TestTable/e450da04b1a10099b618bec031e0f951/cf1
To compact all the families of region "e450da04b1a10099b618bec031e0f951":
bin/hbase org.apache.hadoop.hbase.regionserver.CompactionTool
hdfs:///hbase/TestTable/e450da04b1a10099b618bec031e0f951
To compact all regions and family of the Table:
bin/hbase org.apache.hadoop.hbase.regionserver.CompactionTool -mapred
hdfs:///hbase/TestTable
was:
Tool to run compactions external to hbase:
Usage: java " + this.getClass().getName() + [-compactOnce] [-mapred]
[-D<property=value>]* files...
Options:
mapred Use MapReduce to run compaction.
compactOnce Execute just one compaction step. (default: while needed)
Note: -D properties will be applied to the conf used.
For example:
To preserve input files, pass -D"+CONF_COMPLETE_COMPACTION+"=false"
To stop delete of compacted file, pass -D"+CONF_DELETE_COMPACTED+"=false"
To set tmp dir, pass -D"+CONF_TMP_DIR+"=ALTERNATE_DIR"
Examples:
To compact the full 'TestTable' using MapReduce:
$ bin/hbase " + this.getClass().getName() + " -mapred hdfs:///hbase/TestTable"
To compact column family 'x' of the table 'TestTable' region 'abc':
$ bin/hbase " + this.getClass().getName() + " hdfs:///hbase/TestTable/abc/x"
Hadoop Flags: (was: Reviewed)
> Compaction Tool
> ---------------
>
> Key: HBASE-7253
> URL: https://issues.apache.org/jira/browse/HBASE-7253
> Project: HBase
> Issue Type: New Feature
> Components: Compaction
> Affects Versions: 0.96.0
> Reporter: Matteo Bertozzi
> Assignee: Matteo Bertozzi
> Priority: Minor
> Fix For: 0.96.0
>
> Attachments: HBASE-7253-v0.patch, HBASE-7253-v1.patch
>
>
> In HBASE-5616, as part of the compaction code refactor, a CompactionTool was
> added.
> but there are some issues:
> * The tool is under test/
> * mockito is required, so the "test" scope should be removed from the
> pom.xml, otherwise the tool doesn't start
> * The mock, used by the tool, is mocking HRegion.getRegionInfo() but some
> code (Store) uses HRegion.regionInfo directly HStore.java#L2021,
> HStore.java#L1389, HStore.java#L1402 and you end up with a NPE in the tool.
> * The Mocked Store uses a dummy family and the compacted files doesn't get
> the same family properties specified (compression, encoding, ...)
> * at the end of compaction CompactionTool.java#L155, on by default, the
> compaction file is removed (note that the compacted one are already removed
> inside the store.compact()... and you end up with an empty dir, if you
> compact everything.
> I've fixed some stuff and added support to:
> * Run the compaction as a MR Job
> * Specify a Table (compact each region/family)
> * Specify a Region (compact each family)
> * Specify a Family (as before)
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira