Reply Inline

> -----Original Message-----
> From: Monish r [mailto:monishs...@gmail.com]
> Sent: Sunday, September 23, 2012 7:29 PM
> To: user@hbase.apache.org
> Subject: Clarification regarding major compaction logic
> 
> Hi guys,
> 
> i would like to clarify the following regarding Major Compaction
> 
> 1) When TTL is set for a column family and major compaction is
> triggered by
> user
> 
> - Does it act on the region only when *time since last major compaction
> is
> > TTL.*
> *
> *
[Ram] Major compaction can be triggered based on configuration or manually.
By default major compaction gets triggered every 24 hrs.
While doing compaction(minor or major) if the compaction algo finds that
there are HFiles for which TTL has expired major compaction will simply
delete those files.
Similarly while doing compaction (minor or major) the KV in every HFile is
scanned and if the KV is found to be TTL expired then it is avoided from
getting written to the new compacted file.

> 
> 2) Does major compaction go through the index of a region to find out
> that
> there is data to be acted upon and then start the rewriting  ( or )
> does it
> rewrite without any pre checks about the data  inside the region ?
[Ram] Major compaction differs from minor compaction in a way that the
delete markers are removed.  So if once Major compaction is triggered, the
algo finds if there are any files
That can be major compacted and just runs over those files.  
> 
> 3) If major compaction for a region results in a empty region , does
> the
> empty region get deleted or left as such ?
> 
[Ram] If I remember correctly if major compaction or even minor compaction
results in no data, still an empty file is flushed.  So the region remains
intact and the region is never deleted.
Hope this helps.
> Regards,
> R.Monish

Reply via email to