Reply Inline > -----Original Message----- > From: Monish r [mailto:monishs...@gmail.com] > Sent: Sunday, September 23, 2012 7:29 PM > To: user@hbase.apache.org > Subject: Clarification regarding major compaction logic > > Hi guys, > > i would like to clarify the following regarding Major Compaction > > 1) When TTL is set for a column family and major compaction is > triggered by > user > > - Does it act on the region only when *time since last major compaction > is > > TTL.* > * > * [Ram] Major compaction can be triggered based on configuration or manually. By default major compaction gets triggered every 24 hrs. While doing compaction(minor or major) if the compaction algo finds that there are HFiles for which TTL has expired major compaction will simply delete those files. Similarly while doing compaction (minor or major) the KV in every HFile is scanned and if the KV is found to be TTL expired then it is avoided from getting written to the new compacted file.
> > 2) Does major compaction go through the index of a region to find out > that > there is data to be acted upon and then start the rewriting ( or ) > does it > rewrite without any pre checks about the data inside the region ? [Ram] Major compaction differs from minor compaction in a way that the delete markers are removed. So if once Major compaction is triggered, the algo finds if there are any files That can be major compacted and just runs over those files. > > 3) If major compaction for a region results in a empty region , does > the > empty region get deleted or left as such ? > [Ram] If I remember correctly if major compaction or even minor compaction results in no data, still an empty file is flushed. So the region remains intact and the region is never deleted. Hope this helps. > Regards, > R.Monish