Hi Shrijeet
Regarding your last question about the region growing bigger
The following points could be one reason

When you said your compactions are slower and also you were trying to split
some very big store files, every split would have created some set of
reference files.
By the time as more writes are happening more store files are flushed.

In the compaction algo, whenever reference files are found those files will
be tried to compact. But what happens is the though there are reference
files we try to  take the latest files to compact and the reference files
keeps losing the race in getting compacted i.e they are priority is going
down.

Pls refer to HBASE-5161.  It could be your case. In our case the region
infact went upto 400GB, but it was a heavy write scenario.

Regards
Ram


> -----Original Message-----
> From: Shrijeet Paliwal [mailto:shrij...@rocketfuel.com]
> Sent: Monday, May 14, 2012 4:43 AM
> To: user@hbase.apache.org
> Subject: [HBase 0.92.1] Too many stores files to compact, compaction
> moving slowly
> 
> Hi,
> 
> HBase version : 0.92.1
> Hadoop version: 0.20.2-cdh3u0
> 
> Relavant configurations:
> * hbase.regionserver.fileSplitTimeout : 300000
> * hbase.hstore.compactionThreshold : 3
> * hbase.hregion.max.filesize : 2147483648
> * hbase.hstore.compaction.max : 10
> * hbase.hregion.majorcompaction: 864000000000
> * HBASE_HEAPSIZE : 4000
> 
> Some how[1] a user has got his table into a complicated state. The
> table
> has 299 regions out of which roughly 28 regions have huge amount of
> store
> files in them, as high as 2300 (snapshot
> http://pastie.org/pastes/3907336/text) files! To add to complication
> the individual store files are as big as 14GB.
> 
> Now I am in pursuit of balancing the data in this table.  I tried doing
> manual splits. But the split requests were failing with error "Took too
> long to split the files and create the references, aborting split".
> To get around I increased hbase.regionserver.fileSplitTimeout.
> 
> From this point splits happend. I went ahead and identified 10 regions
> which had too many store files and did split on them. After splits
> daughter
> regions were created with references to all the store files in the
> parent
> region and compactions started happening. The minor compaction
> threshold is
> 10. Since there are 2000 + files (taking one instance for example) it
> will
> do 200 sweeps of minor compaction.
> Each sweep is running slow(couple of hours), since the individual files
> (in
> the set of 10) are too big.
> 
> Now coming to questions:
> 
> A] Given we can afford down time of this table (and of cluster if
> needed)
> can I do some thing *better* than manual splits and allowing
> compactions to
> complete? (I am picturing a tool which scans all the HDFS directories
> under
> the table and launches a distributed *compact and split if needed* job.
> Or
> some thing along those lines..)
> 
> B] If not (A) , can I temporarily tweak some configurations (other than
> heap given to region server) to get the table to a healthy state?
> 
> C] How come we managed to get individual files as big as 15GB, our max
> region size has been configured to be 2GB?
> 
> 
> [1] My theory is during the writes all requests consistently went to
> same
> region server and we managed to flushed faster than we could compact.
> Happy
> to be proved otherwise.

Reply via email to