Related please see HBASE-13408 HBase In-Memory Memstore Compaction FYI
On Mon, Aug 24, 2015 at 10:32 AM, Jean-Marc Spaggiari < jean-m...@spaggiari.org> wrote: > The split policy also uses the flush size to estimate how to split > tables... > > It's sometime fine to upgrade thise number a bit. Like, to 256MB. But 512 > is pretty high.... And 800MB is even more. > > Big memstores takes more time to get flush and can block the writes if they > are not fast enough. If yours are fast enough, then you might be able to > stay with 512MB. I don't think 800MB is a good idea... > > JM > > 2015-08-24 13:23 GMT-04:00 Vladimir Rodionov <vladrodio...@gmail.com>: > > > 1. How many regions per RS? > > 2. What is your dfs.block.size? > > 3. What is your hbase.regionserver.maxlogs? > > > > Flush can be requested when: > > > > 1. Region size exceeds hbase.hregion.memstore.flush.size > > 2. Region's memstore is too old (periodic memstore flusher checks the age > > of memstore, default is 1hour) Controlled by > > hbase.regionserver.optionalcacheflushinterval (in ms) > > 3. There too many unflushed changes in a Region. Controlled by > > hbase.regionserver.flush.per.changes, default is 30,000,000 > > 4. WAL is rolling prematurely, controlled by hbase.regionserver.maxlogs > > and dfs.block.size. > > > > You calculate optimal: hbase.regionserver.maxlogs * dfs.block.size * > 0.95 > > > hbase.regionserver.global.memstore.upperLimit * HBASE_HEAPSIZE > > > > I recommend you to enable DEBUG logging and analyze MemStoreFlusher, > > PeriodicMemstoreFlusher and HRegion flush related log messages to get > idea > > why flush was requested on a region(s), what was the region size at that > > time. > > > > I think, in your case it is either premature WAL rolling or too many > > changes in a memstore. > > > > -Vlad > > > > > > On Wed, May 27, 2015 at 1:53 PM, Gautam Borah <gautam.bo...@gmail.com> > > wrote: > > > > > Hi all, > > > > > > The default size of Hbase.hregion.memstore.flush.size is define as 128 > MB > > > for Hbase.hregion.memstore.flush.size. Could anyone kindly explain what > > > would be the impact if we increase this to a higher value 512 MB or 800 > > MB > > > or higher. > > > > > > We have a very write heavy cluster. Also we run periodic end point co > > > processor based jobs that operate on the data written in the last 10-15 > > > mins, every 10 minute. We are trying to manage the memstore flush > > > operations such that the hot data remains in memstore for at least > 30-40 > > > mins or longer, so that the job hits disk every 3rd or 4th time it > tries > > to > > > operate on the hot data (it does scan). > > > > > > We have region server heap size of 20 GB and set the, > > > > > > hbase.regionserver.global.memstore.lowerLimit = .45 > > > hbase.regionserver.global.memstore.upperLimit = .55 > > > > > > We observed that if we set the Hbase.hregion.memstore.flush.size=128MB > > > only 10% of the heap is utilized by memstore, after that memstore > > flushes. > > > > > > At Hbase.hregion.memstore.flush.size=512MB, we are able to increase the > > > heap utelization to by memstore to 35%. > > > > > > It would be very helpful for us to understand the implication of higher > > > Hbase.hregion.memstore.flush.size for a long running cluster. > > > > > > Thanks, > > > Gautam > > >