Andrew, What HBase version have you run your test on?
This issue probably does not exist anymore in a latest Apache releases, but still exists in not so latest, but still actively used, versions of CDH, HDP etc. We have discovered it during large data set loading ( 100s of GB) in our cluster (4 nodes). -Vladimir On Thu, Dec 4, 2014 at 10:23 AM, Andrew Purtell <[email protected]> wrote: > Actually I have set hbase.hstore.blockingStoreFiles to 200 in testing > exactly :-), but must not have generated sufficient load to encounter the > issue you are seeing. Maybe it would be possible to adapt one of the ingest > integration tests to trigger this problem? Set blockingStoreFiles to 200 or > more. Tune down the region size to 128K or similar. If > it's reproducible like that please open a JIRA. > > On Wed, Dec 3, 2014 at 9:07 AM, Vladimir Rodionov <[email protected]> > wrote: > > > Kevin, > > > > Thank you for your response. This is not a question on how to configure > > correctly HBase cluster for write heavy workloads. This is internal HBase > > issue - something is wrong in a default logic of compaction selection > > algorithm in 0.94-0.98. It seems that nobody has ever tested importing > data > > with very high hbase.hstore.blockingStoreFiles value (200 in our case). > > > > -Vladimir Rodionov > > > > On Wed, Dec 3, 2014 at 6:38 AM, Kevin O'dell <[email protected]> > > wrote: > > > > > Vladimir, > > > > > > I know you said, "do not ask me why", but I am going to have to ask > you > > > why. The fact you are doing this(this being blocking store files > > 200) > > > tells me there is something or multiple somethings wrong with your > > cluster > > > setup. A couple things come to mind: > > > > > > * During this heavy write period, could we use bulk loads? If so, this > > > should solve almost all of your problems > > > > > > * 1GB region size is WAY too small, and if you are pushing the volume > of > > > data you are talking about I would recommend 10 - 20GB region sizes > this > > > should help keep your region count smaller as well which will result in > > > more optimal writes > > > > > > * Your cluster may be undersized, if you are setting the blocking to be > > > that high, you may be pushing too much data for your cluster overall. > > > > > > Would you be so kind as to pass me a few pieces of information? > > > > > > 1.) Cluster size > > > 2.) Average region count per RS > > > 3.) Heap size, Memstore global settings, and block cache settings > > > 4.) a RS log to pastebin and a time frame of "high writes" > > > > > > I can probably make some solid suggestions for you based on the above > > data. > > > > > > On Wed, Dec 3, 2014 at 1:04 AM, Vladimir Rodionov < > > [email protected]> > > > wrote: > > > > > > > This is what we observed in our environment(s) > > > > > > > > The issue exists in CDH4.5, 5.1, HDP2.1, Mapr4 > > > > > > > > If some one sets # of blocking stores way above default value, say - > > 200 > > > to > > > > avoid write stalls during intensive data loading (do not ask me , why > > we > > > do > > > > this), then > > > > one of the regions grows indefinitely and takes more 99% of overall > > > table. > > > > > > > > It can't be split because it still has orphaned reference files. Some > > of > > > a > > > > reference files are able to avoid compactions for a long time, > > obviously. > > > > > > > > The split policy is IncreasingToUpperBound, max region size is 1G. I > do > > > my > > > > tests on CDH4.5 mostly but all other distros seem have the same > issue. > > > > > > > > My attempt to add reference files forcefully to compaction list in > > > > Store.requetsCompaction() when region exceeds recommended maximum > size > > > did > > > > not work out well - some weird results in our test cases (but HBase > > tests > > > > are OK: small, medium and large). > > > > > > > > What is so special with these reference files? Any ideas, what can be > > > done > > > > here to fix the issue? > > > > > > > > -Vladimir Rodionov > > > > > > > > > > > > > > > > -- > > > Kevin O'Dell > > > Systems Engineer, Cloudera > > > > > > > > > -- > Best regards, > > - Andy > > Problems worthy of attack prove their worth by hitting back. - Piet Hein > (via Tom White) >
