Good points Lars. I thought a bit on how to debug/find more clue...sorting it first is a good idea(currently sort by sequenceID, size etc) Thanks
发自我的 iPad 在 2014-12-6,9:07,Andrew Purtell <[email protected]> 写道: >> Seems to me we should sort reference files first _always_, to compact > them away first and allow to be split further. Thoughts? File a jira? > > Sounds reasonable as an enhancement issue > > > On Fri, Dec 5, 2014 at 5:02 PM, lars hofhansl <[email protected]> wrote: > >> Digging in the (0.98) code a bit I find this: >> >> HRegionServer.postOpenDeployTasks(): Request a compaction either when >> we're past the minimum of file or there is any reference file. Good, that >> will trigger >> RatioBasedComactionPolicy.selectCompaction(): turns any compaction into a >> major one if there reference files involved in the set of files already >> selected. Also cool, after a split all files of a daughter will be >> reference files. >> >> But... I do not see any code where it would make sure at least one >> reference file is selected. So in theory the initial compaction started by >> postOpenDeployTasks could have failed for some reason.Now more data is >> written and following compaction selections won't pick up any reference >> files, as there are many small, new files written. >> So the reference files could in theory just linger until a selection just >> happens to come across one, all the while the daughters are (a) >> unsplittable and (b) cannot migrate to another region server. >> That is unless I missed something... Maybe somebody could have a look too? >> Seems to me we should sort reference files first _always_, to compact them >> away first and allow to be split further. Thoughts? File a jira? >> >> -- Lars >> >> From: lars hofhansl <[email protected]> >> To: "[email protected]" <[email protected]> >> Sent: Friday, December 5, 2014 1:59 PM >> Subject: Re: Region is out of bounds >> >> We've run into something like this as well (probably). >> Will be looking at this as well over the next days/weeks. Under heavy load >> HBase seems to just not able to get the necessary compactions in, and until >> that happens it cannot further split a region. >> >> I wonder whether HBASE-12411 would be help here (this optionally allow >> compaction to use private readers), I doubt it, though. >> The details are probably tricky, I thought HBase would compact split >> regions with higher priority (placing those first in the compaction >> queue)... Need to actually check the code. >> >> -- Lars >> From: Qiang Tian <[email protected]> >> >> >> To: "[email protected]" <[email protected]> >> Sent: Thursday, December 4, 2014 7:26 PM >> Subject: Re: Region is out of bounds >> >> ----My attempt to add reference files forcefully to compaction list in >> Store.requetsCompaction() when region exceeds recommended maximum size did >> not work out well - some weird results in our test cases (but HBase tests >> are OK: small, medium and large). >> >> interesting...perhaps it was filtered out in RatioBasedCompactionPolicy# >> selectCompaction? >> >> >> >> >> >> On Fri, Dec 5, 2014 at 5:20 AM, Andrew Purtell <[email protected]> >> wrote: >> >>> Most versions of 0.98 since 0.98.1, but I haven't run a punishing high >>> scale bulk ingest for its own sake, high-ish rate ingest and a setting of >>> blockingStoreFiles to 200 have been in service of getting data in for >>> subsequent testing. >>> >>> >>> On Thu, Dec 4, 2014 at 12:43 PM, Vladimir Rodionov < >> [email protected] >>>> >>> wrote: >>> >>>> Andrew, >>>> >>>> What HBase version have you run your test on? >>>> >>>> This issue probably does not exist anymore in a latest Apache releases, >>> but >>>> still exists in not so latest, but still actively used, versions of >> CDH, >>>> HDP etc. We have discovered it during large data set loading ( 100s of >>> GB) >>>> in our cluster (4 nodes). >>>> >>>> -Vladimir >>>> >>>> On Thu, Dec 4, 2014 at 10:23 AM, Andrew Purtell <[email protected]> >>>> wrote: >>>> >>>>> Actually I have set hbase.hstore. >>>> >>>> blockingStoreFiles to 200 in testing >>>>> exactly :-), but must not have generated sufficient load to encounter >>> the >>>>> issue you are seeing. Maybe it would be possible to adapt one of the >>>> ingest >>>>> integration tests to trigger this problem? Set blockingStoreFiles to >>> 200 >>>> or >>>>> more. Tune down the region size to 128K or similar. If >>>>> it's reproducible like that please open a JIRA. >>>>> >>>>> On Wed, Dec 3, 2014 at 9:07 AM, Vladimir Rodionov < >>>> [email protected]> >>>>> wrote: >>>>> >>>>>> Kevin, >>>>>> >>>>>> Thank you for your response. This is not a question on how to >>> configure >>>>>> correctly HBase cluster for write heavy workloads. This is internal >>>> HBase >>>>>> issue - something is wrong in a default logic of compaction >> selection >>>>>> algorithm in 0.94-0.98. It seems that nobody has ever tested >>> importing >>>>> data >>>>>> with very high hbase.hstore.blockingStoreFiles value (200 in our >>> case). >>>>>> >>>>>> -Vladimir Rodionov >>>>>> >>>>>> On Wed, Dec 3, 2014 at 6:38 AM, Kevin O'dell < >>> [email protected] >>>>> >>>>>> wrote: >>>>>> >>>>>>> Vladimir, >>>>>>> >>>>>>> I know you said, "do not ask me why", but I am going to have to >>> ask >>>>> you >>>>>>> why. The fact you are doing this(this being blocking store >> files > >>>>> 200) >>>>>>> tells me there is something or multiple somethings wrong with >> your >>>>>> cluster >>>>>>> setup. A couple things come to mind: >>>>>>> >>>>>>> * During this heavy write period, could we use bulk loads? If >> so, >>>> this >>>>>>> should solve almost all of your problems >>>>>>> >>>>>>> * 1GB region size is WAY too small, and if you are pushing the >>> volume >>>>> of >>>>>>> data you are talking about I would recommend 10 - 20GB region >> sizes >>>>> this >>>>>>> should help keep your region count smaller as well which will >>> result >>>> in >>>>>>> more optimal writes >>>>>>> >>>>>>> * Your cluster may be undersized, if you are setting the blocking >>> to >>>> be >>>>>>> that high, you may be pushing too much data for your cluster >>> overall. >>>>>>> >>>>>>> Would you be so kind as to pass me a few pieces of information? >>>>>>> >>>>>>> 1.) Cluster size >>>>>>> 2.) Average region count per RS >>>>>>> 3.) Heap size, Memstore global settings, and block cache settings >>>>>>> 4.) a RS log to pastebin and a time frame of "high writes" >>>>>>> >>>>>>> I can probably make some solid suggestions for you based on the >>> above >>>>>> data. >>>>>>> >>>>>>> On Wed, Dec 3, 2014 at 1:04 AM, Vladimir Rodionov < >>>>>> [email protected]> >>>>>>> wrote: >>>>>>> >>>>>>>> This is what we observed in our environment(s) >>>>>>>> >>>>>>>> The issue exists in CDH4.5, 5.1, HDP2.1, Mapr4 >>>>>>>> >>>>>>>> If some one sets # of blocking stores way above default value, >>> say >>>> - >>>>>> 200 >>>>>>> to >>>>>>>> avoid write stalls during intensive data loading (do not ask >> me , >>>> why >>>>>> we >>>>>>> do >>>>>>>> this), then >>>>>>>> one of the regions grows indefinitely and takes more 99% of >>> overall >>>>>>> table. >>>>>>>> >>>>>>>> It can't be split because it still has orphaned reference >> files. >>>> Some >>>>>> of >>>>>>> a >>>>>>>> reference files are able to avoid compactions for a long time, >>>>>> obviously. >>>>>>>> >>>>>>>> The split policy is IncreasingToUpperBound, max region size is >>> 1G. >>>> I >>>>> do >>>>>>> my >>>>>>>> tests on CDH4.5 mostly but all other distros seem have the same >>>>> issue. >>>>>>>> >>>>>>>> My attempt to add reference files forcefully to compaction list >>> in >>>>>>>> Store.requetsCompaction() when region exceeds recommended >> maximum >>>>> size >>>>>>> did >>>>>>>> not work out well - some weird results in our test cases (but >>> HBase >>>>>> tests >>>>>>>> are OK: small, medium and large). >>>>>>>> >>>>>>>> What is so special with these reference files? Any ideas, what >>> can >>>> be >>>>>>> done >>>>>>>> here to fix the issue? >>>>>>>> >>>>>>>> -Vladimir Rodionov >>>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> Kevin O'Dell >>>>>>> Systems Engineer, Cloudera >>>>>>> >>>>>> >>>>> >>>>> >>>>> >>>>> -- >>>>> Best regards, >>>>> >>>>> - Andy >>>>> >>>>> Problems worthy of attack prove their worth by hitting back. - Piet >>> Hein >>>>> (via Tom White) >>>>> >>>> >>> >>> >>> >>> -- >>> Best regards, >>> >>> - Andy >>> >>> Problems worthy of attack prove their worth by hitting back. - Piet Hein >>> (via Tom White) >>> >> >> >> >> >> > > > > -- > Best regards, > > - Andy > > Problems worthy of attack prove their worth by hitting back. - Piet Hein > (via Tom White)
