Good points Lars. I thought a bit on how to debug/find more clue...sorting it 
first is a good idea(currently sort by sequenceID, size etc)
Thanks

发自我的 iPad

在 2014-12-6,9:07,Andrew Purtell <[email protected]> 写道:

>> Seems to me we should sort reference files first _always_, to compact
> them away first and allow to be split further. Thoughts? File a jira?
> 
> ​Sounds reasonable as an enhancement issue​
> 
> 
> On Fri, Dec 5, 2014 at 5:02 PM, lars hofhansl <[email protected]> wrote:
> 
>> Digging in the (0.98) code a bit I find this:
>> 
>> HRegionServer.postOpenDeployTasks(): Request a compaction either when
>> we're past the minimum of file or there is any reference file. Good, that
>> will trigger
>> RatioBasedComactionPolicy.selectCompaction(): turns any compaction into a
>> major one if there reference files involved in the set of files already
>> selected. Also cool, after a split all files of a daughter will be
>> reference files.
>> 
>> But... I do not see any code where it would make sure at least one
>> reference file is selected. So in theory the initial compaction started by
>> postOpenDeployTasks could have failed for some reason.Now more data is
>> written and following compaction selections won't pick up any reference
>> files, as there are many small, new files written.
>> So the reference files could in theory just linger until a selection just
>> happens to come across one, all the while the daughters are (a)
>> unsplittable and (b) cannot migrate to another region server.
>> That is unless I missed something... Maybe somebody could have a look too?
>> Seems to me we should sort reference files first _always_, to compact them
>> away first and allow to be split further. Thoughts? File a jira?
>> 
>> -- Lars
>> 
>>      From: lars hofhansl <[email protected]>
>> To: "[email protected]" <[email protected]>
>> Sent: Friday, December 5, 2014 1:59 PM
>> Subject: Re: Region is out of bounds
>> 
>> We've run into something like this as well (probably).
>> Will be looking at this as well over the next days/weeks. Under heavy load
>> HBase seems to just not able to get the necessary compactions in, and until
>> that happens it cannot further split a region.
>> 
>> I wonder whether HBASE-12411 would be help here (this optionally allow
>> compaction to use private readers), I doubt it, though.
>> The details are probably tricky, I thought HBase would compact split
>> regions with higher priority (placing those first in the compaction
>> queue)... Need to actually check the code.
>> 
>> -- Lars
>>      From: Qiang Tian <[email protected]>
>> 
>> 
>> To: "[email protected]" <[email protected]>
>> Sent: Thursday, December 4, 2014 7:26 PM
>> Subject: Re: Region is out of bounds
>> 
>> ----My attempt to add reference files forcefully to compaction list in
>> Store.requetsCompaction() when region exceeds recommended maximum size did
>> not work out well - some weird results in our test cases (but HBase tests
>> are OK: small, medium and large).
>> 
>> interesting...perhaps it was filtered out in RatioBasedCompactionPolicy#
>> selectCompaction?
>> 
>> 
>> 
>> 
>> 
>> On Fri, Dec 5, 2014 at 5:20 AM, Andrew Purtell <[email protected]>
>> wrote:
>> 
>>> Most versions of 0.98 since 0.98.1, but I haven't run a punishing high
>>> scale bulk ingest for its own sake, high-ish rate ingest and a setting of
>>> blockingStoreFiles to 200 have been in service of getting data in for
>>> subsequent testing.
>>> 
>>> 
>>> On Thu, Dec 4, 2014 at 12:43 PM, Vladimir Rodionov <
>> [email protected]
>>>> 
>>> wrote:
>>> 
>>>> Andrew,
>>>> 
>>>> What HBase version have you run your test on?
>>>> 
>>>> This issue probably does not exist anymore in a latest Apache releases,
>>> but
>>>> still exists in not so latest, but still actively used, versions of
>> CDH,
>>>> HDP etc. We have discovered it during large data set loading ( 100s of
>>> GB)
>>>> in our cluster (4 nodes).
>>>> 
>>>> -Vladimir
>>>> 
>>>> On Thu, Dec 4, 2014 at 10:23 AM, Andrew Purtell <[email protected]>
>>>> wrote:
>>>> 
>>>>> Actually I have set hbase.hstore.
>>>> ​​
>>>> blockingStoreFiles to 200 in testing
>>>>> exactly :-), but must not have generated sufficient load to encounter
>>> the
>>>>> issue you are seeing. Maybe it would be possible to adapt one of the
>>>> ingest
>>>>> integration tests to trigger this problem? Set blockingStoreFiles to
>>> 200
>>>> or
>>>>> more. Tune down the region size to 128K or similar. If
>>>>> it's reproducible like that please open a JIRA.
>>>>> 
>>>>> On Wed, Dec 3, 2014 at 9:07 AM, Vladimir Rodionov <
>>>> [email protected]>
>>>>> wrote:
>>>>> 
>>>>>> Kevin,
>>>>>> 
>>>>>> Thank you for your response. This is not a question on how to
>>> configure
>>>>>> correctly HBase cluster for write heavy workloads. This is internal
>>>> HBase
>>>>>> issue - something is wrong in a default logic of compaction
>> selection
>>>>>> algorithm in 0.94-0.98. It seems that nobody has ever tested
>>> importing
>>>>> data
>>>>>> with very high hbase.hstore.blockingStoreFiles value (200 in our
>>> case).
>>>>>> 
>>>>>> -Vladimir Rodionov
>>>>>> 
>>>>>> On Wed, Dec 3, 2014 at 6:38 AM, Kevin O'dell <
>>> [email protected]
>>>>> 
>>>>>> wrote:
>>>>>> 
>>>>>>> Vladimir,
>>>>>>> 
>>>>>>> I know you said, "do not ask me why", but I am going to have to
>>> ask
>>>>> you
>>>>>>> why.  The fact you are doing this(this being blocking store
>> files >
>>>>> 200)
>>>>>>> tells me there is something or multiple somethings wrong with
>> your
>>>>>> cluster
>>>>>>> setup.  A couple things come to mind:
>>>>>>> 
>>>>>>> * During this heavy write period, could we use bulk loads?  If
>> so,
>>>> this
>>>>>>> should solve almost all of your problems
>>>>>>> 
>>>>>>> * 1GB region size is WAY too small, and if you are pushing the
>>> volume
>>>>> of
>>>>>>> data you are talking about I would recommend 10 - 20GB region
>> sizes
>>>>> this
>>>>>>> should help keep your region count smaller as well which will
>>> result
>>>> in
>>>>>>> more optimal writes
>>>>>>> 
>>>>>>> * Your cluster may be undersized, if you are setting the blocking
>>> to
>>>> be
>>>>>>> that high, you may be pushing too much data for your cluster
>>> overall.
>>>>>>> 
>>>>>>> Would you be so kind as to pass me a few pieces of information?
>>>>>>> 
>>>>>>> 1.) Cluster size
>>>>>>> 2.) Average region count per RS
>>>>>>> 3.) Heap size, Memstore global settings, and block cache settings
>>>>>>> 4.) a RS log to pastebin and a time frame of "high writes"
>>>>>>> 
>>>>>>> I can probably make some solid suggestions for you based on the
>>> above
>>>>>> data.
>>>>>>> 
>>>>>>> On Wed, Dec 3, 2014 at 1:04 AM, Vladimir Rodionov <
>>>>>> [email protected]>
>>>>>>> wrote:
>>>>>>> 
>>>>>>>> This is what we observed in our environment(s)
>>>>>>>> 
>>>>>>>> The issue exists in CDH4.5, 5.1, HDP2.1, Mapr4
>>>>>>>> 
>>>>>>>> If some one sets # of blocking stores way above default value,
>>> say
>>>> -
>>>>>> 200
>>>>>>> to
>>>>>>>> avoid write stalls during intensive data loading (do not ask
>> me ,
>>>> why
>>>>>> we
>>>>>>> do
>>>>>>>> this), then
>>>>>>>> one of the regions grows indefinitely and takes more 99% of
>>> overall
>>>>>>> table.
>>>>>>>> 
>>>>>>>> It can't be split because it still has orphaned reference
>> files.
>>>> Some
>>>>>> of
>>>>>>> a
>>>>>>>> reference files are able to avoid compactions for a long time,
>>>>>> obviously.
>>>>>>>> 
>>>>>>>> The split policy is IncreasingToUpperBound, max region size is
>>> 1G.
>>>> I
>>>>> do
>>>>>>> my
>>>>>>>> tests on CDH4.5 mostly but all other distros seem have the same
>>>>> issue.
>>>>>>>> 
>>>>>>>> My attempt to add reference files forcefully to compaction list
>>> in
>>>>>>>> Store.requetsCompaction() when region exceeds recommended
>> maximum
>>>>> size
>>>>>>> did
>>>>>>>> not work out well - some weird results in our test cases (but
>>> HBase
>>>>>> tests
>>>>>>>> are OK: small, medium and large).
>>>>>>>> 
>>>>>>>> What is so special with these reference files? Any ideas, what
>>> can
>>>> be
>>>>>>> done
>>>>>>>> here to fix the issue?
>>>>>>>> 
>>>>>>>> -Vladimir Rodionov
>>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> --
>>>>>>> Kevin O'Dell
>>>>>>> Systems Engineer, Cloudera
>>>>>>> 
>>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> --
>>>>> Best regards,
>>>>> 
>>>>>   - Andy
>>>>> 
>>>>> Problems worthy of attack prove their worth by hitting back. - Piet
>>> Hein
>>>>> (via Tom White)
>>>>> 
>>>> 
>>> 
>>> 
>>> 
>>> --
>>> Best regards,
>>> 
>>>   - Andy
>>> 
>>> Problems worthy of attack prove their worth by hitting back. - Piet Hein
>>> (via Tom White)
>>> 
>> 
>> 
>> 
>> 
>> 
> 
> 
> 
> -- 
> Best regards,
> 
>   - Andy
> 
> Problems worthy of attack prove their worth by hitting back. - Piet Hein
> (via Tom White)

Reply via email to