Re: xceiver count, regionserver shutdown

Jean-Daniel Cryans Mon, 06 Feb 2012 15:56:29 -0800

Good but...

Keep in mind that if you just increase max filesize and memstore size
without changing anything else then you'll be in the same situation
except with 16GB it'll take just a bit more time to get there.


Here's the math:

200 regions of 2 families means 400 memstores to fill. Assuming a
completely random pattern between all the regions and families, it
means that you're going to fill 400 memstores at the same rate. With
4GB you hit the memstore lower barrier at 0.35*4=1.4GB at which point
the regions have around 3.5MB each and the bigger one will flush.
Currently we flush whole regions not just families so it would flush 2
files of about 3.5MB. About 7MB later, another region will flush like
that and so on and so forth.

Now with 16GB you have 5.6GB which is a lot more room but still you
would flush files that will be 14MB... but it's going to flush before
that unfortunately. By default HBase will keep a maximum of 32
write-ahead-logs (WAL) each of about 64MB which is almost 2GB total.
Since your pattern is random, each log will contain rows from almost
each region meaning that in order to get rid of the older logs in
order to make room for newer ones it will have to force flush ALL your
regions. And it's gonna happen again 2GB later.

This is why I recommended that you try to insert sequentially into
only a few regions at a time as this will play more nicely with the
WALs.

Note that you could set to have bigger WALs or more of them in order
to match the lower barrier (you'd tweak hbase.regionserver.maxlogs and
hbase.regionserver.hlog.blocksize) but it's still not as good as
having a few regions or using less of them at the same time.

J-D

On Mon, Feb 6, 2012 at 3:18 PM, Bryan Keller <[email protected]> wrote:
> Yes, insert pattern is random, and yes, the compactions are going through the 
> roof. Thanks for pointing me in that direction.  I am going to try increasing 
> the region max filesize to 4gb (it was set to 512mb) and the memstore flush 
> size to 512mb (it was 128mb). I'm also going to increase the heap to 16gb 
> (right now it is 4gb).
>
>
> On Feb 6, 2012, at 1:33 PM, Jean-Daniel Cryans wrote:
>
>> Ok this helps, we're still missing your insert pattern regarding but I
>> bet it's pretty random considering what's happening to your cluster.
>>
>> I'm guessing you didn't set up metrics else you would have told us
>> that the compaction queues are through the roof during the import, but
>> at this point I'm pretty sure it's the case.
>>
>> To solve this your choices are:
>>
>> - Do bulk uploads instead of brute forcing it so that you would be
>> entirely skipping those issues. See
>> http://hbase.apache.org/bulk-loads.html
>> - Get that number of regions down to something more manageable; you
>> didn't say how much memory you gave to HBase so I can't say how many
>> exactly you need but it's usually never more than 20. Then set the
>> memstore flush size and max file size accordingly. The goal here is to
>> flush/compact as less as possible.
>> - Keep your current setup, but slow down the insert rate so that data
>> can be compacted over and over again without overrunning your region
>> servers.
>> - Use a more sequential pattern so that you hit only a few regions at
>> a time, this is like the second solution but trying to make it work
>> with your current setup. This might not be practical for you as it
>> really depends on how easily you can sort your data source.
>>
>> Let us know if you need more help,
>>
>> J-D
>>
>> On Mon, Feb 6, 2012 at 1:12 PM, Bryan Keller <[email protected]> wrote:
>>> This is happening during heavy update. I have a "wide" table with around 4 
>>> million rows that have already been inserted. I am adding billions of 
>>> columns to the rows. Each row can have 20+k columns.
>>>
>>> I perform the updates in batch, i.e. I am using the HTable.put(List<Put>) 
>>> API. The batch size is 1000 Puts. The columns being added are scattered, 
>>> e.g. I may add 20 columns to 1000 different rows in each batch. Then in the 
>>> next batch add 20 columns to 1000 more rows (which may be the same rows or 
>>> different than the previous batch), and so forth.
>>>
>>> BTW, I tried upping the "xcievers" parameter to 8192 but now I'm getting a 
>>> "Too many open files" error. I have the file limit set to 32k.
>>>
>>>
>>> On Feb 6, 2012, at 11:59 AM, Jean-Daniel Cryans wrote:
>>>
>>>> The number of regions is the first thing to check, then it's about the
>>>> actual number of blocks opened. Is the issue happening during a heavy
>>>> insert? In this case I guess you could end up with hundreds of opened
>>>> files if the compactions are piling up. Setting a bigger memstore
>>>> flush size would definitely help... but then again if your insert
>>>> pattern is random enough all 200 regions will have filled memstores so
>>>> you'd end up with hundreds of super small files...
>>>>
>>>> Please tell us more about the context of when this issue happens.
>>>>
>>>> J-D
>>>>
>>>> On Mon, Feb 6, 2012 at 11:42 AM, Bryan Keller <[email protected]> wrote:
>>>>> I am trying to resolve an issue with my cluster when I am loading a bunch 
>>>>> of data into HBase. I am reaching the "xciever" limit on the data nodes. 
>>>>> Currently I have this set to 4096. The data node is logging "xceiverCount 
>>>>> 4097 exceeds the limit of concurrent xcievers 4096". The regionservers 
>>>>> eventually shut down. I have read the various threads on this issue.
>>>>>
>>>>> I have 4 datanodes/regionservers. Each regionserver has only around 200 
>>>>> regions. The table has 2 column families. I have the region file size set 
>>>>> to 500mb, and I'm using Snappy compression. This problem is occurring on 
>>>>> HBase 0.90.4 and Hadoop 0.20.2 (both Cloudera cdh3u3).
>>>>>
>>>>> From what I have read, the number of regions on a node can cause the 
>>>>> xceiver limit to be reached, but it doesn't seem like I have an excessive 
>>>>> number of regions. I want the table to scale higher, so simply upping the 
>>>>> xceiver limit could perhaps get my table functional for now, but it seems 
>>>>> it will only be a temporary fix.
>>>>>
>>>>> Are number of regions the only factor that can cause this problem, or are 
>>>>> there other factors involved that I may be able to adjust?
>>>>>
>>>
>

Re: xceiver count, regionserver shutdown

Reply via email to