I increased the max region file size to 4gb so I should have fewer than 200 regions per node now, more like 25. With 2 column families that will be 50 memstores per node. 5.6gb would then flush files of 112mb. Still not close to the memstore limit but shouldn't I be much better off than before?
Inserting sequentially may or may not be an option for me. I am storing a live feed of data from an external source so it could prove tricky. On Feb 6, 2012, at 3:56 PM, Jean-Daniel Cryans wrote: > Good but... > > Keep in mind that if you just increase max filesize and memstore size > without changing anything else then you'll be in the same situation > except with 16GB it'll take just a bit more time to get there. > > Here's the math: > > 200 regions of 2 families means 400 memstores to fill. Assuming a > completely random pattern between all the regions and families, it > means that you're going to fill 400 memstores at the same rate. With > 4GB you hit the memstore lower barrier at 0.35*4=1.4GB at which point > the regions have around 3.5MB each and the bigger one will flush. > Currently we flush whole regions not just families so it would flush 2 > files of about 3.5MB. About 7MB later, another region will flush like > that and so on and so forth. > > Now with 16GB you have 5.6GB which is a lot more room but still you > would flush files that will be 14MB... but it's going to flush before > that unfortunately. By default HBase will keep a maximum of 32 > write-ahead-logs (WAL) each of about 64MB which is almost 2GB total. > Since your pattern is random, each log will contain rows from almost > each region meaning that in order to get rid of the older logs in > order to make room for newer ones it will have to force flush ALL your > regions. And it's gonna happen again 2GB later. > > This is why I recommended that you try to insert sequentially into > only a few regions at a time as this will play more nicely with the > WALs. > > Note that you could set to have bigger WALs or more of them in order > to match the lower barrier (you'd tweak hbase.regionserver.maxlogs and > hbase.regionserver.hlog.blocksize) but it's still not as good as > having a few regions or using less of them at the same time. > > J-D > > On Mon, Feb 6, 2012 at 3:18 PM, Bryan Keller <[email protected]> wrote: >> Yes, insert pattern is random, and yes, the compactions are going through >> the roof. Thanks for pointing me in that direction. I am going to try >> increasing the region max filesize to 4gb (it was set to 512mb) and the >> memstore flush size to 512mb (it was 128mb). I'm also going to increase the >> heap to 16gb (right now it is 4gb). >> >> >> On Feb 6, 2012, at 1:33 PM, Jean-Daniel Cryans wrote: >> >>> Ok this helps, we're still missing your insert pattern regarding but I >>> bet it's pretty random considering what's happening to your cluster. >>> >>> I'm guessing you didn't set up metrics else you would have told us >>> that the compaction queues are through the roof during the import, but >>> at this point I'm pretty sure it's the case. >>> >>> To solve this your choices are: >>> >>> - Do bulk uploads instead of brute forcing it so that you would be >>> entirely skipping those issues. See >>> http://hbase.apache.org/bulk-loads.html >>> - Get that number of regions down to something more manageable; you >>> didn't say how much memory you gave to HBase so I can't say how many >>> exactly you need but it's usually never more than 20. Then set the >>> memstore flush size and max file size accordingly. The goal here is to >>> flush/compact as less as possible. >>> - Keep your current setup, but slow down the insert rate so that data >>> can be compacted over and over again without overrunning your region >>> servers. >>> - Use a more sequential pattern so that you hit only a few regions at >>> a time, this is like the second solution but trying to make it work >>> with your current setup. This might not be practical for you as it >>> really depends on how easily you can sort your data source. >>> >>> Let us know if you need more help, >>> >>> J-D >>> >>> On Mon, Feb 6, 2012 at 1:12 PM, Bryan Keller <[email protected]> wrote: >>>> This is happening during heavy update. I have a "wide" table with around 4 >>>> million rows that have already been inserted. I am adding billions of >>>> columns to the rows. Each row can have 20+k columns. >>>> >>>> I perform the updates in batch, i.e. I am using the HTable.put(List<Put>) >>>> API. The batch size is 1000 Puts. The columns being added are scattered, >>>> e.g. I may add 20 columns to 1000 different rows in each batch. Then in >>>> the next batch add 20 columns to 1000 more rows (which may be the same >>>> rows or different than the previous batch), and so forth. >>>> >>>> BTW, I tried upping the "xcievers" parameter to 8192 but now I'm getting a >>>> "Too many open files" error. I have the file limit set to 32k. >>>> >>>> >>>> On Feb 6, 2012, at 11:59 AM, Jean-Daniel Cryans wrote: >>>> >>>>> The number of regions is the first thing to check, then it's about the >>>>> actual number of blocks opened. Is the issue happening during a heavy >>>>> insert? In this case I guess you could end up with hundreds of opened >>>>> files if the compactions are piling up. Setting a bigger memstore >>>>> flush size would definitely help... but then again if your insert >>>>> pattern is random enough all 200 regions will have filled memstores so >>>>> you'd end up with hundreds of super small files... >>>>> >>>>> Please tell us more about the context of when this issue happens. >>>>> >>>>> J-D >>>>> >>>>> On Mon, Feb 6, 2012 at 11:42 AM, Bryan Keller <[email protected]> wrote: >>>>>> I am trying to resolve an issue with my cluster when I am loading a >>>>>> bunch of data into HBase. I am reaching the "xciever" limit on the data >>>>>> nodes. Currently I have this set to 4096. The data node is logging >>>>>> "xceiverCount 4097 exceeds the limit of concurrent xcievers 4096". The >>>>>> regionservers eventually shut down. I have read the various threads on >>>>>> this issue. >>>>>> >>>>>> I have 4 datanodes/regionservers. Each regionserver has only around 200 >>>>>> regions. The table has 2 column families. I have the region file size >>>>>> set to 500mb, and I'm using Snappy compression. This problem is >>>>>> occurring on HBase 0.90.4 and Hadoop 0.20.2 (both Cloudera cdh3u3). >>>>>> >>>>>> From what I have read, the number of regions on a node can cause the >>>>>> xceiver limit to be reached, but it doesn't seem like I have an >>>>>> excessive number of regions. I want the table to scale higher, so simply >>>>>> upping the xceiver limit could perhaps get my table functional for now, >>>>>> but it seems it will only be a temporary fix. >>>>>> >>>>>> Are number of regions the only factor that can cause this problem, or >>>>>> are there other factors involved that I may be able to adjust? >>>>>> >>>> >>
