For the 200GB push on GB, I definitively recommend you to look at the BulkLoading. It will be way more efficient.
For the block cache being 25%, it's still low. Total of the 2 should be 80%. You need to increase them. I'm sure the logs will tell use even more. JMS 2016-03-18 18:42 GMT-04:00 Frank Luo <j...@merkleinc.com>: > Sorry, I sent you the config from the Master. > > > > Here is the conf from a region server. The block cache is set to 0.25. > > > > Usage pattern: we have all kinds of read/writes for different clients at > different time. But the one under stress is to perform several reads then a > direct put. We typically shot down region servers when trying to put in > 200GB worth of data on a table with 200 regions with this kind of pattern. > > > > I will find a log file and send to you, Jean. > > > > *From:* Frank Luo > *Sent:* Friday, March 18, 2016 5:11 PM > *To:* 'Jean-Marc Spaggiari'; user > *Subject:* RE: is it a good idea to disable tables not currently hot? > > > > Config attached. > > > > *From:* Jean-Marc Spaggiari [mailto:jean-m...@spaggiari.org > <jean-m...@spaggiari.org>] > *Sent:* Friday, March 18, 2016 3:33 PM > *To:* user > *Cc:* Frank Luo > *Subject:* Re: is it a good idea to disable tables not currently hot? > > > > Indeed ;) Frank if you can past the entire config file somewhere it might > help. > > JMS > > > > 2016-03-18 16:30 GMT-04:00 Ted Yu <yuzhih...@gmail.com>: > > bq. By default memsotre is 40%. Here it's 24% > > bq. Memstore.lowerLimit=0.24 > > J-M: > Looks like you misread the config Frank listed. > > > On Fri, Mar 18, 2016 at 12:36 PM, Jean-Marc Spaggiari < > jean-m...@spaggiari.org> wrote: > > > By default memsotre is 40%. Here it's 24%. There is a lot you might want > to > > look at on your cluster and usecase :( > > > > 1) You might have long pause GCs causing issues. Think about offheap > cache > > and reduce heap to less than 20GB > > 2) Way to many regions. Think about your usecases and tables design to > > reduce that. Increase region size to 10GB. > > 3) Increase your memstore to 40%. If your usecase is mostly puts and you > > have issues with that, increase it. > > 4) Take a look at your flush size. It's useless to increase it to 256MB > if > > you are already flushing only few KBs at a time. > > 5) etc. :( > > > > JMS > > > > 2016-03-18 15:26 GMT-04:00 Frank Luo <j...@merkleinc.com>: > > > > > Anil/Jean, > > > > > > Thanks for the tips. Very helpful. > > > > > > To answer your question. I just checked, the Region server's heap is > 32G, > > > instead of 36G as I previously stated, but it is in the same range and > I > > do > > > see long pause on GC. > > > > > > I think the reason it was set to a high value was that we used to have > > > 2000 regions per server, before we increased region file size from the > > > default to compressed 5G. > > > > > > So what should be the heap right size given 5G file size, 400 region > per > > > server on an 80 node cluster? > > > > > > At this time, I think memstore related setting are all defaults from > HDP. > > > > > > Flush.size = 128M > > > Memstore.lowerLimit=0.24 > > > Memstore.upperLimit=0.25 > > > Hstore.blocking.storefiles=10 > > > > > > -----Original Message----- > > > From: anil gupta [mailto:anilgupt...@gmail.com] > > > Sent: Friday, March 18, 2016 12:37 PM > > > To: user@hbase.apache.org > > > Subject: Re: is it a good idea to disable tables not currently hot? > > > > > > @Frank, regarding write amplification: > > > 1. What is your flush size? default is 128 MB. You should increase your > > > "hbase.hregion.memstore.flush.size" so that you dont run over the limit > > of > > > store files. > > > 2. Have a look at "hbase.regionserver.global.memstore.lowerLimit". > > > 3. Your heap size is also too big. Maybe you also run into GC issues. > > Have > > > you checked your GC logs? > > > 4. IMO, writes getting blocks at 9 files might be very less for a big > > > Region Server. So, you can also consider increasing that. > > > > > > On Fri, Mar 18, 2016 at 10:22 AM, Frank Luo <j...@merkleinc.com> > wrote: > > > > > > > Ted, > > > > > > > > Thanks for sharing. I learned something today. > > > > > > > > But I guess it doesn't apply to my case. It is true that I only run > a > > > > client for a few hours in a day, but the data is not date based. > > > > > > > > -----Original Message----- > > > > From: Ted Yu [mailto:yuzhih...@gmail.com] > > > > Sent: Friday, March 18, 2016 12:10 PM > > > > To: user@hbase.apache.org > > > > Subject: Re: is it a good idea to disable tables not currently hot? > > > > > > > > Frank: > > > > Can you take a look at the following to see if it may help with your > > > > use > > > > case(s) ? > > > > > > > > HBASE-15181 A simple implementation of date based tiered compaction > > > > > > > > Cheers > > > > > > > > On Fri, Mar 18, 2016 at 9:58 AM, Frank Luo <j...@merkleinc.com> > wrote: > > > > > > > > > There are two reasons I am hesitating going that route. > > > > > > > > > > One is that most of tables are fairly small. Going to 10GB will > > > > > force tables to shrink to some nodes but not evenly distributed > > > > > around the cluster, hence discouraging parallelism. But I think I > > > > > can manage this issue if the second is resolved. > > > > > > > > > > The second issue, which I have battled with for two years now, is > > > > > that I am doing online puts, which occasionally triggers compacts > > > > > when a region is heavily inserted, and whenever it happens, all > > > > > subsequent read/write are all on hold and I can see time out error > > > > > on the client side. A typical compact runs for 4 minutes now and I > > > > > have to increase timeout on a number of places to accommodate that. > > > > > So if I increase the size to 10 GB, will compact time double? > > > > > > > > > > -----Original Message----- > > > > > From: Jean-Marc Spaggiari [mailto:jean-m...@spaggiari.org] > > > > > Sent: Friday, March 18, 2016 11:34 AM > > > > > To: user > > > > > Subject: Re: is it a good idea to disable tables not currently hot? > > > > > > > > > > So you can safely increase your maximum region size to 10GB, which > > > > > will divide the number of regions by 2. When you will be on 1.1.2 > > > > > you can also do online merge to reduce this number of regions. The > > > > > might > > > > help too. > > > > > > > > > > JMS > > > > > > > > > > 2016-03-18 12:32 GMT-04:00 Frank Luo <j...@merkleinc.com>: > > > > > > > > > > > 0.98 on hdp 2.2 currently. > > > > > > > > > > > > Soon will be on hdp2.3.4, which has HBase 1.1.2. > > > > > > > > > > > > -----Original Message----- > > > > > > From: Jean-Marc Spaggiari [mailto:jean-m...@spaggiari.org] > > > > > > Sent: Friday, March 18, 2016 11:29 AM > > > > > > To: user > > > > > > Subject: Re: is it a good idea to disable tables not currently > hot? > > > > > > > > > > > > Hi Frank, > > > > > > > > > > > > It might be doable. > > > > > > > > > > > > What HBase version are you running? > > > > > > > > > > > > JMS > > > > > > > > > > > > 2016-03-18 12:25 GMT-04:00 Frank Luo <j...@merkleinc.com>: > > > > > > > > > > > > > No one has experience disabling tables? > > > > > > > > > > > > > > -----Original Message----- > > > > > > > From: Frank Luo [mailto:j...@merkleinc.com] > > > > > > > Sent: Thursday, March 17, 2016 4:51 PM > > > > > > > To: user@hbase.apache.org > > > > > > > Subject: is it a good idea to disable tables not currently hot? > > > > > > > > > > > > > > We have a multi tenants environment and each client occupies x > > > > > > > number of hbase regions. We currently have about 500 regions > per > > > > > > > region server and I understand the guideline is less than 200. > > > > > > > So we need to reduce the region counts. Increasing region file > > > > > > > size is no more an option because we are already at 5G and I > > > > > > > don’t want to go > > > > > higher. > > > > > > > > > > > > > > Due to our unique use cases, all clients are running for a few > > > > > > > hours in a day, then being quiet for the rest of time. So I am > > > > > > > thinking whether it is a good idea to disable all quiet tables > > > > > > > and only enable them when they are ready to run. Does anyone > > > > > > > have experience on > > > > > that? > > > > > > > > > > > > > > One thing I worry about is the Balancer. I am pretty sure the > > > > > > > balancer will be confused when regions come and go. And I > cannot > > > > > > > afford not to have it running in case of region server crashes > > > > > > > and come back. So doesn’t anyone have good ideas how to handle > > it? > > > > > > > > > > > > > > I already doing compact myself so that is not an issue. > > > > > > > > > > > > > > Another related question, if a region is enabled but not active > > > > > > > read/write, how much resources it takes in terms of region > > server? > > > > > > > > > > > > > > Thanks! > > > > > > > > > > > > > > Frank Luo > > > > > > > > > > > > > > > > > > > Merkle was named a leader in Customer Insights Services Providers > > > > > > by Forrester Research < > > > > > > > http://www.merkleinc.com/who-we-are-customer-relationship-marketin > > > > > > g- > > > > > > ag > > > > > > > ency/awards-recognition/merkle-named-leader-forrester?utm_source=e > > > > > > ma il > > > > > > footer&utm_medium=email&utm_campaign=2016MonthlyEmployeeFooter > > > > > > > > > > > > > > > > > > > Forrester Research report names 500friends, a Merkle Company, a > > > > > > leader in customer Loyalty Solutions for Midsize Organizations< > > > > > > > http://www.merkleinc.com/who-we-are-customer-relationship-marketin > > > > > > g- > > > > > > ag > > > > > > > ency/awards-recognition/500friends-merkle-company-named?utm_source > > > > > > =e ma > > > > > > ilfooter&utm_medium=email&utm_campaign=2016MonthlyEmployeeFooter > > > > > > > > > > > > > This email and any attachments transmitted with it are intended > > > > > > for use by the intended recipient(s) only. If you have received > > > > > > this email in error, please notify the sender immediately and > then > > > > > > delete it. If you are not the intended recipient, you must not > > > > > > keep, use, disclose, copy or distribute this email without the > > > > > > author’s prior > > > > permission. > > > > > > We take precautions to minimize the risk of transmitting software > > > > > > viruses, but we advise you to perform your own virus checks on > any > > > > > > attachment to this message. We cannot accept liability for any > > > > > > loss or damage caused by software viruses. The information > > > > > > contained in this communication may be confidential and may be > > > > > > subject to the > > > > > attorney-client privilege. > > > > > > > > > > > Merkle was named a leader in Customer Insights Services Providers > by > > > > > Forrester Research < > > > > > > http://www.merkleinc.com/who-we-are-customer-relationship-marketing- > > > > > ag > > > > > > ency/awards-recognition/merkle-named-leader-forrester?utm_source=ema > > > > > il footer&utm_medium=email&utm_campaign=2016MonthlyEmployeeFooter > > > > > > > > > > > > > > > > Forrester Research report names 500friends, a Merkle Company, a > > > > > leader in customer Loyalty Solutions for Midsize Organizations< > > > > > > http://www.merkleinc.com/who-we-are-customer-relationship-marketing- > > > > > ag > > > > > > ency/awards-recognition/500friends-merkle-company-named?utm_source=e > > > > > ma ilfooter&utm_medium=email&utm_campaign=2016MonthlyEmployeeFooter > > > > > > > > > > > This email and any attachments transmitted with it are intended for > > > > > use by the intended recipient(s) only. If you have received this > > > > > email in error, please notify the sender immediately and then > delete > > > > > it. If you are not the intended recipient, you must not keep, use, > > > > > disclose, copy or distribute this email without the author’s prior > > > permission. > > > > > We take precautions to minimize the risk of transmitting software > > > > > viruses, but we advise you to perform your own virus checks on any > > > > > attachment to this message. We cannot accept liability for any loss > > > > > or damage caused by software viruses. The information contained in > > > > > this communication may be confidential and may be subject to the > > > > attorney-client privilege. > > > > > > > > > Merkle was named a leader in Customer Insights Services Providers by > > > > Forrester Research < > > > > > http://www.merkleinc.com/who-we-are-customer-relationship-marketing-ag > > > > > ency/awards-recognition/merkle-named-leader-forrester?utm_source=email > > > > footer&utm_medium=email&utm_campaign=2016MonthlyEmployeeFooter > > > > > > > > > > > > > Forrester Research report names 500friends, a Merkle Company, a > leader > > > > in customer Loyalty Solutions for Midsize Organizations< > > > > > http://www.merkleinc.com/who-we-are-customer-relationship-marketing-ag > > > > > ency/awards-recognition/500friends-merkle-company-named?utm_source=ema > > > > ilfooter&utm_medium=email&utm_campaign=2016MonthlyEmployeeFooter > > > > > > > > > This email and any attachments transmitted with it are intended for > > > > use by the intended recipient(s) only. If you have received this > email > > > > in error, please notify the sender immediately and then delete it. If > > > > you are not the intended recipient, you must not keep, use, disclose, > > > > copy or distribute this email without the author’s prior permission. > > > > We take precautions to minimize the risk of transmitting software > > > > viruses, but we advise you to perform your own virus checks on any > > > > attachment to this message. We cannot accept liability for any loss > or > > > > damage caused by software viruses. The information contained in this > > > > communication may be confidential and may be subject to the > > > attorney-client privilege. > > > > > > > > > > > > > > > > -- > > > Thanks & Regards, > > > Anil Gupta > > > Merkle was named a leader in Customer Insights Services Providers by > > > Forrester Research > > > < > > > > > > http://www.merkleinc.com/who-we-are-customer-relationship-marketing-agency/awards-recognition/merkle-named-leader-forrester?utm_source=emailfooter&utm_medium=email&utm_campaign=2016MonthlyEmployeeFooter > > > > > > > > > > Forrester Research report names 500friends, a Merkle Company, a leader > in > > > customer Loyalty Solutions for Midsize Organizations< > > > > > > http://www.merkleinc.com/who-we-are-customer-relationship-marketing-agency/awards-recognition/500friends-merkle-company-named?utm_source=emailfooter&utm_medium=email&utm_campaign=2016MonthlyEmployeeFooter > > > > > > > This email and any attachments transmitted with it are intended for use > > by > > > the intended recipient(s) only. If you have received this email in > error, > > > please notify the sender immediately and then delete it. If you are not > > the > > > intended recipient, you must not keep, use, disclose, copy or > distribute > > > this email without the author’s prior permission. We take precautions > to > > > minimize the risk of transmitting software viruses, but we advise you > to > > > perform your own virus checks on any attachment to this message. We > > cannot > > > accept liability for any loss or damage caused by software viruses. The > > > information contained in this communication may be confidential and may > > be > > > subject to the attorney-client privilege. > > > > > > > > > > *Merkle was named a leader in Customer Insights Services Providers by > Forrester Research * > <http://www.merkleinc.com/who-we-are-customer-relationship-marketing-agency/awards-recognition/merkle-named-leader-forrester?utm_source=emailfooter&utm_medium=email&utm_campaign=2016MonthlyEmployeeFooter> > > *Forrester Research report names 500friends, a Merkle Company, a leader in > customer Loyalty Solutions for Midsize Organizations* > <http://www.merkleinc.com/who-we-are-customer-relationship-marketing-agency/awards-recognition/500friends-merkle-company-named?utm_source=emailfooter&utm_medium=email&utm_campaign=2016MonthlyEmployeeFooter> > > This email and any attachments transmitted with it are intended for use by > the intended recipient(s) only. If you have received this email in error, > please notify the sender immediately and then delete it. If you are not the > intended recipient, you must not keep, use, disclose, copy or distribute > this email without the author’s prior permission. We take precautions to > minimize the risk of transmitting software viruses, but we advise you to > perform your own virus checks on any attachment to this message. We cannot > accept liability for any loss or damage caused by software viruses. The > information contained in this communication may be confidential and may be > subject to the attorney-client privilege. >