@Flavio:
One thing you should consider: is CPU over saturated?

For instance, if one server has 24 cores, and the task tracker is configured to 
also execute 24 MR simultaneously, then there is no core left for HBase to do 
GC, and then crash.

Few weeks ago my region server sometimes crash when MR is running, then I 
decide to move MR cluster to another machine leaving only DataNode + 
RegionServer running.
After change, my regionserver still running without crash.

My suggestion is you could try to decrease the MR task numbers to around 12 or 
lesser, and see if the frequency of crash decrease?

Best regards,
Henry Hung

-----Original Message-----
From: Dhaval Shah [mailto:prince_mithi...@yahoo.co.in]
Sent: Tuesday, May 27, 2014 8:03 AM
To: user@hbase.apache.org
Subject: Re: HBase cluster design

A few things pop out to me on cursory glance:
- You are using CMSIncrementalMode which after a long chain of events has a 
tendency to result in the famous Juliet pause of death. Can you try Par New GC 
instead and see if that helps?
- You should try to reduce the CMSInitiatingOccupancyFraction to avoid a full GC
- Your hbase-env.sh is not setting the Xmx at all. Do you know how much RAM you 
are giving to your region servers? It may be too small or too large given your 
use case and machines size
- Your client scanner caching is 1 which may be too small depending on your row 
sizes. You can also override that setting in your scan for the MR job
- You only have 2 zookeeper instances which is not at all recommended. 
Zookeeper needs a quorum to operate and generally works best with an odd number 
of zookeeper servers. This probably isn't related to your crashes but it would 
help stability if you had 1 or 3 zookeepers
- I am not 100% sure if the version of hbase you are using has mslab enabled. 
If not you should enable it.
- You can try increasing/decreasing the amount of RAM you provide to block 
caches and memstores to suit your use case. I see that you are using the 
defaults here

On top of these, when you kick off your MR job to scan HBase you should 
setCacheBlocks to false



Regards,
Dhaval


________________________________
 From: Flavio Pompermaier <pomperma...@okkam.it>
To: user@hbase.apache.org; Dhaval Shah <prince_mithi...@yahoo.co.in>
Sent: Friday, 23 May 2014 3:16 AM
Subject: Re: HBase cluster design



The hardware specs are: 4 nodes with 48g RAM, 24 cores and 1 TB disk each server
Attached my hbase config files.

Thanks,
Flavio





On Fri, May 23, 2014 at 3:33 AM, Dhaval Shah <prince_mithi...@yahoo.co.in> 
wrote:

Can you share your hbase-env.sh and hbase-site.xml? And hardware specs of your 
cluster?
>
>
>
>
>Regards,
>Dhaval
>
>
>________________________________
> From: Flavio Pompermaier <pomperma...@okkam.it>
>To: user@hbase.apache.org
>Sent: Saturday, 17 May 2014 2:49 AM
>Subject: Re: HBase cluster design
>
>
>
>Could you tell me please in detail the parameters you'd like to see so i
>can look for them and learn the important ones?i'm using cloudera, cdh4 in
>one cluster and cdh5 in the other.
>
>Best,
>Flavio
>
>On May 17, 2014 2:48 AM, "prince_mithi...@yahoo.co.in" <
>prince_mithi...@yahoo.co.in> wrote:
>
>> Can you describe your setup in more detail? Specifically the amount of
>> heap hbase region servers have and your GC settings. Is your server
>> swapping when your MR obs are running? Also do your regions go down or your
>> region servers?
>>
>> We run many MR jobs simultaneously on our hbase tables (size is in TBs)
>> along with serving real time requests at the same time. So I can vouch for
>> the fact that a well tuned hbase cluster definitely supports this use case
>> (well-tuned is the key word here)
>>
>> Sent from Yahoo Mail on Android
>>
>>

The privileged confidential information contained in this email is intended for 
use only by the addressees as indicated by the original sender of this email. 
If you are not the addressee indicated in this email or are not responsible for 
delivery of the email to such a person, please kindly reply to the sender 
indicating this fact and delete all copies of it from your computer and network 
server immediately. Your cooperation is highly appreciated. It is advised that 
any unauthorized use of confidential information of Winbond is strictly 
prohibited; and any information in this email irrelevant to the official 
business of Winbond shall be deemed as neither given nor endorsed by Winbond.

Reply via email to