Re: HBASE -- RS expire?

2012-07-06 Thread registration
:41:52 -0700 Subject:Re: HBASE -- RS expire? Is your ZK managed by HBase or are you managing it yourself? BTW - All ZK nodes should be reachable by all nodes in the cluster. The YouAreDeadException would be in RS logs if at all. On Thursday, July 5, 2012 at 9:38 PM, Jay Wilson wrote

RE: HBASE -- RS expire?

2012-07-06 Thread Pablo Musa
Is your ZK managed by HBase or are you managing it yourself? Is it a good or bad option? What are the pros and cons about each one. Thanks, Pablo

HBASE -- RS expire?

2012-07-05 Thread Jay Wilson
Finally my HMaster has stabilized and been running for 7 hours. I believe my networking issues are behind me now. Thank you everyone for the help. New issue is my RSes continue to die after about 20 minutes. Again the cluster is idle. No jobs are running and I get this on all of my RSes at

Re: HBASE -- RS expire?

2012-07-05 Thread Amandeep Khurana
On Thursday, July 5, 2012 at 8:25 PM, Jay Wilson wrote: Finally my HMaster has stabilized and been running for 7 hours. I believe my networking issues are behind me now. Thank you everyone for the help. Awesome. Looks like the same issue is biting you with the RS too. The RS isn't

Re: HBASE -- RS expire?

2012-07-05 Thread Jay Wilson
I don't see that in the RS logs. Would I see that in the ZK logs? At this point there is no network. Just a switch. I reduced the number of nodes to 40 and had all of them placed on the same switch with a single vlan. I even had the network techs use a completely different switch just to be

Re: HBASE -- RS expire?

2012-07-05 Thread Amandeep Khurana
The timeout can be configured using the session timeout configuration. The default for that is 180s, but that means that if the RS doesn't heartbeat to ZK for 180s, it's considered dead. Unless the machines are really loaded or GCs are pausing the RS processes, I don't see any other reason

Re: HBASE -- RS expire?

2012-07-05 Thread Jay Wilson
Funny you mention that. I asked the techs to set it up that why. I went to pull my ZK logs and found that 1 RS is still running. What is interesting is that RS is connected to ZK on devrackA-05. The 2 RSes that died where connected to ZK on devrackA-03. devrackA-03 has ZK and HMaster on it.

Re: HBASE -- RS expire?

2012-07-05 Thread Amandeep Khurana
Is your ZK managed by HBase or are you managing it yourself? BTW - All ZK nodes should be reachable by all nodes in the cluster. The YouAreDeadException would be in RS logs if at all. On Thursday, July 5, 2012 at 9:38 PM, Jay Wilson wrote: Funny you mention that. I asked the techs to set it