[ 
https://issues.apache.org/jira/browse/HBASE-776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12617124#action_12617124
 ] 

stack commented on HBASE-776:
-----------------------------

Andrew, can you run with DEBUG level logging -- we might get a clue why it goes 
'deaf' -- and can you up your ulimit for file descriptors from 1024 (because 
out-of-FDs can manifest in weird ways)?  See FAQ for how.

A socket timeout when scanning meta should provoke relocation of meta I'd say.  
Need to dig more.  From your logs, it happens over and over and the master just 
sits there dumb.

Regards pounding META, you'd think the META info cached.  Let me do loading 
here and log the META region accesses.  How many clients you running?

> Master not reassigning .META. from failed/failing regionserver
> --------------------------------------------------------------
>
>                 Key: HBASE-776
>                 URL: https://issues.apache.org/jira/browse/HBASE-776
>             Project: Hadoop HBase
>          Issue Type: Bug
>          Components: master
>    Affects Versions: 0.2.0
>         Environment: CentOS x86_64, JDK 1.6, Hadoop 0.17.1, HBase 0.2.0, 
> r679585, Fri Jul 25 16:47:26 UTC 2008
>            Reporter: Andrew Purtell
>         Attachments: hbase-hadoop-master-sjdc-atr-dc-1.log, 
> hbase-hadoop-regionserver-sjdc-atr-dc-13.log, master_gui.png
>
>
> In our environment sometimes the regionserver carrying META is also assigned 
> to the 'content' table, into which objects retrieved from Internet crawling 
> is stored. For unclear reason the regionserver occasionally goes "deaf" 
> (seperate issue) and when this happens META is no longer available. The 
> master then never reassigns META, so the whole cluster is down from this 
> point and does not recover. Logs attached.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to