[ https://issues.apache.org/jira/browse/HBASE-9740?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13880134#comment-13880134 ]
Lars Hofhansl commented on HBASE-9740: -------------------------------------- The same would happen when we try to open a region with a missing compression codec. Patch looks good to me. I like that the number of attempts is scaled to the number of regionservers. For 0.94, I would feel better if there was a simple test. This is not an issue in 0.96 or later, right? > A corrupt HFile could cause endless attempts to assign the region without a > chance of success > --------------------------------------------------------------------------------------------- > > Key: HBASE-9740 > URL: https://issues.apache.org/jira/browse/HBASE-9740 > Project: HBase > Issue Type: Bug > Affects Versions: 0.94.16 > Reporter: Aditya Kishore > Assignee: Aditya Kishore > Attachments: patch-9740_0.94.txt > > > As described in HBASE-9737, a corrupt HFile in a region could lead to an > assignment storm in the cluster since the Master will keep trying to assign > the region to each region server one after another and obviously none will > succeed. > The region server, upon detecting such a scenario should mark the region as > "RS_ZK_REGION_FAILED_ERROR" (or something to the effect) in the Zookeeper > which should indicate the Master to stop assigning the region until the error > has been resolved (via an HBase shell command, probably "assign"?) -- This message was sent by Atlassian JIRA (v6.1.5#6160)