[ https://issues.apache.org/jira/browse/HBASE-9740?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13791665#comment-13791665 ]
ramkrishna.s.vasudevan commented on HBASE-9740: ----------------------------------------------- >>moving the region offline and raising some sort of alarm which users can >>monitor would be preferable. I agree. It is more preferable. Tomorrow I can just attach a patch that I was saying. > A corrupt HFile could cause endless attempts to assign the region without a > chance of success > --------------------------------------------------------------------------------------------- > > Key: HBASE-9740 > URL: https://issues.apache.org/jira/browse/HBASE-9740 > Project: HBase > Issue Type: Bug > Reporter: Aditya Kishore > Assignee: Aditya Kishore > > As described in HBASE-9737, a corrupt HFile in a region could lead to an > assignment storm in the cluster since the Master will keep trying to assign > the region to each region server one after another and obviously none will > succeed. > The region server, upon detecting such a scenario should mark the region as > "RS_ZK_REGION_FAILED_ERROR" (or something to the effect) in the Zookeeper > which should indicate the Master to stop assigning the region until the error > has been resolved (via an HBase shell command, probably "assign"?) -- This message was sent by Atlassian JIRA (v6.1#6144)