[jira] [Commented] (HBASE-6438) RegionAlreadyInTransitionException needs to give more info to avoid assignment inconsistencies
[ https://issues.apache.org/jira/browse/HBASE-6438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13469993#comment-13469993 ] Hudson commented on HBASE-6438: --- Integrated in HBase-0.92-security #143 (See [https://builds.apache.org/job/HBase-0.92-security/143/]) HBASE-6438 Addendum checks regionAlreadyInTransitionException when generating region plan (Chunhui) (Revision 1387210) HBASE-6438 RegionAlreadyInTransitionException needs to give more info to avoid assignment inconsistencies (Rajesh) (Revision 1385204) Result = FAILURE tedyu : Files : * /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java tedyu : Files : * /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java > RegionAlreadyInTransitionException needs to give more info to avoid > assignment inconsistencies > -- > > Key: HBASE-6438 > URL: https://issues.apache.org/jira/browse/HBASE-6438 > Project: HBase > Issue Type: Bug >Reporter: ramkrishna.s.vasudevan >Assignee: rajeshbabu > Fix For: 0.92.3, 0.94.2, 0.96.0 > > Attachments: 6438-0.92.txt, 6438.addendum, 6438-addendum.94, > 6438-trunk_2.patch, HBASE-6438_2.patch, HBASE-6438_94_3.patch, > HBASE-6438_94_4.patch, HBASE-6438_94.patch, HBASE-6438-trunk_2.patch, > HBASE-6438_trunk.patch > > > Seeing some of the recent issues in region assignment, > RegionAlreadyInTransitionException is one reason after which the region > assignment may or may not happen(in the sense we need to wait for the TM to > assign). > In HBASE-6317 we got one problem due to RegionAlreadyInTransitionException on > master restart. > Consider the following case, due to some reason like master restart or > external assign call, we try to assign a region that is already getting > opened in a RS. > Now the next call to assign has already changed the state of the znode and so > the current assign that is going on the RS is affected and it fails. The > second assignment that started also fails getting RAITE exception. Finally > both assignments not carrying on. Idea is to find whether any such RAITE > exception can be retried or not. > Here again we have following cases like where > -> The znode is yet to transitioned from OFFLINE to OPENING in RS > -> RS may be in the step of openRegion. > -> RS may be trying to transition OPENING to OPENED. > -> RS is yet to add to online regions in the RS side. > Here in openRegion() and updateMeta() any failures we are moving the znode to > FAILED_OPEN. So in these cases getting an RAITE should be ok. But in other > cases the assignment is stopped. > The idea is to just add the current state of the region assignment in the RIT > map in the RS side and using that info we can determine whether the > assignment can be retried or not on getting an RAITE. > Considering the current work going on in AM, pls do share if this is needed > atleast in the 0.92/0.94 versions? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6438) RegionAlreadyInTransitionException needs to give more info to avoid assignment inconsistencies
[ https://issues.apache.org/jira/browse/HBASE-6438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13469892#comment-13469892 ] Hudson commented on HBASE-6438: --- Integrated in HBase-0.94-security-on-Hadoop-23 #8 (See [https://builds.apache.org/job/HBase-0.94-security-on-Hadoop-23/8/]) HBASE-6438 Addendum checks regionAlreadyInTransitionException when generating region plan (Chunhui) (Revision 1387209) HBASE-6438 RegionAlreadyInTransitionException needs to give more info to avoid assignment inconsistencies (Rajesh) (Revision 1385209) Result = FAILURE tedyu : Files : * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java tedyu : Files : * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java > RegionAlreadyInTransitionException needs to give more info to avoid > assignment inconsistencies > -- > > Key: HBASE-6438 > URL: https://issues.apache.org/jira/browse/HBASE-6438 > Project: HBase > Issue Type: Bug >Reporter: ramkrishna.s.vasudevan >Assignee: rajeshbabu > Fix For: 0.92.3, 0.94.2, 0.96.0 > > Attachments: 6438-0.92.txt, 6438.addendum, 6438-addendum.94, > 6438-trunk_2.patch, HBASE-6438_2.patch, HBASE-6438_94_3.patch, > HBASE-6438_94_4.patch, HBASE-6438_94.patch, HBASE-6438-trunk_2.patch, > HBASE-6438_trunk.patch > > > Seeing some of the recent issues in region assignment, > RegionAlreadyInTransitionException is one reason after which the region > assignment may or may not happen(in the sense we need to wait for the TM to > assign). > In HBASE-6317 we got one problem due to RegionAlreadyInTransitionException on > master restart. > Consider the following case, due to some reason like master restart or > external assign call, we try to assign a region that is already getting > opened in a RS. > Now the next call to assign has already changed the state of the znode and so > the current assign that is going on the RS is affected and it fails. The > second assignment that started also fails getting RAITE exception. Finally > both assignments not carrying on. Idea is to find whether any such RAITE > exception can be retried or not. > Here again we have following cases like where > -> The znode is yet to transitioned from OFFLINE to OPENING in RS > -> RS may be in the step of openRegion. > -> RS may be trying to transition OPENING to OPENED. > -> RS is yet to add to online regions in the RS side. > Here in openRegion() and updateMeta() any failures we are moving the znode to > FAILED_OPEN. So in these cases getting an RAITE should be ok. But in other > cases the assignment is stopped. > The idea is to just add the current state of the region assignment in the RIT > map in the RS side and using that info we can determine whether the > assignment can be retried or not on getting an RAITE. > Considering the current work going on in AM, pls do share if this is needed > atleast in the 0.92/0.94 versions? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6438) RegionAlreadyInTransitionException needs to give more info to avoid assignment inconsistencies
[ https://issues.apache.org/jira/browse/HBASE-6438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13460753#comment-13460753 ] Hudson commented on HBASE-6438: --- Integrated in HBase-0.94-security #53 (See [https://builds.apache.org/job/HBase-0.94-security/53/]) HBASE-6438 Addendum checks regionAlreadyInTransitionException when generating region plan (Chunhui) (Revision 1387209) HBASE-6438 RegionAlreadyInTransitionException needs to give more info to avoid assignment inconsistencies (Rajesh) (Revision 1385209) Result = SUCCESS tedyu : Files : * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java tedyu : Files : * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java > RegionAlreadyInTransitionException needs to give more info to avoid > assignment inconsistencies > -- > > Key: HBASE-6438 > URL: https://issues.apache.org/jira/browse/HBASE-6438 > Project: HBase > Issue Type: Bug >Reporter: ramkrishna.s.vasudevan >Assignee: rajeshbabu > Fix For: 0.92.3, 0.94.2, 0.96.0 > > Attachments: 6438-0.92.txt, 6438.addendum, 6438-addendum.94, > 6438-trunk_2.patch, HBASE-6438_2.patch, HBASE-6438_94_3.patch, > HBASE-6438_94_4.patch, HBASE-6438_94.patch, HBASE-6438-trunk_2.patch, > HBASE-6438_trunk.patch > > > Seeing some of the recent issues in region assignment, > RegionAlreadyInTransitionException is one reason after which the region > assignment may or may not happen(in the sense we need to wait for the TM to > assign). > In HBASE-6317 we got one problem due to RegionAlreadyInTransitionException on > master restart. > Consider the following case, due to some reason like master restart or > external assign call, we try to assign a region that is already getting > opened in a RS. > Now the next call to assign has already changed the state of the znode and so > the current assign that is going on the RS is affected and it fails. The > second assignment that started also fails getting RAITE exception. Finally > both assignments not carrying on. Idea is to find whether any such RAITE > exception can be retried or not. > Here again we have following cases like where > -> The znode is yet to transitioned from OFFLINE to OPENING in RS > -> RS may be in the step of openRegion. > -> RS may be trying to transition OPENING to OPENED. > -> RS is yet to add to online regions in the RS side. > Here in openRegion() and updateMeta() any failures we are moving the znode to > FAILED_OPEN. So in these cases getting an RAITE should be ok. But in other > cases the assignment is stopped. > The idea is to just add the current state of the region assignment in the RIT > map in the RS side and using that info we can determine whether the > assignment can be retried or not on getting an RAITE. > Considering the current work going on in AM, pls do share if this is needed > atleast in the 0.92/0.94 versions? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6438) RegionAlreadyInTransitionException needs to give more info to avoid assignment inconsistencies
[ https://issues.apache.org/jira/browse/HBASE-6438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13458973#comment-13458973 ] stack commented on HBASE-6438: -- I'd say, leave the patch in trunk. Hopefully we'll do a better fix but until then, keep this improvement. > RegionAlreadyInTransitionException needs to give more info to avoid > assignment inconsistencies > -- > > Key: HBASE-6438 > URL: https://issues.apache.org/jira/browse/HBASE-6438 > Project: HBase > Issue Type: Bug >Reporter: ramkrishna.s.vasudevan >Assignee: rajeshbabu > Fix For: 0.96.0, 0.92.3, 0.94.3 > > Attachments: 6438-0.92.txt, 6438.addendum, 6438-addendum.94, > 6438-trunk_2.patch, HBASE-6438_2.patch, HBASE-6438_94_3.patch, > HBASE-6438_94_4.patch, HBASE-6438_94.patch, HBASE-6438-trunk_2.patch, > HBASE-6438_trunk.patch > > > Seeing some of the recent issues in region assignment, > RegionAlreadyInTransitionException is one reason after which the region > assignment may or may not happen(in the sense we need to wait for the TM to > assign). > In HBASE-6317 we got one problem due to RegionAlreadyInTransitionException on > master restart. > Consider the following case, due to some reason like master restart or > external assign call, we try to assign a region that is already getting > opened in a RS. > Now the next call to assign has already changed the state of the znode and so > the current assign that is going on the RS is affected and it fails. The > second assignment that started also fails getting RAITE exception. Finally > both assignments not carrying on. Idea is to find whether any such RAITE > exception can be retried or not. > Here again we have following cases like where > -> The znode is yet to transitioned from OFFLINE to OPENING in RS > -> RS may be in the step of openRegion. > -> RS may be trying to transition OPENING to OPENED. > -> RS is yet to add to online regions in the RS side. > Here in openRegion() and updateMeta() any failures we are moving the znode to > FAILED_OPEN. So in these cases getting an RAITE should be ok. But in other > cases the assignment is stopped. > The idea is to just add the current state of the region assignment in the RIT > map in the RS side and using that info we can determine whether the > assignment can be retried or not on getting an RAITE. > Considering the current work going on in AM, pls do share if this is needed > atleast in the 0.92/0.94 versions? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6438) RegionAlreadyInTransitionException needs to give more info to avoid assignment inconsistencies
[ https://issues.apache.org/jira/browse/HBASE-6438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13458288#comment-13458288 ] Hudson commented on HBASE-6438: --- Integrated in HBase-TRUNK-on-Hadoop-2.0.0 #180 (See [https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/180/]) HBASE-6438 Addendum checks regionAlreadyInTransitionException when generating region plan (Chunhui) (Revision 1387164) Result = FAILURE tedyu : Files : * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java > RegionAlreadyInTransitionException needs to give more info to avoid > assignment inconsistencies > -- > > Key: HBASE-6438 > URL: https://issues.apache.org/jira/browse/HBASE-6438 > Project: HBase > Issue Type: Bug >Reporter: ramkrishna.s.vasudevan >Assignee: rajeshbabu > Fix For: 0.96.0, 0.92.3, 0.94.3 > > Attachments: 6438-0.92.txt, 6438.addendum, 6438-addendum.94, > 6438-trunk_2.patch, HBASE-6438_2.patch, HBASE-6438_94_3.patch, > HBASE-6438_94_4.patch, HBASE-6438_94.patch, HBASE-6438-trunk_2.patch, > HBASE-6438_trunk.patch > > > Seeing some of the recent issues in region assignment, > RegionAlreadyInTransitionException is one reason after which the region > assignment may or may not happen(in the sense we need to wait for the TM to > assign). > In HBASE-6317 we got one problem due to RegionAlreadyInTransitionException on > master restart. > Consider the following case, due to some reason like master restart or > external assign call, we try to assign a region that is already getting > opened in a RS. > Now the next call to assign has already changed the state of the znode and so > the current assign that is going on the RS is affected and it fails. The > second assignment that started also fails getting RAITE exception. Finally > both assignments not carrying on. Idea is to find whether any such RAITE > exception can be retried or not. > Here again we have following cases like where > -> The znode is yet to transitioned from OFFLINE to OPENING in RS > -> RS may be in the step of openRegion. > -> RS may be trying to transition OPENING to OPENED. > -> RS is yet to add to online regions in the RS side. > Here in openRegion() and updateMeta() any failures we are moving the znode to > FAILED_OPEN. So in these cases getting an RAITE should be ok. But in other > cases the assignment is stopped. > The idea is to just add the current state of the region assignment in the RIT > map in the RS side and using that info we can determine whether the > assignment can be retried or not on getting an RAITE. > Considering the current work going on in AM, pls do share if this is needed > atleast in the 0.92/0.94 versions? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6438) RegionAlreadyInTransitionException needs to give more info to avoid assignment inconsistencies
[ https://issues.apache.org/jira/browse/HBASE-6438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13457968#comment-13457968 ] Hudson commented on HBASE-6438: --- Integrated in HBase-0.92 #580 (See [https://builds.apache.org/job/HBase-0.92/580/]) HBASE-6438 Addendum checks regionAlreadyInTransitionException when generating region plan (Chunhui) (Revision 1387210) Result = SUCCESS tedyu : Files : * /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java > RegionAlreadyInTransitionException needs to give more info to avoid > assignment inconsistencies > -- > > Key: HBASE-6438 > URL: https://issues.apache.org/jira/browse/HBASE-6438 > Project: HBase > Issue Type: Bug >Reporter: ramkrishna.s.vasudevan >Assignee: rajeshbabu > Fix For: 0.96.0, 0.92.3, 0.94.3 > > Attachments: 6438-0.92.txt, 6438.addendum, 6438-addendum.94, > 6438-trunk_2.patch, HBASE-6438_2.patch, HBASE-6438_94_3.patch, > HBASE-6438_94_4.patch, HBASE-6438_94.patch, HBASE-6438-trunk_2.patch, > HBASE-6438_trunk.patch > > > Seeing some of the recent issues in region assignment, > RegionAlreadyInTransitionException is one reason after which the region > assignment may or may not happen(in the sense we need to wait for the TM to > assign). > In HBASE-6317 we got one problem due to RegionAlreadyInTransitionException on > master restart. > Consider the following case, due to some reason like master restart or > external assign call, we try to assign a region that is already getting > opened in a RS. > Now the next call to assign has already changed the state of the znode and so > the current assign that is going on the RS is affected and it fails. The > second assignment that started also fails getting RAITE exception. Finally > both assignments not carrying on. Idea is to find whether any such RAITE > exception can be retried or not. > Here again we have following cases like where > -> The znode is yet to transitioned from OFFLINE to OPENING in RS > -> RS may be in the step of openRegion. > -> RS may be trying to transition OPENING to OPENED. > -> RS is yet to add to online regions in the RS side. > Here in openRegion() and updateMeta() any failures we are moving the znode to > FAILED_OPEN. So in these cases getting an RAITE should be ok. But in other > cases the assignment is stopped. > The idea is to just add the current state of the region assignment in the RIT > map in the RS side and using that info we can determine whether the > assignment can be retried or not on getting an RAITE. > Considering the current work going on in AM, pls do share if this is needed > atleast in the 0.92/0.94 versions? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6438) RegionAlreadyInTransitionException needs to give more info to avoid assignment inconsistencies
[ https://issues.apache.org/jira/browse/HBASE-6438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13457893#comment-13457893 ] Hudson commented on HBASE-6438: --- Integrated in HBase-0.94 #472 (See [https://builds.apache.org/job/HBase-0.94/472/]) HBASE-6438 Addendum checks regionAlreadyInTransitionException when generating region plan (Chunhui) (Revision 1387209) Result = FAILURE tedyu : Files : * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java > RegionAlreadyInTransitionException needs to give more info to avoid > assignment inconsistencies > -- > > Key: HBASE-6438 > URL: https://issues.apache.org/jira/browse/HBASE-6438 > Project: HBase > Issue Type: Bug >Reporter: ramkrishna.s.vasudevan >Assignee: rajeshbabu > Fix For: 0.96.0, 0.92.3, 0.94.3 > > Attachments: 6438-0.92.txt, 6438.addendum, 6438-addendum.94, > 6438-trunk_2.patch, HBASE-6438_2.patch, HBASE-6438_94_3.patch, > HBASE-6438_94_4.patch, HBASE-6438_94.patch, HBASE-6438-trunk_2.patch, > HBASE-6438_trunk.patch > > > Seeing some of the recent issues in region assignment, > RegionAlreadyInTransitionException is one reason after which the region > assignment may or may not happen(in the sense we need to wait for the TM to > assign). > In HBASE-6317 we got one problem due to RegionAlreadyInTransitionException on > master restart. > Consider the following case, due to some reason like master restart or > external assign call, we try to assign a region that is already getting > opened in a RS. > Now the next call to assign has already changed the state of the znode and so > the current assign that is going on the RS is affected and it fails. The > second assignment that started also fails getting RAITE exception. Finally > both assignments not carrying on. Idea is to find whether any such RAITE > exception can be retried or not. > Here again we have following cases like where > -> The znode is yet to transitioned from OFFLINE to OPENING in RS > -> RS may be in the step of openRegion. > -> RS may be trying to transition OPENING to OPENED. > -> RS is yet to add to online regions in the RS side. > Here in openRegion() and updateMeta() any failures we are moving the znode to > FAILED_OPEN. So in these cases getting an RAITE should be ok. But in other > cases the assignment is stopped. > The idea is to just add the current state of the region assignment in the RIT > map in the RS side and using that info we can determine whether the > assignment can be retried or not on getting an RAITE. > Considering the current work going on in AM, pls do share if this is needed > atleast in the 0.92/0.94 versions? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6438) RegionAlreadyInTransitionException needs to give more info to avoid assignment inconsistencies
[ https://issues.apache.org/jira/browse/HBASE-6438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13457873#comment-13457873 ] Ted Yu commented on HBASE-6438: --- Integrated addendum to 0.92 and 0.94 as well. Thanks for the finding, Chunhui. > RegionAlreadyInTransitionException needs to give more info to avoid > assignment inconsistencies > -- > > Key: HBASE-6438 > URL: https://issues.apache.org/jira/browse/HBASE-6438 > Project: HBase > Issue Type: Bug >Reporter: ramkrishna.s.vasudevan >Assignee: rajeshbabu > Fix For: 0.96.0, 0.92.3, 0.94.3 > > Attachments: 6438-0.92.txt, 6438.addendum, 6438-addendum.94, > 6438-trunk_2.patch, HBASE-6438_2.patch, HBASE-6438_94_3.patch, > HBASE-6438_94_4.patch, HBASE-6438_94.patch, HBASE-6438-trunk_2.patch, > HBASE-6438_trunk.patch > > > Seeing some of the recent issues in region assignment, > RegionAlreadyInTransitionException is one reason after which the region > assignment may or may not happen(in the sense we need to wait for the TM to > assign). > In HBASE-6317 we got one problem due to RegionAlreadyInTransitionException on > master restart. > Consider the following case, due to some reason like master restart or > external assign call, we try to assign a region that is already getting > opened in a RS. > Now the next call to assign has already changed the state of the znode and so > the current assign that is going on the RS is affected and it fails. The > second assignment that started also fails getting RAITE exception. Finally > both assignments not carrying on. Idea is to find whether any such RAITE > exception can be retried or not. > Here again we have following cases like where > -> The znode is yet to transitioned from OFFLINE to OPENING in RS > -> RS may be in the step of openRegion. > -> RS may be trying to transition OPENING to OPENED. > -> RS is yet to add to online regions in the RS side. > Here in openRegion() and updateMeta() any failures we are moving the znode to > FAILED_OPEN. So in these cases getting an RAITE should be ok. But in other > cases the assignment is stopped. > The idea is to just add the current state of the region assignment in the RIT > map in the RS side and using that info we can determine whether the > assignment can be retried or not on getting an RAITE. > Considering the current work going on in AM, pls do share if this is needed > atleast in the 0.92/0.94 versions? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6438) RegionAlreadyInTransitionException needs to give more info to avoid assignment inconsistencies
[ https://issues.apache.org/jira/browse/HBASE-6438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13457860#comment-13457860 ] Ted Yu commented on HBASE-6438: --- @Ram: I will wait for Stack's comment. > RegionAlreadyInTransitionException needs to give more info to avoid > assignment inconsistencies > -- > > Key: HBASE-6438 > URL: https://issues.apache.org/jira/browse/HBASE-6438 > Project: HBase > Issue Type: Bug >Reporter: ramkrishna.s.vasudevan >Assignee: rajeshbabu > Fix For: 0.96.0, 0.92.3, 0.94.3 > > Attachments: 6438-0.92.txt, 6438.addendum, 6438-trunk_2.patch, > HBASE-6438_2.patch, HBASE-6438_94_3.patch, HBASE-6438_94_4.patch, > HBASE-6438_94.patch, HBASE-6438-trunk_2.patch, HBASE-6438_trunk.patch > > > Seeing some of the recent issues in region assignment, > RegionAlreadyInTransitionException is one reason after which the region > assignment may or may not happen(in the sense we need to wait for the TM to > assign). > In HBASE-6317 we got one problem due to RegionAlreadyInTransitionException on > master restart. > Consider the following case, due to some reason like master restart or > external assign call, we try to assign a region that is already getting > opened in a RS. > Now the next call to assign has already changed the state of the znode and so > the current assign that is going on the RS is affected and it fails. The > second assignment that started also fails getting RAITE exception. Finally > both assignments not carrying on. Idea is to find whether any such RAITE > exception can be retried or not. > Here again we have following cases like where > -> The znode is yet to transitioned from OFFLINE to OPENING in RS > -> RS may be in the step of openRegion. > -> RS may be trying to transition OPENING to OPENED. > -> RS is yet to add to online regions in the RS side. > Here in openRegion() and updateMeta() any failures we are moving the znode to > FAILED_OPEN. So in these cases getting an RAITE should be ok. But in other > cases the assignment is stopped. > The idea is to just add the current state of the region assignment in the RIT > map in the RS side and using that info we can determine whether the > assignment can be retried or not on getting an RAITE. > Considering the current work going on in AM, pls do share if this is needed > atleast in the 0.92/0.94 versions? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6438) RegionAlreadyInTransitionException needs to give more info to avoid assignment inconsistencies
[ https://issues.apache.org/jira/browse/HBASE-6438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13457857#comment-13457857 ] Hudson commented on HBASE-6438: --- Integrated in HBase-TRUNK #3346 (See [https://builds.apache.org/job/HBase-TRUNK/3346/]) HBASE-6438 Addendum checks regionAlreadyInTransitionException when generating region plan (Chunhui) (Revision 1387164) Result = FAILURE tedyu : Files : * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java > RegionAlreadyInTransitionException needs to give more info to avoid > assignment inconsistencies > -- > > Key: HBASE-6438 > URL: https://issues.apache.org/jira/browse/HBASE-6438 > Project: HBase > Issue Type: Bug >Reporter: ramkrishna.s.vasudevan >Assignee: rajeshbabu > Fix For: 0.96.0, 0.92.3, 0.94.3 > > Attachments: 6438-0.92.txt, 6438.addendum, 6438-trunk_2.patch, > HBASE-6438_2.patch, HBASE-6438_94_3.patch, HBASE-6438_94_4.patch, > HBASE-6438_94.patch, HBASE-6438-trunk_2.patch, HBASE-6438_trunk.patch > > > Seeing some of the recent issues in region assignment, > RegionAlreadyInTransitionException is one reason after which the region > assignment may or may not happen(in the sense we need to wait for the TM to > assign). > In HBASE-6317 we got one problem due to RegionAlreadyInTransitionException on > master restart. > Consider the following case, due to some reason like master restart or > external assign call, we try to assign a region that is already getting > opened in a RS. > Now the next call to assign has already changed the state of the znode and so > the current assign that is going on the RS is affected and it fails. The > second assignment that started also fails getting RAITE exception. Finally > both assignments not carrying on. Idea is to find whether any such RAITE > exception can be retried or not. > Here again we have following cases like where > -> The znode is yet to transitioned from OFFLINE to OPENING in RS > -> RS may be in the step of openRegion. > -> RS may be trying to transition OPENING to OPENED. > -> RS is yet to add to online regions in the RS side. > Here in openRegion() and updateMeta() any failures we are moving the znode to > FAILED_OPEN. So in these cases getting an RAITE should be ok. But in other > cases the assignment is stopped. > The idea is to just add the current state of the region assignment in the RIT > map in the RS side and using that info we can determine whether the > assignment can be retried or not on getting an RAITE. > Considering the current work going on in AM, pls do share if this is needed > atleast in the 0.92/0.94 versions? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6438) RegionAlreadyInTransitionException needs to give more info to avoid assignment inconsistencies
[ https://issues.apache.org/jira/browse/HBASE-6438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13457854#comment-13457854 ] ramkrishna.s.vasudevan commented on HBASE-6438: --- @Ted I was about to give the patch :). Thanks anyway. Do we need to revert the patch from Trunk? See my comment at 17/Sep/12 09:57 and also Stack's comment 25/Aug/12 00:49 > RegionAlreadyInTransitionException needs to give more info to avoid > assignment inconsistencies > -- > > Key: HBASE-6438 > URL: https://issues.apache.org/jira/browse/HBASE-6438 > Project: HBase > Issue Type: Bug >Reporter: ramkrishna.s.vasudevan >Assignee: rajeshbabu > Fix For: 0.96.0, 0.92.3, 0.94.3 > > Attachments: 6438-0.92.txt, 6438.addendum, 6438-trunk_2.patch, > HBASE-6438_2.patch, HBASE-6438_94_3.patch, HBASE-6438_94_4.patch, > HBASE-6438_94.patch, HBASE-6438-trunk_2.patch, HBASE-6438_trunk.patch > > > Seeing some of the recent issues in region assignment, > RegionAlreadyInTransitionException is one reason after which the region > assignment may or may not happen(in the sense we need to wait for the TM to > assign). > In HBASE-6317 we got one problem due to RegionAlreadyInTransitionException on > master restart. > Consider the following case, due to some reason like master restart or > external assign call, we try to assign a region that is already getting > opened in a RS. > Now the next call to assign has already changed the state of the znode and so > the current assign that is going on the RS is affected and it fails. The > second assignment that started also fails getting RAITE exception. Finally > both assignments not carrying on. Idea is to find whether any such RAITE > exception can be retried or not. > Here again we have following cases like where > -> The znode is yet to transitioned from OFFLINE to OPENING in RS > -> RS may be in the step of openRegion. > -> RS may be trying to transition OPENING to OPENED. > -> RS is yet to add to online regions in the RS side. > Here in openRegion() and updateMeta() any failures we are moving the znode to > FAILED_OPEN. So in these cases getting an RAITE should be ok. But in other > cases the assignment is stopped. > The idea is to just add the current state of the region assignment in the RIT > map in the RS side and using that info we can determine whether the > assignment can be retried or not on getting an RAITE. > Considering the current work going on in AM, pls do share if this is needed > atleast in the 0.92/0.94 versions? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6438) RegionAlreadyInTransitionException needs to give more info to avoid assignment inconsistencies
[ https://issues.apache.org/jira/browse/HBASE-6438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13457824#comment-13457824 ] Ted Yu commented on HBASE-6438: --- Addendum integrated to trunk. Will apply similar addendum to 0.92 and 0.94 when I get into office. > RegionAlreadyInTransitionException needs to give more info to avoid > assignment inconsistencies > -- > > Key: HBASE-6438 > URL: https://issues.apache.org/jira/browse/HBASE-6438 > Project: HBase > Issue Type: Bug >Reporter: ramkrishna.s.vasudevan >Assignee: rajeshbabu > Fix For: 0.96.0, 0.92.3, 0.94.3 > > Attachments: 6438-0.92.txt, 6438.addendum, 6438-trunk_2.patch, > HBASE-6438_2.patch, HBASE-6438_94_3.patch, HBASE-6438_94_4.patch, > HBASE-6438_94.patch, HBASE-6438-trunk_2.patch, HBASE-6438_trunk.patch > > > Seeing some of the recent issues in region assignment, > RegionAlreadyInTransitionException is one reason after which the region > assignment may or may not happen(in the sense we need to wait for the TM to > assign). > In HBASE-6317 we got one problem due to RegionAlreadyInTransitionException on > master restart. > Consider the following case, due to some reason like master restart or > external assign call, we try to assign a region that is already getting > opened in a RS. > Now the next call to assign has already changed the state of the znode and so > the current assign that is going on the RS is affected and it fails. The > second assignment that started also fails getting RAITE exception. Finally > both assignments not carrying on. Idea is to find whether any such RAITE > exception can be retried or not. > Here again we have following cases like where > -> The znode is yet to transitioned from OFFLINE to OPENING in RS > -> RS may be in the step of openRegion. > -> RS may be trying to transition OPENING to OPENED. > -> RS is yet to add to online regions in the RS side. > Here in openRegion() and updateMeta() any failures we are moving the znode to > FAILED_OPEN. So in these cases getting an RAITE should be ok. But in other > cases the assignment is stopped. > The idea is to just add the current state of the region assignment in the RIT > map in the RS side and using that info we can determine whether the > assignment can be retried or not on getting an RAITE. > Considering the current work going on in AM, pls do share if this is needed > atleast in the 0.92/0.94 versions? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6438) RegionAlreadyInTransitionException needs to give more info to avoid assignment inconsistencies
[ https://issues.apache.org/jira/browse/HBASE-6438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13457641#comment-13457641 ] ramkrishna.s.vasudevan commented on HBASE-6438: --- @Chunhui Very good catch. Thanks for your review. > RegionAlreadyInTransitionException needs to give more info to avoid > assignment inconsistencies > -- > > Key: HBASE-6438 > URL: https://issues.apache.org/jira/browse/HBASE-6438 > Project: HBase > Issue Type: Bug >Reporter: ramkrishna.s.vasudevan >Assignee: rajeshbabu > Fix For: 0.96.0, 0.92.3, 0.94.3 > > Attachments: 6438-0.92.txt, 6438-trunk_2.patch, HBASE-6438_2.patch, > HBASE-6438_94_3.patch, HBASE-6438_94_4.patch, HBASE-6438_94.patch, > HBASE-6438-trunk_2.patch, HBASE-6438_trunk.patch > > > Seeing some of the recent issues in region assignment, > RegionAlreadyInTransitionException is one reason after which the region > assignment may or may not happen(in the sense we need to wait for the TM to > assign). > In HBASE-6317 we got one problem due to RegionAlreadyInTransitionException on > master restart. > Consider the following case, due to some reason like master restart or > external assign call, we try to assign a region that is already getting > opened in a RS. > Now the next call to assign has already changed the state of the znode and so > the current assign that is going on the RS is affected and it fails. The > second assignment that started also fails getting RAITE exception. Finally > both assignments not carrying on. Idea is to find whether any such RAITE > exception can be retried or not. > Here again we have following cases like where > -> The znode is yet to transitioned from OFFLINE to OPENING in RS > -> RS may be in the step of openRegion. > -> RS may be trying to transition OPENING to OPENED. > -> RS is yet to add to online regions in the RS side. > Here in openRegion() and updateMeta() any failures we are moving the znode to > FAILED_OPEN. So in these cases getting an RAITE should be ok. But in other > cases the assignment is stopped. > The idea is to just add the current state of the region assignment in the RIT > map in the RS side and using that info we can determine whether the > assignment can be retried or not on getting an RAITE. > Considering the current work going on in AM, pls do share if this is needed > atleast in the 0.92/0.94 versions? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6438) RegionAlreadyInTransitionException needs to give more info to avoid assignment inconsistencies
[ https://issues.apache.org/jira/browse/HBASE-6438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13457632#comment-13457632 ] chunhui shen commented on HBASE-6438: - As per the patch, we will reassign the region to same regionServer if regionAlreadyInTransitionException, But in fact, {code} private void assign(final HRegionInfo region, final RegionState state, final boolean setOfflineInZK, final boolean forceNewPlan, boolean hijack) { .. RegionPlan plan = getRegionPlan(state, forceNewPlan); if (plan == null) { LOG.debug("Unable to determine a plan to assign " + state); this.timeoutMonitor.setAllRegionServersOffline(true); return; // Should get reassigned later when RIT times out. } try { LOG.info("Assigning region " + state.getRegion().getRegionNameAsString() + .. } {code} we will get regionPlan(plan = getRegionPlan(state, forceNewPlan);) again before sending open region rpc in the retry, So if forceNewPlan=true, it may assign region to another regionServer even if regionAlreadyInTransitionException=true So, I think we should modify the following code RegionPlan plan = getRegionPlan(state, !regionAlreadyInTransitionException&forceNewPlan); > RegionAlreadyInTransitionException needs to give more info to avoid > assignment inconsistencies > -- > > Key: HBASE-6438 > URL: https://issues.apache.org/jira/browse/HBASE-6438 > Project: HBase > Issue Type: Bug >Reporter: ramkrishna.s.vasudevan >Assignee: rajeshbabu > Fix For: 0.96.0, 0.92.3, 0.94.3 > > Attachments: 6438-0.92.txt, 6438-trunk_2.patch, HBASE-6438_2.patch, > HBASE-6438_94_3.patch, HBASE-6438_94_4.patch, HBASE-6438_94.patch, > HBASE-6438-trunk_2.patch, HBASE-6438_trunk.patch > > > Seeing some of the recent issues in region assignment, > RegionAlreadyInTransitionException is one reason after which the region > assignment may or may not happen(in the sense we need to wait for the TM to > assign). > In HBASE-6317 we got one problem due to RegionAlreadyInTransitionException on > master restart. > Consider the following case, due to some reason like master restart or > external assign call, we try to assign a region that is already getting > opened in a RS. > Now the next call to assign has already changed the state of the znode and so > the current assign that is going on the RS is affected and it fails. The > second assignment that started also fails getting RAITE exception. Finally > both assignments not carrying on. Idea is to find whether any such RAITE > exception can be retried or not. > Here again we have following cases like where > -> The znode is yet to transitioned from OFFLINE to OPENING in RS > -> RS may be in the step of openRegion. > -> RS may be trying to transition OPENING to OPENED. > -> RS is yet to add to online regions in the RS side. > Here in openRegion() and updateMeta() any failures we are moving the znode to > FAILED_OPEN. So in these cases getting an RAITE should be ok. But in other > cases the assignment is stopped. > The idea is to just add the current state of the region assignment in the RIT > map in the RS side and using that info we can determine whether the > assignment can be retried or not on getting an RAITE. > Considering the current work going on in AM, pls do share if this is needed > atleast in the 0.92/0.94 versions? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6438) RegionAlreadyInTransitionException needs to give more info to avoid assignment inconsistencies
[ https://issues.apache.org/jira/browse/HBASE-6438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13456755#comment-13456755 ] ramkrishna.s.vasudevan commented on HBASE-6438: --- @Ted bq.That said, this patch could go into 0.92 and 0.94 and we can fix it better in trunk. As per Stack this patch need not go in to Trunk. Can we revert from Trunk Ted? > RegionAlreadyInTransitionException needs to give more info to avoid > assignment inconsistencies > -- > > Key: HBASE-6438 > URL: https://issues.apache.org/jira/browse/HBASE-6438 > Project: HBase > Issue Type: Bug >Reporter: ramkrishna.s.vasudevan >Assignee: rajeshbabu > Fix For: 0.96.0, 0.92.3, 0.94.3 > > Attachments: 6438-0.92.txt, 6438-trunk_2.patch, HBASE-6438_2.patch, > HBASE-6438_94_3.patch, HBASE-6438_94_4.patch, HBASE-6438_94.patch, > HBASE-6438-trunk_2.patch, HBASE-6438_trunk.patch > > > Seeing some of the recent issues in region assignment, > RegionAlreadyInTransitionException is one reason after which the region > assignment may or may not happen(in the sense we need to wait for the TM to > assign). > In HBASE-6317 we got one problem due to RegionAlreadyInTransitionException on > master restart. > Consider the following case, due to some reason like master restart or > external assign call, we try to assign a region that is already getting > opened in a RS. > Now the next call to assign has already changed the state of the znode and so > the current assign that is going on the RS is affected and it fails. The > second assignment that started also fails getting RAITE exception. Finally > both assignments not carrying on. Idea is to find whether any such RAITE > exception can be retried or not. > Here again we have following cases like where > -> The znode is yet to transitioned from OFFLINE to OPENING in RS > -> RS may be in the step of openRegion. > -> RS may be trying to transition OPENING to OPENED. > -> RS is yet to add to online regions in the RS side. > Here in openRegion() and updateMeta() any failures we are moving the znode to > FAILED_OPEN. So in these cases getting an RAITE should be ok. But in other > cases the assignment is stopped. > The idea is to just add the current state of the region assignment in the RIT > map in the RS side and using that info we can determine whether the > assignment can be retried or not on getting an RAITE. > Considering the current work going on in AM, pls do share if this is needed > atleast in the 0.92/0.94 versions? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6438) RegionAlreadyInTransitionException needs to give more info to avoid assignment inconsistencies
[ https://issues.apache.org/jira/browse/HBASE-6438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13456564#comment-13456564 ] Hudson commented on HBASE-6438: --- Integrated in HBase-TRUNK-on-Hadoop-2.0.0 #176 (See [https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/176/]) HBASE-6438 RegionAlreadyInTransitionException needs to give more info to avoid assignment inconsistencies (Rajesh) (Revision 1385210) Result = FAILURE tedyu : Files : * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java > RegionAlreadyInTransitionException needs to give more info to avoid > assignment inconsistencies > -- > > Key: HBASE-6438 > URL: https://issues.apache.org/jira/browse/HBASE-6438 > Project: HBase > Issue Type: Bug >Reporter: ramkrishna.s.vasudevan >Assignee: rajeshbabu > Fix For: 0.96.0, 0.92.3, 0.94.3 > > Attachments: 6438-0.92.txt, 6438-trunk_2.patch, HBASE-6438_2.patch, > HBASE-6438_94_3.patch, HBASE-6438_94_4.patch, HBASE-6438_94.patch, > HBASE-6438-trunk_2.patch, HBASE-6438_trunk.patch > > > Seeing some of the recent issues in region assignment, > RegionAlreadyInTransitionException is one reason after which the region > assignment may or may not happen(in the sense we need to wait for the TM to > assign). > In HBASE-6317 we got one problem due to RegionAlreadyInTransitionException on > master restart. > Consider the following case, due to some reason like master restart or > external assign call, we try to assign a region that is already getting > opened in a RS. > Now the next call to assign has already changed the state of the znode and so > the current assign that is going on the RS is affected and it fails. The > second assignment that started also fails getting RAITE exception. Finally > both assignments not carrying on. Idea is to find whether any such RAITE > exception can be retried or not. > Here again we have following cases like where > -> The znode is yet to transitioned from OFFLINE to OPENING in RS > -> RS may be in the step of openRegion. > -> RS may be trying to transition OPENING to OPENED. > -> RS is yet to add to online regions in the RS side. > Here in openRegion() and updateMeta() any failures we are moving the znode to > FAILED_OPEN. So in these cases getting an RAITE should be ok. But in other > cases the assignment is stopped. > The idea is to just add the current state of the region assignment in the RIT > map in the RS side and using that info we can determine whether the > assignment can be retried or not on getting an RAITE. > Considering the current work going on in AM, pls do share if this is needed > atleast in the 0.92/0.94 versions? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6438) RegionAlreadyInTransitionException needs to give more info to avoid assignment inconsistencies
[ https://issues.apache.org/jira/browse/HBASE-6438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13456539#comment-13456539 ] Hudson commented on HBASE-6438: --- Integrated in HBase-0.94 #471 (See [https://builds.apache.org/job/HBase-0.94/471/]) HBASE-6438 RegionAlreadyInTransitionException needs to give more info to avoid assignment inconsistencies (Rajesh) (Revision 1385209) Result = FAILURE tedyu : Files : * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java > RegionAlreadyInTransitionException needs to give more info to avoid > assignment inconsistencies > -- > > Key: HBASE-6438 > URL: https://issues.apache.org/jira/browse/HBASE-6438 > Project: HBase > Issue Type: Bug >Reporter: ramkrishna.s.vasudevan >Assignee: rajeshbabu > Fix For: 0.96.0, 0.92.3, 0.94.3 > > Attachments: 6438-0.92.txt, 6438-trunk_2.patch, HBASE-6438_2.patch, > HBASE-6438_94_3.patch, HBASE-6438_94_4.patch, HBASE-6438_94.patch, > HBASE-6438-trunk_2.patch, HBASE-6438_trunk.patch > > > Seeing some of the recent issues in region assignment, > RegionAlreadyInTransitionException is one reason after which the region > assignment may or may not happen(in the sense we need to wait for the TM to > assign). > In HBASE-6317 we got one problem due to RegionAlreadyInTransitionException on > master restart. > Consider the following case, due to some reason like master restart or > external assign call, we try to assign a region that is already getting > opened in a RS. > Now the next call to assign has already changed the state of the znode and so > the current assign that is going on the RS is affected and it fails. The > second assignment that started also fails getting RAITE exception. Finally > both assignments not carrying on. Idea is to find whether any such RAITE > exception can be retried or not. > Here again we have following cases like where > -> The znode is yet to transitioned from OFFLINE to OPENING in RS > -> RS may be in the step of openRegion. > -> RS may be trying to transition OPENING to OPENED. > -> RS is yet to add to online regions in the RS side. > Here in openRegion() and updateMeta() any failures we are moving the znode to > FAILED_OPEN. So in these cases getting an RAITE should be ok. But in other > cases the assignment is stopped. > The idea is to just add the current state of the region assignment in the RIT > map in the RS side and using that info we can determine whether the > assignment can be retried or not on getting an RAITE. > Considering the current work going on in AM, pls do share if this is needed > atleast in the 0.92/0.94 versions? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6438) RegionAlreadyInTransitionException needs to give more info to avoid assignment inconsistencies
[ https://issues.apache.org/jira/browse/HBASE-6438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13456535#comment-13456535 ] Hudson commented on HBASE-6438: --- Integrated in HBase-TRUNK #3338 (See [https://builds.apache.org/job/HBase-TRUNK/3338/]) HBASE-6438 RegionAlreadyInTransitionException needs to give more info to avoid assignment inconsistencies (Rajesh) (Revision 1385210) Result = FAILURE tedyu : Files : * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java > RegionAlreadyInTransitionException needs to give more info to avoid > assignment inconsistencies > -- > > Key: HBASE-6438 > URL: https://issues.apache.org/jira/browse/HBASE-6438 > Project: HBase > Issue Type: Bug >Reporter: ramkrishna.s.vasudevan >Assignee: rajeshbabu > Fix For: 0.96.0, 0.92.3, 0.94.3 > > Attachments: 6438-0.92.txt, 6438-trunk_2.patch, HBASE-6438_2.patch, > HBASE-6438_94_3.patch, HBASE-6438_94_4.patch, HBASE-6438_94.patch, > HBASE-6438-trunk_2.patch, HBASE-6438_trunk.patch > > > Seeing some of the recent issues in region assignment, > RegionAlreadyInTransitionException is one reason after which the region > assignment may or may not happen(in the sense we need to wait for the TM to > assign). > In HBASE-6317 we got one problem due to RegionAlreadyInTransitionException on > master restart. > Consider the following case, due to some reason like master restart or > external assign call, we try to assign a region that is already getting > opened in a RS. > Now the next call to assign has already changed the state of the znode and so > the current assign that is going on the RS is affected and it fails. The > second assignment that started also fails getting RAITE exception. Finally > both assignments not carrying on. Idea is to find whether any such RAITE > exception can be retried or not. > Here again we have following cases like where > -> The znode is yet to transitioned from OFFLINE to OPENING in RS > -> RS may be in the step of openRegion. > -> RS may be trying to transition OPENING to OPENED. > -> RS is yet to add to online regions in the RS side. > Here in openRegion() and updateMeta() any failures we are moving the znode to > FAILED_OPEN. So in these cases getting an RAITE should be ok. But in other > cases the assignment is stopped. > The idea is to just add the current state of the region assignment in the RIT > map in the RS side and using that info we can determine whether the > assignment can be retried or not on getting an RAITE. > Considering the current work going on in AM, pls do share if this is needed > atleast in the 0.92/0.94 versions? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6438) RegionAlreadyInTransitionException needs to give more info to avoid assignment inconsistencies
[ https://issues.apache.org/jira/browse/HBASE-6438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13456530#comment-13456530 ] Hudson commented on HBASE-6438: --- Integrated in HBase-0.92 #578 (See [https://builds.apache.org/job/HBase-0.92/578/]) HBASE-6438 RegionAlreadyInTransitionException needs to give more info to avoid assignment inconsistencies (Rajesh) (Revision 1385204) Result = FAILURE tedyu : Files : * /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java > RegionAlreadyInTransitionException needs to give more info to avoid > assignment inconsistencies > -- > > Key: HBASE-6438 > URL: https://issues.apache.org/jira/browse/HBASE-6438 > Project: HBase > Issue Type: Bug >Reporter: ramkrishna.s.vasudevan >Assignee: rajeshbabu > Fix For: 0.96.0, 0.92.3, 0.94.3 > > Attachments: 6438-0.92.txt, 6438-trunk_2.patch, HBASE-6438_2.patch, > HBASE-6438_94_3.patch, HBASE-6438_94_4.patch, HBASE-6438_94.patch, > HBASE-6438-trunk_2.patch, HBASE-6438_trunk.patch > > > Seeing some of the recent issues in region assignment, > RegionAlreadyInTransitionException is one reason after which the region > assignment may or may not happen(in the sense we need to wait for the TM to > assign). > In HBASE-6317 we got one problem due to RegionAlreadyInTransitionException on > master restart. > Consider the following case, due to some reason like master restart or > external assign call, we try to assign a region that is already getting > opened in a RS. > Now the next call to assign has already changed the state of the znode and so > the current assign that is going on the RS is affected and it fails. The > second assignment that started also fails getting RAITE exception. Finally > both assignments not carrying on. Idea is to find whether any such RAITE > exception can be retried or not. > Here again we have following cases like where > -> The znode is yet to transitioned from OFFLINE to OPENING in RS > -> RS may be in the step of openRegion. > -> RS may be trying to transition OPENING to OPENED. > -> RS is yet to add to online regions in the RS side. > Here in openRegion() and updateMeta() any failures we are moving the znode to > FAILED_OPEN. So in these cases getting an RAITE should be ok. But in other > cases the assignment is stopped. > The idea is to just add the current state of the region assignment in the RIT > map in the RS side and using that info we can determine whether the > assignment can be retried or not on getting an RAITE. > Considering the current work going on in AM, pls do share if this is needed > atleast in the 0.92/0.94 versions? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6438) RegionAlreadyInTransitionException needs to give more info to avoid assignment inconsistencies
[ https://issues.apache.org/jira/browse/HBASE-6438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13456524#comment-13456524 ] Ted Yu commented on HBASE-6438: --- Integrated to 0.92, 0.94 and trunk. Thanks for the patch, Rajesh. Thanks for the review, Stack, Lars and Ram. > RegionAlreadyInTransitionException needs to give more info to avoid > assignment inconsistencies > -- > > Key: HBASE-6438 > URL: https://issues.apache.org/jira/browse/HBASE-6438 > Project: HBase > Issue Type: Bug >Reporter: ramkrishna.s.vasudevan >Assignee: rajeshbabu > Fix For: 0.96.0, 0.92.3, 0.94.3 > > Attachments: 6438-0.92.txt, 6438-trunk_2.patch, HBASE-6438_2.patch, > HBASE-6438_94_3.patch, HBASE-6438_94_4.patch, HBASE-6438_94.patch, > HBASE-6438-trunk_2.patch, HBASE-6438_trunk.patch > > > Seeing some of the recent issues in region assignment, > RegionAlreadyInTransitionException is one reason after which the region > assignment may or may not happen(in the sense we need to wait for the TM to > assign). > In HBASE-6317 we got one problem due to RegionAlreadyInTransitionException on > master restart. > Consider the following case, due to some reason like master restart or > external assign call, we try to assign a region that is already getting > opened in a RS. > Now the next call to assign has already changed the state of the znode and so > the current assign that is going on the RS is affected and it fails. The > second assignment that started also fails getting RAITE exception. Finally > both assignments not carrying on. Idea is to find whether any such RAITE > exception can be retried or not. > Here again we have following cases like where > -> The znode is yet to transitioned from OFFLINE to OPENING in RS > -> RS may be in the step of openRegion. > -> RS may be trying to transition OPENING to OPENED. > -> RS is yet to add to online regions in the RS side. > Here in openRegion() and updateMeta() any failures we are moving the znode to > FAILED_OPEN. So in these cases getting an RAITE should be ok. But in other > cases the assignment is stopped. > The idea is to just add the current state of the region assignment in the RIT > map in the RS side and using that info we can determine whether the > assignment can be retried or not on getting an RAITE. > Considering the current work going on in AM, pls do share if this is needed > atleast in the 0.92/0.94 versions? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6438) RegionAlreadyInTransitionException needs to give more info to avoid assignment inconsistencies
[ https://issues.apache.org/jira/browse/HBASE-6438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13456515#comment-13456515 ] Ted Yu commented on HBASE-6438: --- All tests passed for 0.92 patch. Will integrate tonight if there is no objection. > RegionAlreadyInTransitionException needs to give more info to avoid > assignment inconsistencies > -- > > Key: HBASE-6438 > URL: https://issues.apache.org/jira/browse/HBASE-6438 > Project: HBase > Issue Type: Bug >Reporter: ramkrishna.s.vasudevan >Assignee: rajeshbabu > Fix For: 0.96.0, 0.92.3, 0.94.3 > > Attachments: 6438-0.92.txt, 6438-trunk_2.patch, HBASE-6438_2.patch, > HBASE-6438_94_3.patch, HBASE-6438_94_4.patch, HBASE-6438_94.patch, > HBASE-6438-trunk_2.patch, HBASE-6438_trunk.patch > > > Seeing some of the recent issues in region assignment, > RegionAlreadyInTransitionException is one reason after which the region > assignment may or may not happen(in the sense we need to wait for the TM to > assign). > In HBASE-6317 we got one problem due to RegionAlreadyInTransitionException on > master restart. > Consider the following case, due to some reason like master restart or > external assign call, we try to assign a region that is already getting > opened in a RS. > Now the next call to assign has already changed the state of the znode and so > the current assign that is going on the RS is affected and it fails. The > second assignment that started also fails getting RAITE exception. Finally > both assignments not carrying on. Idea is to find whether any such RAITE > exception can be retried or not. > Here again we have following cases like where > -> The znode is yet to transitioned from OFFLINE to OPENING in RS > -> RS may be in the step of openRegion. > -> RS may be trying to transition OPENING to OPENED. > -> RS is yet to add to online regions in the RS side. > Here in openRegion() and updateMeta() any failures we are moving the znode to > FAILED_OPEN. So in these cases getting an RAITE should be ok. But in other > cases the assignment is stopped. > The idea is to just add the current state of the region assignment in the RIT > map in the RS side and using that info we can determine whether the > assignment can be retried or not on getting an RAITE. > Considering the current work going on in AM, pls do share if this is needed > atleast in the 0.92/0.94 versions? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6438) RegionAlreadyInTransitionException needs to give more info to avoid assignment inconsistencies
[ https://issues.apache.org/jira/browse/HBASE-6438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13456494#comment-13456494 ] Ted Yu commented on HBASE-6438: --- I ran the test suite: {code} [INFO] HBase - Server FAILURE [45:18.213s] [INFO] HBase - Hadoop Two Compatibility .. SKIPPED [INFO] HBase - Integration Tests . SKIPPED [INFO] [INFO] BUILD FAILURE [INFO] [INFO] Total time: 45:27.603s {code} and got one test failure: {code} queueFailover(org.apache.hadoop.hbase.replication.TestReplication): test timed out after 30 milliseconds {code} I don't think the above is related to Rajesh's patch. +1 from me. > RegionAlreadyInTransitionException needs to give more info to avoid > assignment inconsistencies > -- > > Key: HBASE-6438 > URL: https://issues.apache.org/jira/browse/HBASE-6438 > Project: HBase > Issue Type: Bug >Reporter: ramkrishna.s.vasudevan >Assignee: rajeshbabu > Fix For: 0.96.0, 0.92.3, 0.94.3 > > Attachments: 6438-trunk_2.patch, HBASE-6438_2.patch, > HBASE-6438_94_3.patch, HBASE-6438_94_4.patch, HBASE-6438_94.patch, > HBASE-6438-trunk_2.patch, HBASE-6438_trunk.patch > > > Seeing some of the recent issues in region assignment, > RegionAlreadyInTransitionException is one reason after which the region > assignment may or may not happen(in the sense we need to wait for the TM to > assign). > In HBASE-6317 we got one problem due to RegionAlreadyInTransitionException on > master restart. > Consider the following case, due to some reason like master restart or > external assign call, we try to assign a region that is already getting > opened in a RS. > Now the next call to assign has already changed the state of the znode and so > the current assign that is going on the RS is affected and it fails. The > second assignment that started also fails getting RAITE exception. Finally > both assignments not carrying on. Idea is to find whether any such RAITE > exception can be retried or not. > Here again we have following cases like where > -> The znode is yet to transitioned from OFFLINE to OPENING in RS > -> RS may be in the step of openRegion. > -> RS may be trying to transition OPENING to OPENED. > -> RS is yet to add to online regions in the RS side. > Here in openRegion() and updateMeta() any failures we are moving the znode to > FAILED_OPEN. So in these cases getting an RAITE should be ok. But in other > cases the assignment is stopped. > The idea is to just add the current state of the region assignment in the RIT > map in the RS side and using that info we can determine whether the > assignment can be retried or not on getting an RAITE. > Considering the current work going on in AM, pls do share if this is needed > atleast in the 0.92/0.94 versions? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6438) RegionAlreadyInTransitionException needs to give more info to avoid assignment inconsistencies
[ https://issues.apache.org/jira/browse/HBASE-6438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13456440#comment-13456440 ] Ted Yu commented on HBASE-6438: --- I tried compilation against hadoop 2.0: {code} [INFO] [INFO] Reactor Summary: [INFO] [INFO] HBase . SUCCESS [0.264s] [INFO] HBase - Common SUCCESS [6.768s] [INFO] HBase - Hadoop Compatibility .. SUCCESS [0.257s] [INFO] HBase - Hadoop Two Compatibility .. SUCCESS [1.290s] [INFO] HBase - Server SUCCESS [38.820s] [INFO] HBase - Hadoop One Compatibility .. SUCCESS [0.431s] [INFO] HBase - Integration Tests . SUCCESS [0.959s] [INFO] [INFO] BUILD SUCCESS [INFO] [INFO] Total time: 53.259s [INFO] Finished at: Sat Sep 15 09:35:48 PDT 2012 {code} Running test suite locally. > RegionAlreadyInTransitionException needs to give more info to avoid > assignment inconsistencies > -- > > Key: HBASE-6438 > URL: https://issues.apache.org/jira/browse/HBASE-6438 > Project: HBase > Issue Type: Bug >Reporter: ramkrishna.s.vasudevan >Assignee: rajeshbabu > Fix For: 0.96.0, 0.92.3, 0.94.3 > > Attachments: 6438-trunk_2.patch, HBASE-6438_2.patch, > HBASE-6438_94_3.patch, HBASE-6438_94_4.patch, HBASE-6438_94.patch, > HBASE-6438-trunk_2.patch, HBASE-6438_trunk.patch > > > Seeing some of the recent issues in region assignment, > RegionAlreadyInTransitionException is one reason after which the region > assignment may or may not happen(in the sense we need to wait for the TM to > assign). > In HBASE-6317 we got one problem due to RegionAlreadyInTransitionException on > master restart. > Consider the following case, due to some reason like master restart or > external assign call, we try to assign a region that is already getting > opened in a RS. > Now the next call to assign has already changed the state of the znode and so > the current assign that is going on the RS is affected and it fails. The > second assignment that started also fails getting RAITE exception. Finally > both assignments not carrying on. Idea is to find whether any such RAITE > exception can be retried or not. > Here again we have following cases like where > -> The znode is yet to transitioned from OFFLINE to OPENING in RS > -> RS may be in the step of openRegion. > -> RS may be trying to transition OPENING to OPENED. > -> RS is yet to add to online regions in the RS side. > Here in openRegion() and updateMeta() any failures we are moving the znode to > FAILED_OPEN. So in these cases getting an RAITE should be ok. But in other > cases the assignment is stopped. > The idea is to just add the current state of the region assignment in the RIT > map in the RS side and using that info we can determine whether the > assignment can be retried or not on getting an RAITE. > Considering the current work going on in AM, pls do share if this is needed > atleast in the 0.92/0.94 versions? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6438) RegionAlreadyInTransitionException needs to give more info to avoid assignment inconsistencies
[ https://issues.apache.org/jira/browse/HBASE-6438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13456374#comment-13456374 ] rajeshbabu commented on HBASE-6438: --- @Ted, I am not getting any compilation failures with hadoop 2.0 profile. > RegionAlreadyInTransitionException needs to give more info to avoid > assignment inconsistencies > -- > > Key: HBASE-6438 > URL: https://issues.apache.org/jira/browse/HBASE-6438 > Project: HBase > Issue Type: Bug >Reporter: ramkrishna.s.vasudevan >Assignee: rajeshbabu > Fix For: 0.96.0, 0.92.3, 0.94.3 > > Attachments: HBASE-6438_2.patch, HBASE-6438_94_3.patch, > HBASE-6438_94_4.patch, HBASE-6438_94.patch, HBASE-6438-trunk_2.patch, > HBASE-6438_trunk.patch > > > Seeing some of the recent issues in region assignment, > RegionAlreadyInTransitionException is one reason after which the region > assignment may or may not happen(in the sense we need to wait for the TM to > assign). > In HBASE-6317 we got one problem due to RegionAlreadyInTransitionException on > master restart. > Consider the following case, due to some reason like master restart or > external assign call, we try to assign a region that is already getting > opened in a RS. > Now the next call to assign has already changed the state of the znode and so > the current assign that is going on the RS is affected and it fails. The > second assignment that started also fails getting RAITE exception. Finally > both assignments not carrying on. Idea is to find whether any such RAITE > exception can be retried or not. > Here again we have following cases like where > -> The znode is yet to transitioned from OFFLINE to OPENING in RS > -> RS may be in the step of openRegion. > -> RS may be trying to transition OPENING to OPENED. > -> RS is yet to add to online regions in the RS side. > Here in openRegion() and updateMeta() any failures we are moving the znode to > FAILED_OPEN. So in these cases getting an RAITE should be ok. But in other > cases the assignment is stopped. > The idea is to just add the current state of the region assignment in the RIT > map in the RS side and using that info we can determine whether the > assignment can be retried or not on getting an RAITE. > Considering the current work going on in AM, pls do share if this is needed > atleast in the 0.92/0.94 versions? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6438) RegionAlreadyInTransitionException needs to give more info to avoid assignment inconsistencies
[ https://issues.apache.org/jira/browse/HBASE-6438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13456368#comment-13456368 ] Ted Yu commented on HBASE-6438: --- >From https://builds.apache.org/job/PreCommit-HBASE-Build/2881/console: {code} HBASE-6438 patch is being downloaded at Sat Sep 15 10:31:35 UTC 2012 from http://issues.apache.org/jira/secure/attachment/12545274/HBASE-6438-trunk_2.patch ... == Checking against hadoop 2.0 build == == == == Finished build. {code} Maybe the patch doesn't compile against hadoop 2.0 ? > RegionAlreadyInTransitionException needs to give more info to avoid > assignment inconsistencies > -- > > Key: HBASE-6438 > URL: https://issues.apache.org/jira/browse/HBASE-6438 > Project: HBase > Issue Type: Bug >Reporter: ramkrishna.s.vasudevan >Assignee: rajeshbabu > Fix For: 0.96.0, 0.92.3, 0.94.3 > > Attachments: HBASE-6438_2.patch, HBASE-6438_94_3.patch, > HBASE-6438_94_4.patch, HBASE-6438_94.patch, HBASE-6438-trunk_2.patch, > HBASE-6438_trunk.patch > > > Seeing some of the recent issues in region assignment, > RegionAlreadyInTransitionException is one reason after which the region > assignment may or may not happen(in the sense we need to wait for the TM to > assign). > In HBASE-6317 we got one problem due to RegionAlreadyInTransitionException on > master restart. > Consider the following case, due to some reason like master restart or > external assign call, we try to assign a region that is already getting > opened in a RS. > Now the next call to assign has already changed the state of the znode and so > the current assign that is going on the RS is affected and it fails. The > second assignment that started also fails getting RAITE exception. Finally > both assignments not carrying on. Idea is to find whether any such RAITE > exception can be retried or not. > Here again we have following cases like where > -> The znode is yet to transitioned from OFFLINE to OPENING in RS > -> RS may be in the step of openRegion. > -> RS may be trying to transition OPENING to OPENED. > -> RS is yet to add to online regions in the RS side. > Here in openRegion() and updateMeta() any failures we are moving the znode to > FAILED_OPEN. So in these cases getting an RAITE should be ok. But in other > cases the assignment is stopped. > The idea is to just add the current state of the region assignment in the RIT > map in the RS side and using that info we can determine whether the > assignment can be retried or not on getting an RAITE. > Considering the current work going on in AM, pls do share if this is needed > atleast in the 0.92/0.94 versions? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6438) RegionAlreadyInTransitionException needs to give more info to avoid assignment inconsistencies
[ https://issues.apache.org/jira/browse/HBASE-6438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13456366#comment-13456366 ] Hadoop QA commented on HBASE-6438: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12545273/HBASE-6438_94_4.patch against trunk revision . +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. -1 patch. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/2880//console This message is automatically generated. > RegionAlreadyInTransitionException needs to give more info to avoid > assignment inconsistencies > -- > > Key: HBASE-6438 > URL: https://issues.apache.org/jira/browse/HBASE-6438 > Project: HBase > Issue Type: Bug >Reporter: ramkrishna.s.vasudevan >Assignee: rajeshbabu > Fix For: 0.96.0, 0.92.3, 0.94.3 > > Attachments: HBASE-6438_2.patch, HBASE-6438_94_3.patch, > HBASE-6438_94_4.patch, HBASE-6438_94.patch, HBASE-6438-trunk_2.patch, > HBASE-6438_trunk.patch > > > Seeing some of the recent issues in region assignment, > RegionAlreadyInTransitionException is one reason after which the region > assignment may or may not happen(in the sense we need to wait for the TM to > assign). > In HBASE-6317 we got one problem due to RegionAlreadyInTransitionException on > master restart. > Consider the following case, due to some reason like master restart or > external assign call, we try to assign a region that is already getting > opened in a RS. > Now the next call to assign has already changed the state of the znode and so > the current assign that is going on the RS is affected and it fails. The > second assignment that started also fails getting RAITE exception. Finally > both assignments not carrying on. Idea is to find whether any such RAITE > exception can be retried or not. > Here again we have following cases like where > -> The znode is yet to transitioned from OFFLINE to OPENING in RS > -> RS may be in the step of openRegion. > -> RS may be trying to transition OPENING to OPENED. > -> RS is yet to add to online regions in the RS side. > Here in openRegion() and updateMeta() any failures we are moving the znode to > FAILED_OPEN. So in these cases getting an RAITE should be ok. But in other > cases the assignment is stopped. > The idea is to just add the current state of the region assignment in the RIT > map in the RS side and using that info we can determine whether the > assignment can be retried or not on getting an RAITE. > Considering the current work going on in AM, pls do share if this is needed > atleast in the 0.92/0.94 versions? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6438) RegionAlreadyInTransitionException needs to give more info to avoid assignment inconsistencies
[ https://issues.apache.org/jira/browse/HBASE-6438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13456363#comment-13456363 ] rajeshbabu commented on HBASE-6438: --- Submitted trunk patch to Hadoop QA just to run tests > RegionAlreadyInTransitionException needs to give more info to avoid > assignment inconsistencies > -- > > Key: HBASE-6438 > URL: https://issues.apache.org/jira/browse/HBASE-6438 > Project: HBase > Issue Type: Bug >Reporter: ramkrishna.s.vasudevan >Assignee: rajeshbabu > Fix For: 0.96.0, 0.92.3, 0.94.3 > > Attachments: HBASE-6438_2.patch, HBASE-6438_94_3.patch, > HBASE-6438_94_4.patch, HBASE-6438_94.patch, HBASE-6438-trunk_2.patch, > HBASE-6438_trunk.patch > > > Seeing some of the recent issues in region assignment, > RegionAlreadyInTransitionException is one reason after which the region > assignment may or may not happen(in the sense we need to wait for the TM to > assign). > In HBASE-6317 we got one problem due to RegionAlreadyInTransitionException on > master restart. > Consider the following case, due to some reason like master restart or > external assign call, we try to assign a region that is already getting > opened in a RS. > Now the next call to assign has already changed the state of the znode and so > the current assign that is going on the RS is affected and it fails. The > second assignment that started also fails getting RAITE exception. Finally > both assignments not carrying on. Idea is to find whether any such RAITE > exception can be retried or not. > Here again we have following cases like where > -> The znode is yet to transitioned from OFFLINE to OPENING in RS > -> RS may be in the step of openRegion. > -> RS may be trying to transition OPENING to OPENED. > -> RS is yet to add to online regions in the RS side. > Here in openRegion() and updateMeta() any failures we are moving the znode to > FAILED_OPEN. So in these cases getting an RAITE should be ok. But in other > cases the assignment is stopped. > The idea is to just add the current state of the region assignment in the RIT > map in the RS side and using that info we can determine whether the > assignment can be retried or not on getting an RAITE. > Considering the current work going on in AM, pls do share if this is needed > atleast in the 0.92/0.94 versions? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6438) RegionAlreadyInTransitionException needs to give more info to avoid assignment inconsistencies
[ https://issues.apache.org/jira/browse/HBASE-6438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13455969#comment-13455969 ] Ted Yu commented on HBASE-6438: --- @Rajesh: Latest patch looks good. nit: else keyword is not needed below: {code} +return -1; + } else { {code} Please produce patch for trunk and let Hadoop QA run the tests. > RegionAlreadyInTransitionException needs to give more info to avoid > assignment inconsistencies > -- > > Key: HBASE-6438 > URL: https://issues.apache.org/jira/browse/HBASE-6438 > Project: HBase > Issue Type: Bug >Reporter: ramkrishna.s.vasudevan >Assignee: rajeshbabu > Fix For: 0.96.0, 0.92.3, 0.94.3 > > Attachments: HBASE-6438_2.patch, HBASE-6438_94_3.patch, > HBASE-6438_94.patch, HBASE-6438_trunk.patch > > > Seeing some of the recent issues in region assignment, > RegionAlreadyInTransitionException is one reason after which the region > assignment may or may not happen(in the sense we need to wait for the TM to > assign). > In HBASE-6317 we got one problem due to RegionAlreadyInTransitionException on > master restart. > Consider the following case, due to some reason like master restart or > external assign call, we try to assign a region that is already getting > opened in a RS. > Now the next call to assign has already changed the state of the znode and so > the current assign that is going on the RS is affected and it fails. The > second assignment that started also fails getting RAITE exception. Finally > both assignments not carrying on. Idea is to find whether any such RAITE > exception can be retried or not. > Here again we have following cases like where > -> The znode is yet to transitioned from OFFLINE to OPENING in RS > -> RS may be in the step of openRegion. > -> RS may be trying to transition OPENING to OPENED. > -> RS is yet to add to online regions in the RS side. > Here in openRegion() and updateMeta() any failures we are moving the znode to > FAILED_OPEN. So in these cases getting an RAITE should be ok. But in other > cases the assignment is stopped. > The idea is to just add the current state of the region assignment in the RIT > map in the RS side and using that info we can determine whether the > assignment can be retried or not on getting an RAITE. > Considering the current work going on in AM, pls do share if this is needed > atleast in the 0.92/0.94 versions? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6438) RegionAlreadyInTransitionException needs to give more info to avoid assignment inconsistencies
[ https://issues.apache.org/jira/browse/HBASE-6438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13455937#comment-13455937 ] ramkrishna.s.vasudevan commented on HBASE-6438: --- [~jenraj] Thanks for the patch. This patch address only the current JIRA. Maryann's latest patch addresses HBASE-6299. > RegionAlreadyInTransitionException needs to give more info to avoid > assignment inconsistencies > -- > > Key: HBASE-6438 > URL: https://issues.apache.org/jira/browse/HBASE-6438 > Project: HBase > Issue Type: Bug >Reporter: ramkrishna.s.vasudevan >Assignee: rajeshbabu > Fix For: 0.96.0, 0.92.3, 0.94.3 > > Attachments: HBASE-6438_2.patch, HBASE-6438_94_3.patch, > HBASE-6438_94.patch, HBASE-6438_trunk.patch > > > Seeing some of the recent issues in region assignment, > RegionAlreadyInTransitionException is one reason after which the region > assignment may or may not happen(in the sense we need to wait for the TM to > assign). > In HBASE-6317 we got one problem due to RegionAlreadyInTransitionException on > master restart. > Consider the following case, due to some reason like master restart or > external assign call, we try to assign a region that is already getting > opened in a RS. > Now the next call to assign has already changed the state of the znode and so > the current assign that is going on the RS is affected and it fails. The > second assignment that started also fails getting RAITE exception. Finally > both assignments not carrying on. Idea is to find whether any such RAITE > exception can be retried or not. > Here again we have following cases like where > -> The znode is yet to transitioned from OFFLINE to OPENING in RS > -> RS may be in the step of openRegion. > -> RS may be trying to transition OPENING to OPENED. > -> RS is yet to add to online regions in the RS side. > Here in openRegion() and updateMeta() any failures we are moving the znode to > FAILED_OPEN. So in these cases getting an RAITE should be ok. But in other > cases the assignment is stopped. > The idea is to just add the current state of the region assignment in the RIT > map in the RS side and using that info we can determine whether the > assignment can be retried or not on getting an RAITE. > Considering the current work going on in AM, pls do share if this is needed > atleast in the 0.92/0.94 versions? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6438) RegionAlreadyInTransitionException needs to give more info to avoid assignment inconsistencies
[ https://issues.apache.org/jira/browse/HBASE-6438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13455921#comment-13455921 ] Hadoop QA commented on HBASE-6438: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12545165/HBASE-6438_94_3.patch against trunk revision . +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. -1 patch. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/2869//console This message is automatically generated. > RegionAlreadyInTransitionException needs to give more info to avoid > assignment inconsistencies > -- > > Key: HBASE-6438 > URL: https://issues.apache.org/jira/browse/HBASE-6438 > Project: HBase > Issue Type: Bug >Reporter: ramkrishna.s.vasudevan >Assignee: rajeshbabu > Fix For: 0.96.0, 0.92.3, 0.94.3 > > Attachments: HBASE-6438_2.patch, HBASE-6438_94_3.patch, > HBASE-6438_94.patch, HBASE-6438_trunk.patch > > > Seeing some of the recent issues in region assignment, > RegionAlreadyInTransitionException is one reason after which the region > assignment may or may not happen(in the sense we need to wait for the TM to > assign). > In HBASE-6317 we got one problem due to RegionAlreadyInTransitionException on > master restart. > Consider the following case, due to some reason like master restart or > external assign call, we try to assign a region that is already getting > opened in a RS. > Now the next call to assign has already changed the state of the znode and so > the current assign that is going on the RS is affected and it fails. The > second assignment that started also fails getting RAITE exception. Finally > both assignments not carrying on. Idea is to find whether any such RAITE > exception can be retried or not. > Here again we have following cases like where > -> The znode is yet to transitioned from OFFLINE to OPENING in RS > -> RS may be in the step of openRegion. > -> RS may be trying to transition OPENING to OPENED. > -> RS is yet to add to online regions in the RS side. > Here in openRegion() and updateMeta() any failures we are moving the znode to > FAILED_OPEN. So in these cases getting an RAITE should be ok. But in other > cases the assignment is stopped. > The idea is to just add the current state of the region assignment in the RIT > map in the RS side and using that info we can determine whether the > assignment can be retried or not on getting an RAITE. > Considering the current work going on in AM, pls do share if this is needed > atleast in the 0.92/0.94 versions? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6438) RegionAlreadyInTransitionException needs to give more info to avoid assignment inconsistencies
[ https://issues.apache.org/jira/browse/HBASE-6438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13454956#comment-13454956 ] Lars Hofhansl commented on HBASE-6438: -- I'm fine either way. 0.94.2RC0 is nit spun, yet. If we can get this in quickly I can pull it into the that RC. > RegionAlreadyInTransitionException needs to give more info to avoid > assignment inconsistencies > -- > > Key: HBASE-6438 > URL: https://issues.apache.org/jira/browse/HBASE-6438 > Project: HBase > Issue Type: Bug >Reporter: ramkrishna.s.vasudevan >Assignee: rajeshbabu > Fix For: 0.96.0, 0.92.3, 0.94.3 > > Attachments: HBASE-6438_2.patch, HBASE-6438_94.patch, > HBASE-6438_trunk.patch > > > Seeing some of the recent issues in region assignment, > RegionAlreadyInTransitionException is one reason after which the region > assignment may or may not happen(in the sense we need to wait for the TM to > assign). > In HBASE-6317 we got one problem due to RegionAlreadyInTransitionException on > master restart. > Consider the following case, due to some reason like master restart or > external assign call, we try to assign a region that is already getting > opened in a RS. > Now the next call to assign has already changed the state of the znode and so > the current assign that is going on the RS is affected and it fails. The > second assignment that started also fails getting RAITE exception. Finally > both assignments not carrying on. Idea is to find whether any such RAITE > exception can be retried or not. > Here again we have following cases like where > -> The znode is yet to transitioned from OFFLINE to OPENING in RS > -> RS may be in the step of openRegion. > -> RS may be trying to transition OPENING to OPENED. > -> RS is yet to add to online regions in the RS side. > Here in openRegion() and updateMeta() any failures we are moving the znode to > FAILED_OPEN. So in these cases getting an RAITE should be ok. But in other > cases the assignment is stopped. > The idea is to just add the current state of the region assignment in the RIT > map in the RS side and using that info we can determine whether the > assignment can be retried or not on getting an RAITE. > Considering the current work going on in AM, pls do share if this is needed > atleast in the 0.92/0.94 versions? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6438) RegionAlreadyInTransitionException needs to give more info to avoid assignment inconsistencies
[ https://issues.apache.org/jira/browse/HBASE-6438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13454929#comment-13454929 ] Ted Yu commented on HBASE-6438: --- I think separating the fix would make discussion easier. Thanks > RegionAlreadyInTransitionException needs to give more info to avoid > assignment inconsistencies > -- > > Key: HBASE-6438 > URL: https://issues.apache.org/jira/browse/HBASE-6438 > Project: HBase > Issue Type: Bug >Reporter: ramkrishna.s.vasudevan >Assignee: rajeshbabu > Fix For: 0.96.0, 0.92.3, 0.94.3 > > Attachments: HBASE-6438_2.patch, HBASE-6438_94.patch, > HBASE-6438_trunk.patch > > > Seeing some of the recent issues in region assignment, > RegionAlreadyInTransitionException is one reason after which the region > assignment may or may not happen(in the sense we need to wait for the TM to > assign). > In HBASE-6317 we got one problem due to RegionAlreadyInTransitionException on > master restart. > Consider the following case, due to some reason like master restart or > external assign call, we try to assign a region that is already getting > opened in a RS. > Now the next call to assign has already changed the state of the znode and so > the current assign that is going on the RS is affected and it fails. The > second assignment that started also fails getting RAITE exception. Finally > both assignments not carrying on. Idea is to find whether any such RAITE > exception can be retried or not. > Here again we have following cases like where > -> The znode is yet to transitioned from OFFLINE to OPENING in RS > -> RS may be in the step of openRegion. > -> RS may be trying to transition OPENING to OPENED. > -> RS is yet to add to online regions in the RS side. > Here in openRegion() and updateMeta() any failures we are moving the znode to > FAILED_OPEN. So in these cases getting an RAITE should be ok. But in other > cases the assignment is stopped. > The idea is to just add the current state of the region assignment in the RIT > map in the RS side and using that info we can determine whether the > assignment can be retried or not on getting an RAITE. > Considering the current work going on in AM, pls do share if this is needed > atleast in the 0.92/0.94 versions? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6438) RegionAlreadyInTransitionException needs to give more info to avoid assignment inconsistencies
[ https://issues.apache.org/jira/browse/HBASE-6438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13454906#comment-13454906 ] ramkrishna.s.vasudevan commented on HBASE-6438: --- @Lars/@Ted Maryann has come up with a patch for HBASe-6299 where there is no retry on SocketTimeOut. May be if he is fine we can merge both or we can seperate this HBASE-6438 seperately. > RegionAlreadyInTransitionException needs to give more info to avoid > assignment inconsistencies > -- > > Key: HBASE-6438 > URL: https://issues.apache.org/jira/browse/HBASE-6438 > Project: HBase > Issue Type: Bug >Reporter: ramkrishna.s.vasudevan >Assignee: rajeshbabu > Fix For: 0.96.0, 0.92.3, 0.94.3 > > Attachments: HBASE-6438_2.patch, HBASE-6438_94.patch, > HBASE-6438_trunk.patch > > > Seeing some of the recent issues in region assignment, > RegionAlreadyInTransitionException is one reason after which the region > assignment may or may not happen(in the sense we need to wait for the TM to > assign). > In HBASE-6317 we got one problem due to RegionAlreadyInTransitionException on > master restart. > Consider the following case, due to some reason like master restart or > external assign call, we try to assign a region that is already getting > opened in a RS. > Now the next call to assign has already changed the state of the znode and so > the current assign that is going on the RS is affected and it fails. The > second assignment that started also fails getting RAITE exception. Finally > both assignments not carrying on. Idea is to find whether any such RAITE > exception can be retried or not. > Here again we have following cases like where > -> The znode is yet to transitioned from OFFLINE to OPENING in RS > -> RS may be in the step of openRegion. > -> RS may be trying to transition OPENING to OPENED. > -> RS is yet to add to online regions in the RS side. > Here in openRegion() and updateMeta() any failures we are moving the znode to > FAILED_OPEN. So in these cases getting an RAITE should be ok. But in other > cases the assignment is stopped. > The idea is to just add the current state of the region assignment in the RIT > map in the RS side and using that info we can determine whether the > assignment can be retried or not on getting an RAITE. > Considering the current work going on in AM, pls do share if this is needed > atleast in the 0.92/0.94 versions? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6438) RegionAlreadyInTransitionException needs to give more info to avoid assignment inconsistencies
[ https://issues.apache.org/jira/browse/HBASE-6438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13453734#comment-13453734 ] ramkrishna.s.vasudevan commented on HBASE-6438: --- @Lars Thanks for your review. Your comment makes sense. We can update the patch based on this. > RegionAlreadyInTransitionException needs to give more info to avoid > assignment inconsistencies > -- > > Key: HBASE-6438 > URL: https://issues.apache.org/jira/browse/HBASE-6438 > Project: HBase > Issue Type: Bug >Reporter: ramkrishna.s.vasudevan >Assignee: rajeshbabu > Fix For: 0.96.0, 0.92.3, 0.94.2 > > Attachments: HBASE-6438_2.patch, HBASE-6438_94.patch, > HBASE-6438_trunk.patch > > > Seeing some of the recent issues in region assignment, > RegionAlreadyInTransitionException is one reason after which the region > assignment may or may not happen(in the sense we need to wait for the TM to > assign). > In HBASE-6317 we got one problem due to RegionAlreadyInTransitionException on > master restart. > Consider the following case, due to some reason like master restart or > external assign call, we try to assign a region that is already getting > opened in a RS. > Now the next call to assign has already changed the state of the znode and so > the current assign that is going on the RS is affected and it fails. The > second assignment that started also fails getting RAITE exception. Finally > both assignments not carrying on. Idea is to find whether any such RAITE > exception can be retried or not. > Here again we have following cases like where > -> The znode is yet to transitioned from OFFLINE to OPENING in RS > -> RS may be in the step of openRegion. > -> RS may be trying to transition OPENING to OPENED. > -> RS is yet to add to online regions in the RS side. > Here in openRegion() and updateMeta() any failures we are moving the znode to > FAILED_OPEN. So in these cases getting an RAITE should be ok. But in other > cases the assignment is stopped. > The idea is to just add the current state of the region assignment in the RIT > map in the RS side and using that info we can determine whether the > assignment can be retried or not on getting an RAITE. > Considering the current work going on in AM, pls do share if this is needed > atleast in the 0.92/0.94 versions? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6438) RegionAlreadyInTransitionException needs to give more info to avoid assignment inconsistencies
[ https://issues.apache.org/jira/browse/HBASE-6438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13453698#comment-13453698 ] Lars Hofhansl commented on HBASE-6438: -- So the SocketTimeoutException is from HBASE-6299... Still the question remains, why single out this exception? If the region was assigned in the meanwhile we're good - regardless of what the exception. > RegionAlreadyInTransitionException needs to give more info to avoid > assignment inconsistencies > -- > > Key: HBASE-6438 > URL: https://issues.apache.org/jira/browse/HBASE-6438 > Project: HBase > Issue Type: Bug >Reporter: ramkrishna.s.vasudevan >Assignee: rajeshbabu > Fix For: 0.96.0, 0.92.3, 0.94.2 > > Attachments: HBASE-6438_2.patch, HBASE-6438_94.patch, > HBASE-6438_trunk.patch > > > Seeing some of the recent issues in region assignment, > RegionAlreadyInTransitionException is one reason after which the region > assignment may or may not happen(in the sense we need to wait for the TM to > assign). > In HBASE-6317 we got one problem due to RegionAlreadyInTransitionException on > master restart. > Consider the following case, due to some reason like master restart or > external assign call, we try to assign a region that is already getting > opened in a RS. > Now the next call to assign has already changed the state of the znode and so > the current assign that is going on the RS is affected and it fails. The > second assignment that started also fails getting RAITE exception. Finally > both assignments not carrying on. Idea is to find whether any such RAITE > exception can be retried or not. > Here again we have following cases like where > -> The znode is yet to transitioned from OFFLINE to OPENING in RS > -> RS may be in the step of openRegion. > -> RS may be trying to transition OPENING to OPENED. > -> RS is yet to add to online regions in the RS side. > Here in openRegion() and updateMeta() any failures we are moving the znode to > FAILED_OPEN. So in these cases getting an RAITE should be ok. But in other > cases the assignment is stopped. > The idea is to just add the current state of the region assignment in the RIT > map in the RS side and using that info we can determine whether the > assignment can be retried or not on getting an RAITE. > Considering the current work going on in AM, pls do share if this is needed > atleast in the 0.92/0.94 versions? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6438) RegionAlreadyInTransitionException needs to give more info to avoid assignment inconsistencies
[ https://issues.apache.org/jira/browse/HBASE-6438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13453227#comment-13453227 ] Lars Hofhansl commented on HBASE-6438: -- Patch looks good. Why the special check for SocketTimeoutException? Could we just check isRegionInTransition(region), log an INFO and then return (because the region has in fact transitioned)? > RegionAlreadyInTransitionException needs to give more info to avoid > assignment inconsistencies > -- > > Key: HBASE-6438 > URL: https://issues.apache.org/jira/browse/HBASE-6438 > Project: HBase > Issue Type: Bug >Reporter: ramkrishna.s.vasudevan >Assignee: rajeshbabu > Fix For: 0.96.0, 0.92.3, 0.94.2 > > Attachments: HBASE-6438_2.patch, HBASE-6438_94.patch, > HBASE-6438_trunk.patch > > > Seeing some of the recent issues in region assignment, > RegionAlreadyInTransitionException is one reason after which the region > assignment may or may not happen(in the sense we need to wait for the TM to > assign). > In HBASE-6317 we got one problem due to RegionAlreadyInTransitionException on > master restart. > Consider the following case, due to some reason like master restart or > external assign call, we try to assign a region that is already getting > opened in a RS. > Now the next call to assign has already changed the state of the znode and so > the current assign that is going on the RS is affected and it fails. The > second assignment that started also fails getting RAITE exception. Finally > both assignments not carrying on. Idea is to find whether any such RAITE > exception can be retried or not. > Here again we have following cases like where > -> The znode is yet to transitioned from OFFLINE to OPENING in RS > -> RS may be in the step of openRegion. > -> RS may be trying to transition OPENING to OPENED. > -> RS is yet to add to online regions in the RS side. > Here in openRegion() and updateMeta() any failures we are moving the znode to > FAILED_OPEN. So in these cases getting an RAITE should be ok. But in other > cases the assignment is stopped. > The idea is to just add the current state of the region assignment in the RIT > map in the RS side and using that info we can determine whether the > assignment can be retried or not on getting an RAITE. > Considering the current work going on in AM, pls do share if this is needed > atleast in the 0.92/0.94 versions? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6438) RegionAlreadyInTransitionException needs to give more info to avoid assignment inconsistencies
[ https://issues.apache.org/jira/browse/HBASE-6438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13451876#comment-13451876 ] rajeshbabu commented on HBASE-6438: --- @Ted When I ran test suite in my local below test cases are always(without this patch also) failing because of environment problems. I ran failed tests individually in our jenkins multiple times. They are always passing. {code} Failed tests: testPermMask(org.apache.hadoop.hbase.util.TestFSUtils): expected: but was: Tests in error: testCacheOnWriteInSchema[1](org.apache.hadoop.hbase.regionserver.TestCacheOnWriteInSchema): Target HLog directory already exists: /mnt/F/hbase94Com/target/test-data/8a5bb561-edfc-4fab-9358-7ab726cb44fc/TestCacheOnWriteInSchema/logs testCacheOnWriteInSchema[2](org.apache.hadoop.hbase.regionserver.TestCacheOnWriteInSchema): Target HLog directory already exists: /mnt/F/hbase94Com/target/test-data/8a5bb561-edfc-4fab-9358-7ab726cb44fc/TestCacheOnWriteInSchema/logs testWholesomeSplit(org.apache.hadoop.hbase.regionserver.TestSplitTransaction): Failed delete of /mnt/F/hbase94Com/target/test-data/9d7234b4-1f6a-42a7-bbb1-641eb464b7e6/org.apache.hadoop.hbase.regionserver.TestSplitTransaction/table/4bbe087ebab2243b8b9633bb3d870f4c testRollback(org.apache.hadoop.hbase.regionserver.TestSplitTransaction): Failed delete of /mnt/F/hbase94Com/target/test-data/4afca7c8-ee29-47fb-b660-f2ee661bced7/org.apache.hadoop.hbase.regionserver.TestSplitTransaction/table/ad08ee3070175df954844582816d5927 testOffPeakCompactionRatio(org.apache.hadoop.hbase.regionserver.TestCompactSelection): Target HLog directory already exists: /mnt/F/hbase94Com/target/test-data/dd6ca8f4-4321-42d8-825b-fc6a42ab84c0/TestCompactSelection/logs Tests run: 1590, Failures: 1, Errors: 5, Skipped: 12 Running org.apache.hadoop.hbase.regionserver.TestSplitTransaction Tests run: 7, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 8.383 sec Results : Tests run: 7, Failures: 0, Errors: 0, Skipped: 0 Running org.apache.hadoop.hbase.regionserver.TestCompactSelection Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 1.264 sec Results : Tests run: 2, Failures: 0, Errors: 0, Skipped: 0 Running org.apache.hadoop.hbase.regionserver.TestCacheOnWriteInSchema Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 2.564 sec Results : Tests run: 3, Failures: 0, Errors: 0, Skipped: 0 Running org.apache.hadoop.hbase.util.TestFSUtils Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 11.43 sec Results : Tests run: 4, Failures: 0, Errors: 0, Skipped: 0 {code} > RegionAlreadyInTransitionException needs to give more info to avoid > assignment inconsistencies > -- > > Key: HBASE-6438 > URL: https://issues.apache.org/jira/browse/HBASE-6438 > Project: HBase > Issue Type: Bug >Reporter: ramkrishna.s.vasudevan >Assignee: rajeshbabu > Fix For: 0.96.0, 0.92.3, 0.94.2 > > Attachments: HBASE-6438_2.patch, HBASE-6438_94.patch, > HBASE-6438_trunk.patch > > > Seeing some of the recent issues in region assignment, > RegionAlreadyInTransitionException is one reason after which the region > assignment may or may not happen(in the sense we need to wait for the TM to > assign). > In HBASE-6317 we got one problem due to RegionAlreadyInTransitionException on > master restart. > Consider the following case, due to some reason like master restart or > external assign call, we try to assign a region that is already getting > opened in a RS. > Now the next call to assign has already changed the state of the znode and so > the current assign that is going on the RS is affected and it fails. The > second assignment that started also fails getting RAITE exception. Finally > both assignments not carrying on. Idea is to find whether any such RAITE > exception can be retried or not. > Here again we have following cases like where > -> The znode is yet to transitioned from OFFLINE to OPENING in RS > -> RS may be in the step of openRegion. > -> RS may be trying to transition OPENING to OPENED. > -> RS is yet to add to online regions in the RS side. > Here in openRegion() and updateMeta() any failures we are moving the znode to > FAILED_OPEN. So in these cases getting an RAITE should be ok. But in other > cases the assignment is stopped. > The idea is to just add the current state of the region assignment in the RIT > map in the RS side and using that info we can determine whether the > assignment can be retried or not on getting an RAITE. > Considering the current work going on in AM, pls do share if this is needed > atleast in the 0.92/0.94 versions? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, pleas
[jira] [Commented] (HBASE-6438) RegionAlreadyInTransitionException needs to give more info to avoid assignment inconsistencies
[ https://issues.apache.org/jira/browse/HBASE-6438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13451608#comment-13451608 ] Ted Yu commented on HBASE-6438: --- @Rajesh: Do all tests pass ? @Stack, @Lars: Can you take a look ? Thanks > RegionAlreadyInTransitionException needs to give more info to avoid > assignment inconsistencies > -- > > Key: HBASE-6438 > URL: https://issues.apache.org/jira/browse/HBASE-6438 > Project: HBase > Issue Type: Bug >Reporter: ramkrishna.s.vasudevan >Assignee: rajeshbabu > Fix For: 0.96.0, 0.92.3, 0.94.2 > > Attachments: HBASE-6438_2.patch, HBASE-6438_94.patch, > HBASE-6438_trunk.patch > > > Seeing some of the recent issues in region assignment, > RegionAlreadyInTransitionException is one reason after which the region > assignment may or may not happen(in the sense we need to wait for the TM to > assign). > In HBASE-6317 we got one problem due to RegionAlreadyInTransitionException on > master restart. > Consider the following case, due to some reason like master restart or > external assign call, we try to assign a region that is already getting > opened in a RS. > Now the next call to assign has already changed the state of the znode and so > the current assign that is going on the RS is affected and it fails. The > second assignment that started also fails getting RAITE exception. Finally > both assignments not carrying on. Idea is to find whether any such RAITE > exception can be retried or not. > Here again we have following cases like where > -> The znode is yet to transitioned from OFFLINE to OPENING in RS > -> RS may be in the step of openRegion. > -> RS may be trying to transition OPENING to OPENED. > -> RS is yet to add to online regions in the RS side. > Here in openRegion() and updateMeta() any failures we are moving the znode to > FAILED_OPEN. So in these cases getting an RAITE should be ok. But in other > cases the assignment is stopped. > The idea is to just add the current state of the region assignment in the RIT > map in the RS side and using that info we can determine whether the > assignment can be retried or not on getting an RAITE. > Considering the current work going on in AM, pls do share if this is needed > atleast in the 0.92/0.94 versions? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6438) RegionAlreadyInTransitionException needs to give more info to avoid assignment inconsistencies
[ https://issues.apache.org/jira/browse/HBASE-6438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13451560#comment-13451560 ] Hadoop QA commented on HBASE-6438: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12544396/HBASE-6438_2.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. -1 patch. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/2833//console This message is automatically generated. > RegionAlreadyInTransitionException needs to give more info to avoid > assignment inconsistencies > -- > > Key: HBASE-6438 > URL: https://issues.apache.org/jira/browse/HBASE-6438 > Project: HBase > Issue Type: Bug >Reporter: ramkrishna.s.vasudevan >Assignee: rajeshbabu > Fix For: 0.96.0, 0.92.3, 0.94.2 > > Attachments: HBASE-6438_2.patch, HBASE-6438_94.patch, > HBASE-6438_trunk.patch > > > Seeing some of the recent issues in region assignment, > RegionAlreadyInTransitionException is one reason after which the region > assignment may or may not happen(in the sense we need to wait for the TM to > assign). > In HBASE-6317 we got one problem due to RegionAlreadyInTransitionException on > master restart. > Consider the following case, due to some reason like master restart or > external assign call, we try to assign a region that is already getting > opened in a RS. > Now the next call to assign has already changed the state of the znode and so > the current assign that is going on the RS is affected and it fails. The > second assignment that started also fails getting RAITE exception. Finally > both assignments not carrying on. Idea is to find whether any such RAITE > exception can be retried or not. > Here again we have following cases like where > -> The znode is yet to transitioned from OFFLINE to OPENING in RS > -> RS may be in the step of openRegion. > -> RS may be trying to transition OPENING to OPENED. > -> RS is yet to add to online regions in the RS side. > Here in openRegion() and updateMeta() any failures we are moving the znode to > FAILED_OPEN. So in these cases getting an RAITE should be ok. But in other > cases the assignment is stopped. > The idea is to just add the current state of the region assignment in the RIT > map in the RS side and using that info we can determine whether the > assignment can be retried or not on getting an RAITE. > Considering the current work going on in AM, pls do share if this is needed > atleast in the 0.92/0.94 versions? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6438) RegionAlreadyInTransitionException needs to give more info to avoid assignment inconsistencies
[ https://issues.apache.org/jira/browse/HBASE-6438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13450844#comment-13450844 ] Ted Yu commented on HBASE-6438: --- {code} +if (LOG.isDebugEnabled()) { + LOG.debug(t.getMessage()); {code} Please add some text preceding t.getMessage() so that we know the context of the debug log. {code} +// double assignments. In case of RITException reassigning to same RS. {code} Please expand RITException to RegionAlreadyInTransitionException. Same here: {code} + * - true if we need to retry assignment because of RITException. {code} and here: {code} +LOG.debug("Unexpected state : " + state + " but retrying to assign because RITException."); {code} > RegionAlreadyInTransitionException needs to give more info to avoid > assignment inconsistencies > -- > > Key: HBASE-6438 > URL: https://issues.apache.org/jira/browse/HBASE-6438 > Project: HBase > Issue Type: Bug >Reporter: ramkrishna.s.vasudevan >Assignee: rajeshbabu > Fix For: 0.96.0, 0.92.3, 0.94.2 > > Attachments: HBASE-6438_94.patch, HBASE-6438_trunk.patch > > > Seeing some of the recent issues in region assignment, > RegionAlreadyInTransitionException is one reason after which the region > assignment may or may not happen(in the sense we need to wait for the TM to > assign). > In HBASE-6317 we got one problem due to RegionAlreadyInTransitionException on > master restart. > Consider the following case, due to some reason like master restart or > external assign call, we try to assign a region that is already getting > opened in a RS. > Now the next call to assign has already changed the state of the znode and so > the current assign that is going on the RS is affected and it fails. The > second assignment that started also fails getting RAITE exception. Finally > both assignments not carrying on. Idea is to find whether any such RAITE > exception can be retried or not. > Here again we have following cases like where > -> The znode is yet to transitioned from OFFLINE to OPENING in RS > -> RS may be in the step of openRegion. > -> RS may be trying to transition OPENING to OPENED. > -> RS is yet to add to online regions in the RS side. > Here in openRegion() and updateMeta() any failures we are moving the znode to > FAILED_OPEN. So in these cases getting an RAITE should be ok. But in other > cases the assignment is stopped. > The idea is to just add the current state of the region assignment in the RIT > map in the RS side and using that info we can determine whether the > assignment can be retried or not on getting an RAITE. > Considering the current work going on in AM, pls do share if this is needed > atleast in the 0.92/0.94 versions? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6438) RegionAlreadyInTransitionException needs to give more info to avoid assignment inconsistencies
[ https://issues.apache.org/jira/browse/HBASE-6438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13450550#comment-13450550 ] rajeshbabu commented on HBASE-6438: --- Test suite passes locally with this patch. > RegionAlreadyInTransitionException needs to give more info to avoid > assignment inconsistencies > -- > > Key: HBASE-6438 > URL: https://issues.apache.org/jira/browse/HBASE-6438 > Project: HBase > Issue Type: Bug >Reporter: ramkrishna.s.vasudevan >Assignee: rajeshbabu > Attachments: HBASE-6438_94.patch, HBASE-6438_trunk.patch > > > Seeing some of the recent issues in region assignment, > RegionAlreadyInTransitionException is one reason after which the region > assignment may or may not happen(in the sense we need to wait for the TM to > assign). > In HBASE-6317 we got one problem due to RegionAlreadyInTransitionException on > master restart. > Consider the following case, due to some reason like master restart or > external assign call, we try to assign a region that is already getting > opened in a RS. > Now the next call to assign has already changed the state of the znode and so > the current assign that is going on the RS is affected and it fails. The > second assignment that started also fails getting RAITE exception. Finally > both assignments not carrying on. Idea is to find whether any such RAITE > exception can be retried or not. > Here again we have following cases like where > -> The znode is yet to transitioned from OFFLINE to OPENING in RS > -> RS may be in the step of openRegion. > -> RS may be trying to transition OPENING to OPENED. > -> RS is yet to add to online regions in the RS side. > Here in openRegion() and updateMeta() any failures we are moving the znode to > FAILED_OPEN. So in these cases getting an RAITE should be ok. But in other > cases the assignment is stopped. > The idea is to just add the current state of the region assignment in the RIT > map in the RS side and using that info we can determine whether the > assignment can be retried or not on getting an RAITE. > Considering the current work going on in AM, pls do share if this is needed > atleast in the 0.92/0.94 versions? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6438) RegionAlreadyInTransitionException needs to give more info to avoid assignment inconsistencies
[ https://issues.apache.org/jira/browse/HBASE-6438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13450544#comment-13450544 ] rajeshbabu commented on HBASE-6438: --- Patch for 94 addressing Stack's and Ted's comments. In this patch included fix for HBASE-6299 also. Please provide your comments/suggestions. If this patch is ok I will prepare patch for 92 also. > RegionAlreadyInTransitionException needs to give more info to avoid > assignment inconsistencies > -- > > Key: HBASE-6438 > URL: https://issues.apache.org/jira/browse/HBASE-6438 > Project: HBase > Issue Type: Bug >Reporter: ramkrishna.s.vasudevan >Assignee: rajeshbabu > Attachments: HBASE-6438_94.patch, HBASE-6438_trunk.patch > > > Seeing some of the recent issues in region assignment, > RegionAlreadyInTransitionException is one reason after which the region > assignment may or may not happen(in the sense we need to wait for the TM to > assign). > In HBASE-6317 we got one problem due to RegionAlreadyInTransitionException on > master restart. > Consider the following case, due to some reason like master restart or > external assign call, we try to assign a region that is already getting > opened in a RS. > Now the next call to assign has already changed the state of the znode and so > the current assign that is going on the RS is affected and it fails. The > second assignment that started also fails getting RAITE exception. Finally > both assignments not carrying on. Idea is to find whether any such RAITE > exception can be retried or not. > Here again we have following cases like where > -> The znode is yet to transitioned from OFFLINE to OPENING in RS > -> RS may be in the step of openRegion. > -> RS may be trying to transition OPENING to OPENED. > -> RS is yet to add to online regions in the RS side. > Here in openRegion() and updateMeta() any failures we are moving the znode to > FAILED_OPEN. So in these cases getting an RAITE should be ok. But in other > cases the assignment is stopped. > The idea is to just add the current state of the region assignment in the RIT > map in the RS side and using that info we can determine whether the > assignment can be retried or not on getting an RAITE. > Considering the current work going on in AM, pls do share if this is needed > atleast in the 0.92/0.94 versions? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6438) RegionAlreadyInTransitionException needs to give more info to avoid assignment inconsistencies
[ https://issues.apache.org/jira/browse/HBASE-6438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13450543#comment-13450543 ] Hadoop QA commented on HBASE-6438: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12544203/HBASE-6438_94.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. -1 patch. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/2813//console This message is automatically generated. > RegionAlreadyInTransitionException needs to give more info to avoid > assignment inconsistencies > -- > > Key: HBASE-6438 > URL: https://issues.apache.org/jira/browse/HBASE-6438 > Project: HBase > Issue Type: Bug >Reporter: ramkrishna.s.vasudevan >Assignee: rajeshbabu > Attachments: HBASE-6438_94.patch, HBASE-6438_trunk.patch > > > Seeing some of the recent issues in region assignment, > RegionAlreadyInTransitionException is one reason after which the region > assignment may or may not happen(in the sense we need to wait for the TM to > assign). > In HBASE-6317 we got one problem due to RegionAlreadyInTransitionException on > master restart. > Consider the following case, due to some reason like master restart or > external assign call, we try to assign a region that is already getting > opened in a RS. > Now the next call to assign has already changed the state of the znode and so > the current assign that is going on the RS is affected and it fails. The > second assignment that started also fails getting RAITE exception. Finally > both assignments not carrying on. Idea is to find whether any such RAITE > exception can be retried or not. > Here again we have following cases like where > -> The znode is yet to transitioned from OFFLINE to OPENING in RS > -> RS may be in the step of openRegion. > -> RS may be trying to transition OPENING to OPENED. > -> RS is yet to add to online regions in the RS side. > Here in openRegion() and updateMeta() any failures we are moving the znode to > FAILED_OPEN. So in these cases getting an RAITE should be ok. But in other > cases the assignment is stopped. > The idea is to just add the current state of the region assignment in the RIT > map in the RS side and using that info we can determine whether the > assignment can be retried or not on getting an RAITE. > Considering the current work going on in AM, pls do share if this is needed > atleast in the 0.92/0.94 versions? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6438) RegionAlreadyInTransitionException needs to give more info to avoid assignment inconsistencies
[ https://issues.apache.org/jira/browse/HBASE-6438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13448843#comment-13448843 ] ramkrishna.s.vasudevan commented on HBASE-6438: --- @Stack Sorry for missing out this review comment all these days. Actually we would like to get in HBASe-6299 also and this patch. As you mentioned can we give a patch for 0.94 and 0.92 combining both. We faced HBASE-6299 recently in one of our testing. Both should be an useful one. > RegionAlreadyInTransitionException needs to give more info to avoid > assignment inconsistencies > -- > > Key: HBASE-6438 > URL: https://issues.apache.org/jira/browse/HBASE-6438 > Project: HBase > Issue Type: Bug >Reporter: ramkrishna.s.vasudevan >Assignee: rajeshbabu > Attachments: HBASE-6438_trunk.patch > > > Seeing some of the recent issues in region assignment, > RegionAlreadyInTransitionException is one reason after which the region > assignment may or may not happen(in the sense we need to wait for the TM to > assign). > In HBASE-6317 we got one problem due to RegionAlreadyInTransitionException on > master restart. > Consider the following case, due to some reason like master restart or > external assign call, we try to assign a region that is already getting > opened in a RS. > Now the next call to assign has already changed the state of the znode and so > the current assign that is going on the RS is affected and it fails. The > second assignment that started also fails getting RAITE exception. Finally > both assignments not carrying on. Idea is to find whether any such RAITE > exception can be retried or not. > Here again we have following cases like where > -> The znode is yet to transitioned from OFFLINE to OPENING in RS > -> RS may be in the step of openRegion. > -> RS may be trying to transition OPENING to OPENED. > -> RS is yet to add to online regions in the RS side. > Here in openRegion() and updateMeta() any failures we are moving the znode to > FAILED_OPEN. So in these cases getting an RAITE should be ok. But in other > cases the assignment is stopped. > The idea is to just add the current state of the region assignment in the RIT > map in the RS side and using that info we can determine whether the > assignment can be retried or not on getting an RAITE. > Considering the current work going on in AM, pls do share if this is needed > atleast in the 0.92/0.94 versions? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6438) RegionAlreadyInTransitionException needs to give more info to avoid assignment inconsistencies
[ https://issues.apache.org/jira/browse/HBASE-6438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13438556#comment-13438556 ] Hadoop QA commented on HBASE-6438: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12540041/HBASE-6438_trunk.patch against trunk revision . +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 hadoop2.0. The patch compiles against the hadoop 2.0 profile. +1 javadoc. The javadoc tool did not generate any warning messages. -1 javac. The applied patch generated 5 javac compiler warnings (more than the trunk's current 4 warnings). -1 findbugs. The patch appears to introduce 6 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.TestDrainingServer Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/2637//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2637//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2637//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2637//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2637//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2637//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/2637//console This message is automatically generated. > RegionAlreadyInTransitionException needs to give more info to avoid > assignment inconsistencies > -- > > Key: HBASE-6438 > URL: https://issues.apache.org/jira/browse/HBASE-6438 > Project: HBase > Issue Type: Bug >Reporter: ramkrishna.s.vasudevan >Assignee: rajeshbabu > Attachments: HBASE-6438_trunk.patch > > > Seeing some of the recent issues in region assignment, > RegionAlreadyInTransitionException is one reason after which the region > assignment may or may not happen(in the sense we need to wait for the TM to > assign). > In HBASE-6317 we got one problem due to RegionAlreadyInTransitionException on > master restart. > Consider the following case, due to some reason like master restart or > external assign call, we try to assign a region that is already getting > opened in a RS. > Now the next call to assign has already changed the state of the znode and so > the current assign that is going on the RS is affected and it fails. The > second assignment that started also fails getting RAITE exception. Finally > both assignments not carrying on. Idea is to find whether any such RAITE > exception can be retried or not. > Here again we have following cases like where > -> The znode is yet to transitioned from OFFLINE to OPENING in RS > -> RS may be in the step of openRegion. > -> RS may be trying to transition OPENING to OPENED. > -> RS is yet to add to online regions in the RS side. > Here in openRegion() and updateMeta() any failures we are moving the znode to > FAILED_OPEN. So in these cases getting an RAITE should be ok. But in other > cases the assignment is stopped. > The idea is to just add the current state of the region assignment in the RIT > map in the RS side and using that info we can determine whether the > assignment can be retried or not on getting an RAITE. > Considering the current work going on in AM, pls do share if this is needed > atleast in the 0.92/0.94 versions? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6438) RegionAlreadyInTransitionException needs to give more info to avoid assignment inconsistencies
[ https://issues.apache.org/jira/browse/HBASE-6438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13432331#comment-13432331 ] Zhihong Ted Yu commented on HBASE-6438: --- Patch makes sense. {code} +// If region opened on destination of present plan and reassign to new RS may cause {code} nit: ' and reassign to new RS may cause' -> ', reassigning to new RS may cause' > RegionAlreadyInTransitionException needs to give more info to avoid > assignment inconsistencies > -- > > Key: HBASE-6438 > URL: https://issues.apache.org/jira/browse/HBASE-6438 > Project: HBase > Issue Type: Bug >Reporter: ramkrishna.s.vasudevan >Assignee: rajeshbabu > Attachments: HBASE-6438_trunk.patch > > > Seeing some of the recent issues in region assignment, > RegionAlreadyInTransitionException is one reason after which the region > assignment may or may not happen(in the sense we need to wait for the TM to > assign). > In HBASE-6317 we got one problem due to RegionAlreadyInTransitionException on > master restart. > Consider the following case, due to some reason like master restart or > external assign call, we try to assign a region that is already getting > opened in a RS. > Now the next call to assign has already changed the state of the znode and so > the current assign that is going on the RS is affected and it fails. The > second assignment that started also fails getting RAITE exception. Finally > both assignments not carrying on. Idea is to find whether any such RAITE > exception can be retried or not. > Here again we have following cases like where > -> The znode is yet to transitioned from OFFLINE to OPENING in RS > -> RS may be in the step of openRegion. > -> RS may be trying to transition OPENING to OPENED. > -> RS is yet to add to online regions in the RS side. > Here in openRegion() and updateMeta() any failures we are moving the znode to > FAILED_OPEN. So in these cases getting an RAITE should be ok. But in other > cases the assignment is stopped. > The idea is to just add the current state of the region assignment in the RIT > map in the RS side and using that info we can determine whether the > assignment can be retried or not on getting an RAITE. > Considering the current work going on in AM, pls do share if this is needed > atleast in the 0.92/0.94 versions? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6438) RegionAlreadyInTransitionException needs to give more info to avoid assignment inconsistencies
[ https://issues.apache.org/jira/browse/HBASE-6438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13432008#comment-13432008 ] ramkrishna.s.vasudevan commented on HBASE-6438: --- This patch solves the problem of depending on TM in case of RIT exception. Pls provide your suggestions. > RegionAlreadyInTransitionException needs to give more info to avoid > assignment inconsistencies > -- > > Key: HBASE-6438 > URL: https://issues.apache.org/jira/browse/HBASE-6438 > Project: HBase > Issue Type: Bug >Reporter: ramkrishna.s.vasudevan >Assignee: rajeshbabu > Attachments: HBASE-6438_trunk.patch > > > Seeing some of the recent issues in region assignment, > RegionAlreadyInTransitionException is one reason after which the region > assignment may or may not happen(in the sense we need to wait for the TM to > assign). > In HBASE-6317 we got one problem due to RegionAlreadyInTransitionException on > master restart. > Consider the following case, due to some reason like master restart or > external assign call, we try to assign a region that is already getting > opened in a RS. > Now the next call to assign has already changed the state of the znode and so > the current assign that is going on the RS is affected and it fails. The > second assignment that started also fails getting RAITE exception. Finally > both assignments not carrying on. Idea is to find whether any such RAITE > exception can be retried or not. > Here again we have following cases like where > -> The znode is yet to transitioned from OFFLINE to OPENING in RS > -> RS may be in the step of openRegion. > -> RS may be trying to transition OPENING to OPENED. > -> RS is yet to add to online regions in the RS side. > Here in openRegion() and updateMeta() any failures we are moving the znode to > FAILED_OPEN. So in these cases getting an RAITE should be ok. But in other > cases the assignment is stopped. > The idea is to just add the current state of the region assignment in the RIT > map in the RS side and using that info we can determine whether the > assignment can be retried or not on getting an RAITE. > Considering the current work going on in AM, pls do share if this is needed > atleast in the 0.92/0.94 versions? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira