[jira] [Commented] (HBASE-7034) Bad version, failed OPENING to OPENED but master thinks it is open anyways
[ https://issues.apache.org/jira/browse/HBASE-7034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13570983#comment-13570983 ] Hudson commented on HBASE-7034: --- Integrated in HBase-0.94-security-on-Hadoop-23 #11 (See [https://builds.apache.org/job/HBase-0.94-security-on-Hadoop-23/11/]) HBASE-7034 Bad version, failed OPENING to OPENED but master thinks it is open anyways (Anoop) (Revision 1434797) Result = FAILURE tedyu : Files : * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/zookeeper/RecoverableZooKeeper.java * /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/zookeeper/TestRecoverableZooKeeper.java Bad version, failed OPENING to OPENED but master thinks it is open anyways -- Key: HBASE-7034 URL: https://issues.apache.org/jira/browse/HBASE-7034 Project: HBase Issue Type: Bug Components: Region Assignment Affects Versions: 0.94.2 Reporter: stack Assignee: Anoop Sam John Fix For: 0.96.0, 0.94.5 Attachments: HBASE-7034_94.patch, HBASE-7034_94_V2.patch, HBASE-7034_Test_Trunk.patch, TestRecoverableZooKeeper.java I have this in RS log: {code} 2012-10-22 02:21:50,698 ERROR org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler: Failed transitioning node b9,\xEE\xAE\x9BiQO\x89]+a\xE0\x7F\xB7'X?,1349052737638.9af7cfc9b15910a0b3d714bf40a3248f. from OPENING to OPENED -- closing region org.apache.zookeeper.KeeperException$BadVersionException: KeeperErrorCode = BadVersion for /hbase/unassigned/9af7cfc9b15910a0b3d714bf40a3248f {code} Master says this (it is bulk assigning): {code} 2012-10-22 02:21:40,673 DEBUG org.apache.hadoop.hbase.zookeeper.ZKUtil: master:10302-0xb3a862e57a503ba Set watcher on existing znode /hbase/unassigned/9af7cfc9b15910a0b3d714bf40a3248f ... then this 2012-10-22 02:23:47,089 DEBUG org.apache.hadoop.hbase.zookeeper.ZKUtil: master:10302-0xb3a862e57a503ba Set watcher on existing znode /hbase/unassigned/9af7cfc9b15910a0b3d714bf40a3248f 2012-10-22 02:24:34,176 DEBUG org.apache.hadoop.hbase.zookeeper.ZKUtil: master:10302-0xb3a862e57a503ba Retrieved 112 byte(s) of data from znode /hbase/unassigned/9af7cfc9b15910a0b3d714bf40a3248f and set watcher; region=b9,\xEE\xAE\x9BiQO\x89]+a\xE0\x7F\xB7'X?,1349052737638.9af7cfc9b15910a0b3d714bf40a3248f., origin=sv4r17s44,10304,1350872216778, state=RS_ZK_REGION_OPENED etc. {code} Disagreement as to what is going on here. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7034) Bad version, failed OPENING to OPENED but master thinks it is open anyways
[ https://issues.apache.org/jira/browse/HBASE-7034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13561725#comment-13561725 ] Hudson commented on HBASE-7034: --- Integrated in HBase-0.94-security #96 (See [https://builds.apache.org/job/HBase-0.94-security/96/]) HBASE-7034 Bad version, failed OPENING to OPENED but master thinks it is open anyways (Anoop) (Revision 1434797) Result = FAILURE tedyu : Files : * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/zookeeper/RecoverableZooKeeper.java * /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/zookeeper/TestRecoverableZooKeeper.java Bad version, failed OPENING to OPENED but master thinks it is open anyways -- Key: HBASE-7034 URL: https://issues.apache.org/jira/browse/HBASE-7034 Project: HBase Issue Type: Bug Components: Region Assignment Affects Versions: 0.94.2 Reporter: stack Assignee: Anoop Sam John Fix For: 0.96.0, 0.94.5 Attachments: HBASE-7034_94.patch, HBASE-7034_94_V2.patch, HBASE-7034_Test_Trunk.patch, TestRecoverableZooKeeper.java I have this in RS log: {code} 2012-10-22 02:21:50,698 ERROR org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler: Failed transitioning node b9,\xEE\xAE\x9BiQO\x89]+a\xE0\x7F\xB7'X?,1349052737638.9af7cfc9b15910a0b3d714bf40a3248f. from OPENING to OPENED -- closing region org.apache.zookeeper.KeeperException$BadVersionException: KeeperErrorCode = BadVersion for /hbase/unassigned/9af7cfc9b15910a0b3d714bf40a3248f {code} Master says this (it is bulk assigning): {code} 2012-10-22 02:21:40,673 DEBUG org.apache.hadoop.hbase.zookeeper.ZKUtil: master:10302-0xb3a862e57a503ba Set watcher on existing znode /hbase/unassigned/9af7cfc9b15910a0b3d714bf40a3248f ... then this 2012-10-22 02:23:47,089 DEBUG org.apache.hadoop.hbase.zookeeper.ZKUtil: master:10302-0xb3a862e57a503ba Set watcher on existing znode /hbase/unassigned/9af7cfc9b15910a0b3d714bf40a3248f 2012-10-22 02:24:34,176 DEBUG org.apache.hadoop.hbase.zookeeper.ZKUtil: master:10302-0xb3a862e57a503ba Retrieved 112 byte(s) of data from znode /hbase/unassigned/9af7cfc9b15910a0b3d714bf40a3248f and set watcher; region=b9,\xEE\xAE\x9BiQO\x89]+a\xE0\x7F\xB7'X?,1349052737638.9af7cfc9b15910a0b3d714bf40a3248f., origin=sv4r17s44,10304,1350872216778, state=RS_ZK_REGION_OPENED etc. {code} Disagreement as to what is going on here. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7034) Bad version, failed OPENING to OPENED but master thinks it is open anyways
[ https://issues.apache.org/jira/browse/HBASE-7034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13557366#comment-13557366 ] ramkrishna.s.vasudevan commented on HBASE-7034: --- HBASE-6858 has the fix for trunk. Bad version, failed OPENING to OPENED but master thinks it is open anyways -- Key: HBASE-7034 URL: https://issues.apache.org/jira/browse/HBASE-7034 Project: HBase Issue Type: Bug Components: Region Assignment Affects Versions: 0.94.2 Reporter: stack Assignee: Anoop Sam John Fix For: 0.96.0, 0.94.5 Attachments: HBASE-7034_94.patch, HBASE-7034_94_V2.patch, HBASE-7034_Test_Trunk.patch, TestRecoverableZooKeeper.java I have this in RS log: {code} 2012-10-22 02:21:50,698 ERROR org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler: Failed transitioning node b9,\xEE\xAE\x9BiQO\x89]+a\xE0\x7F\xB7'X?,1349052737638.9af7cfc9b15910a0b3d714bf40a3248f. from OPENING to OPENED -- closing region org.apache.zookeeper.KeeperException$BadVersionException: KeeperErrorCode = BadVersion for /hbase/unassigned/9af7cfc9b15910a0b3d714bf40a3248f {code} Master says this (it is bulk assigning): {code} 2012-10-22 02:21:40,673 DEBUG org.apache.hadoop.hbase.zookeeper.ZKUtil: master:10302-0xb3a862e57a503ba Set watcher on existing znode /hbase/unassigned/9af7cfc9b15910a0b3d714bf40a3248f ... then this 2012-10-22 02:23:47,089 DEBUG org.apache.hadoop.hbase.zookeeper.ZKUtil: master:10302-0xb3a862e57a503ba Set watcher on existing znode /hbase/unassigned/9af7cfc9b15910a0b3d714bf40a3248f 2012-10-22 02:24:34,176 DEBUG org.apache.hadoop.hbase.zookeeper.ZKUtil: master:10302-0xb3a862e57a503ba Retrieved 112 byte(s) of data from znode /hbase/unassigned/9af7cfc9b15910a0b3d714bf40a3248f and set watcher; region=b9,\xEE\xAE\x9BiQO\x89]+a\xE0\x7F\xB7'X?,1349052737638.9af7cfc9b15910a0b3d714bf40a3248f., origin=sv4r17s44,10304,1350872216778, state=RS_ZK_REGION_OPENED etc. {code} Disagreement as to what is going on here. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7034) Bad version, failed OPENING to OPENED but master thinks it is open anyways
[ https://issues.apache.org/jira/browse/HBASE-7034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13556334#comment-13556334 ] Ted Yu commented on HBASE-7034: --- +1 on patch v2. Patch v2 aligns with trunk code. It would be nice to have a trunk patch which contains the new test. Bad version, failed OPENING to OPENED but master thinks it is open anyways -- Key: HBASE-7034 URL: https://issues.apache.org/jira/browse/HBASE-7034 Project: HBase Issue Type: Bug Components: Region Assignment Affects Versions: 0.94.2 Reporter: stack Assignee: Anoop Sam John Attachments: HBASE-7034_94.patch, HBASE-7034_94_V2.patch, TestRecoverableZooKeeper.java I have this in RS log: {code} 2012-10-22 02:21:50,698 ERROR org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler: Failed transitioning node b9,\xEE\xAE\x9BiQO\x89]+a\xE0\x7F\xB7'X?,1349052737638.9af7cfc9b15910a0b3d714bf40a3248f. from OPENING to OPENED -- closing region org.apache.zookeeper.KeeperException$BadVersionException: KeeperErrorCode = BadVersion for /hbase/unassigned/9af7cfc9b15910a0b3d714bf40a3248f {code} Master says this (it is bulk assigning): {code} 2012-10-22 02:21:40,673 DEBUG org.apache.hadoop.hbase.zookeeper.ZKUtil: master:10302-0xb3a862e57a503ba Set watcher on existing znode /hbase/unassigned/9af7cfc9b15910a0b3d714bf40a3248f ... then this 2012-10-22 02:23:47,089 DEBUG org.apache.hadoop.hbase.zookeeper.ZKUtil: master:10302-0xb3a862e57a503ba Set watcher on existing znode /hbase/unassigned/9af7cfc9b15910a0b3d714bf40a3248f 2012-10-22 02:24:34,176 DEBUG org.apache.hadoop.hbase.zookeeper.ZKUtil: master:10302-0xb3a862e57a503ba Retrieved 112 byte(s) of data from znode /hbase/unassigned/9af7cfc9b15910a0b3d714bf40a3248f and set watcher; region=b9,\xEE\xAE\x9BiQO\x89]+a\xE0\x7F\xB7'X?,1349052737638.9af7cfc9b15910a0b3d714bf40a3248f., origin=sv4r17s44,10304,1350872216778, state=RS_ZK_REGION_OPENED etc. {code} Disagreement as to what is going on here. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7034) Bad version, failed OPENING to OPENED but master thinks it is open anyways
[ https://issues.apache.org/jira/browse/HBASE-7034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13556354#comment-13556354 ] Anoop Sam John commented on HBASE-7034: --- Oh.. Now when I seen the trunk code to apply the fix, it is already fixed. :) How we missed 94 then!! Bad version, failed OPENING to OPENED but master thinks it is open anyways -- Key: HBASE-7034 URL: https://issues.apache.org/jira/browse/HBASE-7034 Project: HBase Issue Type: Bug Components: Region Assignment Affects Versions: 0.94.2 Reporter: stack Assignee: Anoop Sam John Attachments: HBASE-7034_94.patch, HBASE-7034_94_V2.patch, TestRecoverableZooKeeper.java I have this in RS log: {code} 2012-10-22 02:21:50,698 ERROR org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler: Failed transitioning node b9,\xEE\xAE\x9BiQO\x89]+a\xE0\x7F\xB7'X?,1349052737638.9af7cfc9b15910a0b3d714bf40a3248f. from OPENING to OPENED -- closing region org.apache.zookeeper.KeeperException$BadVersionException: KeeperErrorCode = BadVersion for /hbase/unassigned/9af7cfc9b15910a0b3d714bf40a3248f {code} Master says this (it is bulk assigning): {code} 2012-10-22 02:21:40,673 DEBUG org.apache.hadoop.hbase.zookeeper.ZKUtil: master:10302-0xb3a862e57a503ba Set watcher on existing znode /hbase/unassigned/9af7cfc9b15910a0b3d714bf40a3248f ... then this 2012-10-22 02:23:47,089 DEBUG org.apache.hadoop.hbase.zookeeper.ZKUtil: master:10302-0xb3a862e57a503ba Set watcher on existing znode /hbase/unassigned/9af7cfc9b15910a0b3d714bf40a3248f 2012-10-22 02:24:34,176 DEBUG org.apache.hadoop.hbase.zookeeper.ZKUtil: master:10302-0xb3a862e57a503ba Retrieved 112 byte(s) of data from znode /hbase/unassigned/9af7cfc9b15910a0b3d714bf40a3248f and set watcher; region=b9,\xEE\xAE\x9BiQO\x89]+a\xE0\x7F\xB7'X?,1349052737638.9af7cfc9b15910a0b3d714bf40a3248f., origin=sv4r17s44,10304,1350872216778, state=RS_ZK_REGION_OPENED etc. {code} Disagreement as to what is going on here. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7034) Bad version, failed OPENING to OPENED but master thinks it is open anyways
[ https://issues.apache.org/jira/browse/HBASE-7034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13556356#comment-13556356 ] Anoop Sam John commented on HBASE-7034: --- bq.It would be nice to have a trunk patch which contains the new test. Let me make that then Ted. Didn't seen your comment. I was trying to apply the fix in trunk :) Bad version, failed OPENING to OPENED but master thinks it is open anyways -- Key: HBASE-7034 URL: https://issues.apache.org/jira/browse/HBASE-7034 Project: HBase Issue Type: Bug Components: Region Assignment Affects Versions: 0.94.2 Reporter: stack Assignee: Anoop Sam John Attachments: HBASE-7034_94.patch, HBASE-7034_94_V2.patch, TestRecoverableZooKeeper.java I have this in RS log: {code} 2012-10-22 02:21:50,698 ERROR org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler: Failed transitioning node b9,\xEE\xAE\x9BiQO\x89]+a\xE0\x7F\xB7'X?,1349052737638.9af7cfc9b15910a0b3d714bf40a3248f. from OPENING to OPENED -- closing region org.apache.zookeeper.KeeperException$BadVersionException: KeeperErrorCode = BadVersion for /hbase/unassigned/9af7cfc9b15910a0b3d714bf40a3248f {code} Master says this (it is bulk assigning): {code} 2012-10-22 02:21:40,673 DEBUG org.apache.hadoop.hbase.zookeeper.ZKUtil: master:10302-0xb3a862e57a503ba Set watcher on existing znode /hbase/unassigned/9af7cfc9b15910a0b3d714bf40a3248f ... then this 2012-10-22 02:23:47,089 DEBUG org.apache.hadoop.hbase.zookeeper.ZKUtil: master:10302-0xb3a862e57a503ba Set watcher on existing znode /hbase/unassigned/9af7cfc9b15910a0b3d714bf40a3248f 2012-10-22 02:24:34,176 DEBUG org.apache.hadoop.hbase.zookeeper.ZKUtil: master:10302-0xb3a862e57a503ba Retrieved 112 byte(s) of data from znode /hbase/unassigned/9af7cfc9b15910a0b3d714bf40a3248f and set watcher; region=b9,\xEE\xAE\x9BiQO\x89]+a\xE0\x7F\xB7'X?,1349052737638.9af7cfc9b15910a0b3d714bf40a3248f., origin=sv4r17s44,10304,1350872216778, state=RS_ZK_REGION_OPENED etc. {code} Disagreement as to what is going on here. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7034) Bad version, failed OPENING to OPENED but master thinks it is open anyways
[ https://issues.apache.org/jira/browse/HBASE-7034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13556396#comment-13556396 ] Ted Yu commented on HBASE-7034: --- Ran the new test and it passed in both 0.94 and trunk. Integrated to 0.94 and trunk. Thanks for the patch, Anoop. Thanks for the review, Ram. Bad version, failed OPENING to OPENED but master thinks it is open anyways -- Key: HBASE-7034 URL: https://issues.apache.org/jira/browse/HBASE-7034 Project: HBase Issue Type: Bug Components: Region Assignment Affects Versions: 0.94.2 Reporter: stack Assignee: Anoop Sam John Attachments: HBASE-7034_94.patch, HBASE-7034_94_V2.patch, HBASE-7034_Test_Trunk.patch, TestRecoverableZooKeeper.java I have this in RS log: {code} 2012-10-22 02:21:50,698 ERROR org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler: Failed transitioning node b9,\xEE\xAE\x9BiQO\x89]+a\xE0\x7F\xB7'X?,1349052737638.9af7cfc9b15910a0b3d714bf40a3248f. from OPENING to OPENED -- closing region org.apache.zookeeper.KeeperException$BadVersionException: KeeperErrorCode = BadVersion for /hbase/unassigned/9af7cfc9b15910a0b3d714bf40a3248f {code} Master says this (it is bulk assigning): {code} 2012-10-22 02:21:40,673 DEBUG org.apache.hadoop.hbase.zookeeper.ZKUtil: master:10302-0xb3a862e57a503ba Set watcher on existing znode /hbase/unassigned/9af7cfc9b15910a0b3d714bf40a3248f ... then this 2012-10-22 02:23:47,089 DEBUG org.apache.hadoop.hbase.zookeeper.ZKUtil: master:10302-0xb3a862e57a503ba Set watcher on existing znode /hbase/unassigned/9af7cfc9b15910a0b3d714bf40a3248f 2012-10-22 02:24:34,176 DEBUG org.apache.hadoop.hbase.zookeeper.ZKUtil: master:10302-0xb3a862e57a503ba Retrieved 112 byte(s) of data from znode /hbase/unassigned/9af7cfc9b15910a0b3d714bf40a3248f and set watcher; region=b9,\xEE\xAE\x9BiQO\x89]+a\xE0\x7F\xB7'X?,1349052737638.9af7cfc9b15910a0b3d714bf40a3248f., origin=sv4r17s44,10304,1350872216778, state=RS_ZK_REGION_OPENED etc. {code} Disagreement as to what is going on here. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7034) Bad version, failed OPENING to OPENED but master thinks it is open anyways
[ https://issues.apache.org/jira/browse/HBASE-7034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13556470#comment-13556470 ] Hudson commented on HBASE-7034: --- Integrated in HBase-TRUNK-on-Hadoop-2.0.0 #355 (See [https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/355/]) HBASE-7034 Bad version, failed OPENING to OPENED but master thinks it is open anyways (Anoop) (Revision 1434800) Result = FAILURE tedyu : Files : * /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/zookeeper/TestRecoverableZooKeeper.java Bad version, failed OPENING to OPENED but master thinks it is open anyways -- Key: HBASE-7034 URL: https://issues.apache.org/jira/browse/HBASE-7034 Project: HBase Issue Type: Bug Components: Region Assignment Affects Versions: 0.94.2 Reporter: stack Assignee: Anoop Sam John Attachments: HBASE-7034_94.patch, HBASE-7034_94_V2.patch, HBASE-7034_Test_Trunk.patch, TestRecoverableZooKeeper.java I have this in RS log: {code} 2012-10-22 02:21:50,698 ERROR org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler: Failed transitioning node b9,\xEE\xAE\x9BiQO\x89]+a\xE0\x7F\xB7'X?,1349052737638.9af7cfc9b15910a0b3d714bf40a3248f. from OPENING to OPENED -- closing region org.apache.zookeeper.KeeperException$BadVersionException: KeeperErrorCode = BadVersion for /hbase/unassigned/9af7cfc9b15910a0b3d714bf40a3248f {code} Master says this (it is bulk assigning): {code} 2012-10-22 02:21:40,673 DEBUG org.apache.hadoop.hbase.zookeeper.ZKUtil: master:10302-0xb3a862e57a503ba Set watcher on existing znode /hbase/unassigned/9af7cfc9b15910a0b3d714bf40a3248f ... then this 2012-10-22 02:23:47,089 DEBUG org.apache.hadoop.hbase.zookeeper.ZKUtil: master:10302-0xb3a862e57a503ba Set watcher on existing znode /hbase/unassigned/9af7cfc9b15910a0b3d714bf40a3248f 2012-10-22 02:24:34,176 DEBUG org.apache.hadoop.hbase.zookeeper.ZKUtil: master:10302-0xb3a862e57a503ba Retrieved 112 byte(s) of data from znode /hbase/unassigned/9af7cfc9b15910a0b3d714bf40a3248f and set watcher; region=b9,\xEE\xAE\x9BiQO\x89]+a\xE0\x7F\xB7'X?,1349052737638.9af7cfc9b15910a0b3d714bf40a3248f., origin=sv4r17s44,10304,1350872216778, state=RS_ZK_REGION_OPENED etc. {code} Disagreement as to what is going on here. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7034) Bad version, failed OPENING to OPENED but master thinks it is open anyways
[ https://issues.apache.org/jira/browse/HBASE-7034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13556522#comment-13556522 ] Hudson commented on HBASE-7034: --- Integrated in HBase-TRUNK #3762 (See [https://builds.apache.org/job/HBase-TRUNK/3762/]) HBASE-7034 Bad version, failed OPENING to OPENED but master thinks it is open anyways (Anoop) (Revision 1434800) Result = FAILURE tedyu : Files : * /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/zookeeper/TestRecoverableZooKeeper.java Bad version, failed OPENING to OPENED but master thinks it is open anyways -- Key: HBASE-7034 URL: https://issues.apache.org/jira/browse/HBASE-7034 Project: HBase Issue Type: Bug Components: Region Assignment Affects Versions: 0.94.2 Reporter: stack Assignee: Anoop Sam John Attachments: HBASE-7034_94.patch, HBASE-7034_94_V2.patch, HBASE-7034_Test_Trunk.patch, TestRecoverableZooKeeper.java I have this in RS log: {code} 2012-10-22 02:21:50,698 ERROR org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler: Failed transitioning node b9,\xEE\xAE\x9BiQO\x89]+a\xE0\x7F\xB7'X?,1349052737638.9af7cfc9b15910a0b3d714bf40a3248f. from OPENING to OPENED -- closing region org.apache.zookeeper.KeeperException$BadVersionException: KeeperErrorCode = BadVersion for /hbase/unassigned/9af7cfc9b15910a0b3d714bf40a3248f {code} Master says this (it is bulk assigning): {code} 2012-10-22 02:21:40,673 DEBUG org.apache.hadoop.hbase.zookeeper.ZKUtil: master:10302-0xb3a862e57a503ba Set watcher on existing znode /hbase/unassigned/9af7cfc9b15910a0b3d714bf40a3248f ... then this 2012-10-22 02:23:47,089 DEBUG org.apache.hadoop.hbase.zookeeper.ZKUtil: master:10302-0xb3a862e57a503ba Set watcher on existing znode /hbase/unassigned/9af7cfc9b15910a0b3d714bf40a3248f 2012-10-22 02:24:34,176 DEBUG org.apache.hadoop.hbase.zookeeper.ZKUtil: master:10302-0xb3a862e57a503ba Retrieved 112 byte(s) of data from znode /hbase/unassigned/9af7cfc9b15910a0b3d714bf40a3248f and set watcher; region=b9,\xEE\xAE\x9BiQO\x89]+a\xE0\x7F\xB7'X?,1349052737638.9af7cfc9b15910a0b3d714bf40a3248f., origin=sv4r17s44,10304,1350872216778, state=RS_ZK_REGION_OPENED etc. {code} Disagreement as to what is going on here. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7034) Bad version, failed OPENING to OPENED but master thinks it is open anyways
[ https://issues.apache.org/jira/browse/HBASE-7034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13554042#comment-13554042 ] Ted Yu commented on HBASE-7034: --- Patch looks good - TestRecoverableZooKeeper passes with patch and fails without patch. Can you add TestRecoverableZooKeeper to the patch as well ? Please add license header and test category to TestRecoverableZooKeeper. Bad version, failed OPENING to OPENED but master thinks it is open anyways -- Key: HBASE-7034 URL: https://issues.apache.org/jira/browse/HBASE-7034 Project: HBase Issue Type: Bug Components: Region Assignment Affects Versions: 0.94.2 Reporter: stack Assignee: Anoop Sam John Attachments: HBASE-7034_94.patch, TestRecoverableZooKeeper.java I have this in RS log: {code} 2012-10-22 02:21:50,698 ERROR org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler: Failed transitioning node b9,\xEE\xAE\x9BiQO\x89]+a\xE0\x7F\xB7'X?,1349052737638.9af7cfc9b15910a0b3d714bf40a3248f. from OPENING to OPENED -- closing region org.apache.zookeeper.KeeperException$BadVersionException: KeeperErrorCode = BadVersion for /hbase/unassigned/9af7cfc9b15910a0b3d714bf40a3248f {code} Master says this (it is bulk assigning): {code} 2012-10-22 02:21:40,673 DEBUG org.apache.hadoop.hbase.zookeeper.ZKUtil: master:10302-0xb3a862e57a503ba Set watcher on existing znode /hbase/unassigned/9af7cfc9b15910a0b3d714bf40a3248f ... then this 2012-10-22 02:23:47,089 DEBUG org.apache.hadoop.hbase.zookeeper.ZKUtil: master:10302-0xb3a862e57a503ba Set watcher on existing znode /hbase/unassigned/9af7cfc9b15910a0b3d714bf40a3248f 2012-10-22 02:24:34,176 DEBUG org.apache.hadoop.hbase.zookeeper.ZKUtil: master:10302-0xb3a862e57a503ba Retrieved 112 byte(s) of data from znode /hbase/unassigned/9af7cfc9b15910a0b3d714bf40a3248f and set watcher; region=b9,\xEE\xAE\x9BiQO\x89]+a\xE0\x7F\xB7'X?,1349052737638.9af7cfc9b15910a0b3d714bf40a3248f., origin=sv4r17s44,10304,1350872216778, state=RS_ZK_REGION_OPENED etc. {code} Disagreement as to what is going on here. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7034) Bad version, failed OPENING to OPENED but master thinks it is open anyways
[ https://issues.apache.org/jira/browse/HBASE-7034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13554144#comment-13554144 ] ramkrishna.s.vasudevan commented on HBASE-7034: --- @Anoop I feel that we should check for the data part whether it was set correctly. The id part is any way going to be the same i feel. bq.we need to compare the id read from the zookeeper and the id for this process (this.id) Just referring to the above comment. I may be wrong. I know you would have done sufficient study on this. Bad version, failed OPENING to OPENED but master thinks it is open anyways -- Key: HBASE-7034 URL: https://issues.apache.org/jira/browse/HBASE-7034 Project: HBase Issue Type: Bug Components: Region Assignment Affects Versions: 0.94.2 Reporter: stack Assignee: Anoop Sam John Attachments: HBASE-7034_94.patch, TestRecoverableZooKeeper.java I have this in RS log: {code} 2012-10-22 02:21:50,698 ERROR org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler: Failed transitioning node b9,\xEE\xAE\x9BiQO\x89]+a\xE0\x7F\xB7'X?,1349052737638.9af7cfc9b15910a0b3d714bf40a3248f. from OPENING to OPENED -- closing region org.apache.zookeeper.KeeperException$BadVersionException: KeeperErrorCode = BadVersion for /hbase/unassigned/9af7cfc9b15910a0b3d714bf40a3248f {code} Master says this (it is bulk assigning): {code} 2012-10-22 02:21:40,673 DEBUG org.apache.hadoop.hbase.zookeeper.ZKUtil: master:10302-0xb3a862e57a503ba Set watcher on existing znode /hbase/unassigned/9af7cfc9b15910a0b3d714bf40a3248f ... then this 2012-10-22 02:23:47,089 DEBUG org.apache.hadoop.hbase.zookeeper.ZKUtil: master:10302-0xb3a862e57a503ba Set watcher on existing znode /hbase/unassigned/9af7cfc9b15910a0b3d714bf40a3248f 2012-10-22 02:24:34,176 DEBUG org.apache.hadoop.hbase.zookeeper.ZKUtil: master:10302-0xb3a862e57a503ba Retrieved 112 byte(s) of data from znode /hbase/unassigned/9af7cfc9b15910a0b3d714bf40a3248f and set watcher; region=b9,\xEE\xAE\x9BiQO\x89]+a\xE0\x7F\xB7'X?,1349052737638.9af7cfc9b15910a0b3d714bf40a3248f., origin=sv4r17s44,10304,1350872216778, state=RS_ZK_REGION_OPENED etc. {code} Disagreement as to what is going on here. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7034) Bad version, failed OPENING to OPENED but master thinks it is open anyways
[ https://issues.apache.org/jira/browse/HBASE-7034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13554663#comment-13554663 ] Anoop Sam John commented on HBASE-7034: --- bq.I feel that we should check for the data part whether it was set correctly. The id part is any way going to be the same i feel. Yes Ram. I am also thinking so. This was not final patch any way. Was thinking that I can do this in final patch. This was for the feedback from you guys. :) As I read from the code the intent was to check this process id against the id in the zoo data. This is to check while this process try to change the data in the zookeeper another process changed that already or not. In such a case this process will give up this op. That is the whole reason why we add the id stuff in the data. :) The old check was any way wrong which compare the id against data. Yes Ted I can add the test case in the final patch. Bad version, failed OPENING to OPENED but master thinks it is open anyways -- Key: HBASE-7034 URL: https://issues.apache.org/jira/browse/HBASE-7034 Project: HBase Issue Type: Bug Components: Region Assignment Affects Versions: 0.94.2 Reporter: stack Assignee: Anoop Sam John Attachments: HBASE-7034_94.patch, TestRecoverableZooKeeper.java I have this in RS log: {code} 2012-10-22 02:21:50,698 ERROR org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler: Failed transitioning node b9,\xEE\xAE\x9BiQO\x89]+a\xE0\x7F\xB7'X?,1349052737638.9af7cfc9b15910a0b3d714bf40a3248f. from OPENING to OPENED -- closing region org.apache.zookeeper.KeeperException$BadVersionException: KeeperErrorCode = BadVersion for /hbase/unassigned/9af7cfc9b15910a0b3d714bf40a3248f {code} Master says this (it is bulk assigning): {code} 2012-10-22 02:21:40,673 DEBUG org.apache.hadoop.hbase.zookeeper.ZKUtil: master:10302-0xb3a862e57a503ba Set watcher on existing znode /hbase/unassigned/9af7cfc9b15910a0b3d714bf40a3248f ... then this 2012-10-22 02:23:47,089 DEBUG org.apache.hadoop.hbase.zookeeper.ZKUtil: master:10302-0xb3a862e57a503ba Set watcher on existing znode /hbase/unassigned/9af7cfc9b15910a0b3d714bf40a3248f 2012-10-22 02:24:34,176 DEBUG org.apache.hadoop.hbase.zookeeper.ZKUtil: master:10302-0xb3a862e57a503ba Retrieved 112 byte(s) of data from znode /hbase/unassigned/9af7cfc9b15910a0b3d714bf40a3248f and set watcher; region=b9,\xEE\xAE\x9BiQO\x89]+a\xE0\x7F\xB7'X?,1349052737638.9af7cfc9b15910a0b3d714bf40a3248f., origin=sv4r17s44,10304,1350872216778, state=RS_ZK_REGION_OPENED etc. {code} Disagreement as to what is going on here. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7034) Bad version, failed OPENING to OPENED but master thinks it is open anyways
[ https://issues.apache.org/jira/browse/HBASE-7034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13553523#comment-13553523 ] stack commented on HBASE-7034: -- [~anoopsamjohn] I don't know (I would swear we were in here recently looking at this exact area of the code..) Bad version, failed OPENING to OPENED but master thinks it is open anyways -- Key: HBASE-7034 URL: https://issues.apache.org/jira/browse/HBASE-7034 Project: HBase Issue Type: Bug Components: Region Assignment Affects Versions: 0.94.2 Reporter: stack Assignee: Anoop Sam John I have this in RS log: {code} 2012-10-22 02:21:50,698 ERROR org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler: Failed transitioning node b9,\xEE\xAE\x9BiQO\x89]+a\xE0\x7F\xB7'X?,1349052737638.9af7cfc9b15910a0b3d714bf40a3248f. from OPENING to OPENED -- closing region org.apache.zookeeper.KeeperException$BadVersionException: KeeperErrorCode = BadVersion for /hbase/unassigned/9af7cfc9b15910a0b3d714bf40a3248f {code} Master says this (it is bulk assigning): {code} 2012-10-22 02:21:40,673 DEBUG org.apache.hadoop.hbase.zookeeper.ZKUtil: master:10302-0xb3a862e57a503ba Set watcher on existing znode /hbase/unassigned/9af7cfc9b15910a0b3d714bf40a3248f ... then this 2012-10-22 02:23:47,089 DEBUG org.apache.hadoop.hbase.zookeeper.ZKUtil: master:10302-0xb3a862e57a503ba Set watcher on existing znode /hbase/unassigned/9af7cfc9b15910a0b3d714bf40a3248f 2012-10-22 02:24:34,176 DEBUG org.apache.hadoop.hbase.zookeeper.ZKUtil: master:10302-0xb3a862e57a503ba Retrieved 112 byte(s) of data from znode /hbase/unassigned/9af7cfc9b15910a0b3d714bf40a3248f and set watcher; region=b9,\xEE\xAE\x9BiQO\x89]+a\xE0\x7F\xB7'X?,1349052737638.9af7cfc9b15910a0b3d714bf40a3248f., origin=sv4r17s44,10304,1350872216778, state=RS_ZK_REGION_OPENED etc. {code} Disagreement as to what is going on here. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7034) Bad version, failed OPENING to OPENED but master thinks it is open anyways
[ https://issues.apache.org/jira/browse/HBASE-7034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13553536#comment-13553536 ] Anoop Sam John commented on HBASE-7034: --- Stack - I will try with some sort of unit tests. I believe this caused the issue Bad version, failed OPENING to OPENED but master thinks it is open anyways -- Key: HBASE-7034 URL: https://issues.apache.org/jira/browse/HBASE-7034 Project: HBase Issue Type: Bug Components: Region Assignment Affects Versions: 0.94.2 Reporter: stack Assignee: Anoop Sam John I have this in RS log: {code} 2012-10-22 02:21:50,698 ERROR org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler: Failed transitioning node b9,\xEE\xAE\x9BiQO\x89]+a\xE0\x7F\xB7'X?,1349052737638.9af7cfc9b15910a0b3d714bf40a3248f. from OPENING to OPENED -- closing region org.apache.zookeeper.KeeperException$BadVersionException: KeeperErrorCode = BadVersion for /hbase/unassigned/9af7cfc9b15910a0b3d714bf40a3248f {code} Master says this (it is bulk assigning): {code} 2012-10-22 02:21:40,673 DEBUG org.apache.hadoop.hbase.zookeeper.ZKUtil: master:10302-0xb3a862e57a503ba Set watcher on existing znode /hbase/unassigned/9af7cfc9b15910a0b3d714bf40a3248f ... then this 2012-10-22 02:23:47,089 DEBUG org.apache.hadoop.hbase.zookeeper.ZKUtil: master:10302-0xb3a862e57a503ba Set watcher on existing znode /hbase/unassigned/9af7cfc9b15910a0b3d714bf40a3248f 2012-10-22 02:24:34,176 DEBUG org.apache.hadoop.hbase.zookeeper.ZKUtil: master:10302-0xb3a862e57a503ba Retrieved 112 byte(s) of data from znode /hbase/unassigned/9af7cfc9b15910a0b3d714bf40a3248f and set watcher; region=b9,\xEE\xAE\x9BiQO\x89]+a\xE0\x7F\xB7'X?,1349052737638.9af7cfc9b15910a0b3d714bf40a3248f., origin=sv4r17s44,10304,1350872216778, state=RS_ZK_REGION_OPENED etc. {code} Disagreement as to what is going on here. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7034) Bad version, failed OPENING to OPENED but master thinks it is open anyways
[ https://issues.apache.org/jira/browse/HBASE-7034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13552348#comment-13552348 ] stack commented on HBASE-7034: -- [~anoopsamjohn] You mean this bit Anoop? {code} if(Bytes.compareTo(revData, ID_LENGTH_SIZE, id.length, revData, dataOffset, dataLength) == 0) { {code} Bad version, failed OPENING to OPENED but master thinks it is open anyways -- Key: HBASE-7034 URL: https://issues.apache.org/jira/browse/HBASE-7034 Project: HBase Issue Type: Bug Components: Region Assignment Affects Versions: 0.94.2 Reporter: stack Assignee: Anoop Sam John I have this in RS log: {code} 2012-10-22 02:21:50,698 ERROR org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler: Failed transitioning node b9,\xEE\xAE\x9BiQO\x89]+a\xE0\x7F\xB7'X?,1349052737638.9af7cfc9b15910a0b3d714bf40a3248f. from OPENING to OPENED -- closing region org.apache.zookeeper.KeeperException$BadVersionException: KeeperErrorCode = BadVersion for /hbase/unassigned/9af7cfc9b15910a0b3d714bf40a3248f {code} Master says this (it is bulk assigning): {code} 2012-10-22 02:21:40,673 DEBUG org.apache.hadoop.hbase.zookeeper.ZKUtil: master:10302-0xb3a862e57a503ba Set watcher on existing znode /hbase/unassigned/9af7cfc9b15910a0b3d714bf40a3248f ... then this 2012-10-22 02:23:47,089 DEBUG org.apache.hadoop.hbase.zookeeper.ZKUtil: master:10302-0xb3a862e57a503ba Set watcher on existing znode /hbase/unassigned/9af7cfc9b15910a0b3d714bf40a3248f 2012-10-22 02:24:34,176 DEBUG org.apache.hadoop.hbase.zookeeper.ZKUtil: master:10302-0xb3a862e57a503ba Retrieved 112 byte(s) of data from znode /hbase/unassigned/9af7cfc9b15910a0b3d714bf40a3248f and set watcher; region=b9,\xEE\xAE\x9BiQO\x89]+a\xE0\x7F\xB7'X?,1349052737638.9af7cfc9b15910a0b3d714bf40a3248f., origin=sv4r17s44,10304,1350872216778, state=RS_ZK_REGION_OPENED etc. {code} Disagreement as to what is going on here. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7034) Bad version, failed OPENING to OPENED but master thinks it is open anyways
[ https://issues.apache.org/jira/browse/HBASE-7034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13552444#comment-13552444 ] Anoop Sam John commented on HBASE-7034: --- Yep the same. What does this compare? Bad version, failed OPENING to OPENED but master thinks it is open anyways -- Key: HBASE-7034 URL: https://issues.apache.org/jira/browse/HBASE-7034 Project: HBase Issue Type: Bug Components: Region Assignment Affects Versions: 0.94.2 Reporter: stack Assignee: Anoop Sam John I have this in RS log: {code} 2012-10-22 02:21:50,698 ERROR org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler: Failed transitioning node b9,\xEE\xAE\x9BiQO\x89]+a\xE0\x7F\xB7'X?,1349052737638.9af7cfc9b15910a0b3d714bf40a3248f. from OPENING to OPENED -- closing region org.apache.zookeeper.KeeperException$BadVersionException: KeeperErrorCode = BadVersion for /hbase/unassigned/9af7cfc9b15910a0b3d714bf40a3248f {code} Master says this (it is bulk assigning): {code} 2012-10-22 02:21:40,673 DEBUG org.apache.hadoop.hbase.zookeeper.ZKUtil: master:10302-0xb3a862e57a503ba Set watcher on existing znode /hbase/unassigned/9af7cfc9b15910a0b3d714bf40a3248f ... then this 2012-10-22 02:23:47,089 DEBUG org.apache.hadoop.hbase.zookeeper.ZKUtil: master:10302-0xb3a862e57a503ba Set watcher on existing znode /hbase/unassigned/9af7cfc9b15910a0b3d714bf40a3248f 2012-10-22 02:24:34,176 DEBUG org.apache.hadoop.hbase.zookeeper.ZKUtil: master:10302-0xb3a862e57a503ba Retrieved 112 byte(s) of data from znode /hbase/unassigned/9af7cfc9b15910a0b3d714bf40a3248f and set watcher; region=b9,\xEE\xAE\x9BiQO\x89]+a\xE0\x7F\xB7'X?,1349052737638.9af7cfc9b15910a0b3d714bf40a3248f., origin=sv4r17s44,10304,1350872216778, state=RS_ZK_REGION_OPENED etc. {code} Disagreement as to what is going on here. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7034) Bad version, failed OPENING to OPENED but master thinks it is open anyways
[ https://issues.apache.org/jira/browse/HBASE-7034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13551942#comment-13551942 ] Anoop Sam John commented on HBASE-7034: --- Will submit patch . Bad version, failed OPENING to OPENED but master thinks it is open anyways -- Key: HBASE-7034 URL: https://issues.apache.org/jira/browse/HBASE-7034 Project: HBase Issue Type: Bug Components: Region Assignment Affects Versions: 0.94.2 Reporter: stack Assignee: Anoop Sam John I have this in RS log: {code} 2012-10-22 02:21:50,698 ERROR org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler: Failed transitioning node b9,\xEE\xAE\x9BiQO\x89]+a\xE0\x7F\xB7'X?,1349052737638.9af7cfc9b15910a0b3d714bf40a3248f. from OPENING to OPENED -- closing region org.apache.zookeeper.KeeperException$BadVersionException: KeeperErrorCode = BadVersion for /hbase/unassigned/9af7cfc9b15910a0b3d714bf40a3248f {code} Master says this (it is bulk assigning): {code} 2012-10-22 02:21:40,673 DEBUG org.apache.hadoop.hbase.zookeeper.ZKUtil: master:10302-0xb3a862e57a503ba Set watcher on existing znode /hbase/unassigned/9af7cfc9b15910a0b3d714bf40a3248f ... then this 2012-10-22 02:23:47,089 DEBUG org.apache.hadoop.hbase.zookeeper.ZKUtil: master:10302-0xb3a862e57a503ba Set watcher on existing znode /hbase/unassigned/9af7cfc9b15910a0b3d714bf40a3248f 2012-10-22 02:24:34,176 DEBUG org.apache.hadoop.hbase.zookeeper.ZKUtil: master:10302-0xb3a862e57a503ba Retrieved 112 byte(s) of data from znode /hbase/unassigned/9af7cfc9b15910a0b3d714bf40a3248f and set watcher; region=b9,\xEE\xAE\x9BiQO\x89]+a\xE0\x7F\xB7'X?,1349052737638.9af7cfc9b15910a0b3d714bf40a3248f., origin=sv4r17s44,10304,1350872216778, state=RS_ZK_REGION_OPENED etc. {code} Disagreement as to what is going on here. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7034) Bad version, failed OPENING to OPENED but master thinks it is open anyways
[ https://issues.apache.org/jira/browse/HBASE-7034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13550900#comment-13550900 ] Anoop Sam John commented on HBASE-7034: --- Is this code came in by mistake? {code} RecoverableZooKeeper#setData(String path, byte[] data, int version){ byte[] revData = zk.getData(path, false, stat); int idLength = Bytes.toInt(revData, ID_LENGTH_SIZE); int dataLength = revData.length-ID_LENGTH_SIZE-idLength; int dataOffset = ID_LENGTH_SIZE+idLength; if(Bytes.compareTo(revData, ID_LENGTH_SIZE, id.length, revData, dataOffset, dataLength) == 0) { // the bad version is caused by previous successful setData return stat; } } {code} When we write the data to zk, we write an identifier for the process. Here in order to check whether the BADVERSION exception from zookeeper is due to a previous setData (from the same process), we need to compare the id read from the zookeeper and the id for this process (this.id).. Or am I missing some thing. The above offset and length calculating math and compare looks problematic for me. In that case this is the issue for this bug I guess. From the log it is clear that there is no problem wrt the node and version at 1st. [As part of the transition of state from OPENING to OPENED 1st the present data is read and the check below tells the data and its version every thing is fine.] Immediately a connection loss happened. This triggers a retry for the setData. May be the previous operation made the data change in zookeeper and master got the data changed event. (?) I think correcting the above code may solve the problems. Bad version, failed OPENING to OPENED but master thinks it is open anyways -- Key: HBASE-7034 URL: https://issues.apache.org/jira/browse/HBASE-7034 Project: HBase Issue Type: Bug Components: Region Assignment Affects Versions: 0.94.2 Reporter: stack I have this in RS log: {code} 2012-10-22 02:21:50,698 ERROR org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler: Failed transitioning node b9,\xEE\xAE\x9BiQO\x89]+a\xE0\x7F\xB7'X?,1349052737638.9af7cfc9b15910a0b3d714bf40a3248f. from OPENING to OPENED -- closing region org.apache.zookeeper.KeeperException$BadVersionException: KeeperErrorCode = BadVersion for /hbase/unassigned/9af7cfc9b15910a0b3d714bf40a3248f {code} Master says this (it is bulk assigning): {code} 2012-10-22 02:21:40,673 DEBUG org.apache.hadoop.hbase.zookeeper.ZKUtil: master:10302-0xb3a862e57a503ba Set watcher on existing znode /hbase/unassigned/9af7cfc9b15910a0b3d714bf40a3248f ... then this 2012-10-22 02:23:47,089 DEBUG org.apache.hadoop.hbase.zookeeper.ZKUtil: master:10302-0xb3a862e57a503ba Set watcher on existing znode /hbase/unassigned/9af7cfc9b15910a0b3d714bf40a3248f 2012-10-22 02:24:34,176 DEBUG org.apache.hadoop.hbase.zookeeper.ZKUtil: master:10302-0xb3a862e57a503ba Retrieved 112 byte(s) of data from znode /hbase/unassigned/9af7cfc9b15910a0b3d714bf40a3248f and set watcher; region=b9,\xEE\xAE\x9BiQO\x89]+a\xE0\x7F\xB7'X?,1349052737638.9af7cfc9b15910a0b3d714bf40a3248f., origin=sv4r17s44,10304,1350872216778, state=RS_ZK_REGION_OPENED etc. {code} Disagreement as to what is going on here. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7034) Bad version, failed OPENING to OPENED but master thinks it is open anyways
[ https://issues.apache.org/jira/browse/HBASE-7034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13482160#comment-13482160 ] ramkrishna.s.vasudevan commented on HBASE-7034: --- @Stack This happens in 0.94.2? Which RS failed in transitioning the node to OPENED? new one or the one which already transitioned to OPENED? Sorry for the questions as there are no full logs i had these doubts. Thanks Stack for reporting this, it should be useful for us also. Bad version, failed OPENING to OPENED but master thinks it is open anyways -- Key: HBASE-7034 URL: https://issues.apache.org/jira/browse/HBASE-7034 Project: HBase Issue Type: Bug Components: Region Assignment Affects Versions: 0.94.2 Reporter: stack I have this in RS log: {code} 2012-10-22 02:21:50,698 ERROR org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler: Failed transitioning node b9,\xEE\xAE\x9BiQO\x89]+a\xE0\x7F\xB7'X?,1349052737638.9af7cfc9b15910a0b3d714bf40a3248f. from OPENING to OPENED -- closing region org.apache.zookeeper.KeeperException$BadVersionException: KeeperErrorCode = BadVersion for /hbase/unassigned/9af7cfc9b15910a0b3d714bf40a3248f {code} Master says this (it is bulk assigning): {code} 2012-10-22 02:21:40,673 DEBUG org.apache.hadoop.hbase.zookeeper.ZKUtil: master:10302-0xb3a862e57a503ba Set watcher on existing znode /hbase/unassigned/9af7cfc9b15910a0b3d714bf40a3248f ... then this 2012-10-22 02:23:47,089 DEBUG org.apache.hadoop.hbase.zookeeper.ZKUtil: master:10302-0xb3a862e57a503ba Set watcher on existing znode /hbase/unassigned/9af7cfc9b15910a0b3d714bf40a3248f 2012-10-22 02:24:34,176 DEBUG org.apache.hadoop.hbase.zookeeper.ZKUtil: master:10302-0xb3a862e57a503ba Retrieved 112 byte(s) of data from znode /hbase/unassigned/9af7cfc9b15910a0b3d714bf40a3248f and set watcher; region=b9,\xEE\xAE\x9BiQO\x89]+a\xE0\x7F\xB7'X?,1349052737638.9af7cfc9b15910a0b3d714bf40a3248f., origin=sv4r17s44,10304,1350872216778, state=RS_ZK_REGION_OPENED etc. {code} Disagreement as to what is going on here. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7034) Bad version, failed OPENING to OPENED but master thinks it is open anyways
[ https://issues.apache.org/jira/browse/HBASE-7034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13482739#comment-13482739 ] stack commented on HBASE-7034: -- Here is more from log Ram. Master side listing of all lines related to 1349052737638.9af7cfc9b15910a0b3d714bf40a3248f {code} 2012-10-22 02:20:01,351 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: master:10302-0xb3a862e57a503ba Async create of unassigned node for 9af7cfc9b15910a0b3d714bf40a3248f with OFFLINE state 2012-10-22 02:20:24,577 DEBUG org.apache.hadoop.hbase.zookeeper.ZKUtil: master:10302-0xb3a862e57a503ba Set watcher on existing znode /hbase/unassigned/9af7cfc9b15910a0b3d714bf40a3248f 2012-10-22 02:21:08,770 DEBUG org.apache.hadoop.hbase.master.AssignmentManager$CreateUnassignedAsyncCallback: rs=b9,\xEE\xAE\x9BiQO\x89]+a\xE0\x7F\xB7'X?,1349052737638.9af7cfc9b15910a0b3d714bf40a3248f. state=OFFLINE, ts=1350872401351, server=null, server=sv4r17s44,10304,1350872216778 2012-10-22 02:21:09,495 DEBUG org.apache.hadoop.hbase.master.AssignmentManager$ExistsUnassignedAsyncCallback: rs=b9,\xEE\xAE\x9BiQO\x89]+a\xE0\x7F\xB7'X?,1349052737638.9af7cfc9b15910a0b3d714bf40a3248f. state=OFFLINE, ts=1350872401351, server=null 2012-10-22 02:21:40,673 DEBUG org.apache.hadoop.hbase.zookeeper.ZKUtil: master:10302-0xb3a862e57a503ba Set watcher on existing znode /hbase/unassigned/9af7cfc9b15910a0b3d714bf40a3248f 2012-10-22 02:23:47,089 DEBUG org.apache.hadoop.hbase.zookeeper.ZKUtil: master:10302-0xb3a862e57a503ba Set watcher on existing znode /hbase/unassigned/9af7cfc9b15910a0b3d714bf40a3248f 2012-10-22 02:24:34,154 DEBUG org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher: master:10302-0xb3a862e57a503ba Received ZooKeeper Event, type=NodeDataChanged, state=SyncConnected, path=/hbase/unassigned/9af7cfc9b15910a0b3d714bf40a3248f 2012-10-22 02:24:34,176 DEBUG org.apache.hadoop.hbase.zookeeper.ZKUtil: master:10302-0xb3a862e57a503ba Retrieved 112 byte(s) of data from znode /hbase/unassigned/9af7cfc9b15910a0b3d714bf40a3248f and set watcher; region=b9,\xEE\xAE\x9BiQO\x89]+a\xE0\x7F\xB7'X?,1349052737638.9af7cfc9b15910a0b3d714bf40a3248f., origin=sv4r17s44,10304,1350872216778, state=RS_ZK_REGION_OPENED 2012-10-22 02:24:34,176 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Handling transition=RS_ZK_REGION_OPENED, server=sv4r17s44,10304,1350872216778, region=9af7cfc9b15910a0b3d714bf40a3248f, which is more than 15 seconds late 2012-10-22 02:24:34,176 DEBUG org.apache.hadoop.hbase.master.handler.OpenedRegionHandler: Handling OPENED event for b9,\xEE\xAE\x9BiQO\x89]+a\xE0\x7F\xB7'X?,1349052737638.9af7cfc9b15910a0b3d714bf40a3248f. from sv4r17s44,10304,1350872216778; deleting unassigned node 2012-10-22 02:24:34,176 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: master:10302-0xb3a862e57a503ba Deleting existing unassigned node for 9af7cfc9b15910a0b3d714bf40a3248f that is in expected state RS_ZK_REGION_OPENED 2012-10-22 02:24:34,221 DEBUG org.apache.hadoop.hbase.zookeeper.ZKUtil: master:10302-0xb3a862e57a503ba Retrieved 112 byte(s) of data from znode /hbase/unassigned/9af7cfc9b15910a0b3d714bf40a3248f; data=region=b9,\xEE\xAE\x9BiQO\x89]+a\xE0\x7F\xB7'X?,1349052737638.9af7cfc9b15910a0b3d714bf40a3248f., origin=sv4r17s44,10304,1350872216778, state=RS_ZK_REGION_OPENED 2012-10-22 02:24:34,239 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: master:10302-0xb3a862e57a503ba Successfully deleted unassigned node for region 9af7cfc9b15910a0b3d714bf40a3248f in expected state RS_ZK_REGION_OPENED 2012-10-22 02:27:09,169 DEBUG org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher: master:10302-0xb3a862e57a503ba Received ZooKeeper Event, type=NodeDeleted, state=SyncConnected, path=/hbase/unassigned/9af7cfc9b15910a0b3d714bf40a3248f 2012-10-22 02:27:09,169 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: The znode of region b9,\xEE\xAE\x9BiQO\x89]+a\xE0\x7F\xB7'X?,1349052737638.9af7cfc9b15910a0b3d714bf40a3248f. has been deleted. 2012-10-22 02:27:09,174 INFO org.apache.hadoop.hbase.master.AssignmentManager: The master has opened the region b9,\xEE\xAE\x9BiQO\x89]+a\xE0\x7F\xB7'X?,1349052737638.9af7cfc9b15910a0b3d714bf40a3248f. that was online on sv4r17s44,10304,1350872216778 {code} The above is being processed some minutes after the regionserver (see below) I guess because it is a bulk assign and there is a lot of other zk'ing going on at this time. Here is the regionserver side. {code} 2012-10-22 02:21:34,778 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: regionserver:10304-0xd3a863aa2ee011a Attempting to transition node 9af7cfc9b15910a0b3d714bf40a3248f from RS_ZK_REGION_OPENING to RS_ZK_REGION_OPENED 2012-10-22 02:21:34,801 DEBUG org.apache.hadoop.hbase.zookeeper.ZKUtil: regionserver:10304-0xd3a863aa2ee011a Retrieved 112 byte(s) of data from znode /hbase/unassigned/9af7cfc9b15910a0b3d714bf40a3248f;