[jira] [Updated] (HBASE-6070) AM.nodeDeleted and SSH races creating problems for regions under SPLIT

2012-10-26 Thread Tianying Chang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6070?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tianying Chang updated HBASE-6070:
--


@stack 

Thanks. I want to get some second opinion from others. I guess it is better to 
do this by opening a separate jira. I have created HBASE-7058 for this purpose. 
If other people found no other potential problem, I can provide patch.  

 AM.nodeDeleted and SSH races creating problems for regions under SPLIT
 --

 Key: HBASE-6070
 URL: https://issues.apache.org/jira/browse/HBASE-6070
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.92.1, 0.94.0
Reporter: ramkrishna.s.vasudevan
Assignee: ramkrishna.s.vasudevan
 Fix For: 0.92.2, 0.94.1, 0.96.0

 Attachments: HBASE-6070_0.92_1.patch, HBASE-6070_0.92.patch, 
 HBASE-6070_0.94_1.patch, HBASE-6070_0.94.patch, HBASE-6070_trunk_1.patch, 
 HBASE-6070_trunk.patch


 We tried to address the problems in Master restart and RS restart while SPLIT 
 region is in progress as part of HBASE-5806.
 While doing some more we found still there is one race condition.
 - Split has just started and the znode is in RS_SPLIT state.
 - RS goes down.
 - First call back for SSH comes.
 - As part of the fix for HBASE-5806 SSH knows that some region is in RIT.
 - But now nodeDeleted event comes for the SPLIt node and there we try to 
 delete the RIT.
 - After this we try to see in the SSH whether any node is in RIT.  As we 
 dont find the region in RIT the region is never assigned.
 When we fixed HBASE-5806 step 6 happened first and then step 5 happened.  So 
 we missed it.  Now we found that. Will come up with a patch shortly.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-6070) AM.nodeDeleted and SSH races creating problems for regions under SPLIT

2012-10-24 Thread Tianying Chang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6070?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tianying Chang updated HBASE-6070:
--


@ram,  

I am reading the code related to region split. I feel that this code below in 
AssignmentManager seems to be dead code.  Because 1) I don't see any place that 
callls to update the regionState to be State.SPLIT. 2) for scenario when region 
has already been split and RS crashed, ServerShutdownHandler should have 
already taken care of it.  Am I missing something here.  Thanks

if (rs.isSplit()) {
  LOG.debug(Ephemeral node deleted, regionserver crashed?,  +
clearing from RIT; rs= + rs);
  regionOffline(rs.getRegion());


 AM.nodeDeleted and SSH races creating problems for regions under SPLIT
 --

 Key: HBASE-6070
 URL: https://issues.apache.org/jira/browse/HBASE-6070
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.92.1, 0.94.0
Reporter: ramkrishna.s.vasudevan
Assignee: ramkrishna.s.vasudevan
 Fix For: 0.92.2, 0.94.1, 0.96.0

 Attachments: HBASE-6070_0.92_1.patch, HBASE-6070_0.92.patch, 
 HBASE-6070_0.94_1.patch, HBASE-6070_0.94.patch, HBASE-6070_trunk_1.patch, 
 HBASE-6070_trunk.patch


 We tried to address the problems in Master restart and RS restart while SPLIT 
 region is in progress as part of HBASE-5806.
 While doing some more we found still there is one race condition.
 - Split has just started and the znode is in RS_SPLIT state.
 - RS goes down.
 - First call back for SSH comes.
 - As part of the fix for HBASE-5806 SSH knows that some region is in RIT.
 - But now nodeDeleted event comes for the SPLIt node and there we try to 
 delete the RIT.
 - After this we try to see in the SSH whether any node is in RIT.  As we 
 dont find the region in RIT the region is never assigned.
 When we fixed HBASE-5806 step 6 happened first and then step 5 happened.  So 
 we missed it.  Now we found that. Will come up with a patch shortly.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-6070) AM.nodeDeleted and SSH races creating problems for regions under SPLIT

2012-05-27 Thread ramkrishna.s.vasudevan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6070?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ramkrishna.s.vasudevan updated HBASE-6070:
--

Hadoop Flags: Reviewed

 AM.nodeDeleted and SSH races creating problems for regions under SPLIT
 --

 Key: HBASE-6070
 URL: https://issues.apache.org/jira/browse/HBASE-6070
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.92.1, 0.94.0
Reporter: ramkrishna.s.vasudevan
Assignee: ramkrishna.s.vasudevan
 Fix For: 0.92.2, 0.96.0, 0.94.1

 Attachments: HBASE-6070_0.92.patch, HBASE-6070_0.92_1.patch, 
 HBASE-6070_0.94.patch, HBASE-6070_0.94_1.patch, HBASE-6070_trunk.patch, 
 HBASE-6070_trunk_1.patch


 We tried to address the problems in Master restart and RS restart while SPLIT 
 region is in progress as part of HBASE-5806.
 While doing some more we found still there is one race condition.
 - Split has just started and the znode is in RS_SPLIT state.
 - RS goes down.
 - First call back for SSH comes.
 - As part of the fix for HBASE-5806 SSH knows that some region is in RIT.
 - But now nodeDeleted event comes for the SPLIt node and there we try to 
 delete the RIT.
 - After this we try to see in the SSH whether any node is in RIT.  As we 
 dont find the region in RIT the region is never assigned.
 When we fixed HBASE-5806 step 6 happened first and then step 5 happened.  So 
 we missed it.  Now we found that. Will come up with a patch shortly.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-6070) AM.nodeDeleted and SSH races creating problems for regions under SPLIT

2012-05-25 Thread ramkrishna.s.vasudevan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6070?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ramkrishna.s.vasudevan updated HBASE-6070:
--

Resolution: Fixed
Status: Resolved  (was: Patch Available)

 AM.nodeDeleted and SSH races creating problems for regions under SPLIT
 --

 Key: HBASE-6070
 URL: https://issues.apache.org/jira/browse/HBASE-6070
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.92.1, 0.94.0
Reporter: ramkrishna.s.vasudevan
Assignee: ramkrishna.s.vasudevan
 Fix For: 0.92.2, 0.96.0, 0.94.1

 Attachments: HBASE-6070_0.92.patch, HBASE-6070_0.92_1.patch, 
 HBASE-6070_0.94.patch, HBASE-6070_0.94_1.patch, HBASE-6070_trunk.patch, 
 HBASE-6070_trunk_1.patch


 We tried to address the problems in Master restart and RS restart while SPLIT 
 region is in progress as part of HBASE-5806.
 While doing some more we found still there is one race condition.
 - Split has just started and the znode is in RS_SPLIT state.
 - RS goes down.
 - First call back for SSH comes.
 - As part of the fix for HBASE-5806 SSH knows that some region is in RIT.
 - But now nodeDeleted event comes for the SPLIt node and there we try to 
 delete the RIT.
 - After this we try to see in the SSH whether any node is in RIT.  As we 
 dont find the region in RIT the region is never assigned.
 When we fixed HBASE-5806 step 6 happened first and then step 5 happened.  So 
 we missed it.  Now we found that. Will come up with a patch shortly.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-6070) AM.nodeDeleted and SSH races creating problems for regions under SPLIT

2012-05-24 Thread ramkrishna.s.vasudevan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6070?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ramkrishna.s.vasudevan updated HBASE-6070:
--

Attachment: HBASE-6070_0.94.patch

 AM.nodeDeleted and SSH races creating problems for regions under SPLIT
 --

 Key: HBASE-6070
 URL: https://issues.apache.org/jira/browse/HBASE-6070
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.92.1, 0.94.0
Reporter: ramkrishna.s.vasudevan
Assignee: ramkrishna.s.vasudevan
 Fix For: 0.92.2, 0.96.0, 0.94.1

 Attachments: HBASE-6070_0.92.patch, HBASE-6070_0.94.patch


 We tried to address the problems in Master restart and RS restart while SPLIT 
 region is in progress as part of HBASE-5806.
 While doing some more we found still there is one race condition.
 - Split has just started and the znode is in RS_SPLIT state.
 - RS goes down.
 - First call back for SSH comes.
 - As part of the fix for HBASE-5806 SSH knows that some region is in RIT.
 - But now nodeDeleted event comes for the SPLIt node and there we try to 
 delete the RIT.
 - After this we try to see in the SSH whether any node is in RIT.  As we 
 dont find the region in RIT the region is never assigned.
 When we fixed HBASE-5806 step 6 happened first and then step 5 happened.  So 
 we missed it.  Now we found that. Will come up with a patch shortly.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-6070) AM.nodeDeleted and SSH races creating problems for regions under SPLIT

2012-05-24 Thread ramkrishna.s.vasudevan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6070?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ramkrishna.s.vasudevan updated HBASE-6070:
--

Attachment: HBASE-6070_0.92.patch

 AM.nodeDeleted and SSH races creating problems for regions under SPLIT
 --

 Key: HBASE-6070
 URL: https://issues.apache.org/jira/browse/HBASE-6070
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.92.1, 0.94.0
Reporter: ramkrishna.s.vasudevan
Assignee: ramkrishna.s.vasudevan
 Fix For: 0.92.2, 0.96.0, 0.94.1

 Attachments: HBASE-6070_0.92.patch, HBASE-6070_0.94.patch


 We tried to address the problems in Master restart and RS restart while SPLIT 
 region is in progress as part of HBASE-5806.
 While doing some more we found still there is one race condition.
 - Split has just started and the znode is in RS_SPLIT state.
 - RS goes down.
 - First call back for SSH comes.
 - As part of the fix for HBASE-5806 SSH knows that some region is in RIT.
 - But now nodeDeleted event comes for the SPLIt node and there we try to 
 delete the RIT.
 - After this we try to see in the SSH whether any node is in RIT.  As we 
 dont find the region in RIT the region is never assigned.
 When we fixed HBASE-5806 step 6 happened first and then step 5 happened.  So 
 we missed it.  Now we found that. Will come up with a patch shortly.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-6070) AM.nodeDeleted and SSH races creating problems for regions under SPLIT

2012-05-24 Thread ramkrishna.s.vasudevan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6070?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ramkrishna.s.vasudevan updated HBASE-6070:
--

Attachment: HBASE-6070_trunk.patch

Uploaded patches for all branches.  Tested in cluster including scenarios for 
HBASE-5806.  Pls review and provide your comments.

 AM.nodeDeleted and SSH races creating problems for regions under SPLIT
 --

 Key: HBASE-6070
 URL: https://issues.apache.org/jira/browse/HBASE-6070
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.92.1, 0.94.0
Reporter: ramkrishna.s.vasudevan
Assignee: ramkrishna.s.vasudevan
 Fix For: 0.92.2, 0.96.0, 0.94.1

 Attachments: HBASE-6070_0.92.patch, HBASE-6070_0.94.patch, 
 HBASE-6070_trunk.patch


 We tried to address the problems in Master restart and RS restart while SPLIT 
 region is in progress as part of HBASE-5806.
 While doing some more we found still there is one race condition.
 - Split has just started and the znode is in RS_SPLIT state.
 - RS goes down.
 - First call back for SSH comes.
 - As part of the fix for HBASE-5806 SSH knows that some region is in RIT.
 - But now nodeDeleted event comes for the SPLIt node and there we try to 
 delete the RIT.
 - After this we try to see in the SSH whether any node is in RIT.  As we 
 dont find the region in RIT the region is never assigned.
 When we fixed HBASE-5806 step 6 happened first and then step 5 happened.  So 
 we missed it.  Now we found that. Will come up with a patch shortly.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-6070) AM.nodeDeleted and SSH races creating problems for regions under SPLIT

2012-05-24 Thread ramkrishna.s.vasudevan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6070?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ramkrishna.s.vasudevan updated HBASE-6070:
--

Status: Patch Available  (was: Open)

 AM.nodeDeleted and SSH races creating problems for regions under SPLIT
 --

 Key: HBASE-6070
 URL: https://issues.apache.org/jira/browse/HBASE-6070
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.94.0, 0.92.1
Reporter: ramkrishna.s.vasudevan
Assignee: ramkrishna.s.vasudevan
 Fix For: 0.92.2, 0.96.0, 0.94.1

 Attachments: HBASE-6070_0.92.patch, HBASE-6070_0.94.patch, 
 HBASE-6070_trunk.patch


 We tried to address the problems in Master restart and RS restart while SPLIT 
 region is in progress as part of HBASE-5806.
 While doing some more we found still there is one race condition.
 - Split has just started and the znode is in RS_SPLIT state.
 - RS goes down.
 - First call back for SSH comes.
 - As part of the fix for HBASE-5806 SSH knows that some region is in RIT.
 - But now nodeDeleted event comes for the SPLIt node and there we try to 
 delete the RIT.
 - After this we try to see in the SSH whether any node is in RIT.  As we 
 dont find the region in RIT the region is never assigned.
 When we fixed HBASE-5806 step 6 happened first and then step 5 happened.  So 
 we missed it.  Now we found that. Will come up with a patch shortly.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-6070) AM.nodeDeleted and SSH races creating problems for regions under SPLIT

2012-05-24 Thread ramkrishna.s.vasudevan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6070?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ramkrishna.s.vasudevan updated HBASE-6070:
--

Status: Open  (was: Patch Available)

 AM.nodeDeleted and SSH races creating problems for regions under SPLIT
 --

 Key: HBASE-6070
 URL: https://issues.apache.org/jira/browse/HBASE-6070
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.94.0, 0.92.1
Reporter: ramkrishna.s.vasudevan
Assignee: ramkrishna.s.vasudevan
 Fix For: 0.92.2, 0.96.0, 0.94.1

 Attachments: HBASE-6070_0.92.patch, HBASE-6070_0.92_1.patch, 
 HBASE-6070_0.94.patch, HBASE-6070_trunk.patch


 We tried to address the problems in Master restart and RS restart while SPLIT 
 region is in progress as part of HBASE-5806.
 While doing some more we found still there is one race condition.
 - Split has just started and the znode is in RS_SPLIT state.
 - RS goes down.
 - First call back for SSH comes.
 - As part of the fix for HBASE-5806 SSH knows that some region is in RIT.
 - But now nodeDeleted event comes for the SPLIt node and there we try to 
 delete the RIT.
 - After this we try to see in the SSH whether any node is in RIT.  As we 
 dont find the region in RIT the region is never assigned.
 When we fixed HBASE-5806 step 6 happened first and then step 5 happened.  So 
 we missed it.  Now we found that. Will come up with a patch shortly.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-6070) AM.nodeDeleted and SSH races creating problems for regions under SPLIT

2012-05-24 Thread ramkrishna.s.vasudevan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6070?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ramkrishna.s.vasudevan updated HBASE-6070:
--

Attachment: HBASE-6070_0.92_1.patch

 AM.nodeDeleted and SSH races creating problems for regions under SPLIT
 --

 Key: HBASE-6070
 URL: https://issues.apache.org/jira/browse/HBASE-6070
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.92.1, 0.94.0
Reporter: ramkrishna.s.vasudevan
Assignee: ramkrishna.s.vasudevan
 Fix For: 0.92.2, 0.96.0, 0.94.1

 Attachments: HBASE-6070_0.92.patch, HBASE-6070_0.92_1.patch, 
 HBASE-6070_0.94.patch, HBASE-6070_trunk.patch


 We tried to address the problems in Master restart and RS restart while SPLIT 
 region is in progress as part of HBASE-5806.
 While doing some more we found still there is one race condition.
 - Split has just started and the znode is in RS_SPLIT state.
 - RS goes down.
 - First call back for SSH comes.
 - As part of the fix for HBASE-5806 SSH knows that some region is in RIT.
 - But now nodeDeleted event comes for the SPLIt node and there we try to 
 delete the RIT.
 - After this we try to see in the SSH whether any node is in RIT.  As we 
 dont find the region in RIT the region is never assigned.
 When we fixed HBASE-5806 step 6 happened first and then step 5 happened.  So 
 we missed it.  Now we found that. Will come up with a patch shortly.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-6070) AM.nodeDeleted and SSH races creating problems for regions under SPLIT

2012-05-24 Thread ramkrishna.s.vasudevan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6070?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ramkrishna.s.vasudevan updated HBASE-6070:
--

Attachment: HBASE-6070_0.94_1.patch

 AM.nodeDeleted and SSH races creating problems for regions under SPLIT
 --

 Key: HBASE-6070
 URL: https://issues.apache.org/jira/browse/HBASE-6070
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.92.1, 0.94.0
Reporter: ramkrishna.s.vasudevan
Assignee: ramkrishna.s.vasudevan
 Fix For: 0.92.2, 0.96.0, 0.94.1

 Attachments: HBASE-6070_0.92.patch, HBASE-6070_0.92_1.patch, 
 HBASE-6070_0.94.patch, HBASE-6070_0.94_1.patch, HBASE-6070_trunk.patch, 
 HBASE-6070_trunk_1.patch


 We tried to address the problems in Master restart and RS restart while SPLIT 
 region is in progress as part of HBASE-5806.
 While doing some more we found still there is one race condition.
 - Split has just started and the znode is in RS_SPLIT state.
 - RS goes down.
 - First call back for SSH comes.
 - As part of the fix for HBASE-5806 SSH knows that some region is in RIT.
 - But now nodeDeleted event comes for the SPLIt node and there we try to 
 delete the RIT.
 - After this we try to see in the SSH whether any node is in RIT.  As we 
 dont find the region in RIT the region is never assigned.
 When we fixed HBASE-5806 step 6 happened first and then step 5 happened.  So 
 we missed it.  Now we found that. Will come up with a patch shortly.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-6070) AM.nodeDeleted and SSH races creating problems for regions under SPLIT

2012-05-24 Thread ramkrishna.s.vasudevan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6070?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ramkrishna.s.vasudevan updated HBASE-6070:
--

Status: Patch Available  (was: Open)

 AM.nodeDeleted and SSH races creating problems for regions under SPLIT
 --

 Key: HBASE-6070
 URL: https://issues.apache.org/jira/browse/HBASE-6070
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.94.0, 0.92.1
Reporter: ramkrishna.s.vasudevan
Assignee: ramkrishna.s.vasudevan
 Fix For: 0.92.2, 0.96.0, 0.94.1

 Attachments: HBASE-6070_0.92.patch, HBASE-6070_0.92_1.patch, 
 HBASE-6070_0.94.patch, HBASE-6070_0.94_1.patch, HBASE-6070_trunk.patch, 
 HBASE-6070_trunk_1.patch


 We tried to address the problems in Master restart and RS restart while SPLIT 
 region is in progress as part of HBASE-5806.
 While doing some more we found still there is one race condition.
 - Split has just started and the znode is in RS_SPLIT state.
 - RS goes down.
 - First call back for SSH comes.
 - As part of the fix for HBASE-5806 SSH knows that some region is in RIT.
 - But now nodeDeleted event comes for the SPLIt node and there we try to 
 delete the RIT.
 - After this we try to see in the SSH whether any node is in RIT.  As we 
 dont find the region in RIT the region is never assigned.
 When we fixed HBASE-5806 step 6 happened first and then step 5 happened.  So 
 we missed it.  Now we found that. Will come up with a patch shortly.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-6070) AM.nodeDeleted and SSH races creating problems for regions under SPLIT

2012-05-24 Thread ramkrishna.s.vasudevan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6070?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ramkrishna.s.vasudevan updated HBASE-6070:
--

Attachment: HBASE-6070_trunk_1.patch

Updated patches fixing the comments.  I tried running the failed testcase.  It 
passed every time.

 AM.nodeDeleted and SSH races creating problems for regions under SPLIT
 --

 Key: HBASE-6070
 URL: https://issues.apache.org/jira/browse/HBASE-6070
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.92.1, 0.94.0
Reporter: ramkrishna.s.vasudevan
Assignee: ramkrishna.s.vasudevan
 Fix For: 0.92.2, 0.96.0, 0.94.1

 Attachments: HBASE-6070_0.92.patch, HBASE-6070_0.92_1.patch, 
 HBASE-6070_0.94.patch, HBASE-6070_0.94_1.patch, HBASE-6070_trunk.patch, 
 HBASE-6070_trunk_1.patch


 We tried to address the problems in Master restart and RS restart while SPLIT 
 region is in progress as part of HBASE-5806.
 While doing some more we found still there is one race condition.
 - Split has just started and the znode is in RS_SPLIT state.
 - RS goes down.
 - First call back for SSH comes.
 - As part of the fix for HBASE-5806 SSH knows that some region is in RIT.
 - But now nodeDeleted event comes for the SPLIt node and there we try to 
 delete the RIT.
 - After this we try to see in the SSH whether any node is in RIT.  As we 
 dont find the region in RIT the region is never assigned.
 When we fixed HBASE-5806 step 6 happened first and then step 5 happened.  So 
 we missed it.  Now we found that. Will come up with a patch shortly.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-6070) AM.nodeDeleted and SSH races creating problems for regions under SPLIT

2012-05-24 Thread ramkrishna.s.vasudevan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6070?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ramkrishna.s.vasudevan updated HBASE-6070:
--

Attachment: (was: HBASE-6070_trunk_1.patch)

 AM.nodeDeleted and SSH races creating problems for regions under SPLIT
 --

 Key: HBASE-6070
 URL: https://issues.apache.org/jira/browse/HBASE-6070
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.92.1, 0.94.0
Reporter: ramkrishna.s.vasudevan
Assignee: ramkrishna.s.vasudevan
 Fix For: 0.92.2, 0.96.0, 0.94.1

 Attachments: HBASE-6070_0.92.patch, HBASE-6070_0.92_1.patch, 
 HBASE-6070_0.94.patch, HBASE-6070_0.94_1.patch, HBASE-6070_trunk.patch


 We tried to address the problems in Master restart and RS restart while SPLIT 
 region is in progress as part of HBASE-5806.
 While doing some more we found still there is one race condition.
 - Split has just started and the znode is in RS_SPLIT state.
 - RS goes down.
 - First call back for SSH comes.
 - As part of the fix for HBASE-5806 SSH knows that some region is in RIT.
 - But now nodeDeleted event comes for the SPLIt node and there we try to 
 delete the RIT.
 - After this we try to see in the SSH whether any node is in RIT.  As we 
 dont find the region in RIT the region is never assigned.
 When we fixed HBASE-5806 step 6 happened first and then step 5 happened.  So 
 we missed it.  Now we found that. Will come up with a patch shortly.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-6070) AM.nodeDeleted and SSH races creating problems for regions under SPLIT

2012-05-24 Thread ramkrishna.s.vasudevan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6070?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ramkrishna.s.vasudevan updated HBASE-6070:
--

Attachment: HBASE-6070_trunk_1.patch

Just reattaching the patch.

 AM.nodeDeleted and SSH races creating problems for regions under SPLIT
 --

 Key: HBASE-6070
 URL: https://issues.apache.org/jira/browse/HBASE-6070
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.92.1, 0.94.0
Reporter: ramkrishna.s.vasudevan
Assignee: ramkrishna.s.vasudevan
 Fix For: 0.92.2, 0.96.0, 0.94.1

 Attachments: HBASE-6070_0.92.patch, HBASE-6070_0.92_1.patch, 
 HBASE-6070_0.94.patch, HBASE-6070_0.94_1.patch, HBASE-6070_trunk.patch, 
 HBASE-6070_trunk_1.patch


 We tried to address the problems in Master restart and RS restart while SPLIT 
 region is in progress as part of HBASE-5806.
 While doing some more we found still there is one race condition.
 - Split has just started and the znode is in RS_SPLIT state.
 - RS goes down.
 - First call back for SSH comes.
 - As part of the fix for HBASE-5806 SSH knows that some region is in RIT.
 - But now nodeDeleted event comes for the SPLIt node and there we try to 
 delete the RIT.
 - After this we try to see in the SSH whether any node is in RIT.  As we 
 dont find the region in RIT the region is never assigned.
 When we fixed HBASE-5806 step 6 happened first and then step 5 happened.  So 
 we missed it.  Now we found that. Will come up with a patch shortly.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira