[jira] [Updated] (HBASE-5270) Handle potential data loss due to concurrent processing of processFaileOver and ServerShutdownHandler
[ https://issues.apache.org/jira/browse/HBASE-5270?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] chunhui shen updated HBASE-5270: Attachment: HBASE-5270-92v11.patch patchv11 for 0.92 > Handle potential data loss due to concurrent processing of processFaileOver > and ServerShutdownHandler > - > > Key: HBASE-5270 > URL: https://issues.apache.org/jira/browse/HBASE-5270 > Project: HBase > Issue Type: Sub-task > Components: master >Reporter: Zhihong Yu >Assignee: chunhui shen > Fix For: 0.92.2 > > Attachments: 5270-90-testcase.patch, 5270-90-testcasev2.patch, > 5270-90.patch, 5270-90v2.patch, 5270-90v3.patch, 5270-testcase.patch, > 5270-testcasev2.patch, HBASE-5270-92v11.patch, HBASE-5270v11.patch, > hbase-5270.patch, hbase-5270v10.patch, hbase-5270v2.patch, > hbase-5270v4.patch, hbase-5270v5.patch, hbase-5270v6.patch, > hbase-5270v7.patch, hbase-5270v8.patch, hbase-5270v9.patch, sampletest.txt > > > This JIRA continues the effort from HBASE-5179. Starting with Stack's > comments about patches for 0.92 and TRUNK: > Reviewing 0.92v17 > isDeadServerInProgress is a new public method in ServerManager but it does > not seem to be used anywhere. > Does isDeadRootServerInProgress need to be public? Ditto for meta version. > This method param names are not right 'definitiveRootServer'; what is meant > by definitive? Do they need this qualifier? > Is there anything in place to stop us expiring a server twice if its carrying > root and meta? > What is difference between asking assignment manager isCarryingRoot and this > variable that is passed in? Should be doc'd at least. Ditto for meta. > I think I've asked for this a few times - onlineServers needs to be > explained... either in javadoc or in comment. This is the param passed into > joinCluster. How does it arise? I think I know but am unsure. God love the > poor noob that comes awandering this code trying to make sense of it all. > It looks like we get the list by trawling zk for regionserver znodes that > have not checked in. Don't we do this operation earlier in master setup? Are > we doing it again here? > Though distributed split log is configured, we will do in master single > process splitting under some conditions with this patch. Its not explained in > code why we would do this. Why do we think master log splitting 'high > priority' when it could very well be slower. Should we only go this route if > distributed splitting is not going on. Do we know if concurrent distributed > log splitting and master splitting works? > Why would we have dead servers in progress here in master startup? Because a > servershutdownhandler fired? > This patch is different to the patch for 0.90. Should go into trunk first > with tests, then 0.92. Should it be in this issue? This issue is really hard > to follow now. Maybe this issue is for 0.90.x and new issue for more work on > this trunk patch? > This patch needs to have the v18 differences applied. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5270) Handle potential data loss due to concurrent processing of processFaileOver and ServerShutdownHandler
[ https://issues.apache.org/jira/browse/HBASE-5270?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-5270: - Status: Open (was: Patch Available) > Handle potential data loss due to concurrent processing of processFaileOver > and ServerShutdownHandler > - > > Key: HBASE-5270 > URL: https://issues.apache.org/jira/browse/HBASE-5270 > Project: HBase > Issue Type: Sub-task > Components: master >Reporter: Zhihong Yu >Assignee: chunhui shen > Fix For: 0.92.2 > > Attachments: 5270-90-testcase.patch, 5270-90-testcasev2.patch, > 5270-90.patch, 5270-90v2.patch, 5270-90v3.patch, 5270-testcase.patch, > 5270-testcasev2.patch, HBASE-5270v11.patch, hbase-5270.patch, > hbase-5270v10.patch, hbase-5270v2.patch, hbase-5270v4.patch, > hbase-5270v5.patch, hbase-5270v6.patch, hbase-5270v7.patch, > hbase-5270v8.patch, hbase-5270v9.patch, sampletest.txt > > > This JIRA continues the effort from HBASE-5179. Starting with Stack's > comments about patches for 0.92 and TRUNK: > Reviewing 0.92v17 > isDeadServerInProgress is a new public method in ServerManager but it does > not seem to be used anywhere. > Does isDeadRootServerInProgress need to be public? Ditto for meta version. > This method param names are not right 'definitiveRootServer'; what is meant > by definitive? Do they need this qualifier? > Is there anything in place to stop us expiring a server twice if its carrying > root and meta? > What is difference between asking assignment manager isCarryingRoot and this > variable that is passed in? Should be doc'd at least. Ditto for meta. > I think I've asked for this a few times - onlineServers needs to be > explained... either in javadoc or in comment. This is the param passed into > joinCluster. How does it arise? I think I know but am unsure. God love the > poor noob that comes awandering this code trying to make sense of it all. > It looks like we get the list by trawling zk for regionserver znodes that > have not checked in. Don't we do this operation earlier in master setup? Are > we doing it again here? > Though distributed split log is configured, we will do in master single > process splitting under some conditions with this patch. Its not explained in > code why we would do this. Why do we think master log splitting 'high > priority' when it could very well be slower. Should we only go this route if > distributed splitting is not going on. Do we know if concurrent distributed > log splitting and master splitting works? > Why would we have dead servers in progress here in master startup? Because a > servershutdownhandler fired? > This patch is different to the patch for 0.90. Should go into trunk first > with tests, then 0.92. Should it be in this issue? This issue is really hard > to follow now. Maybe this issue is for 0.90.x and new issue for more work on > this trunk patch? > This patch needs to have the v18 differences applied. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5270) Handle potential data loss due to concurrent processing of processFaileOver and ServerShutdownHandler
[ https://issues.apache.org/jira/browse/HBASE-5270?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] chunhui shen updated HBASE-5270: Attachment: HBASE-5270v11.patch minor items addressed in patchv11 https://reviews.apache.org/r/4021/ > Handle potential data loss due to concurrent processing of processFaileOver > and ServerShutdownHandler > - > > Key: HBASE-5270 > URL: https://issues.apache.org/jira/browse/HBASE-5270 > Project: HBase > Issue Type: Sub-task > Components: master >Reporter: Zhihong Yu >Assignee: chunhui shen > Fix For: 0.92.2 > > Attachments: 5270-90-testcase.patch, 5270-90-testcasev2.patch, > 5270-90.patch, 5270-90v2.patch, 5270-90v3.patch, 5270-testcase.patch, > 5270-testcasev2.patch, HBASE-5270v11.patch, hbase-5270.patch, > hbase-5270v10.patch, hbase-5270v2.patch, hbase-5270v4.patch, > hbase-5270v5.patch, hbase-5270v6.patch, hbase-5270v7.patch, > hbase-5270v8.patch, hbase-5270v9.patch, sampletest.txt > > > This JIRA continues the effort from HBASE-5179. Starting with Stack's > comments about patches for 0.92 and TRUNK: > Reviewing 0.92v17 > isDeadServerInProgress is a new public method in ServerManager but it does > not seem to be used anywhere. > Does isDeadRootServerInProgress need to be public? Ditto for meta version. > This method param names are not right 'definitiveRootServer'; what is meant > by definitive? Do they need this qualifier? > Is there anything in place to stop us expiring a server twice if its carrying > root and meta? > What is difference between asking assignment manager isCarryingRoot and this > variable that is passed in? Should be doc'd at least. Ditto for meta. > I think I've asked for this a few times - onlineServers needs to be > explained... either in javadoc or in comment. This is the param passed into > joinCluster. How does it arise? I think I know but am unsure. God love the > poor noob that comes awandering this code trying to make sense of it all. > It looks like we get the list by trawling zk for regionserver znodes that > have not checked in. Don't we do this operation earlier in master setup? Are > we doing it again here? > Though distributed split log is configured, we will do in master single > process splitting under some conditions with this patch. Its not explained in > code why we would do this. Why do we think master log splitting 'high > priority' when it could very well be slower. Should we only go this route if > distributed splitting is not going on. Do we know if concurrent distributed > log splitting and master splitting works? > Why would we have dead servers in progress here in master startup? Because a > servershutdownhandler fired? > This patch is different to the patch for 0.90. Should go into trunk first > with tests, then 0.92. Should it be in this issue? This issue is really hard > to follow now. Maybe this issue is for 0.90.x and new issue for more work on > this trunk patch? > This patch needs to have the v18 differences applied. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5270) Handle potential data loss due to concurrent processing of processFaileOver and ServerShutdownHandler
[ https://issues.apache.org/jira/browse/HBASE-5270?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhihong Yu updated HBASE-5270: -- Status: Patch Available (was: Open) > Handle potential data loss due to concurrent processing of processFaileOver > and ServerShutdownHandler > - > > Key: HBASE-5270 > URL: https://issues.apache.org/jira/browse/HBASE-5270 > Project: HBase > Issue Type: Sub-task > Components: master >Reporter: Zhihong Yu >Assignee: chunhui shen > Fix For: 0.92.2 > > Attachments: 5270-90-testcase.patch, 5270-90-testcasev2.patch, > 5270-90.patch, 5270-90v2.patch, 5270-90v3.patch, 5270-testcase.patch, > 5270-testcasev2.patch, hbase-5270.patch, hbase-5270v10.patch, > hbase-5270v2.patch, hbase-5270v4.patch, hbase-5270v5.patch, > hbase-5270v6.patch, hbase-5270v7.patch, hbase-5270v8.patch, > hbase-5270v9.patch, sampletest.txt > > > This JIRA continues the effort from HBASE-5179. Starting with Stack's > comments about patches for 0.92 and TRUNK: > Reviewing 0.92v17 > isDeadServerInProgress is a new public method in ServerManager but it does > not seem to be used anywhere. > Does isDeadRootServerInProgress need to be public? Ditto for meta version. > This method param names are not right 'definitiveRootServer'; what is meant > by definitive? Do they need this qualifier? > Is there anything in place to stop us expiring a server twice if its carrying > root and meta? > What is difference between asking assignment manager isCarryingRoot and this > variable that is passed in? Should be doc'd at least. Ditto for meta. > I think I've asked for this a few times - onlineServers needs to be > explained... either in javadoc or in comment. This is the param passed into > joinCluster. How does it arise? I think I know but am unsure. God love the > poor noob that comes awandering this code trying to make sense of it all. > It looks like we get the list by trawling zk for regionserver znodes that > have not checked in. Don't we do this operation earlier in master setup? Are > we doing it again here? > Though distributed split log is configured, we will do in master single > process splitting under some conditions with this patch. Its not explained in > code why we would do this. Why do we think master log splitting 'high > priority' when it could very well be slower. Should we only go this route if > distributed splitting is not going on. Do we know if concurrent distributed > log splitting and master splitting works? > Why would we have dead servers in progress here in master startup? Because a > servershutdownhandler fired? > This patch is different to the patch for 0.90. Should go into trunk first > with tests, then 0.92. Should it be in this issue? This issue is really hard > to follow now. Maybe this issue is for 0.90.x and new issue for more work on > this trunk patch? > This patch needs to have the v18 differences applied. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5270) Handle potential data loss due to concurrent processing of processFaileOver and ServerShutdownHandler
[ https://issues.apache.org/jira/browse/HBASE-5270?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] chunhui shen updated HBASE-5270: Attachment: hbase-5270v10.patch > Handle potential data loss due to concurrent processing of processFaileOver > and ServerShutdownHandler > - > > Key: HBASE-5270 > URL: https://issues.apache.org/jira/browse/HBASE-5270 > Project: HBase > Issue Type: Sub-task > Components: master >Reporter: Zhihong Yu >Assignee: chunhui shen > Fix For: 0.92.2 > > Attachments: 5270-90-testcase.patch, 5270-90-testcasev2.patch, > 5270-90.patch, 5270-90v2.patch, 5270-90v3.patch, 5270-testcase.patch, > 5270-testcasev2.patch, hbase-5270.patch, hbase-5270v10.patch, > hbase-5270v2.patch, hbase-5270v4.patch, hbase-5270v5.patch, > hbase-5270v6.patch, hbase-5270v7.patch, hbase-5270v8.patch, > hbase-5270v9.patch, sampletest.txt > > > This JIRA continues the effort from HBASE-5179. Starting with Stack's > comments about patches for 0.92 and TRUNK: > Reviewing 0.92v17 > isDeadServerInProgress is a new public method in ServerManager but it does > not seem to be used anywhere. > Does isDeadRootServerInProgress need to be public? Ditto for meta version. > This method param names are not right 'definitiveRootServer'; what is meant > by definitive? Do they need this qualifier? > Is there anything in place to stop us expiring a server twice if its carrying > root and meta? > What is difference between asking assignment manager isCarryingRoot and this > variable that is passed in? Should be doc'd at least. Ditto for meta. > I think I've asked for this a few times - onlineServers needs to be > explained... either in javadoc or in comment. This is the param passed into > joinCluster. How does it arise? I think I know but am unsure. God love the > poor noob that comes awandering this code trying to make sense of it all. > It looks like we get the list by trawling zk for regionserver znodes that > have not checked in. Don't we do this operation earlier in master setup? Are > we doing it again here? > Though distributed split log is configured, we will do in master single > process splitting under some conditions with this patch. Its not explained in > code why we would do this. Why do we think master log splitting 'high > priority' when it could very well be slower. Should we only go this route if > distributed splitting is not going on. Do we know if concurrent distributed > log splitting and master splitting works? > Why would we have dead servers in progress here in master startup? Because a > servershutdownhandler fired? > This patch is different to the patch for 0.90. Should go into trunk first > with tests, then 0.92. Should it be in this issue? This issue is really hard > to follow now. Maybe this issue is for 0.90.x and new issue for more work on > this trunk patch? > This patch needs to have the v18 differences applied. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5270) Handle potential data loss due to concurrent processing of processFaileOver and ServerShutdownHandler
[ https://issues.apache.org/jira/browse/HBASE-5270?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] chunhui shen updated HBASE-5270: Attachment: hbase-5270v9.patch > Handle potential data loss due to concurrent processing of processFaileOver > and ServerShutdownHandler > - > > Key: HBASE-5270 > URL: https://issues.apache.org/jira/browse/HBASE-5270 > Project: HBase > Issue Type: Sub-task > Components: master >Reporter: Zhihong Yu >Assignee: chunhui shen > Fix For: 0.92.1, 0.94.0 > > Attachments: 5270-90-testcase.patch, 5270-90-testcasev2.patch, > 5270-90.patch, 5270-90v2.patch, 5270-90v3.patch, 5270-testcase.patch, > 5270-testcasev2.patch, hbase-5270.patch, hbase-5270v2.patch, > hbase-5270v4.patch, hbase-5270v5.patch, hbase-5270v6.patch, > hbase-5270v7.patch, hbase-5270v8.patch, hbase-5270v9.patch, sampletest.txt > > > This JIRA continues the effort from HBASE-5179. Starting with Stack's > comments about patches for 0.92 and TRUNK: > Reviewing 0.92v17 > isDeadServerInProgress is a new public method in ServerManager but it does > not seem to be used anywhere. > Does isDeadRootServerInProgress need to be public? Ditto for meta version. > This method param names are not right 'definitiveRootServer'; what is meant > by definitive? Do they need this qualifier? > Is there anything in place to stop us expiring a server twice if its carrying > root and meta? > What is difference between asking assignment manager isCarryingRoot and this > variable that is passed in? Should be doc'd at least. Ditto for meta. > I think I've asked for this a few times - onlineServers needs to be > explained... either in javadoc or in comment. This is the param passed into > joinCluster. How does it arise? I think I know but am unsure. God love the > poor noob that comes awandering this code trying to make sense of it all. > It looks like we get the list by trawling zk for regionserver znodes that > have not checked in. Don't we do this operation earlier in master setup? Are > we doing it again here? > Though distributed split log is configured, we will do in master single > process splitting under some conditions with this patch. Its not explained in > code why we would do this. Why do we think master log splitting 'high > priority' when it could very well be slower. Should we only go this route if > distributed splitting is not going on. Do we know if concurrent distributed > log splitting and master splitting works? > Why would we have dead servers in progress here in master startup? Because a > servershutdownhandler fired? > This patch is different to the patch for 0.90. Should go into trunk first > with tests, then 0.92. Should it be in this issue? This issue is really hard > to follow now. Maybe this issue is for 0.90.x and new issue for more work on > this trunk patch? > This patch needs to have the v18 differences applied. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5270) Handle potential data loss due to concurrent processing of processFaileOver and ServerShutdownHandler
[ https://issues.apache.org/jira/browse/HBASE-5270?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] chunhui shen updated HBASE-5270: Attachment: hbase-5270v8.patch > Handle potential data loss due to concurrent processing of processFaileOver > and ServerShutdownHandler > - > > Key: HBASE-5270 > URL: https://issues.apache.org/jira/browse/HBASE-5270 > Project: HBase > Issue Type: Sub-task > Components: master >Reporter: Zhihong Yu >Assignee: chunhui shen > Fix For: 0.92.1, 0.94.0 > > Attachments: 5270-90-testcase.patch, 5270-90-testcasev2.patch, > 5270-90.patch, 5270-90v2.patch, 5270-90v3.patch, 5270-testcase.patch, > 5270-testcasev2.patch, hbase-5270.patch, hbase-5270v2.patch, > hbase-5270v4.patch, hbase-5270v5.patch, hbase-5270v6.patch, > hbase-5270v7.patch, hbase-5270v8.patch, sampletest.txt > > > This JIRA continues the effort from HBASE-5179. Starting with Stack's > comments about patches for 0.92 and TRUNK: > Reviewing 0.92v17 > isDeadServerInProgress is a new public method in ServerManager but it does > not seem to be used anywhere. > Does isDeadRootServerInProgress need to be public? Ditto for meta version. > This method param names are not right 'definitiveRootServer'; what is meant > by definitive? Do they need this qualifier? > Is there anything in place to stop us expiring a server twice if its carrying > root and meta? > What is difference between asking assignment manager isCarryingRoot and this > variable that is passed in? Should be doc'd at least. Ditto for meta. > I think I've asked for this a few times - onlineServers needs to be > explained... either in javadoc or in comment. This is the param passed into > joinCluster. How does it arise? I think I know but am unsure. God love the > poor noob that comes awandering this code trying to make sense of it all. > It looks like we get the list by trawling zk for regionserver znodes that > have not checked in. Don't we do this operation earlier in master setup? Are > we doing it again here? > Though distributed split log is configured, we will do in master single > process splitting under some conditions with this patch. Its not explained in > code why we would do this. Why do we think master log splitting 'high > priority' when it could very well be slower. Should we only go this route if > distributed splitting is not going on. Do we know if concurrent distributed > log splitting and master splitting works? > Why would we have dead servers in progress here in master startup? Because a > servershutdownhandler fired? > This patch is different to the patch for 0.90. Should go into trunk first > with tests, then 0.92. Should it be in this issue? This issue is really hard > to follow now. Maybe this issue is for 0.90.x and new issue for more work on > this trunk patch? > This patch needs to have the v18 differences applied. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5270) Handle potential data loss due to concurrent processing of processFaileOver and ServerShutdownHandler
[ https://issues.apache.org/jira/browse/HBASE-5270?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] chunhui shen updated HBASE-5270: Attachment: hbase-5270v7.patch > Handle potential data loss due to concurrent processing of processFaileOver > and ServerShutdownHandler > - > > Key: HBASE-5270 > URL: https://issues.apache.org/jira/browse/HBASE-5270 > Project: HBase > Issue Type: Sub-task > Components: master >Reporter: Zhihong Yu >Assignee: chunhui shen > Fix For: 0.92.1, 0.94.0 > > Attachments: 5270-90-testcase.patch, 5270-90-testcasev2.patch, > 5270-90.patch, 5270-90v2.patch, 5270-90v3.patch, 5270-testcase.patch, > 5270-testcasev2.patch, hbase-5270.patch, hbase-5270v2.patch, > hbase-5270v4.patch, hbase-5270v5.patch, hbase-5270v6.patch, > hbase-5270v7.patch, sampletest.txt > > > This JIRA continues the effort from HBASE-5179. Starting with Stack's > comments about patches for 0.92 and TRUNK: > Reviewing 0.92v17 > isDeadServerInProgress is a new public method in ServerManager but it does > not seem to be used anywhere. > Does isDeadRootServerInProgress need to be public? Ditto for meta version. > This method param names are not right 'definitiveRootServer'; what is meant > by definitive? Do they need this qualifier? > Is there anything in place to stop us expiring a server twice if its carrying > root and meta? > What is difference between asking assignment manager isCarryingRoot and this > variable that is passed in? Should be doc'd at least. Ditto for meta. > I think I've asked for this a few times - onlineServers needs to be > explained... either in javadoc or in comment. This is the param passed into > joinCluster. How does it arise? I think I know but am unsure. God love the > poor noob that comes awandering this code trying to make sense of it all. > It looks like we get the list by trawling zk for regionserver znodes that > have not checked in. Don't we do this operation earlier in master setup? Are > we doing it again here? > Though distributed split log is configured, we will do in master single > process splitting under some conditions with this patch. Its not explained in > code why we would do this. Why do we think master log splitting 'high > priority' when it could very well be slower. Should we only go this route if > distributed splitting is not going on. Do we know if concurrent distributed > log splitting and master splitting works? > Why would we have dead servers in progress here in master startup? Because a > servershutdownhandler fired? > This patch is different to the patch for 0.90. Should go into trunk first > with tests, then 0.92. Should it be in this issue? This issue is really hard > to follow now. Maybe this issue is for 0.90.x and new issue for more work on > this trunk patch? > This patch needs to have the v18 differences applied. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5270) Handle potential data loss due to concurrent processing of processFaileOver and ServerShutdownHandler
[ https://issues.apache.org/jira/browse/HBASE-5270?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] chunhui shen updated HBASE-5270: Attachment: hbase-5270v6.patch I‘m sorry for the mistake of ConcurrentHashSet. Thank Ted. > Handle potential data loss due to concurrent processing of processFaileOver > and ServerShutdownHandler > - > > Key: HBASE-5270 > URL: https://issues.apache.org/jira/browse/HBASE-5270 > Project: HBase > Issue Type: Sub-task > Components: master >Reporter: Zhihong Yu >Assignee: chunhui shen > Fix For: 0.92.1, 0.94.0 > > Attachments: 5270-90-testcase.patch, 5270-90-testcasev2.patch, > 5270-90.patch, 5270-90v2.patch, 5270-90v3.patch, 5270-testcase.patch, > 5270-testcasev2.patch, hbase-5270.patch, hbase-5270v2.patch, > hbase-5270v4.patch, hbase-5270v5.patch, hbase-5270v6.patch, sampletest.txt > > > This JIRA continues the effort from HBASE-5179. Starting with Stack's > comments about patches for 0.92 and TRUNK: > Reviewing 0.92v17 > isDeadServerInProgress is a new public method in ServerManager but it does > not seem to be used anywhere. > Does isDeadRootServerInProgress need to be public? Ditto for meta version. > This method param names are not right 'definitiveRootServer'; what is meant > by definitive? Do they need this qualifier? > Is there anything in place to stop us expiring a server twice if its carrying > root and meta? > What is difference between asking assignment manager isCarryingRoot and this > variable that is passed in? Should be doc'd at least. Ditto for meta. > I think I've asked for this a few times - onlineServers needs to be > explained... either in javadoc or in comment. This is the param passed into > joinCluster. How does it arise? I think I know but am unsure. God love the > poor noob that comes awandering this code trying to make sense of it all. > It looks like we get the list by trawling zk for regionserver znodes that > have not checked in. Don't we do this operation earlier in master setup? Are > we doing it again here? > Though distributed split log is configured, we will do in master single > process splitting under some conditions with this patch. Its not explained in > code why we would do this. Why do we think master log splitting 'high > priority' when it could very well be slower. Should we only go this route if > distributed splitting is not going on. Do we know if concurrent distributed > log splitting and master splitting works? > Why would we have dead servers in progress here in master startup? Because a > servershutdownhandler fired? > This patch is different to the patch for 0.90. Should go into trunk first > with tests, then 0.92. Should it be in this issue? This issue is really hard > to follow now. Maybe this issue is for 0.90.x and new issue for more work on > this trunk patch? > This patch needs to have the v18 differences applied. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5270) Handle potential data loss due to concurrent processing of processFaileOver and ServerShutdownHandler
[ https://issues.apache.org/jira/browse/HBASE-5270?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] chunhui shen updated HBASE-5270: Attachment: hbase-5270v5.patch > Handle potential data loss due to concurrent processing of processFaileOver > and ServerShutdownHandler > - > > Key: HBASE-5270 > URL: https://issues.apache.org/jira/browse/HBASE-5270 > Project: HBase > Issue Type: Sub-task > Components: master >Reporter: Zhihong Yu >Assignee: chunhui shen > Fix For: 0.92.1, 0.94.0 > > Attachments: 5270-90-testcase.patch, 5270-90-testcasev2.patch, > 5270-90.patch, 5270-90v2.patch, 5270-90v3.patch, 5270-testcase.patch, > 5270-testcasev2.patch, hbase-5270.patch, hbase-5270v2.patch, > hbase-5270v4.patch, hbase-5270v5.patch, sampletest.txt > > > This JIRA continues the effort from HBASE-5179. Starting with Stack's > comments about patches for 0.92 and TRUNK: > Reviewing 0.92v17 > isDeadServerInProgress is a new public method in ServerManager but it does > not seem to be used anywhere. > Does isDeadRootServerInProgress need to be public? Ditto for meta version. > This method param names are not right 'definitiveRootServer'; what is meant > by definitive? Do they need this qualifier? > Is there anything in place to stop us expiring a server twice if its carrying > root and meta? > What is difference between asking assignment manager isCarryingRoot and this > variable that is passed in? Should be doc'd at least. Ditto for meta. > I think I've asked for this a few times - onlineServers needs to be > explained... either in javadoc or in comment. This is the param passed into > joinCluster. How does it arise? I think I know but am unsure. God love the > poor noob that comes awandering this code trying to make sense of it all. > It looks like we get the list by trawling zk for regionserver znodes that > have not checked in. Don't we do this operation earlier in master setup? Are > we doing it again here? > Though distributed split log is configured, we will do in master single > process splitting under some conditions with this patch. Its not explained in > code why we would do this. Why do we think master log splitting 'high > priority' when it could very well be slower. Should we only go this route if > distributed splitting is not going on. Do we know if concurrent distributed > log splitting and master splitting works? > Why would we have dead servers in progress here in master startup? Because a > servershutdownhandler fired? > This patch is different to the patch for 0.90. Should go into trunk first > with tests, then 0.92. Should it be in this issue? This issue is really hard > to follow now. Maybe this issue is for 0.90.x and new issue for more work on > this trunk patch? > This patch needs to have the v18 differences applied. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5270) Handle potential data loss due to concurrent processing of processFaileOver and ServerShutdownHandler
[ https://issues.apache.org/jira/browse/HBASE-5270?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhihong Yu updated HBASE-5270: -- Comment: was deleted (was: @Chunhui: Since review board isn't used, do you mind highlighting the new changes in hbase-5270v4.patch ? Thanks) > Handle potential data loss due to concurrent processing of processFaileOver > and ServerShutdownHandler > - > > Key: HBASE-5270 > URL: https://issues.apache.org/jira/browse/HBASE-5270 > Project: HBase > Issue Type: Sub-task > Components: master >Reporter: Zhihong Yu >Assignee: chunhui shen > Fix For: 0.92.1, 0.94.0 > > Attachments: 5270-90-testcase.patch, 5270-90-testcasev2.patch, > 5270-90.patch, 5270-90v2.patch, 5270-90v3.patch, 5270-testcase.patch, > 5270-testcasev2.patch, hbase-5270.patch, hbase-5270v2.patch, > hbase-5270v4.patch, sampletest.txt > > > This JIRA continues the effort from HBASE-5179. Starting with Stack's > comments about patches for 0.92 and TRUNK: > Reviewing 0.92v17 > isDeadServerInProgress is a new public method in ServerManager but it does > not seem to be used anywhere. > Does isDeadRootServerInProgress need to be public? Ditto for meta version. > This method param names are not right 'definitiveRootServer'; what is meant > by definitive? Do they need this qualifier? > Is there anything in place to stop us expiring a server twice if its carrying > root and meta? > What is difference between asking assignment manager isCarryingRoot and this > variable that is passed in? Should be doc'd at least. Ditto for meta. > I think I've asked for this a few times - onlineServers needs to be > explained... either in javadoc or in comment. This is the param passed into > joinCluster. How does it arise? I think I know but am unsure. God love the > poor noob that comes awandering this code trying to make sense of it all. > It looks like we get the list by trawling zk for regionserver znodes that > have not checked in. Don't we do this operation earlier in master setup? Are > we doing it again here? > Though distributed split log is configured, we will do in master single > process splitting under some conditions with this patch. Its not explained in > code why we would do this. Why do we think master log splitting 'high > priority' when it could very well be slower. Should we only go this route if > distributed splitting is not going on. Do we know if concurrent distributed > log splitting and master splitting works? > Why would we have dead servers in progress here in master startup? Because a > servershutdownhandler fired? > This patch is different to the patch for 0.90. Should go into trunk first > with tests, then 0.92. Should it be in this issue? This issue is really hard > to follow now. Maybe this issue is for 0.90.x and new issue for more work on > this trunk patch? > This patch needs to have the v18 differences applied. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5270) Handle potential data loss due to concurrent processing of processFaileOver and ServerShutdownHandler
[ https://issues.apache.org/jira/browse/HBASE-5270?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] chunhui shen updated HBASE-5270: Attachment: hbase-5270v4.patch > Handle potential data loss due to concurrent processing of processFaileOver > and ServerShutdownHandler > - > > Key: HBASE-5270 > URL: https://issues.apache.org/jira/browse/HBASE-5270 > Project: HBase > Issue Type: Sub-task > Components: master >Reporter: Zhihong Yu > Fix For: 0.92.1, 0.94.0 > > Attachments: 5270-90-testcase.patch, 5270-90-testcasev2.patch, > 5270-90.patch, 5270-90v2.patch, 5270-90v3.patch, 5270-testcase.patch, > 5270-testcasev2.patch, hbase-5270.patch, hbase-5270v2.patch, > hbase-5270v4.patch, sampletest.txt > > > This JIRA continues the effort from HBASE-5179. Starting with Stack's > comments about patches for 0.92 and TRUNK: > Reviewing 0.92v17 > isDeadServerInProgress is a new public method in ServerManager but it does > not seem to be used anywhere. > Does isDeadRootServerInProgress need to be public? Ditto for meta version. > This method param names are not right 'definitiveRootServer'; what is meant > by definitive? Do they need this qualifier? > Is there anything in place to stop us expiring a server twice if its carrying > root and meta? > What is difference between asking assignment manager isCarryingRoot and this > variable that is passed in? Should be doc'd at least. Ditto for meta. > I think I've asked for this a few times - onlineServers needs to be > explained... either in javadoc or in comment. This is the param passed into > joinCluster. How does it arise? I think I know but am unsure. God love the > poor noob that comes awandering this code trying to make sense of it all. > It looks like we get the list by trawling zk for regionserver znodes that > have not checked in. Don't we do this operation earlier in master setup? Are > we doing it again here? > Though distributed split log is configured, we will do in master single > process splitting under some conditions with this patch. Its not explained in > code why we would do this. Why do we think master log splitting 'high > priority' when it could very well be slower. Should we only go this route if > distributed splitting is not going on. Do we know if concurrent distributed > log splitting and master splitting works? > Why would we have dead servers in progress here in master startup? Because a > servershutdownhandler fired? > This patch is different to the patch for 0.90. Should go into trunk first > with tests, then 0.92. Should it be in this issue? This issue is really hard > to follow now. Maybe this issue is for 0.90.x and new issue for more work on > this trunk patch? > This patch needs to have the v18 differences applied. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5270) Handle potential data loss due to concurrent processing of processFaileOver and ServerShutdownHandler
[ https://issues.apache.org/jira/browse/HBASE-5270?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] chunhui shen updated HBASE-5270: Attachment: 5270-90v3.patch Takes Stack‘s comment in 5270-90v3.patch > Handle potential data loss due to concurrent processing of processFaileOver > and ServerShutdownHandler > - > > Key: HBASE-5270 > URL: https://issues.apache.org/jira/browse/HBASE-5270 > Project: HBase > Issue Type: Sub-task > Components: master >Reporter: Zhihong Yu > Fix For: 0.94.0, 0.92.1 > > Attachments: 5270-90-testcase.patch, 5270-90-testcasev2.patch, > 5270-90.patch, 5270-90v2.patch, 5270-90v3.patch, 5270-testcase.patch, > 5270-testcasev2.patch, hbase-5270.patch, hbase-5270v2.patch, sampletest.txt > > > This JIRA continues the effort from HBASE-5179. Starting with Stack's > comments about patches for 0.92 and TRUNK: > Reviewing 0.92v17 > isDeadServerInProgress is a new public method in ServerManager but it does > not seem to be used anywhere. > Does isDeadRootServerInProgress need to be public? Ditto for meta version. > This method param names are not right 'definitiveRootServer'; what is meant > by definitive? Do they need this qualifier? > Is there anything in place to stop us expiring a server twice if its carrying > root and meta? > What is difference between asking assignment manager isCarryingRoot and this > variable that is passed in? Should be doc'd at least. Ditto for meta. > I think I've asked for this a few times - onlineServers needs to be > explained... either in javadoc or in comment. This is the param passed into > joinCluster. How does it arise? I think I know but am unsure. God love the > poor noob that comes awandering this code trying to make sense of it all. > It looks like we get the list by trawling zk for regionserver znodes that > have not checked in. Don't we do this operation earlier in master setup? Are > we doing it again here? > Though distributed split log is configured, we will do in master single > process splitting under some conditions with this patch. Its not explained in > code why we would do this. Why do we think master log splitting 'high > priority' when it could very well be slower. Should we only go this route if > distributed splitting is not going on. Do we know if concurrent distributed > log splitting and master splitting works? > Why would we have dead servers in progress here in master startup? Because a > servershutdownhandler fired? > This patch is different to the patch for 0.90. Should go into trunk first > with tests, then 0.92. Should it be in this issue? This issue is really hard > to follow now. Maybe this issue is for 0.90.x and new issue for more work on > this trunk patch? > This patch needs to have the v18 differences applied. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5270) Handle potential data loss due to concurrent processing of processFaileOver and ServerShutdownHandler
[ https://issues.apache.org/jira/browse/HBASE-5270?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] chunhui shen updated HBASE-5270: Attachment: 5270-90v2.patch 5270-90-testcasev2.patch Testcase and patch for 90 version > Handle potential data loss due to concurrent processing of processFaileOver > and ServerShutdownHandler > - > > Key: HBASE-5270 > URL: https://issues.apache.org/jira/browse/HBASE-5270 > Project: HBase > Issue Type: Sub-task > Components: master >Reporter: Zhihong Yu > Fix For: 0.94.0, 0.92.1 > > Attachments: 5270-90-testcase.patch, 5270-90-testcasev2.patch, > 5270-90.patch, 5270-90v2.patch, 5270-testcase.patch, 5270-testcasev2.patch, > hbase-5270.patch, hbase-5270v2.patch, sampletest.txt > > > This JIRA continues the effort from HBASE-5179. Starting with Stack's > comments about patches for 0.92 and TRUNK: > Reviewing 0.92v17 > isDeadServerInProgress is a new public method in ServerManager but it does > not seem to be used anywhere. > Does isDeadRootServerInProgress need to be public? Ditto for meta version. > This method param names are not right 'definitiveRootServer'; what is meant > by definitive? Do they need this qualifier? > Is there anything in place to stop us expiring a server twice if its carrying > root and meta? > What is difference between asking assignment manager isCarryingRoot and this > variable that is passed in? Should be doc'd at least. Ditto for meta. > I think I've asked for this a few times - onlineServers needs to be > explained... either in javadoc or in comment. This is the param passed into > joinCluster. How does it arise? I think I know but am unsure. God love the > poor noob that comes awandering this code trying to make sense of it all. > It looks like we get the list by trawling zk for regionserver znodes that > have not checked in. Don't we do this operation earlier in master setup? Are > we doing it again here? > Though distributed split log is configured, we will do in master single > process splitting under some conditions with this patch. Its not explained in > code why we would do this. Why do we think master log splitting 'high > priority' when it could very well be slower. Should we only go this route if > distributed splitting is not going on. Do we know if concurrent distributed > log splitting and master splitting works? > Why would we have dead servers in progress here in master startup? Because a > servershutdownhandler fired? > This patch is different to the patch for 0.90. Should go into trunk first > with tests, then 0.92. Should it be in this issue? This issue is really hard > to follow now. Maybe this issue is for 0.90.x and new issue for more work on > this trunk patch? > This patch needs to have the v18 differences applied. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5270) Handle potential data loss due to concurrent processing of processFaileOver and ServerShutdownHandler
[ https://issues.apache.org/jira/browse/HBASE-5270?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] chunhui shen updated HBASE-5270: Attachment: 5270-testcasev2.patch hbase-5270v2.patch Optimize the testcase as Stack's sample. And hbase-5270v2 is a patch to fix the issue for trunk including testcase. > Handle potential data loss due to concurrent processing of processFaileOver > and ServerShutdownHandler > - > > Key: HBASE-5270 > URL: https://issues.apache.org/jira/browse/HBASE-5270 > Project: HBase > Issue Type: Sub-task > Components: master >Reporter: Zhihong Yu > Fix For: 0.94.0, 0.92.1 > > Attachments: 5270-90-testcase.patch, 5270-90.patch, > 5270-testcase.patch, 5270-testcasev2.patch, hbase-5270.patch, > hbase-5270v2.patch, sampletest.txt > > > This JIRA continues the effort from HBASE-5179. Starting with Stack's > comments about patches for 0.92 and TRUNK: > Reviewing 0.92v17 > isDeadServerInProgress is a new public method in ServerManager but it does > not seem to be used anywhere. > Does isDeadRootServerInProgress need to be public? Ditto for meta version. > This method param names are not right 'definitiveRootServer'; what is meant > by definitive? Do they need this qualifier? > Is there anything in place to stop us expiring a server twice if its carrying > root and meta? > What is difference between asking assignment manager isCarryingRoot and this > variable that is passed in? Should be doc'd at least. Ditto for meta. > I think I've asked for this a few times - onlineServers needs to be > explained... either in javadoc or in comment. This is the param passed into > joinCluster. How does it arise? I think I know but am unsure. God love the > poor noob that comes awandering this code trying to make sense of it all. > It looks like we get the list by trawling zk for regionserver znodes that > have not checked in. Don't we do this operation earlier in master setup? Are > we doing it again here? > Though distributed split log is configured, we will do in master single > process splitting under some conditions with this patch. Its not explained in > code why we would do this. Why do we think master log splitting 'high > priority' when it could very well be slower. Should we only go this route if > distributed splitting is not going on. Do we know if concurrent distributed > log splitting and master splitting works? > Why would we have dead servers in progress here in master startup? Because a > servershutdownhandler fired? > This patch is different to the patch for 0.90. Should go into trunk first > with tests, then 0.92. Should it be in this issue? This issue is really hard > to follow now. Maybe this issue is for 0.90.x and new issue for more work on > this trunk patch? > This patch needs to have the v18 differences applied. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5270) Handle potential data loss due to concurrent processing of processFaileOver and ServerShutdownHandler
[ https://issues.apache.org/jira/browse/HBASE-5270?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-5270: - Attachment: sampletest.txt > Handle potential data loss due to concurrent processing of processFaileOver > and ServerShutdownHandler > - > > Key: HBASE-5270 > URL: https://issues.apache.org/jira/browse/HBASE-5270 > Project: HBase > Issue Type: Sub-task > Components: master >Reporter: Zhihong Yu > Fix For: 0.94.0, 0.92.1 > > Attachments: 5270-90-testcase.patch, 5270-90.patch, > 5270-testcase.patch, hbase-5270.patch, sampletest.txt > > > This JIRA continues the effort from HBASE-5179. Starting with Stack's > comments about patches for 0.92 and TRUNK: > Reviewing 0.92v17 > isDeadServerInProgress is a new public method in ServerManager but it does > not seem to be used anywhere. > Does isDeadRootServerInProgress need to be public? Ditto for meta version. > This method param names are not right 'definitiveRootServer'; what is meant > by definitive? Do they need this qualifier? > Is there anything in place to stop us expiring a server twice if its carrying > root and meta? > What is difference between asking assignment manager isCarryingRoot and this > variable that is passed in? Should be doc'd at least. Ditto for meta. > I think I've asked for this a few times - onlineServers needs to be > explained... either in javadoc or in comment. This is the param passed into > joinCluster. How does it arise? I think I know but am unsure. God love the > poor noob that comes awandering this code trying to make sense of it all. > It looks like we get the list by trawling zk for regionserver znodes that > have not checked in. Don't we do this operation earlier in master setup? Are > we doing it again here? > Though distributed split log is configured, we will do in master single > process splitting under some conditions with this patch. Its not explained in > code why we would do this. Why do we think master log splitting 'high > priority' when it could very well be slower. Should we only go this route if > distributed splitting is not going on. Do we know if concurrent distributed > log splitting and master splitting works? > Why would we have dead servers in progress here in master startup? Because a > servershutdownhandler fired? > This patch is different to the patch for 0.90. Should go into trunk first > with tests, then 0.92. Should it be in this issue? This issue is really hard > to follow now. Maybe this issue is for 0.90.x and new issue for more work on > this trunk patch? > This patch needs to have the v18 differences applied. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5270) Handle potential data loss due to concurrent processing of processFaileOver and ServerShutdownHandler
[ https://issues.apache.org/jira/browse/HBASE-5270?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhihong Yu updated HBASE-5270: -- Attachment: (was: mapreduce-3583-trunk-v2.txt) > Handle potential data loss due to concurrent processing of processFaileOver > and ServerShutdownHandler > - > > Key: HBASE-5270 > URL: https://issues.apache.org/jira/browse/HBASE-5270 > Project: HBase > Issue Type: Sub-task > Components: master >Reporter: Zhihong Yu > Fix For: 0.94.0, 0.92.1 > > Attachments: 5270-90-testcase.patch, 5270-90.patch, > 5270-testcase.patch, hbase-5270.patch > > > This JIRA continues the effort from HBASE-5179. Starting with Stack's > comments about patches for 0.92 and TRUNK: > Reviewing 0.92v17 > isDeadServerInProgress is a new public method in ServerManager but it does > not seem to be used anywhere. > Does isDeadRootServerInProgress need to be public? Ditto for meta version. > This method param names are not right 'definitiveRootServer'; what is meant > by definitive? Do they need this qualifier? > Is there anything in place to stop us expiring a server twice if its carrying > root and meta? > What is difference between asking assignment manager isCarryingRoot and this > variable that is passed in? Should be doc'd at least. Ditto for meta. > I think I've asked for this a few times - onlineServers needs to be > explained... either in javadoc or in comment. This is the param passed into > joinCluster. How does it arise? I think I know but am unsure. God love the > poor noob that comes awandering this code trying to make sense of it all. > It looks like we get the list by trawling zk for regionserver znodes that > have not checked in. Don't we do this operation earlier in master setup? Are > we doing it again here? > Though distributed split log is configured, we will do in master single > process splitting under some conditions with this patch. Its not explained in > code why we would do this. Why do we think master log splitting 'high > priority' when it could very well be slower. Should we only go this route if > distributed splitting is not going on. Do we know if concurrent distributed > log splitting and master splitting works? > Why would we have dead servers in progress here in master startup? Because a > servershutdownhandler fired? > This patch is different to the patch for 0.90. Should go into trunk first > with tests, then 0.92. Should it be in this issue? This issue is really hard > to follow now. Maybe this issue is for 0.90.x and new issue for more work on > this trunk patch? > This patch needs to have the v18 differences applied. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5270) Handle potential data loss due to concurrent processing of processFaileOver and ServerShutdownHandler
[ https://issues.apache.org/jira/browse/HBASE-5270?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhihong Yu updated HBASE-5270: -- Attachment: mapreduce-3583-trunk-v2.txt Reattaching patch v2 for TRUNK with --no-prefix > Handle potential data loss due to concurrent processing of processFaileOver > and ServerShutdownHandler > - > > Key: HBASE-5270 > URL: https://issues.apache.org/jira/browse/HBASE-5270 > Project: HBase > Issue Type: Sub-task > Components: master >Reporter: Zhihong Yu > Fix For: 0.94.0, 0.92.1 > > Attachments: 5270-90-testcase.patch, 5270-90.patch, > 5270-testcase.patch, hbase-5270.patch > > > This JIRA continues the effort from HBASE-5179. Starting with Stack's > comments about patches for 0.92 and TRUNK: > Reviewing 0.92v17 > isDeadServerInProgress is a new public method in ServerManager but it does > not seem to be used anywhere. > Does isDeadRootServerInProgress need to be public? Ditto for meta version. > This method param names are not right 'definitiveRootServer'; what is meant > by definitive? Do they need this qualifier? > Is there anything in place to stop us expiring a server twice if its carrying > root and meta? > What is difference between asking assignment manager isCarryingRoot and this > variable that is passed in? Should be doc'd at least. Ditto for meta. > I think I've asked for this a few times - onlineServers needs to be > explained... either in javadoc or in comment. This is the param passed into > joinCluster. How does it arise? I think I know but am unsure. God love the > poor noob that comes awandering this code trying to make sense of it all. > It looks like we get the list by trawling zk for regionserver znodes that > have not checked in. Don't we do this operation earlier in master setup? Are > we doing it again here? > Though distributed split log is configured, we will do in master single > process splitting under some conditions with this patch. Its not explained in > code why we would do this. Why do we think master log splitting 'high > priority' when it could very well be slower. Should we only go this route if > distributed splitting is not going on. Do we know if concurrent distributed > log splitting and master splitting works? > Why would we have dead servers in progress here in master startup? Because a > servershutdownhandler fired? > This patch is different to the patch for 0.90. Should go into trunk first > with tests, then 0.92. Should it be in this issue? This issue is really hard > to follow now. Maybe this issue is for 0.90.x and new issue for more work on > this trunk patch? > This patch needs to have the v18 differences applied. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5270) Handle potential data loss due to concurrent processing of processFaileOver and ServerShutdownHandler
[ https://issues.apache.org/jira/browse/HBASE-5270?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] chunhui shen updated HBASE-5270: Attachment: 5270-90-testcase.patch 5270-90.patch 5270-testcase.patch hbase-5270.patch I have written a testcase for this issue to show the exist problem in 5270-testcase.patch And hbase-5270.patch is the combination of testcase and HBASE-5179 ‘s latest patch。 5270-90.patch and 5270-90-testcase.patch are for 90 version Now, testcase may not contain all the situations which causes problems and runs slowly. I will optimize the testcase later. > Handle potential data loss due to concurrent processing of processFaileOver > and ServerShutdownHandler > - > > Key: HBASE-5270 > URL: https://issues.apache.org/jira/browse/HBASE-5270 > Project: HBase > Issue Type: Sub-task > Components: master >Reporter: Zhihong Yu > Fix For: 0.94.0, 0.92.1 > > Attachments: 5270-90-testcase.patch, 5270-90.patch, > 5270-testcase.patch, hbase-5270.patch > > > This JIRA continues the effort from HBASE-5179. Starting with Stack's > comments about patches for 0.92 and TRUNK: > Reviewing 0.92v17 > isDeadServerInProgress is a new public method in ServerManager but it does > not seem to be used anywhere. > Does isDeadRootServerInProgress need to be public? Ditto for meta version. > This method param names are not right 'definitiveRootServer'; what is meant > by definitive? Do they need this qualifier? > Is there anything in place to stop us expiring a server twice if its carrying > root and meta? > What is difference between asking assignment manager isCarryingRoot and this > variable that is passed in? Should be doc'd at least. Ditto for meta. > I think I've asked for this a few times - onlineServers needs to be > explained... either in javadoc or in comment. This is the param passed into > joinCluster. How does it arise? I think I know but am unsure. God love the > poor noob that comes awandering this code trying to make sense of it all. > It looks like we get the list by trawling zk for regionserver znodes that > have not checked in. Don't we do this operation earlier in master setup? Are > we doing it again here? > Though distributed split log is configured, we will do in master single > process splitting under some conditions with this patch. Its not explained in > code why we would do this. Why do we think master log splitting 'high > priority' when it could very well be slower. Should we only go this route if > distributed splitting is not going on. Do we know if concurrent distributed > log splitting and master splitting works? > Why would we have dead servers in progress here in master startup? Because a > servershutdownhandler fired? > This patch is different to the patch for 0.90. Should go into trunk first > with tests, then 0.92. Should it be in this issue? This issue is really hard > to follow now. Maybe this issue is for 0.90.x and new issue for more work on > this trunk patch? > This patch needs to have the v18 differences applied. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira